-
The Rise of Small Language Models in Healthcare: A Comprehensive Survey
Authors:
Muskan Garg,
Shaina Raza,
Shebuti Rayana,
Xingyi Liu,
Sunghwan Sohn
Abstract:
Despite substantial progress in healthcare applications driven by large language models (LLMs), growing concerns around data privacy, and limited resources; the small language models (SLMs) offer a scalable and clinically viable solution for efficient performance in resource-constrained environments for next-generation healthcare informatics. Our comprehensive survey presents a taxonomic framework…
▽ More
Despite substantial progress in healthcare applications driven by large language models (LLMs), growing concerns around data privacy, and limited resources; the small language models (SLMs) offer a scalable and clinically viable solution for efficient performance in resource-constrained environments for next-generation healthcare informatics. Our comprehensive survey presents a taxonomic framework to identify and categorize them for healthcare professionals and informaticians. The timeline of healthcare SLM contributions establishes a foundational framework for analyzing models across three dimensions: NLP tasks, stakeholder roles, and the continuum of care. We present a taxonomic framework to identify the architectural foundations for building models from scratch; adapting SLMs to clinical precision through prompting, instruction fine-tuning, and reasoning; and accessibility and sustainability through compression techniques. Our primary objective is to offer a comprehensive survey for healthcare professionals, introducing recent innovations in model optimization and equipping them with curated resources to support future research and development in the field. Aiming to showcase the groundbreaking advancements in SLMs for healthcare, we present a comprehensive compilation of experimental results across widely studied NLP tasks in healthcare to highlight the transformative potential of SLMs in healthcare. The updated repository is available at Github
△ Less
Submitted 25 April, 2025; v1 submitted 23 April, 2025;
originally announced April 2025.
-
OccRobNet : Occlusion Robust Network for Accurate 3D Interacting Hand-Object Pose Estimation
Authors:
Mallika Garg,
Debashis Ghosh,
Pyari Mohan Pradhan
Abstract:
Occlusion is one of the challenging issues when estimating 3D hand pose. This problem becomes more prominent when hand interacts with an object or two hands are involved. In the past works, much attention has not been given to these occluded regions. But these regions contain important and beneficial information that is vital for 3D hand pose estimation. Thus, in this paper, we propose an occlusio…
▽ More
Occlusion is one of the challenging issues when estimating 3D hand pose. This problem becomes more prominent when hand interacts with an object or two hands are involved. In the past works, much attention has not been given to these occluded regions. But these regions contain important and beneficial information that is vital for 3D hand pose estimation. Thus, in this paper, we propose an occlusion robust and accurate method for the estimation of 3D hand-object pose from the input RGB image. Our method includes first localising the hand joints using a CNN based model and then refining them by extracting contextual information. The self attention transformer then identifies the specific joints along with the hand identity. This helps the model to identify the hand belongingness of a particular joint which helps to detect the joint even in the occluded region. Further, these joints with hand identity are then used to estimate the pose using cross attention mechanism. Thus, by identifying the joints in the occluded region, the obtained network becomes robust to occlusion. Hence, this network achieves state-of-the-art results when evaluated on the InterHand2.6M, HO3D and H$_2$O3D datasets.
△ Less
Submitted 27 March, 2025;
originally announced March 2025.
-
ReviewEval: An Evaluation Framework for AI-Generated Reviews
Authors:
Madhav Krishan Garg,
Tejash Prasad,
Tanmay Singhal,
Chhavi Kirtani,
Murari Mandal,
Dhruv Kumar
Abstract:
The escalating volume of academic research, coupled with a shortage of qualified reviewers, necessitates innovative approaches to peer review. In this work, we propose: 1. ReviewEval, a comprehensive evaluation framework for AI-generated reviews that measures alignment with human assessments, verifies factual accuracy, assesses analytical depth, identifies degree of constructiveness and adherence…
▽ More
The escalating volume of academic research, coupled with a shortage of qualified reviewers, necessitates innovative approaches to peer review. In this work, we propose: 1. ReviewEval, a comprehensive evaluation framework for AI-generated reviews that measures alignment with human assessments, verifies factual accuracy, assesses analytical depth, identifies degree of constructiveness and adherence to reviewer guidelines; and 2. ReviewAgent, an LLM-based review generation agent featuring a novel alignment mechanism to tailor feedback to target conferences and journals, along with a self-refinement loop that iteratively optimizes its intermediate outputs and an external improvement loop using ReviewEval to improve upon the final reviews. ReviewAgent improves actionable insights by 6.78% and 47.62% over existing AI baselines and expert reviews respectively. Further, it boosts analytical depth by 3.97% and 12.73%, enhances adherence to guidelines by 10.11% and 47.26% respectively. This paper establishes essential metrics for AIbased peer review and substantially enhances the reliability and impact of AI-generated reviews in academic research.
△ Less
Submitted 24 May, 2025; v1 submitted 17 February, 2025;
originally announced February 2025.
-
Dynamic Neural Style Transfer for Artistic Image Generation using VGG19
Authors:
Kapil Kashyap,
Mehak Garg,
Sean Fargose,
Sindhu Nair
Abstract:
Throughout history, humans have created remarkable works of art, but artificial intelligence has only recently started to make strides in generating visually compelling art. Breakthroughs in the past few years have focused on using convolutional neural networks (CNNs) to separate and manipulate the content and style of images, applying texture synthesis techniques. Nevertheless, a number of curren…
▽ More
Throughout history, humans have created remarkable works of art, but artificial intelligence has only recently started to make strides in generating visually compelling art. Breakthroughs in the past few years have focused on using convolutional neural networks (CNNs) to separate and manipulate the content and style of images, applying texture synthesis techniques. Nevertheless, a number of current techniques continue to encounter obstacles, including lengthy processing times, restricted choices of style images, and the inability to modify the weight ratio of styles. We proposed a neural style transfer system that can add various artistic styles to a desired image to address these constraints allowing flexible adjustments to style weight ratios and reducing processing time. The system uses the VGG19 model for feature extraction, ensuring high-quality, flexible stylization without compromising content integrity.
△ Less
Submitted 16 January, 2025;
originally announced January 2025.
-
Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition
Authors:
Mallika Garg,
Debashis Ghosh,
Pyari Mohan Pradhan
Abstract:
Dynamic gesture recognition is one of the challenging research areas due to variations in pose, size, and shape of the signer's hand. In this letter, Multiscaled Multi-Head Attention Video Transformer Network (MsMHA-VTN) for dynamic hand gesture recognition is proposed. A pyramidal hierarchy of multiscale features is extracted using the transformer multiscaled head attention model. The proposed mo…
▽ More
Dynamic gesture recognition is one of the challenging research areas due to variations in pose, size, and shape of the signer's hand. In this letter, Multiscaled Multi-Head Attention Video Transformer Network (MsMHA-VTN) for dynamic hand gesture recognition is proposed. A pyramidal hierarchy of multiscale features is extracted using the transformer multiscaled head attention model. The proposed model employs different attention dimensions for each head of the transformer which enables it to provide attention at the multiscale level. Further, in addition to single modality, recognition performance using multiple modalities is examined. Extensive experiments demonstrate the superior performance of the proposed MsMHA-VTN with an overall accuracy of 88.22\% and 99.10\% on NVGesture and Briareo datasets, respectively.
△ Less
Submitted 1 January, 2025;
originally announced January 2025.
-
Efficient and Verified Continuous Double Auctions
Authors:
Mohit Garg,
Suneel Sarswat
Abstract:
Continuous double auctions are commonly used to match orders at currency, stock, and commodities exchanges. A verified implementation of continuous double auctions is a useful tool for market regulators as they give rise to automated checkers that are guaranteed to detect errors in the trade logs of an existing exchange if they contain trades that violate the matching rules. We provide an efficien…
▽ More
Continuous double auctions are commonly used to match orders at currency, stock, and commodities exchanges. A verified implementation of continuous double auctions is a useful tool for market regulators as they give rise to automated checkers that are guaranteed to detect errors in the trade logs of an existing exchange if they contain trades that violate the matching rules. We provide an efficient and formally verified implementation of continuous double auctions that takes $O(n \log n)$ time to match $n$ orders. This improves an earlier $O(n^2)$ verified implementation. We also prove a matching $Ω(n\log n)$ lower bound on the running time for continuous double auctions. Our new implementation takes only a couple of minutes to run on ten million randomly generated orders as opposed to a few days taken by the earlier implementation. Our new implementation gives rise to an efficient automatic checker.
We use the Coq proof assistant for verifying our implementation and extracting a verified OCaml program. While using Coq's standard library implementation of red-black trees to obtain our improvement, we observed that its specification has serious gaps, which we fill in this work; this might be of independent interest.
△ Less
Submitted 11 December, 2024;
originally announced December 2024.
-
ConvMixFormer- A Resource-efficient Convolution Mixer for Transformer-based Dynamic Hand Gesture Recognition
Authors:
Mallika Garg,
Debashis Ghosh,
Pyari Mohan Pradhan
Abstract:
Transformer models have demonstrated remarkable success in many domains such as natural language processing (NLP) and computer vision. With the growing interest in transformer-based architectures, they are now utilized for gesture recognition. So, we also explore and devise a novel ConvMixFormer architecture for dynamic hand gestures. The transformers use quadratic scaling of the attention feature…
▽ More
Transformer models have demonstrated remarkable success in many domains such as natural language processing (NLP) and computer vision. With the growing interest in transformer-based architectures, they are now utilized for gesture recognition. So, we also explore and devise a novel ConvMixFormer architecture for dynamic hand gestures. The transformers use quadratic scaling of the attention features with the sequential data, due to which these models are computationally complex and heavy. We have considered this drawback of the transformer and designed a resource-efficient model that replaces the self-attention in the transformer with the simple convolutional layer-based token mixer. The computational cost and the parameters used for the convolution-based mixer are comparatively less than the quadratic self-attention. Convolution-mixer helps the model capture the local spatial features that self-attention struggles to capture due to their sequential processing nature. Further, an efficient gate mechanism is employed instead of a conventional feed-forward network in the transformer to help the model control the flow of features within different stages of the proposed model. This design uses fewer learnable parameters which is nearly half the vanilla transformer that helps in fast and efficient training. The proposed method is evaluated on NVidia Dynamic Hand Gesture and Briareo datasets and our model has achieved state-of-the-art results on single and multimodal inputs. We have also shown the parameter efficiency of the proposed ConvMixFormer model compared to other methods. The source code is available at https://github.com/mallikagarg/ConvMixFormer.
△ Less
Submitted 2 December, 2024; v1 submitted 11 November, 2024;
originally announced November 2024.
-
Double Auctions: Formalization and Automated Checkers
Authors:
Mohit Garg,
N. Raja,
Suneel Sarswat,
Abhishek Kr Singh
Abstract:
Double auctions are widely used in financial markets, such as those for stocks, derivatives, currencies, and commodities, to match demand and supply. Once all buyers and sellers have placed their trade requests, the exchange determines how these requests are to be matched. The two most common objectives for determining the matching are maximizing trade volume at a uniform price and maximizing trad…
▽ More
Double auctions are widely used in financial markets, such as those for stocks, derivatives, currencies, and commodities, to match demand and supply. Once all buyers and sellers have placed their trade requests, the exchange determines how these requests are to be matched. The two most common objectives for determining the matching are maximizing trade volume at a uniform price and maximizing trade volume through dynamic pricing. Prior research has primarily focused on single-quantity trade requests. In this work, we extend the framework to handle multiple-quantity trade requests and present fully formalized matching algorithms for double auctions, along with their correctness proofs. We establish new uniqueness theorems, enabling automatic detection of violations in exchange systems by comparing their output to that of a verified program. All proofs are formalized in the Coq Proof Assistant, and we extract verified OCaml and Haskell programs that could serve as a resource for exchanges and market regulators. We demonstrate the practical applicability of our work by running the verified program on real market data from an exchange to automatically check for violations in the exchange algorithm.
△ Less
Submitted 24 October, 2024;
originally announced October 2024.
-
MVTN: A Multiscale Video Transformer Network for Hand Gesture Recognition
Authors:
Mallika Garg,
Debashis Ghosh,
Pyari Mohan Pradhan
Abstract:
In this paper, we introduce a novel Multiscale Video Transformer Network (MVTN) for dynamic hand gesture recognition, since multiscale features can extract features with variable size, pose, and shape of hand which is a challenge in hand gesture recognition. The proposed model incorporates a multiscale feature hierarchy to capture diverse levels of detail and context within hand gestures which enh…
▽ More
In this paper, we introduce a novel Multiscale Video Transformer Network (MVTN) for dynamic hand gesture recognition, since multiscale features can extract features with variable size, pose, and shape of hand which is a challenge in hand gesture recognition. The proposed model incorporates a multiscale feature hierarchy to capture diverse levels of detail and context within hand gestures which enhances the model's ability. This multiscale hierarchy is obtained by extracting different dimensions of attention in different transformer stages with initial stages to model high-resolution features and later stages to model low-resolution features. Our approach also leverages multimodal data, utilizing depth maps, infrared data, and surface normals along with RGB images from NVGesture and Briareo datasets. Experiments show that the proposed MVTN achieves state-of-the-art results with less computational complexity and parameters. The source code is available at https://github.com/mallikagarg/MVTN.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
A $5/4$-Approximation for Two-Edge Connectivity
Authors:
Miguel Bosch-Calvo,
Mohit Garg,
Fabrizio Grandoni,
Felix Hommelsheim,
Afrouz Jabal Ameli,
Alexander Lindermayr
Abstract:
The 2-Edge-Connected Spanning Subgraph problem (2ECSS) is among the most basic survivable network design problems: given an undirected and unweighted graph, the task is to find a spanning subgraph with the minimum number of edges that is 2-edge-connected (i.e., it remains connected after the removal of any single edge). 2ECSS is an NP-hard problem that has been extensively studied in the context o…
▽ More
The 2-Edge-Connected Spanning Subgraph problem (2ECSS) is among the most basic survivable network design problems: given an undirected and unweighted graph, the task is to find a spanning subgraph with the minimum number of edges that is 2-edge-connected (i.e., it remains connected after the removal of any single edge). 2ECSS is an NP-hard problem that has been extensively studied in the context of approximation algorithms. The best known approximation ratio for 2ECSS prior to this work was $1.3+\varepsilon$, for any constant $\varepsilon>0$ [Garg, Grandoni, Jabal-Ameli'23; Kobayashi, Noguchi'23]. In this paper, we present a 5/4-approximation algorithm. Our algorithm is also faster for small values of $\varepsilon$: its running time is $n^{O(1)}$ instead of $n^{O(1/\varepsilon)}$.
△ Less
Submitted 28 March, 2025; v1 submitted 13 August, 2024;
originally announced August 2024.
-
Two-Edge Connectivity via Pac-Man Gluing
Authors:
Mohit Garg,
Felix Hommelsheim,
Alexander Lindermayr
Abstract:
We study the 2-edge-connected spanning subgraph (2-ECSS) problem: Given a graph $G$, compute a connected subgraph $H$ of $G$ with the minimum number of edges such that $H$ is spanning, i.e., $V(H) = V(G)$, and $H$ is 2-edge-connected, i.e., $H$ remains connected upon the deletion of any single edge, if such an $H$ exists. The $2$-ECSS problem is known to be NP-hard. In this work, we provide a poly…
▽ More
We study the 2-edge-connected spanning subgraph (2-ECSS) problem: Given a graph $G$, compute a connected subgraph $H$ of $G$ with the minimum number of edges such that $H$ is spanning, i.e., $V(H) = V(G)$, and $H$ is 2-edge-connected, i.e., $H$ remains connected upon the deletion of any single edge, if such an $H$ exists. The $2$-ECSS problem is known to be NP-hard. In this work, we provide a polynomial-time $(\frac 5 4 + \varepsilon)$-approximation for the problem for an arbitrarily small $\varepsilon>0$, improving the previous best approximation ratio of $\frac{13}{10}+\varepsilon$.
Our improvement is based on two main innovations: First, we reduce solving the problem on general graphs to solving it on structured graphs with high vertex connectivity. This high vertex connectivity ensures the existence of a 4-matching across any bipartition of the vertex set with at least 10 vertices in each part. Second, we exploit this property in a later gluing step, where isolated 2-edge-connected components need to be merged without adding too many edges. Using the 4-matching property, we can repeatedly glue a huge component (containing at least 10 vertices) to other components. This step is reminiscent of the Pac-Man game, where a Pac-Man (a huge component) consumes all the dots (other components) as it moves through a maze. These two innovations lead to a significantly simpler algorithm and analysis for the gluing step compared to the previous best approximation algorithm, which required a long and tedious case analysis.
△ Less
Submitted 9 August, 2024;
originally announced August 2024.
-
Evaluation and Continual Improvement for an Enterprise AI Assistant
Authors:
Akash V. Maharaj,
Kun Qian,
Uttaran Bhattacharya,
Sally Fang,
Horia Galatanu,
Manas Garg,
Rachel Hanessian,
Nishant Kapoor,
Ken Russell,
Shivakumar Vaithyanathan,
Yunyao Li
Abstract:
The development of conversational AI assistants is an iterative process with multiple components. As such, the evaluation and continual improvement of these assistants is a complex and multifaceted problem. This paper introduces the challenges in evaluating and improving a generative AI assistant for enterprises, which is under active development, and how we address these challenges. We also share…
▽ More
The development of conversational AI assistants is an iterative process with multiple components. As such, the evaluation and continual improvement of these assistants is a complex and multifaceted problem. This paper introduces the challenges in evaluating and improving a generative AI assistant for enterprises, which is under active development, and how we address these challenges. We also share preliminary results and discuss lessons learned.
△ Less
Submitted 8 December, 2024; v1 submitted 15 June, 2024;
originally announced July 2024.
-
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition
Authors:
Mallika Garg,
Debashis Ghosh,
Pyari Mohan Pradhan
Abstract:
Transformer model have achieved state-of-the-art results in many applications like NLP, classification, etc. But their exploration in gesture recognition task is still limited. So, we propose a novel GestFormer architecture for dynamic hand gesture recognition. The motivation behind this design is to propose a resource efficient transformer model, since transformers are computationally expensive a…
▽ More
Transformer model have achieved state-of-the-art results in many applications like NLP, classification, etc. But their exploration in gesture recognition task is still limited. So, we propose a novel GestFormer architecture for dynamic hand gesture recognition. The motivation behind this design is to propose a resource efficient transformer model, since transformers are computationally expensive and very complex. So, we propose to use a pooling based token mixer named PoolFormer, since it uses only pooling layer which is a non-parametric layer instead of quadratic attention. The proposed model also leverages the space-invariant features of the wavelet transform and also the multiscale features are selected using multi-scale pooling. Further, a gated mechanism helps to focus on fine details of the gesture with the contextual information. This enhances the performance of the proposed model compared to the traditional transformer with fewer parameters, when evaluated on dynamic hand gesture datasets, NVidia Dynamic Hand Gesture and Briareo datasets. To prove the efficacy of the proposed model, we have experimented on single as well multimodal inputs such as infrared, normals, depth, optical flow and color images. We have also compared the proposed GestFormer in terms of resource efficiency and number of operations. The source code is available at https://github.com/mallikagarg/GestFormer.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
The Exchange Problem
Authors:
Mohit Garg,
Suneel Sarswat
Abstract:
Auctions are widely used in exchanges to match buy and sell requests. Once the buyers and sellers place their requests, the exchange determines how these requests are to be matched. The two most popular objectives used while determining the matching are maximizing volume at a uniform price and maximizing volume with dynamic pricing. In this work, we study the algorithmic complexity of the problems…
▽ More
Auctions are widely used in exchanges to match buy and sell requests. Once the buyers and sellers place their requests, the exchange determines how these requests are to be matched. The two most popular objectives used while determining the matching are maximizing volume at a uniform price and maximizing volume with dynamic pricing. In this work, we study the algorithmic complexity of the problems arising from these matching tasks.
We present a linear time algorithm for uniform price matching which is an improvement over the previous algorithms that take $O(n\log n)$ time to match $n$ requests. For dynamic price matching, we establish a lower bound of $Ω(n \log n)$ on the running time, thereby proving that the currently known best algorithm is time-optimal.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Random-Order Online Independent Set of Intervals and Hyperrectangles
Authors:
Mohit Garg,
Debajyoti Kar,
Arindam Khan
Abstract:
In the Maximum Independent Set of Hyperrectangles problem, we are given a set of $n$ (possibly overlapping) $d$-dimensional axis-aligned hyperrectangles, and the goal is to find a subset of non-overlapping hyperrectangles of maximum cardinality. For $d=1$, this corresponds to the classical Interval Scheduling problem, where a simple greedy algorithm returns an optimal solution. In the offline sett…
▽ More
In the Maximum Independent Set of Hyperrectangles problem, we are given a set of $n$ (possibly overlapping) $d$-dimensional axis-aligned hyperrectangles, and the goal is to find a subset of non-overlapping hyperrectangles of maximum cardinality. For $d=1$, this corresponds to the classical Interval Scheduling problem, where a simple greedy algorithm returns an optimal solution. In the offline setting, for $d$-dimensional hyperrectangles, polynomial time $(\log n)^{O(d)}$-approximation algorithms are known. However, the problem becomes notably challenging in the online setting, where the input objects (hyperrectangles) appear one by one in an adversarial order, and on the arrival of an object, the algorithm needs to make an immediate and irrevocable decision whether or not to select the object while maintaining the feasibility. Even for interval scheduling, an $Ω(n)$ lower bound is known on the competitive ratio.
To circumvent these negative results, in this work, we study the online maximum independent set of axis-aligned hyperrectangles in the random-order arrival model, where the adversary specifies the set of input objects which then arrive in a uniformly random order. Starting from the prototypical secretary problem, the random-order model has received significant attention to study algorithms beyond the worst-case competitive analysis. Surprisingly, we show that the problem in the random-order model almost matches the best-known offline approximation guarantees, up to polylogarithmic factors. In particular, we give a simple $(\log n)^{O(d)}$-competitive algorithm for $d$-dimensional hyperrectangles in this model, which runs in $\tilde{O_d}(n)$ time. Our approach also yields $(\log n)^{O(d)}$-competitive algorithms in the random-order model for more general objects such as $d$-dimensional fat objects and ellipsoids. Furthermore, our guarantees hold with high probability.
△ Less
Submitted 26 June, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
"Which LLM should I use?": Evaluating LLMs for tasks performed by Undergraduate Computer Science Students
Authors:
Vibhor Agarwal,
Madhav Krishan Garg,
Sahiti Dharmavaram,
Dhruv Kumar
Abstract:
This study evaluates the effectiveness of various large language models (LLMs) in performing tasks common among undergraduate computer science students. Although a number of research studies in the computing education community have explored the possibility of using LLMs for a variety of tasks, there is a lack of comprehensive research comparing different LLMs and evaluating which LLMs are most ef…
▽ More
This study evaluates the effectiveness of various large language models (LLMs) in performing tasks common among undergraduate computer science students. Although a number of research studies in the computing education community have explored the possibility of using LLMs for a variety of tasks, there is a lack of comprehensive research comparing different LLMs and evaluating which LLMs are most effective for different tasks. Our research systematically assesses some of the publicly available LLMs such as Google Bard, ChatGPT(3.5), GitHub Copilot Chat, and Microsoft Copilot across diverse tasks commonly encountered by undergraduate computer science students in India. These tasks include code explanation and documentation, solving class assignments, technical interview preparation, learning new concepts and frameworks, and email writing. Evaluation for these tasks was carried out by pre-final year and final year undergraduate computer science students and provides insights into the models' strengths and limitations. This study aims to guide students as well as instructors in selecting suitable LLMs for any specific task and offers valuable insights on how LLMs can be used constructively by students and instructors.
△ Less
Submitted 3 April, 2024; v1 submitted 22 January, 2024;
originally announced February 2024.
-
Reliability Analysis of Psychological Concept Extraction and Classification in User-penned Text
Authors:
Muskan Garg,
MSVPJ Sathvik,
Amrit Chadha,
Shaina Raza,
Sunghwan Sohn
Abstract:
The social NLP research community witness a recent surge in the computational advancements of mental health analysis to build responsible AI models for a complex interplay between language use and self-perception. Such responsible AI models aid in quantifying the psychological concepts from user-penned texts on social media. On thinking beyond the low-level (classification) task, we advance the ex…
▽ More
The social NLP research community witness a recent surge in the computational advancements of mental health analysis to build responsible AI models for a complex interplay between language use and self-perception. Such responsible AI models aid in quantifying the psychological concepts from user-penned texts on social media. On thinking beyond the low-level (classification) task, we advance the existing binary classification dataset, towards a higher-level task of reliability analysis through the lens of explanations, posing it as one of the safety measures. We annotate the LoST dataset to capture nuanced textual cues that suggest the presence of low self-esteem in the posts of Reddit users. We further state that the NLP models developed for determining the presence of low self-esteem, focus more on three types of textual cues: (i) Trigger: words that triggers mental disturbance, (ii) LoST indicators: text indicators emphasizing low self-esteem, and (iii) Consequences: words describing the consequences of mental disturbance. We implement existing classifiers to examine the attention mechanism in pre-trained language models (PLMs) for a domain-specific psychology-grounded task. Our findings suggest the need of shifting the focus of PLMs from Trigger and Consequences to a more comprehensive explanation, emphasizing LoST indicators while determining low self-esteem in Reddit posts.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
AutoNumerics-Zero: Automated Discovery of State-of-the-Art Mathematical Functions
Authors:
Esteban Real,
Yao Chen,
Mirko Rossini,
Connal de Souza,
Manav Garg,
Akhil Verghese,
Moritz Firsching,
Quoc V. Le,
Ekin Dogus Cubuk,
David H. Park
Abstract:
Computers calculate transcendental functions by approximating them through the composition of a few limited-precision instructions. For example, an exponential can be calculated with a Taylor series. These approximation methods were developed over the centuries by mathematicians, who emphasized the attainability of arbitrary precision. Computers, however, operate on few limited precision types, su…
▽ More
Computers calculate transcendental functions by approximating them through the composition of a few limited-precision instructions. For example, an exponential can be calculated with a Taylor series. These approximation methods were developed over the centuries by mathematicians, who emphasized the attainability of arbitrary precision. Computers, however, operate on few limited precision types, such as the popular float32. In this study, we show that when aiming for limited precision, existing approximation methods can be outperformed by programs automatically discovered from scratch by a simple evolutionary algorithm. In particular, over real numbers, our method can approximate the exponential function reaching orders of magnitude more precision for a given number of operations when compared to previous approaches. More practically, over float32 numbers and constrained to less than 1 ULP of error, the same method attains a speedup over baselines by generating code that triggers better XLA/LLVM compilation paths. In other words, in both cases, evolution searched a vast space of possible programs, without knowledge of mathematics, to discover previously unknown optimized approximations to high precision, for the first time. We also give evidence that these results extend beyond the exponential. The ubiquity of transcendental functions suggests that our method has the potential to reduce the cost of scientific computing applications.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
InterPrompt: Interpretable Prompting for Interrelated Interpersonal Risk Factors in Reddit Posts
Authors:
MSVPJ Sathvik,
Surjodeep Sarkar,
Chandni Saxena,
Sunghwan Sohn,
Muskan Garg
Abstract:
Mental health professionals and clinicians have observed the upsurge of mental disorders due to Interpersonal Risk Factors (IRFs). To simulate the human-in-the-loop triaging scenario for early detection of mental health disorders, we recognized textual indications to ascertain these IRFs : Thwarted Belongingness (TBe) and Perceived Burdensomeness (PBu) within personal narratives. In light of this,…
▽ More
Mental health professionals and clinicians have observed the upsurge of mental disorders due to Interpersonal Risk Factors (IRFs). To simulate the human-in-the-loop triaging scenario for early detection of mental health disorders, we recognized textual indications to ascertain these IRFs : Thwarted Belongingness (TBe) and Perceived Burdensomeness (PBu) within personal narratives. In light of this, we use N-shot learning with GPT-3 model on the IRF dataset, and underscored the importance of fine-tuning GPT-3 model to incorporate the context-specific sensitivity and the interconnectedness of textual cues that represent both IRFs.
In this paper, we introduce an Interpretable Prompting (InterPrompt)} method to boost the attention mechanism by fine-tuning the GPT-3 model. This allows a more sophisticated level of language modification by adjusting the pre-trained weights. Our model learns to detect usual patterns and underlying connections across both the IRFs, which leads to better system-level explainability and trustworthiness. The results of our research demonstrate that all four variants of GPT-3 model, when fine-tuned with InterPrompt, perform considerably better as compared to the baseline methods, both in terms of classification and explanation generation.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
WellXplain: Wellness Concept Extraction and Classification in Reddit Posts for Mental Health Analysis
Authors:
Muskan Garg
Abstract:
During the current mental health crisis, the importance of identifying potential indicators of mental issues from social media content has surged. Overlooking the multifaceted nature of mental and social well-being can have detrimental effects on one's mental state. In traditional therapy sessions, professionals manually pinpoint the origins and outcomes of underlying mental challenges, a process…
▽ More
During the current mental health crisis, the importance of identifying potential indicators of mental issues from social media content has surged. Overlooking the multifaceted nature of mental and social well-being can have detrimental effects on one's mental state. In traditional therapy sessions, professionals manually pinpoint the origins and outcomes of underlying mental challenges, a process both detailed and time-intensive. We introduce an approach to this intricate mental health analysis by framing the identification of wellness dimensions in Reddit content as a wellness concept extraction and categorization challenge. We've curated a unique dataset named WELLXPLAIN, comprising 3,092 entries and totaling 72,813 words. Drawing from Halbert L. Dunn's well-regarded wellness theory, our team formulated an annotation framework along with guidelines. This dataset also includes human-marked textual segments, offering clear reasoning for decisions made in the wellness concept categorization process. Our aim in publishing this dataset and analyzing initial benchmarks is to spearhead the creation of advanced language models tailored for healthcare-focused concept extraction and categorization.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
NBIAS: A Natural Language Processing Framework for Bias Identification in Text
Authors:
Shaina Raza,
Muskan Garg,
Deepak John Reji,
Syed Raza Bashir,
Chen Ding
Abstract:
Bias in textual data can lead to skewed interpretations and outcomes when the data is used. These biases could perpetuate stereotypes, discrimination, or other forms of unfair treatment. An algorithm trained on biased data may end up making decisions that disproportionately impact a certain group of people. Therefore, it is crucial to detect and remove these biases to ensure the fair and ethical u…
▽ More
Bias in textual data can lead to skewed interpretations and outcomes when the data is used. These biases could perpetuate stereotypes, discrimination, or other forms of unfair treatment. An algorithm trained on biased data may end up making decisions that disproportionately impact a certain group of people. Therefore, it is crucial to detect and remove these biases to ensure the fair and ethical use of data. To this end, we develop a comprehensive and robust framework NBIAS that consists of four main layers: data, corpus construction, model development and an evaluation layer. The dataset is constructed by collecting diverse data from various domains, including social media, healthcare, and job hiring portals. As such, we applied a transformer-based token classification model that is able to identify bias words/ phrases through a unique named entity BIAS. In the evaluation procedure, we incorporate a blend of quantitative and qualitative measures to gauge the effectiveness of our models. We achieve accuracy improvements ranging from 1% to 8% compared to baselines. We are also able to generate a robust understanding of the model functioning. The proposed approach is applicable to a variety of biases and contributes to the fair and ethical use of textual data.
△ Less
Submitted 29 August, 2023; v1 submitted 3 August, 2023;
originally announced August 2023.
-
LOST: A Mental Health Dataset of Low Self-esteem in Reddit Posts
Authors:
Muskan Garg,
Manas Gaur,
Raxit Goswami,
Sunghwan Sohn
Abstract:
Low self-esteem and interpersonal needs (i.e., thwarted belongingness (TB) and perceived burdensomeness (PB)) have a major impact on depression and suicide attempts. Individuals seek social connectedness on social media to boost and alleviate their loneliness. Social media platforms allow people to express their thoughts, experiences, beliefs, and emotions. Prior studies on mental health from soci…
▽ More
Low self-esteem and interpersonal needs (i.e., thwarted belongingness (TB) and perceived burdensomeness (PB)) have a major impact on depression and suicide attempts. Individuals seek social connectedness on social media to boost and alleviate their loneliness. Social media platforms allow people to express their thoughts, experiences, beliefs, and emotions. Prior studies on mental health from social media have focused on symptoms, causes, and disorders. Whereas an initial screening of social media content for interpersonal risk factors and low self-esteem may raise early alerts and assign therapists to at-risk users of mental disturbance. Standardized scales measure self-esteem and interpersonal needs from questions created using psychological theories. In the current research, we introduce a psychology-grounded and expertly annotated dataset, LoST: Low Self esTeem, to study and detect low self-esteem on Reddit. Through an annotation approach involving checks on coherence, correctness, consistency, and reliability, we ensure gold-standard for supervised learning. We present results from different deep language models tested using two data augmentation techniques. Our findings suggest developing a class of language models that infuses psychological and clinical knowledge.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health
Authors:
Chandreen Liyanage,
Muskan Garg,
Vijay Mago,
Sunghwan Sohn
Abstract:
Amid ongoing health crisis, there is a growing necessity to discern possible signs of Wellness Dimensions (WD) manifested in self-narrated text. As the distribution of WD on social media data is intrinsically imbalanced, we experiment the generative NLP models for data augmentation to enable further improvement in the pre-screening task of classifying WD. To this end, we propose a simple yet effec…
▽ More
Amid ongoing health crisis, there is a growing necessity to discern possible signs of Wellness Dimensions (WD) manifested in self-narrated text. As the distribution of WD on social media data is intrinsically imbalanced, we experiment the generative NLP models for data augmentation to enable further improvement in the pre-screening task of classifying WD. To this end, we propose a simple yet effective data augmentation approach through prompt-based Generative NLP models, and evaluate the ROUGE scores and syntactic/semantic similarity among existing interpretations and augmented data. Our approach with ChatGPT model surpasses all the other methods and achieves improvement over baselines such as Easy-Data Augmentation and Backtranslation. Introducing data augmentation to generate more training samples and balanced dataset, results in the improved F-score and the Matthew's Correlation Coefficient for upto 13.11% and 15.95%, respectively.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
LonXplain: Lonesomeness as a Consequence of Mental Disturbance in Reddit Posts
Authors:
Muskan Garg,
Chandni Saxena,
Debabrata Samanta,
Bonnie J. Dorr
Abstract:
Social media is a potential source of information that infers latent mental states through Natural Language Processing (NLP). While narrating real-life experiences, social media users convey their feeling of loneliness or isolated lifestyle, impacting their mental well-being. Existing literature on psychological theories points to loneliness as the major consequence of interpersonal risk factors,…
▽ More
Social media is a potential source of information that infers latent mental states through Natural Language Processing (NLP). While narrating real-life experiences, social media users convey their feeling of loneliness or isolated lifestyle, impacting their mental well-being. Existing literature on psychological theories points to loneliness as the major consequence of interpersonal risk factors, propounding the need to investigate loneliness as a major aspect of mental disturbance. We formulate lonesomeness detection in social media posts as an explainable binary classification problem, discovering the users at-risk, suggesting the need of resilience for early control. To the best of our knowledge, there is no existing explainable dataset, i.e., one with human-readable, annotated text spans, to facilitate further research and development in loneliness detection causing mental disturbance. In this work, three experts: a senior clinical psychologist, a rehabilitation counselor, and a social NLP researcher define annotation schemes and perplexity guidelines to mark the presence or absence of lonesomeness, along with the marking of text-spans in original posts as explanation, in 3,521 Reddit posts. We expect the public release of our dataset, LonXplain, and traditional classifiers as baselines via GitHub.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
An Annotated Dataset for Explainable Interpersonal Risk Factors of Mental Disturbance in Social Media Posts
Authors:
Muskan Garg,
Amirmohammad Shahbandegan,
Amrit Chadha,
Vijay Mago
Abstract:
With a surge in identifying suicidal risk and its severity in social media posts, we argue that a more consequential and explainable research is required for optimal impact on clinical psychology practice and personalized mental healthcare. The success of computational intelligence techniques for inferring mental illness from social media resources, points to natural language processing as a lens…
▽ More
With a surge in identifying suicidal risk and its severity in social media posts, we argue that a more consequential and explainable research is required for optimal impact on clinical psychology practice and personalized mental healthcare. The success of computational intelligence techniques for inferring mental illness from social media resources, points to natural language processing as a lens for determining Interpersonal Risk Factors (IRF) in human writings. Motivated with limited availability of datasets for social NLP research community, we construct and release a new annotated dataset with human-labelled explanations and classification of IRF affecting mental disturbance on social media: (i) Thwarted Belongingness (TBe), and (ii) Perceived Burdensomeness (PBu). We establish baseline models on our dataset facilitating future research directions to develop real-time personalized AI models by detecting patterns of TBe and PBu in emotional spectrum of user's historical social media profile.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering
Authors:
Fangkai Yang,
Pu Zhao,
Zezhong Wang,
Lu Wang,
Jue Zhang,
Mohit Garg,
Qingwei Lin,
Saravan Rajmohan,
Dongmei Zhang
Abstract:
Large Language Model (LLM) has gained popularity and achieved remarkable results in open-domain tasks, but its performance in real industrial domain-specific scenarios is average due to its lack of specific domain knowledge. This issue has attracted widespread attention, but there are few relevant benchmarks available. In this paper, we provide a benchmark Question Answering (QA) dataset named MSQ…
▽ More
Large Language Model (LLM) has gained popularity and achieved remarkable results in open-domain tasks, but its performance in real industrial domain-specific scenarios is average due to its lack of specific domain knowledge. This issue has attracted widespread attention, but there are few relevant benchmarks available. In this paper, we provide a benchmark Question Answering (QA) dataset named MSQA, centered around Microsoft products and IT technical problems encountered by customers. This dataset contains industry cloud-specific QA knowledge, an area not extensively covered in general LLMs, making it well-suited for evaluating methods aiming to enhance LLMs' domain-specific capabilities. In addition, we propose a new model interaction paradigm that can empower LLM to achieve better performance on domain-specific tasks where it is not proficient. Extensive experiments demonstrate that the approach following our method outperforms the commonly used LLM with retrieval methods. We make our source code and sample data available at: https://aka.ms/Microsoft_QA.
△ Less
Submitted 16 October, 2023; v1 submitted 19 May, 2023;
originally announced May 2023.
-
Towards Explainable and Safe Conversational Agents for Mental Health: A Survey
Authors:
Surjodeep Sarkar,
Manas Gaur,
L. Chen,
Muskan Garg,
Biplav Srivastava,
Bhaktee Dongaonkar
Abstract:
Virtual Mental Health Assistants (VMHAs) are seeing continual advancements to support the overburdened global healthcare system that gets 60 million primary care visits, and 6 million Emergency Room (ER) visits annually. These systems are built by clinical psychologists, psychiatrists, and Artificial Intelligence (AI) researchers for Cognitive Behavioral Therapy (CBT). At present, the role of VMHA…
▽ More
Virtual Mental Health Assistants (VMHAs) are seeing continual advancements to support the overburdened global healthcare system that gets 60 million primary care visits, and 6 million Emergency Room (ER) visits annually. These systems are built by clinical psychologists, psychiatrists, and Artificial Intelligence (AI) researchers for Cognitive Behavioral Therapy (CBT). At present, the role of VMHAs is to provide emotional support through information, focusing less on developing a reflective conversation with the patient. A more comprehensive, safe and explainable approach is required to build responsible VMHAs to ask follow-up questions or provide a well-informed response. This survey offers a systematic critical review of the existing conversational agents in mental health, followed by new insights into the improvements of VMHAs with contextual knowledge, datasets, and their emerging role in clinical decision support. We also provide new directions toward enriching the user experience of VMHAs with explainability, safety, and wholesome trustworthiness. Finally, we provide evaluation metrics and practical considerations for VMHAs beyond the current literature to build trust between VMHAs and patients in active communications.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
Multi-class Categorization of Reasons behind Mental Disturbance in Long Texts
Authors:
Muskan Garg
Abstract:
Motivated with recent advances in inferring users' mental state in social media posts, we identify and formulate the problem of finding causal indicators behind mental illness in self-reported text. In the past, we witness the presence of rule-based studies for causal explanation analysis on curated Facebook data. The investigation on transformer-based model for multi-class causal categorization i…
▽ More
Motivated with recent advances in inferring users' mental state in social media posts, we identify and formulate the problem of finding causal indicators behind mental illness in self-reported text. In the past, we witness the presence of rule-based studies for causal explanation analysis on curated Facebook data. The investigation on transformer-based model for multi-class causal categorization in Reddit posts point to a problem of using long-text which contains as many as 4000 words. Developing end-to-end transformer-based models subject to the limitation of maximum-length in a given instance. To handle this problem, we use Longformer and deploy its encoding on transformer-based classifier. The experimental results show that Longformer achieves new state-of-the-art results on M-CAMS, a publicly available dataset with 62\% F1-score. Cause-specific analysis and ablation study prove the effectiveness of Longformer. We believe our work facilitates causal analysis of depression and suicide risk on social media data, and shows potential for application on other mental health conditions.
△ Less
Submitted 8 April, 2023;
originally announced April 2023.
-
Technology-Circuit-Algorithm Tri-Design for Processing-in-Pixel-in-Memory (P2M)
Authors:
Md Abdullah-Al Kaiser,
Gourav Datta,
Sreetama Sarkar,
Souvik Kundu,
Zihan Yin,
Manas Garg,
Ajey P. Jacob,
Peter A. Beerel,
Akhilesh R. Jaiswal
Abstract:
The massive amounts of data generated by camera sensors motivate data processing inside pixel arrays, i.e., at the extreme-edge. Several critical developments have fueled recent interest in the processing-in-pixel-in-memory paradigm for a wide range of visual machine intelligence tasks, including (1) advances in 3D integration technology to enable complex processing inside each pixel in a 3D integ…
▽ More
The massive amounts of data generated by camera sensors motivate data processing inside pixel arrays, i.e., at the extreme-edge. Several critical developments have fueled recent interest in the processing-in-pixel-in-memory paradigm for a wide range of visual machine intelligence tasks, including (1) advances in 3D integration technology to enable complex processing inside each pixel in a 3D integrated manner while maintaining pixel density, (2) analog processing circuit techniques for massively parallel low-energy in-pixel computations, and (3) algorithmic techniques to mitigate non-idealities associated with analog processing through hardware-aware training schemes. This article presents a comprehensive technology-circuit-algorithm landscape that connects technology capabilities, circuit design strategies, and algorithmic optimizations to power, performance, area, bandwidth reduction, and application-level accuracy metrics. We present our results using a comprehensive co-design framework incorporating hardware and algorithmic optimizations for various complex real-life visual intelligence tasks mapped onto our P2M paradigm.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments
Authors:
Tung Thai,
Ming Shen,
Mayank Garg,
Ayush Kalani,
Nakul Vaidya,
Utkarsh Soni,
Mudit Verma,
Sriram Gopalakrishnan,
Neeraj Varshney,
Chitta Baral,
Subbarao Kambhampati,
Jivko Sinapov,
Matthias Scheutz
Abstract:
Learning to detect, characterize and accommodate novelties is a challenge that agents operating in open-world domains need to address to be able to guarantee satisfactory task performance. Certain novelties (e.g., changes in environment dynamics) can interfere with the performance or prevent agents from accomplishing task goals altogether. In this paper, we introduce general methods and architectu…
▽ More
Learning to detect, characterize and accommodate novelties is a challenge that agents operating in open-world domains need to address to be able to guarantee satisfactory task performance. Certain novelties (e.g., changes in environment dynamics) can interfere with the performance or prevent agents from accomplishing task goals altogether. In this paper, we introduce general methods and architectural mechanisms for detecting and characterizing different types of novelties, and for building an appropriate adaptive model to accommodate them utilizing logical representations and reasoning methods. We demonstrate the effectiveness of the proposed methods in evaluations performed by a third party in the adversarial multi-agent board game Monopoly. The results show high novelty detection and accommodation rates across a variety of novelty types, including changes to the rules of the game, as well as changes to the agent's action capabilities.
△ Less
Submitted 5 March, 2023; v1 submitted 27 February, 2023;
originally announced February 2023.
-
NLP as a Lens for Causal Analysis and Perception Mining to Infer Mental Health on Social Media
Authors:
Muskan Garg,
Chandni Saxena,
Usman Naseem,
Bonnie J Dorr
Abstract:
Interactions among humans on social media often convey intentions behind their actions, yielding a psychological language resource for Mental Health Analysis (MHA) of online users. The success of Computational Intelligence Techniques (CIT) for inferring mental illness from such social media resources points to NLP as a lens for causal analysis and perception mining. However, we argue that more con…
▽ More
Interactions among humans on social media often convey intentions behind their actions, yielding a psychological language resource for Mental Health Analysis (MHA) of online users. The success of Computational Intelligence Techniques (CIT) for inferring mental illness from such social media resources points to NLP as a lens for causal analysis and perception mining. However, we argue that more consequential and explainable research is required for optimal impact on clinical psychology practice and personalized mental healthcare. To bridge this gap, we posit two significant dimensions: (1) Causal analysis to illustrate a cause and effect relationship in the user generated text; (2) Perception mining to infer psychological perspectives of social effects on online users intentions. Within the scope of Natural Language Processing (NLP), we further explore critical areas of inquiry associated with these two dimensions, specifically through recent advancements in discourse analysis. This position paper guides the community to explore solutions in this space and advance the state of practice in developing conversational agents for inferring mental health from social media. We advocate for a more explainable approach toward modeling computational psychology problems through the lens of language as we observe an increased number of research contributions in dataset and problem formulation for causal relation extraction and perception enhancements while inferring mental states.
△ Less
Submitted 22 August, 2023; v1 submitted 26 January, 2023;
originally announced January 2023.
-
Causal Categorization of Mental Health Posts using Transformers
Authors:
Simranjeet Kaur,
Ritika Bhardwaj,
Aastha Jain,
Muskan Garg,
Chandni Saxena
Abstract:
With recent developments in digitization of clinical psychology, NLP research community has revolutionized the field of mental health detection on social media. Existing research in mental health analysis revolves around the cross-sectional studies to classify users' intent on social media. For in-depth analysis, we investigate existing classifiers to solve the problem of causal categorization whi…
▽ More
With recent developments in digitization of clinical psychology, NLP research community has revolutionized the field of mental health detection on social media. Existing research in mental health analysis revolves around the cross-sectional studies to classify users' intent on social media. For in-depth analysis, we investigate existing classifiers to solve the problem of causal categorization which suggests the inefficiency of learning based methods due to limited training samples. To handle this challenge, we use transformer models and demonstrate the efficacy of a pre-trained transfer learning on "CAMS" dataset. The experimental result improves the accuracy and depicts the importance of identifying cause-and-effect relationships in the underlying text.
△ Less
Submitted 15 January, 2023; v1 submitted 6 January, 2023;
originally announced January 2023.
-
Matching Augmentation via Simultaneous Contractions
Authors:
Mohit Garg,
Felix Hommelsheim,
Nicole Megow
Abstract:
We consider the matching augmentation problem (MAP), where a matching of a graph needs to be extended into a $2$-edge-connected spanning subgraph by adding the minimum number of edges to it. We present a polynomial-time algorithm with an approximation ratio of $13/8 = 1.625$ improving upon an earlier $5/3$-approximation. The improvement builds on a new $α$-approximation preserving reduction for an…
▽ More
We consider the matching augmentation problem (MAP), where a matching of a graph needs to be extended into a $2$-edge-connected spanning subgraph by adding the minimum number of edges to it. We present a polynomial-time algorithm with an approximation ratio of $13/8 = 1.625$ improving upon an earlier $5/3$-approximation. The improvement builds on a new $α$-approximation preserving reduction for any $α\geq 3/2$ from arbitrary MAP instances to well-structured instances that do not contain certain forbidden structures like parallel edges, small separators, and contractible subgraphs. We further introduce, as key ingredients, the technique of repeated simultaneous contractions and provide improved lower bounds for instances that cannot be contracted.
△ Less
Submitted 26 May, 2023; v1 submitted 3 November, 2022;
originally announced November 2022.
-
Explainable Causal Analysis of Mental Health on Social Media Data
Authors:
Chandni Saxena,
Muskan Garg,
Gunjan Ansari
Abstract:
With recent developments in Social Computing, Natural Language Processing and Clinical Psychology, the social NLP research community addresses the challenge of automation in mental illness on social media. A recent extension to the problem of multi-class classification of mental health issues is to identify the cause behind the user's intention. However, multi-class causal categorization for menta…
▽ More
With recent developments in Social Computing, Natural Language Processing and Clinical Psychology, the social NLP research community addresses the challenge of automation in mental illness on social media. A recent extension to the problem of multi-class classification of mental health issues is to identify the cause behind the user's intention. However, multi-class causal categorization for mental health issues on social media has a major challenge of wrong prediction due to the overlapping problem of causal explanations. There are two possible mitigation techniques to solve this problem: (i) Inconsistency among causal explanations/ inappropriate human-annotated inferences in the dataset, (ii) in-depth analysis of arguments and stances in self-reported text using discourse analysis. In this research work, we hypothesise that if there exists the inconsistency among F1 scores of different classes, there must be inconsistency among corresponding causal explanations as well. In this task, we fine tune the classifiers and find explanations for multi-class causal categorization of mental illness on social media with LIME and Integrated Gradient (IG) methods. We test our methods with CAMS dataset and validate with annotated interpretations. A key contribution of this research work is to find the reason behind inconsistency in accuracy of multi-class causal categorization. The effectiveness of our methods is evident with the results obtained having category-wise average scores of $81.29 \%$ and $0.906$ using cosine similarity and word mover's distance, respectively.
△ Less
Submitted 9 November, 2022; v1 submitted 15 October, 2022;
originally announced October 2022.
-
The Design and Regulation of Exchanges: A Formal Approach
Authors:
Mohit Garg,
Suneel Sarswat
Abstract:
We use formal methods to specify, design, and monitor continuous double auctions, which are widely used to match buyers and sellers at exchanges of foreign currencies, stocks, and commodities. We identify three natural properties of such auctions and formally prove that these properties completely determine the input-output relationship. We then formally verify that a natural algorithm satisfies t…
▽ More
We use formal methods to specify, design, and monitor continuous double auctions, which are widely used to match buyers and sellers at exchanges of foreign currencies, stocks, and commodities. We identify three natural properties of such auctions and formally prove that these properties completely determine the input-output relationship. We then formally verify that a natural algorithm satisfies these properties. All definitions, theorems, and proofs are formalized in an interactive theorem prover. We extract a verified program of our algorithm to build an automated checker that is guaranteed to detect errors in the trade logs of exchanges if they generate transactions that violate any of the natural properties.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Improved Approximation for Two-Edge-Connectivity
Authors:
Mohit Garg,
Fabrizio Grandoni,
Afrouz Jabal Ameli
Abstract:
The basic goal of survivable network design is to construct low-cost networks which preserve a sufficient level of connectivity despite the failure or removal of a few nodes or edges. One of the most basic problems in this area is the $2$-Edge-Connected Spanning Subgraph problem (2-ECSS): given an undirected graph $G$, find a $2$-edge-connected spanning subgraph $H$ of $G$ with the minimum number…
▽ More
The basic goal of survivable network design is to construct low-cost networks which preserve a sufficient level of connectivity despite the failure or removal of a few nodes or edges. One of the most basic problems in this area is the $2$-Edge-Connected Spanning Subgraph problem (2-ECSS): given an undirected graph $G$, find a $2$-edge-connected spanning subgraph $H$ of $G$ with the minimum number of edges (in particular, $H$ remains connected after the removal of one arbitrary edge).
2-ECSS is NP-hard and the best-known (polynomial-time) approximation factor for this problem is $4/3$. Interestingly, this factor was achieved with drastically different techniques by [Hunkenschr{ö}der, Vempala and Vetta '00,'19] and [Seb{ö} and Vygen, '14]. In this paper we present an improved $\frac{118}{89}+ε<1.326$ approximation for 2-ECSS.
The key ingredient in our approach (which might also be helpful in future work) is a reduction to a special type of structured graphs: our reduction preserves approximation factors up to $6/5$. While reducing to 2-vertex-connected graphs is trivial (and heavily used in prior work), our structured graphs are "almost" 3-vertex-connected: more precisely, given any 2-vertex-cut $\{u,v\}$ of a structured graph $G=(V,E)$, $G[V\setminus \{u,v\}]$ has exactly 2 connected components, one of which contains exactly one node of degree $2$ in $G$.
△ Less
Submitted 12 November, 2022; v1 submitted 21 September, 2022;
originally announced September 2022.
-
An event detection technique using social media data
Authors:
Muskan Garg
Abstract:
People post information about different topics which are in their active vocabulary over social media platforms (like Twitter, Facebook, PInterest and Google+). They follow each other and it is more likely that the person who posts information about current happenings will receive better response. Manual analysis of huge amount of data on social media platforms is difficult. This has opened new re…
▽ More
People post information about different topics which are in their active vocabulary over social media platforms (like Twitter, Facebook, PInterest and Google+). They follow each other and it is more likely that the person who posts information about current happenings will receive better response. Manual analysis of huge amount of data on social media platforms is difficult. This has opened new research directions for automatic analysis of usercontributed social media documents. Automatic social media data analysis is difficult due to abundant information shared by users. Many researchers use Twitter data for Social Media Analysis (SMA) as the Twitter data is freely available in the public domain. One of the most this research work. Event Detection from social media data is used for different applications like traffic congestion detection, disaster and emergency management, and live news detection. Nature of the information which is shared on twitter platform is short-text, noisy, and ambiguous. Thus, event detection and extraction of event phrases from user-generated and illformed data becomes challenging. To address these challenges, events are extracted from streaming social media data in the form of keyphrases using different cognitive properties. The motivation behind this research work is to provide substantial improvements in the lexical variation of event phrases while detecting events and sub-events from twitter data. In this research work, the approach towards event detection from social media data is divided into three phases namely: Identifying sub-graphs in Microblog Word Co-occurrence Network (WCN) which provides important information about keyphrases; Identifying multiple events from social media data; and Ranking contextual information of event phrases.
△ Less
Submitted 27 August, 2022;
originally announced August 2022.
-
Minimal Feature Analysis for Isolated Digit Recognition for varying encoding rates in noisy environments
Authors:
Muskan Garg,
Naveen Aggarwal
Abstract:
This research work is about recent development made in speech recognition. In this research work, analysis of isolated digit recognition in the presence of different bit rates and at different noise levels has been performed. This research work has been carried using audacity and HTK toolkit. Hidden Markov Model (HMM) is the recognition model which was used to perform this experiment. The feature…
▽ More
This research work is about recent development made in speech recognition. In this research work, analysis of isolated digit recognition in the presence of different bit rates and at different noise levels has been performed. This research work has been carried using audacity and HTK toolkit. Hidden Markov Model (HMM) is the recognition model which was used to perform this experiment. The feature extraction techniques used are Mel Frequency Cepstrum coefficient (MFCC), Linear Predictive Coding (LPC), perceptual linear predictive (PLP), mel spectrum (MELSPEC), filter bank (FBANK). There were three types of different noise levels which have been considered for testing of data. These include random noise, fan noise and random noise in real time environment. This was done to analyse the best environment which can used for real time applications. Further, five different types of commonly used bit rates at different sampling rates were considered to find out the most optimum bit rate.
△ Less
Submitted 27 August, 2022;
originally announced August 2022.
-
CAMS: An Annotated Corpus for Causal Analysis of Mental Health Issues in Social Media Posts
Authors:
Muskan Garg,
Chandni Saxena,
Veena Krishnan,
Ruchi Joshi,
Sriparna Saha,
Vijay Mago,
Bonnie J Dorr
Abstract:
Research community has witnessed substantial growth in the detection of mental health issues and their associated reasons from analysis of social media. We introduce a new dataset for Causal Analysis of Mental health issues in Social media posts (CAMS). Our contributions for causal analysis are two-fold: causal interpretation and causal categorization. We introduce an annotation schema for this ta…
▽ More
Research community has witnessed substantial growth in the detection of mental health issues and their associated reasons from analysis of social media. We introduce a new dataset for Causal Analysis of Mental health issues in Social media posts (CAMS). Our contributions for causal analysis are two-fold: causal interpretation and causal categorization. We introduce an annotation schema for this task of causal analysis. We demonstrate the efficacy of our schema on two different datasets: (i) crawling and annotating 3155 Reddit posts and (ii) re-annotating the publicly available SDCNL dataset of 1896 instances for interpretable causal analysis. We further combine these into the CAMS dataset and make this resource publicly available along with associated source code: https://github.com/drmuskangarg/CAMS. We present experimental results of models learned from CAMS dataset and demonstrate that a classic Logistic Regression model outperforms the next best (CNN-LSTM) model by 4.9\% accuracy.
△ Less
Submitted 11 July, 2022;
originally announced July 2022.
-
Data Augmentation for Mental Health Classification on Social Media
Authors:
Gunjan Ansari,
Muskan Garg,
Chandni Saxena
Abstract:
The mental disorder of online users is determined using social media posts. The major challenge in this domain is to avail the ethical clearance for using the user generated text on social media platforms. Academic re searchers identified the problem of insufficient and unlabeled data for mental health classification. To handle this issue, we have studied the effect of data augmentation techniques…
▽ More
The mental disorder of online users is determined using social media posts. The major challenge in this domain is to avail the ethical clearance for using the user generated text on social media platforms. Academic re searchers identified the problem of insufficient and unlabeled data for mental health classification. To handle this issue, we have studied the effect of data augmentation techniques on domain specific user generated text for mental health classification. Among the existing well established data augmentation techniques, we have identified Easy Data Augmentation (EDA), conditional BERT, and Back Translation (BT) as the potential techniques for generating additional text to improve the performance of classifiers. Further, three different classifiers Random Forest (RF), Support Vector Machine (SVM) and Logistic Regression (LR) are employed for analyzing the impact of data augmentation on two publicly available social media datasets. The experiments mental results show significant improvements in classifiers performance when trained on the augmented data.
△ Less
Submitted 19 December, 2021;
originally announced December 2021.
-
A Novel Corpus of Discourse Structure in Humans and Computers
Authors:
Babak Hemmatian,
Sheridan Feucht,
Rachel Avram,
Alexander Wey,
Muskaan Garg,
Kate Spitalnic,
Carsten Eickhoff,
Ellie Pavlick,
Bjorn Sandstede,
Steven Sloman
Abstract:
We present a novel corpus of 445 human- and computer-generated documents, comprising about 27,000 clauses, annotated for semantic clause types and coherence relations that allow for nuanced comparison of artificial and natural discourse modes. The corpus covers both formal and informal discourse, and contains documents generated using fine-tuned GPT-2 (Zellers et al., 2019) and GPT-3(Brown et al.,…
▽ More
We present a novel corpus of 445 human- and computer-generated documents, comprising about 27,000 clauses, annotated for semantic clause types and coherence relations that allow for nuanced comparison of artificial and natural discourse modes. The corpus covers both formal and informal discourse, and contains documents generated using fine-tuned GPT-2 (Zellers et al., 2019) and GPT-3(Brown et al., 2020). We showcase the usefulness of this corpus for detailed discourse analysis of text generation by providing preliminary evidence that less numerous, shorter and more often incoherent clause relations are associated with lower perceived quality of computer-generated narratives and arguments.
△ Less
Submitted 10 November, 2021;
originally announced November 2021.
-
Quantifying the Suicidal Tendency on Social Media: A Survey
Authors:
Muskan Garg
Abstract:
Amid lockdown period more people express their feelings over social media platforms due to closed third-place and academic researchers have witnessed strong associations between the mental healthcare and social media posts. The stress for a brief period may lead to clinical depressions and the long-lasting traits of prevailing depressions can be life threatening with suicidal ideation as the possi…
▽ More
Amid lockdown period more people express their feelings over social media platforms due to closed third-place and academic researchers have witnessed strong associations between the mental healthcare and social media posts. The stress for a brief period may lead to clinical depressions and the long-lasting traits of prevailing depressions can be life threatening with suicidal ideation as the possible outcome. The increasing concern towards the rise in number of suicide cases is because it is one of the leading cause of premature but preventable death. Recent studies have shown that mining social media data has helped in quantifying the suicidal tendency of users at risk. This potential manuscript elucidates the taxonomy of mental healthcare and highlights some recent attempts in examining the potential of quantifying suicidal tendency on social media data. This manuscript presents the classification of heterogeneous features from social media data and handling feature vector representation. Aiming to identify the new research directions and advances in the development of Machine Learning (ML) and Deep Learning (DL) based models, a quantitative synthesis and a qualitative review was carried out with corpus of over 77 potential research articles related to stress, depression and suicide risk from 2013 to 2021.
△ Less
Submitted 27 August, 2022; v1 submitted 4 October, 2021;
originally announced October 2021.
-
Can Connected Autonomous Vehicles really improve mixed traffic efficiency in realistic scenarios?
Authors:
Mohit Garg,
Cian Johnston,
Mélanie Bouroche
Abstract:
Connected autonomous vehicles (CAVs) can supplement the information from their own sensors with information from surrounding CAVs for decision making and control. This has the potential to improve traffic efficiency. CAVs face additional challenges in their driving, however, when they interact with human-driven vehicles (HDVs) in mixed-traffic environments due to the uncertainty in human's driving…
▽ More
Connected autonomous vehicles (CAVs) can supplement the information from their own sensors with information from surrounding CAVs for decision making and control. This has the potential to improve traffic efficiency. CAVs face additional challenges in their driving, however, when they interact with human-driven vehicles (HDVs) in mixed-traffic environments due to the uncertainty in human's driving behavior e.g. larger reaction times, perception errors, etc. While a lot of research has investigated the impact of CAVs on traffic safety and efficiency at different penetration rates, all have assumed either perfect communication or very simple scenarios with imperfect communication. In practice, the presence of communication delays and packet losses means that CAVs might receive only partial information from surrounding vehicles, and this can have detrimental effects on their performance. This paper investigates the impact of CAVs on traffic efficiency in realistic communication and road network scenarios (i.e. imperfect communication and large-scale road network). We analyze the effect of unreliable communication links on CAVs operation in mixed traffic with various penetration rates and evaluate traffic performance in congested traffic scenarios on a large-scale road network (the M50 motorway, in Ireland). Results show that CAVs can significantly improve traffic efficiency in congested traffic scenarios at high penetration rates. The scale of the improvement depends on communication reliability, with a packet drop rate of 70% leading to an increase in traffic congestion by 28.7% and 11.88% at 40% and 70% penetration rates respectively compared to perfect communication.
△ Less
Submitted 11 July, 2021; v1 submitted 7 July, 2021;
originally announced July 2021.
-
Deterministic (1/2 + ε)-Approximation for Submodular Maximization over a Matroid
Authors:
Niv Buchbinder,
Moran Feldman,
Mohit Garg
Abstract:
We study the problem of maximizing a monotone submodular function subject to a matroid constraint and present a deterministic algorithm that achieves (1/2 + ε)-approximation for the problem. This algorithm is the first deterministic algorithm known to improve over the 1/2-approximation ratio of the classical greedy algorithm proved by Nemhauser, Wolsely and Fisher in 1978.
We study the problem of maximizing a monotone submodular function subject to a matroid constraint and present a deterministic algorithm that achieves (1/2 + ε)-approximation for the problem. This algorithm is the first deterministic algorithm known to improve over the 1/2-approximation ratio of the classical greedy algorithm proved by Nemhauser, Wolsely and Fisher in 1978.
△ Less
Submitted 15 July, 2018;
originally announced July 2018.
-
Online Submodular Maximization: Beating 1/2 Made Simple
Authors:
Niv Buchbinder,
Moran Feldman,
Yuval Filmus,
Mohit Garg
Abstract:
The Submodular Welfare Maximization problem (SWM) captures an important subclass of combinatorial auctions and has been studied extensively from both computational and economic perspectives. In particular, it has been studied in a natural online setting in which items arrive one-by-one and should be allocated irrevocably upon arrival. In this setting, it is well known that the greedy algorithm ach…
▽ More
The Submodular Welfare Maximization problem (SWM) captures an important subclass of combinatorial auctions and has been studied extensively from both computational and economic perspectives. In particular, it has been studied in a natural online setting in which items arrive one-by-one and should be allocated irrevocably upon arrival. In this setting, it is well known that the greedy algorithm achieves a competitive ratio of 1/2, and recently Kapralov et al. (2013) showed that this ratio is optimal for the problem. Surprisingly, despite this impossibility result, Korula et al. (2015) were able to show that the same algorithm is 0.5052-competitive when the items arrive in a uniformly random order, but unfortunately, their proof is very long and involved. In this work, we present an (arguably) much simpler analysis that provides a slightly better guarantee of 0.5096-competitiveness for the greedy algorithm in the random-arrival model. Moreover, this analysis applies also to a generalization of online SWM in which the sets defining a (simple) partition matroid arrive online in a uniformly random order, and we would like to maximize a monotone submodular function subject to this matroid. Furthermore, for this more general problem, we prove an upper bound of 0.576 on the competitive ratio of the greedy algorithm, ruling out the possibility that the competitiveness of this natural algorithm matches the optimal offline approximation ratio of 1-1/e.
△ Less
Submitted 19 November, 2018; v1 submitted 15 July, 2018;
originally announced July 2018.
-
Set membership with non-adaptive bit probes
Authors:
Mohit Garg,
Jaikumar Radhakrishnan
Abstract:
We consider the non-adaptive bit-probe complexity of the set membership problem, where a set S of size at most n from a universe of size m is to be represented as a short bit vector in order to answer membership queries of the form "Is x in S?" by non-adaptively probing the bit vector at t places. Let s_N(m,n,t) be the minimum number of bits of storage needed for such a scheme. In this work, we sh…
▽ More
We consider the non-adaptive bit-probe complexity of the set membership problem, where a set S of size at most n from a universe of size m is to be represented as a short bit vector in order to answer membership queries of the form "Is x in S?" by non-adaptively probing the bit vector at t places. Let s_N(m,n,t) be the minimum number of bits of storage needed for such a scheme. In this work, we show existence of non-adaptive and adaptive schemes for a range of t that improves an upper bound of Buhrman, Miltersen, Radhakrishnan and Srinivasan (2002) on s_N(m,n,t). For three non-adaptive probes, we improve the previous best lower bound on s_N(m,n,3) by Alon and Feige (2009).
△ Less
Submitted 29 December, 2016;
originally announced December 2016.
-
OCR++: A Robust Framework For Information Extraction from Scholarly Articles
Authors:
Mayank Singh,
Barnopriyo Barua,
Priyank Palod,
Manvi Garg,
Sidhartha Satapathy,
Samuel Bushi,
Kumar Ayush,
Krishna Sai Rohith,
Tulasi Gamidi,
Pawan Goyal,
Animesh Mukherjee
Abstract:
This paper proposes OCR++, an open-source framework designed for a variety of information extraction tasks from scholarly articles including metadata (title, author names, affiliation and e-mail), structure (section headings and body text, table and figure headings, URLs and footnotes) and bibliography (citation instances and references). We analyze a diverse set of scientific articles written in…
▽ More
This paper proposes OCR++, an open-source framework designed for a variety of information extraction tasks from scholarly articles including metadata (title, author names, affiliation and e-mail), structure (section headings and body text, table and figure headings, URLs and footnotes) and bibliography (citation instances and references). We analyze a diverse set of scientific articles written in English language to understand generic writing patterns and formulate rules to develop this hybrid framework. Extensive evaluations show that the proposed framework outperforms the existing state-of-the-art tools with huge margin in structural information extraction along with improved performance in metadata and bibliography extraction tasks, both in terms of accuracy (around 50% improvement) and processing time (around 52% improvement). A user experience study conducted with the help of 30 researchers reveals that the researchers found this system to be very helpful. As an additional objective, we discuss two novel use cases including automatically extracting links to public datasets from the proceedings, which would further accelerate the advancement in digital libraries. The result of the framework can be exported as a whole into structured TEI-encoded documents. Our framework is accessible online at http://cnergres.iitkgp.ac.in/OCR++/home/.
△ Less
Submitted 23 September, 2016; v1 submitted 21 September, 2016;
originally announced September 2016.
-
Set Membership with a Few Bit Probes
Authors:
Mohit Garg,
Jaikumar Radhakrishnan
Abstract:
We consider the bit-probe complexity of the set membership problem, where a set S of size at most n from a universe of size m is to be represented as a short bit vector in order to answer membership queries of the form "Is x in S?" by adaptively probing the bit vector at t places. Let s(m,n,t) be the minimum number of bits of storage needed for such a scheme. Several recent works investigate s(m,n…
▽ More
We consider the bit-probe complexity of the set membership problem, where a set S of size at most n from a universe of size m is to be represented as a short bit vector in order to answer membership queries of the form "Is x in S?" by adaptively probing the bit vector at t places. Let s(m,n,t) be the minimum number of bits of storage needed for such a scheme. Several recent works investigate s(m,n,t) for various ranges of the parameter; we obtain improvements over some of the bounds shown by Buhrman, Miltersen, Radhakrishnan, and Srinivasan (2002) and Alon and Feige (2009).
△ Less
Submitted 8 April, 2015;
originally announced April 2015.
-
Ontology based approach for video transmission over the network
Authors:
Rachit Mohan Garg,
Yamini Sood,
Neha Tyagi
Abstract:
With the increase in the bandwidth & the transmission speed over the internet, transmission of multimedia objects like video, audio, images has become an easier work. In this paper we provide an approach that can be useful for transmission of video objects over the internet without much fuzz. The approach provides a ontology based framework that is used to establish an automatic deployment of vide…
▽ More
With the increase in the bandwidth & the transmission speed over the internet, transmission of multimedia objects like video, audio, images has become an easier work. In this paper we provide an approach that can be useful for transmission of video objects over the internet without much fuzz. The approach provides a ontology based framework that is used to establish an automatic deployment of video transmission system. Further the video is compressed using the structural flow mechanism that uses the wavelet principle for compression of video frames. Finally the video transmission algorithm known as RRDBFSF algorithm is provided that makes use of the concept of restrictive flooding to avoid redundancy thereby increasing the efficiency.
△ Less
Submitted 28 February, 2011;
originally announced February 2011.
-
A Framework Based Approach for the Development of Web Based Applications
Authors:
Rachit Mohan Garg,
Yamini Sood,
Balaji Kottana,
Pallavi Totlani
Abstract:
The sole goal of E-Governance is to allow interaction of government with their citizens in a comfortable & transparent manner. Uniqueness of J2EE makes it a perfect technology for development of any online portal. These involve constancy, easy to replant, construct speedily etc. In this paper we present a procedural approach to develop a web application using the J2EE Struts Framework.
The sole goal of E-Governance is to allow interaction of government with their citizens in a comfortable & transparent manner. Uniqueness of J2EE makes it a perfect technology for development of any online portal. These involve constancy, easy to replant, construct speedily etc. In this paper we present a procedural approach to develop a web application using the J2EE Struts Framework.
△ Less
Submitted 11 February, 2011;
originally announced February 2011.