-
Investigating the effectiveness of multimodal data in forecasting SARS-COV-2 case surges
Authors:
Palur Venkata Raghuvamsi,
Siyuan Brandon Loh,
Prasanta Bhattacharya,
Joses Ho,
Raphael Lee Tze Chuen,
Alvin X. Han,
Sebastian Maurer-Stroh
Abstract:
The COVID-19 pandemic response relied heavily on statistical and machine learning models to predict key outcomes such as case prevalence and fatality rates. These predictions were instrumental in enabling timely public health interventions that helped break transmission cycles. While most existing models are grounded in traditional epidemiological data, the potential of alternative datasets, such…
▽ More
The COVID-19 pandemic response relied heavily on statistical and machine learning models to predict key outcomes such as case prevalence and fatality rates. These predictions were instrumental in enabling timely public health interventions that helped break transmission cycles. While most existing models are grounded in traditional epidemiological data, the potential of alternative datasets, such as those derived from genomic information and human behavior, remains underexplored. In the current study, we investigated the usefulness of diverse modalities of feature sets in predicting case surges. Our results highlight the relative effectiveness of biological (e.g., mutations), public health (e.g., case counts, policy interventions) and human behavioral features (e.g., mobility and social media conversations) in predicting country-level case surges. Importantly, we uncover considerable heterogeneity in predictive performance across countries and feature modalities, suggesting that surge prediction models may need to be tailored to specific national contexts and pandemic phases. Overall, our work highlights the value of integrating alternative data sources into existing disease surveillance frameworks to enhance the prediction of pandemic dynamics.
△ Less
Submitted 29 May, 2025; v1 submitted 27 May, 2025;
originally announced May 2025.
-
What Contributes to Affective Polarization in Networked Online Environments? Evidence from an Agent-Based Model
Authors:
Narayani Vedam,
Subhayan Mukerjee,
Prasanta Bhattacharya
Abstract:
Affective polarization, or, inter-party hostility, is increasingly recognized as a pervasive issue in democracies worldwide, posing a threat to social cohesion. The digital media ecosystem, now widely accessible and ever-present, has often been implicated in accelerating this phenomenon. However, the precise causal mechanisms responsible for driving affective polarization have been a subject of ex…
▽ More
Affective polarization, or, inter-party hostility, is increasingly recognized as a pervasive issue in democracies worldwide, posing a threat to social cohesion. The digital media ecosystem, now widely accessible and ever-present, has often been implicated in accelerating this phenomenon. However, the precise causal mechanisms responsible for driving affective polarization have been a subject of extensive debate. While the concept of echo chambers, characterized by individuals ensconced within like-minded groups, bereft of counter-attitudinal content, has long been the prevailing hypothesis, accumulating empirical evidence suggests a more nuanced picture. This study aims to contribute to the ongoing debate by employing an agent-based model to illustrate how affective polarization is either fostered or hindered by individual news consumption and dissemination patterns based on ideological alignment. To achieve this, we parameterize three key aspects: (1) The affective asymmetry of individuals' engagement with in-party versus out-party content, (2) The proportion of in-party members within one's social neighborhood, and (3) The degree of partisan bias among the elites within the population. Subsequently, we observe macro-level changes in affective polarization within the population under various conditions stipulated by these parameters. This approach allows us to explore the intricate dynamics of affective polarization within digital environments, shedding light on the interplay between individual behaviors, social networks, and information exposure.
△ Less
Submitted 10 April, 2025;
originally announced April 2025.
-
ReXCL: A Tool for Requirement Document Extraction and Classification
Authors:
Paheli Bhattacharya,
Manojit Chakraborty,
Santhosh Kumar Arumugam,
Rishabh Gupta
Abstract:
This paper presents the ReXCL tool, which automates the extraction and classification processes in requirement engineering, enhancing the software development lifecycle. The tool features two main modules: Extraction, which processes raw requirement documents into a predefined schema using heuristics and predictive modeling, and Classification, which assigns class labels to requirements using adap…
▽ More
This paper presents the ReXCL tool, which automates the extraction and classification processes in requirement engineering, enhancing the software development lifecycle. The tool features two main modules: Extraction, which processes raw requirement documents into a predefined schema using heuristics and predictive modeling, and Classification, which assigns class labels to requirements using adaptive fine-tuning of encoder-based models. The final output can be exported to external requirement engineering tools. Performance evaluations indicate that ReXCL significantly improves efficiency and accuracy in managing requirements, marking a novel approach to automating the schematization of semi-structured requirement documents.
△ Less
Submitted 10 April, 2025;
originally announced April 2025.
-
Application of muon absorption tomography in imaging of civil structures
Authors:
Piyush Pallav,
Purba Bhattacharya,
Supratik Mukhopadhyay,
Nayana Majumdar
Abstract:
The research focuses on the non-invasive imaging technique using cosmic muon absorption tomography to monitor the internals of archaeological / civil / industrial structures of intermediate size. It integrates experimental measurements and numerical simulations with Geant4, ascertaining the reliability and precision of muon absorption tomography using easily available components for the stated pur…
▽ More
The research focuses on the non-invasive imaging technique using cosmic muon absorption tomography to monitor the internals of archaeological / civil / industrial structures of intermediate size. It integrates experimental measurements and numerical simulations with Geant4, ascertaining the reliability and precision of muon absorption tomography using easily available components for the stated purpose. The experiment probes muon interactions across a range of materials including those commonly used in building civil and industrial structures. An experiment, fondly named MARS (Muon Absorption in Rigid Structures), was carried out to explore the possibility of using overlapped scintillation paddles for improved mapping of inhomogeneities in structures made of concrete. Good correlation of experimental and simulated results for all tests indicates that this simple approach can be implemented for non-destructive evaluations of structures of civil and industrial interest
△ Less
Submitted 30 March, 2025;
originally announced March 2025.
-
Observation of giant remnant polarization in ultrathin AlScN at cryogenic temperatures
Authors:
Seunguk Song,
Dhiren K. Pradhan,
Zekun Hu,
Yinuo Zhang,
Rachael N. Keneipp,
Michael A. Susner,
Pijush Bhattacharya,
Marija Drndić,
Roy H. Olsson III,
Deep Jariwala
Abstract:
The discovery of wurtzite ferroelectrics opens new frontiers in polar materials, yet their behavior at cryogenic temperatures remains unexplored. Here, we reveal unprecedented ferroelectric properties in ultrathin (10 nm) Al$_{0.68}$Sc$_{0.32}$N (AlScN) at cryogenic temperatures where the properties are fundamentally distinct from those of conventional oxide ferroelectrics. At 12 K, we demonstrate…
▽ More
The discovery of wurtzite ferroelectrics opens new frontiers in polar materials, yet their behavior at cryogenic temperatures remains unexplored. Here, we reveal unprecedented ferroelectric properties in ultrathin (10 nm) Al$_{0.68}$Sc$_{0.32}$N (AlScN) at cryogenic temperatures where the properties are fundamentally distinct from those of conventional oxide ferroelectrics. At 12 K, we demonstrate a giant remnant polarization exceeding 250 $μ$C/cm$^2$ -- more than twice that of any known ferroelectric -- driven by an enhanced c/a ratio in the wurtzite structure. Our devices sustain remarkably high electric fields (~13 MV/cm) while maintaining reliable switching, achieving over 104 polarization reversal cycles at 12 K. Critically, this breakdown field strength approaches that of passive dielectric materials while maintaining ferroelectric functionality. The extraordinary polarization enhancement and high-field stability at cryogenic temperatures contrasts sharply with oxide ferroelectrics, establishing wurtzite ferroelectrics as a distinct class of polar materials with implications spanning fundamental physics to cryogenic non-volatile memory and quantum technologies.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
Training Video Foundation Models with NVIDIA NeMo
Authors:
Zeeshan Patel,
Ethan He,
Parth Mannan,
Xiaowei Ren,
Ryan Wolf,
Niket Agarwal,
Jacob Huffman,
Zhuoyao Wang,
Carl Wang,
Jack Chang,
Yan Bai,
Tommy Huang,
Linnan Wang,
Sahil Jain,
Shanmugam Ramasamy,
Joseph Jennings,
Ekaterina Sirazitdinova,
Oleg Sudakov,
Mingyuan Ma,
Bobby Chen,
Forrest Lin,
Hao Wang,
Vasanth Rao Naik Sabavat,
Sriharsha Niverty,
Rong Ou
, et al. (4 additional authors not shown)
Abstract:
Video Foundation Models (VFMs) have recently been used to simulate the real world to train physical AI systems and develop creative visual experiences. However, there are significant challenges in training large-scale, high quality VFMs that can generate high-quality videos. We present a scalable, open-source VFM training pipeline with NVIDIA NeMo, providing accelerated video dataset curation, mul…
▽ More
Video Foundation Models (VFMs) have recently been used to simulate the real world to train physical AI systems and develop creative visual experiences. However, there are significant challenges in training large-scale, high quality VFMs that can generate high-quality videos. We present a scalable, open-source VFM training pipeline with NVIDIA NeMo, providing accelerated video dataset curation, multimodal data loading, and parallelized video diffusion model training and inference. We also provide a comprehensive performance analysis highlighting best practices for efficient VFM training and inference.
△ Less
Submitted 17 March, 2025;
originally announced March 2025.
-
MARRO: Multi-headed Attention for Rhetorical Role Labeling in Legal Documents
Authors:
Purbid Bambroo,
Subinay Adhikary,
Paheli Bhattacharya,
Abhijnan Chakraborty,
Saptarshi Ghosh,
Kripabandhu Ghosh
Abstract:
Identification of rhetorical roles like facts, arguments, and final judgments is central to understanding a legal case document and can lend power to other downstream tasks like legal case summarization and judgment prediction. However, there are several challenges to this task. Legal documents are often unstructured and contain a specialized vocabulary, making it hard for conventional transformer…
▽ More
Identification of rhetorical roles like facts, arguments, and final judgments is central to understanding a legal case document and can lend power to other downstream tasks like legal case summarization and judgment prediction. However, there are several challenges to this task. Legal documents are often unstructured and contain a specialized vocabulary, making it hard for conventional transformer models to understand them. Additionally, these documents run into several pages, which makes it difficult for neural models to capture the entire context at once. Lastly, there is a dearth of annotated legal documents to train deep learning models. Previous state-of-the-art approaches for this task have focused on using neural models like BiLSTM-CRF or have explored different embedding techniques to achieve decent results. While such techniques have shown that better embedding can result in improved model performance, not many models have focused on utilizing attention for learning better embeddings in sentences of a document. Additionally, it has been recently shown that advanced techniques like multi-task learning can help the models learn better representations, thereby improving performance. In this paper, we combine these two aspects by proposing a novel family of multi-task learning-based models for rhetorical role labeling, named MARRO, that uses transformer-inspired multi-headed attention. Using label shift as an auxiliary task, we show that models from the MARRO family achieve state-of-the-art results on two labeled datasets for rhetorical role labeling, from the Indian and UK Supreme Courts.
△ Less
Submitted 8 March, 2025;
originally announced March 2025.
-
When Incentives Backfire, Data Stops Being Human
Authors:
Sebastin Santy,
Prasanta Bhattacharya,
Manoel Horta Ribeiro,
Kelsey Allen,
Sewoong Oh
Abstract:
Progress in AI has relied on human-generated data, from annotator marketplaces to the wider Internet. However, the widespread use of large language models now threatens the quality and integrity of human-generated data on these very platforms. We argue that this issue goes beyond the immediate challenge of filtering AI-generated content -- it reveals deeper flaws in how data collection systems are…
▽ More
Progress in AI has relied on human-generated data, from annotator marketplaces to the wider Internet. However, the widespread use of large language models now threatens the quality and integrity of human-generated data on these very platforms. We argue that this issue goes beyond the immediate challenge of filtering AI-generated content -- it reveals deeper flaws in how data collection systems are designed. Existing systems often prioritize speed, scale, and efficiency at the cost of intrinsic human motivation, leading to declining engagement and data quality. We propose that rethinking data collection systems to align with contributors' intrinsic motivations -- rather than relying solely on external incentives -- can help sustain high-quality data sourcing at scale while maintaining contributor trust and long-term participation.
△ Less
Submitted 7 June, 2025; v1 submitted 11 February, 2025;
originally announced February 2025.
-
Drone Detection and Tracking with YOLO and a Rule-based Method
Authors:
Purbaditya Bhattacharya,
Patrick Nowak
Abstract:
Drones or unmanned aerial vehicles are traditionally used for military missions, warfare, and espionage. However, the usage of drones has significantly increased due to multiple industrial applications involving security and inspection, transportation, research purposes, and recreational drone flying. Such an increased volume of drone activity in public spaces requires regulatory actions for purpo…
▽ More
Drones or unmanned aerial vehicles are traditionally used for military missions, warfare, and espionage. However, the usage of drones has significantly increased due to multiple industrial applications involving security and inspection, transportation, research purposes, and recreational drone flying. Such an increased volume of drone activity in public spaces requires regulatory actions for purposes of privacy protection and safety. Hence, detection of illegal drone activities such as boundary encroachment becomes a necessity. Such detection tasks are usually automated and performed by deep learning models which are trained on annotated image datasets. This paper builds on a previous work and extends an already published open source dataset. A description and analysis of the entire dataset is provided. The dataset is used to train the YOLOv7 deep learning model and some of its minor variants and the results are provided. Since the detection models are based on a single image input, a simple cross-correlation based tracker is used to reduce detection drops and improve tracking performance in videos. Finally, the entire drone detection system is summarized.
△ Less
Submitted 7 February, 2025;
originally announced February 2025.
-
Rethinking stance detection: A theoretically-informed research agenda for user-level inference using language models
Authors:
Prasanta Bhattacharya,
Hong Zhang,
Yiming Cao,
Wei Gao,
Brandon Siyuan Loh,
Joseph J. P. Simons,
Liang Ze Wong
Abstract:
Stance detection has emerged as a popular task in natural language processing research, enabled largely by the abundance of target-specific social media data. While there has been considerable research on the development of stance detection models, datasets, and application, we highlight important gaps pertaining to (i) a lack of theoretical conceptualization of stance, and (ii) the treatment of s…
▽ More
Stance detection has emerged as a popular task in natural language processing research, enabled largely by the abundance of target-specific social media data. While there has been considerable research on the development of stance detection models, datasets, and application, we highlight important gaps pertaining to (i) a lack of theoretical conceptualization of stance, and (ii) the treatment of stance at an individual- or user-level, as opposed to message-level. In this paper, we first review the interdisciplinary origins of stance as an individual-level construct to highlight relevant attributes (e.g., psychological features) that might be useful to incorporate in stance detection models. Further, we argue that recent pre-trained and large language models (LLMs) might offer a way to flexibly infer such user-level attributes and/or incorporate them in modelling stance. To better illustrate this, we briefly review and synthesize the emerging corpus of studies on using LLMs for inferring stance, and specifically on incorporating user attributes in such tasks. We conclude by proposing a four-point agenda for pursuing stance detection research that is theoretically informed, inclusive, and practically impactful.
△ Less
Submitted 4 February, 2025;
originally announced February 2025.
-
Is your LLM trapped in a Mental Set? Investigative study on how mental sets affect the reasoning capabilities of LLMs
Authors:
Saiful Haq,
Niyati Chhaya,
Piyush Pandey,
Pushpak Bhattacharya
Abstract:
In this paper, we present an investigative study on how Mental Sets influence the reasoning capabilities of LLMs. LLMs have excelled in diverse natural language processing (NLP) tasks, driven by advancements in parameter-efficient fine-tuning (PEFT) and emergent capabilities like in-context learning (ICL). For complex reasoning tasks, selecting the right model for PEFT or ICL is critical, often re…
▽ More
In this paper, we present an investigative study on how Mental Sets influence the reasoning capabilities of LLMs. LLMs have excelled in diverse natural language processing (NLP) tasks, driven by advancements in parameter-efficient fine-tuning (PEFT) and emergent capabilities like in-context learning (ICL). For complex reasoning tasks, selecting the right model for PEFT or ICL is critical, often relying on scores on benchmarks such as MMLU, MATH, and GSM8K. However, current evaluation methods, based on metrics like F1 Score or reasoning chain assessments by larger models, overlook a key dimension: adaptability to unfamiliar situations and overcoming entrenched thinking patterns. In cognitive psychology, Mental Set refers to the tendency to persist with previously successful strategies, even when they become inefficient - a challenge for problem solving and reasoning. We compare the performance of LLM models like Llama-3.1-8B-Instruct, Llama-3.1-70B-Instruct and GPT-4o in the presence of mental sets. To the best of our knowledge, this is the first study to integrate cognitive psychology concepts into the evaluation of LLMs for complex reasoning tasks, providing deeper insights into their adaptability and problem-solving efficacy.
△ Less
Submitted 20 January, 2025;
originally announced January 2025.
-
Selective Shot Learning for Code Explanation
Authors:
Paheli Bhattacharya,
Rishabh Gupta
Abstract:
Code explanation plays a crucial role in the software engineering domain, aiding developers in grasping code functionality efficiently. Recent work shows that the performance of LLMs for code explanation improves in a few-shot setting, especially when the few-shot examples are selected intelligently. State-of-the-art approaches for such Selective Shot Learning (SSL) include token-based and embeddi…
▽ More
Code explanation plays a crucial role in the software engineering domain, aiding developers in grasping code functionality efficiently. Recent work shows that the performance of LLMs for code explanation improves in a few-shot setting, especially when the few-shot examples are selected intelligently. State-of-the-art approaches for such Selective Shot Learning (SSL) include token-based and embedding-based methods. However, these SSL approaches have been evaluated on proprietary LLMs, without much exploration on open-source Code-LLMs. Additionally, these methods lack consideration for programming language syntax. To bridge these gaps, we present a comparative study and propose a novel SSL method (SSL_ner) that utilizes entity information for few-shot example selection. We present several insights and show the effectiveness of SSL_ner approach over state-of-the-art methods across two datasets. To the best of our knowledge, this is the first systematic benchmarking of open-source Code-LLMs while assessing the performances of the various few-shot examples selection approaches for the code explanation task.
△ Less
Submitted 17 December, 2024;
originally announced December 2024.
-
Equivariant Weiss Calculus
Authors:
Prasit Bhattacharya,
Yang Hu
Abstract:
In this paper, we introduce an equivariant analog of Weiss calculus of functors for all finite group $\mathrm{G}$. In our theory, Taylor approximations and derivatives are index by finite dimensional $\mathrm{G}$-representations, and homogeneous layers are classified by orthogonal $\mathrm{G}$-spectra. Further, our framework permits a notion of restriction as well as a notion of fixed-point at the…
▽ More
In this paper, we introduce an equivariant analog of Weiss calculus of functors for all finite group $\mathrm{G}$. In our theory, Taylor approximations and derivatives are index by finite dimensional $\mathrm{G}$-representations, and homogeneous layers are classified by orthogonal $\mathrm{G}$-spectra. Further, our framework permits a notion of restriction as well as a notion of fixed-point at the level of Weiss functors. We establish various results comparing Taylor approximations and derivatives of fixed-point (resp. restrictions) functors to that of the fixed-point (resp. restrictions) of Taylor approximations and derivatives.
△ Less
Submitted 26 October, 2024; v1 submitted 15 October, 2024;
originally announced October 2024.
-
Predicting User Stances from Target-Agnostic Information using Large Language Models
Authors:
Siyuan Brandon Loh,
Liang Ze Wong,
Prasanta Bhattacharya,
Joseph Simons,
Wei Gao,
Hong Zhang
Abstract:
We investigate Large Language Models' (LLMs) ability to predict a user's stance on a target given a collection of his/her target-agnostic social media posts (i.e., user-level stance prediction). While we show early evidence that LLMs are capable of this task, we highlight considerable variability in the performance of the model across (i) the type of stance target, (ii) the prediction strategy and…
▽ More
We investigate Large Language Models' (LLMs) ability to predict a user's stance on a target given a collection of his/her target-agnostic social media posts (i.e., user-level stance prediction). While we show early evidence that LLMs are capable of this task, we highlight considerable variability in the performance of the model across (i) the type of stance target, (ii) the prediction strategy and (iii) the number of target-agnostic posts supplied. Post-hoc analyses further hint at the usefulness of target-agnostic posts in providing relevant information to LLMs through the presence of both surface-level (e.g., target-relevant keywords) and user-level features (e.g., encoding users' moral values). Overall, our findings suggest that LLMs might offer a viable method for determining public stances towards new topics based on historical and target-agnostic data. At the same time, we also call for further research to better understand LLMs' strong performance on the stance prediction task and how their effectiveness varies across task contexts.
△ Less
Submitted 22 September, 2024;
originally announced September 2024.
-
Nemotron-4 340B Technical Report
Authors:
Nvidia,
:,
Bo Adler,
Niket Agarwal,
Ashwath Aithal,
Dong H. Anh,
Pallab Bhattacharya,
Annika Brundyn,
Jared Casper,
Bryan Catanzaro,
Sharon Clay,
Jonathan Cohen,
Sirshak Das,
Ayush Dattagupta,
Olivier Delalleau,
Leon Derczynski,
Yi Dong,
Daniel Egert,
Ellie Evans,
Aleksander Ficek,
Denys Fridman,
Shaona Ghosh,
Boris Ginsburg,
Igor Gitman,
Tomasz Grzegorzek
, et al. (58 additional authors not shown)
Abstract:
We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation be…
▽ More
We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation benchmarks, and were sized to fit on a single DGX H100 with 8 GPUs when deployed in FP8 precision. We believe that the community can benefit from these models in various research studies and commercial applications, especially for generating synthetic data to train smaller language models. Notably, over 98% of data used in our model alignment process is synthetically generated, showcasing the effectiveness of these models in generating synthetic data. To further support open research and facilitate model development, we are also open-sourcing the synthetic data generation pipeline used in our model alignment process.
△ Less
Submitted 6 August, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
From traces to measures: Large language models as a tool for psychological measurement from text
Authors:
Joseph J. P. Simons,
Wong Liang Ze,
Prasanta Bhattacharya,
Brandon Siyuan Loh,
Wei Gao
Abstract:
Large language models are increasingly being used to label or rate psychological features in text data. This approach helps address one of the limiting factors of digital trace data - their lack of an inherent target of measurement. However, this approach is also a form of psychological measurement (using observable variables to quantify a hypothetical latent construct). As such, these ratings are…
▽ More
Large language models are increasingly being used to label or rate psychological features in text data. This approach helps address one of the limiting factors of digital trace data - their lack of an inherent target of measurement. However, this approach is also a form of psychological measurement (using observable variables to quantify a hypothetical latent construct). As such, these ratings are subject to the same psychometric considerations of reliability and validity as more standard psychological measures. Here we present a workflow for developing and evaluating large language model based measures of psychological features which incorporate these considerations. We also provide an example, attempting to measure the previously established constructs of attitude certainty, importance and moralization from text. Using a pool of prompts adapted from existing measurement instruments, we find they have good levels of internal consistency but only partially meet validity criteria.
△ Less
Submitted 13 October, 2024; v1 submitted 12 May, 2024;
originally announced May 2024.
-
Towards Universal Performance Modeling for Machine Learning Training on Multi-GPU Platforms
Authors:
Zhongyi Lin,
Ning Sun,
Pallab Bhattacharya,
Xizhou Feng,
Louis Feng,
John D. Owens
Abstract:
Characterizing and predicting the training performance of modern machine learning (ML) workloads on compute systems with compute and communication spread between CPUs, GPUs, and network devices is not only the key to optimization and planning but also a complex goal to achieve. The primary challenges include the complexity of synchronization and load balancing between CPUs and GPUs, the variance i…
▽ More
Characterizing and predicting the training performance of modern machine learning (ML) workloads on compute systems with compute and communication spread between CPUs, GPUs, and network devices is not only the key to optimization and planning but also a complex goal to achieve. The primary challenges include the complexity of synchronization and load balancing between CPUs and GPUs, the variance in input data distribution, and the use of different communication devices and topologies (e.g., NVLink, PCIe, network cards) that connect multiple compute devices, coupled with the desire for flexible training configurations. Built on top of our prior work for single-GPU platforms, we address these challenges and enable multi-GPU performance modeling by incorporating (1) data-distribution-aware performance models for embedding table lookup, and (2) data movement prediction of communication collectives, into our upgraded performance modeling pipeline equipped with inter-and intra-rank synchronization for ML workloads trained on multi-GPU platforms. Beyond accurately predicting the per-iteration training time of DLRM models with random configurations with a geomean error of 5.21% on two multi-GPU platforms, our prediction pipeline generalizes well to other types of ML workloads, such as Transformer-based NLP models with a geomean error of 3.00%. Moreover, even without actually running ML workloads like DLRMs on the hardware, it is capable of generating insights such as quickly selecting the fastest embedding table sharding configuration (with a success rate of 85%).
△ Less
Submitted 26 November, 2024; v1 submitted 19 April, 2024;
originally announced April 2024.
-
New infinite families in the stable homotopy groups of spheres
Authors:
Prasit Bhattacharya,
Irina Bobkova,
J. D. Quigley
Abstract:
We identify seven new $192$-periodic infinite families of elements in the $2$-primary stable homotopy groups of spheres. Although their Hurewicz image is trivial for topological modular forms, they remain nontrivial after $\mathrm{T}(2)$- as well as $\mathrm{K}(2)$-localization. We also obtain new information about $2$-torsion and $2$-divisibility of some of the previously known $192$-periodic inf…
▽ More
We identify seven new $192$-periodic infinite families of elements in the $2$-primary stable homotopy groups of spheres. Although their Hurewicz image is trivial for topological modular forms, they remain nontrivial after $\mathrm{T}(2)$- as well as $\mathrm{K}(2)$-localization. We also obtain new information about $2$-torsion and $2$-divisibility of some of the previously known $192$-periodic infinite families in the stable stems.
△ Less
Submitted 9 May, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Numerical simulation of charging up, accumulation of space charge and formation of discharges
Authors:
Purba Bhattacharya,
Promita Roy,
Tanay Dey,
Jaydeep Datta,
Prasant K. Rout,
Nayana Majumdar,
Supratik Mukhopadhyay
Abstract:
Aging and stability of gaseous ionization detectors are intricately related to charging up, accumulation of space charge and formation of discharges. All these phenomena, in their turn, depend on the dynamics of charged particles within the device. Because of the large number of particles involved and their complex interactions, the dynamic processes of generation and loss of charged particles, an…
▽ More
Aging and stability of gaseous ionization detectors are intricately related to charging up, accumulation of space charge and formation of discharges. All these phenomena, in their turn, depend on the dynamics of charged particles within the device. Because of the large number of particles involved and their complex interactions, the dynamic processes of generation and loss of charged particles, and their transport within the detector volume are extremely expensive to simulate numerically. In this work, we propose and evaluate possible algorithms / approaches that show some promise in relation to the above-mentioned problems. Several important ionization detectors having parallel plate configurations, such as GEM, Micromegas, RPCs and THGEMs, are considered for this purpose. Information related to primary ionization is obtained from HEED, while all the transport properties are evaluated using MAGBOLTZ. The transport dynamics have been followed using two different approaches. In one, particle description using neBEM-Garfield++ combination has been used. For this purpose, the neBEM solver has been significantly improved such that perturbations due to the charged particles present within the device are considered while estimating electric field. In the other approach, the transport is simulated following hydrodynamic model using COMSOL during which the electric field is also provided by COMSOL where it is easy to set up space charge effects. A comparison between these possible approaches will be presented. Effect of different simulation parameters will also be demonstrated using simple examples.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Exploring Large Language Models for Code Explanation
Authors:
Paheli Bhattacharya,
Manojit Chakraborty,
Kartheek N S N Palepu,
Vikas Pandey,
Ishan Dindorkar,
Rakesh Rajpurohit,
Rishabh Gupta
Abstract:
Automating code documentation through explanatory text can prove highly beneficial in code understanding. Large Language Models (LLMs) have made remarkable strides in Natural Language Processing, especially within software engineering tasks such as code generation and code summarization. This study specifically delves into the task of generating natural-language summaries for code snippets, using…
▽ More
Automating code documentation through explanatory text can prove highly beneficial in code understanding. Large Language Models (LLMs) have made remarkable strides in Natural Language Processing, especially within software engineering tasks such as code generation and code summarization. This study specifically delves into the task of generating natural-language summaries for code snippets, using various LLMs. The findings indicate that Code LLMs outperform their generic counterparts, and zero-shot methods yield superior results when dealing with datasets with dissimilar distributions between training and testing sets.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Enhancing Stance Classification on Social Media Using Quantified Moral Foundations
Authors:
Hong Zhang,
Quoc-Nam Nguyen,
Prasanta Bhattacharya,
Wei Gao,
Liang Ze Wong,
Brandon Siyuan Loh,
Joseph J. P. Simons,
Jisun An
Abstract:
This study enhances stance detection on social media by incorporating deeper psychological attributes, specifically individuals' moral foundations. These theoretically-derived dimensions aim to provide a comprehensive profile of an individual's moral concerns which, in recent work, has been linked to behaviour in a range of domains, including society, politics, health, and the environment. In this…
▽ More
This study enhances stance detection on social media by incorporating deeper psychological attributes, specifically individuals' moral foundations. These theoretically-derived dimensions aim to provide a comprehensive profile of an individual's moral concerns which, in recent work, has been linked to behaviour in a range of domains, including society, politics, health, and the environment. In this paper, we investigate how moral foundation dimensions can contribute to predicting an individual's stance on a given target. Specifically we incorporate moral foundation features extracted from text, along with message semantic features, to classify stances at both message- and user-levels using both traditional machine learning models and large language models. Our preliminary results suggest that encoding moral foundations can enhance the performance of stance detection tasks and help illuminate the associations between specific moral foundations and online stances on target topics. The results highlight the importance of considering deeper psychological attributes in stance analysis and underscores the role of moral foundations in guiding online social behavior.
△ Less
Submitted 29 September, 2024; v1 submitted 15 October, 2023;
originally announced October 2023.
-
On the Steenrod module structure of $\mathbb{R}$-motivic Spanier-Whitehead duals
Authors:
Prasit Bhattacharya,
Bertrand J. Guillou,
Ang Li
Abstract:
The $\mathbb{R}$-motivic cohomology of an $\mathbb{R}$-motivic spectrum is a module over the $\mathbb{R}$-motivic Steenrod algebra $\mathcal{A}^{\mathbb{R}}$. In this paper, we describe how to recover the $\mathbb{R}$-motivic cohomology of the Spanier-Whitehead dual $\mathrm{DX}$ of an $\mathbb{R}$-motivic finite complex $\mathrm{X}$, as an $\mathcal{A}^{\mathbb{R}}$-module, given the…
▽ More
The $\mathbb{R}$-motivic cohomology of an $\mathbb{R}$-motivic spectrum is a module over the $\mathbb{R}$-motivic Steenrod algebra $\mathcal{A}^{\mathbb{R}}$. In this paper, we describe how to recover the $\mathbb{R}$-motivic cohomology of the Spanier-Whitehead dual $\mathrm{DX}$ of an $\mathbb{R}$-motivic finite complex $\mathrm{X}$, as an $\mathcal{A}^{\mathbb{R}}$-module, given the $\mathcal{A}^{\mathbb{R}}$-module structure on the cohomology of $\mathrm{X}$. As an application, we show that 16 out of 128 different $\mathcal{A}^{\mathbb{R}}$-module structures on $\mathcal{A}^{\mathbb{R}}(1):= \langle \mathrm{Sq}^1, \mathrm{Sq}^2 \rangle$ are self-dual.
△ Less
Submitted 18 October, 2023; v1 submitted 27 September, 2023;
originally announced September 2023.
-
Estimating Policy Effects in a Social Network with Independent Set Sampling
Authors:
Eugene Ang,
Prasanta Bhattacharya,
Andrew Lim
Abstract:
Evaluating the impact of policy interventions on respondents who are embedded in a social network is often challenging due to the presence of network interference within the treatment groups, as well as between treatment and non-treatment groups throughout the network. In this paper, we propose a novel empirical strategy that combines network sampling based on the identification of independent set…
▽ More
Evaluating the impact of policy interventions on respondents who are embedded in a social network is often challenging due to the presence of network interference within the treatment groups, as well as between treatment and non-treatment groups throughout the network. In this paper, we propose a novel empirical strategy that combines network sampling based on the identification of independent sets with a stochastic actor-oriented model (SAOM) to infer the direct and net effects of a policy. By assigning respondents from an independent set to the treatment, we are able to block direct spillover of the treatment among the treated respondents for an extended period of time, during which the direct effect of the treatment can be isolated from the associated network interference. We empirically demonstrate this using a simulation-based evaluation of a fictitious policy implementation using both real-life and generated networks, and use a counterfactual approach to estimate the treatment effect of the policy. Our results highlight the effectiveness of our proposed empirical strategy, and notably, the role of network sampling techniques in influencing the evaluation of policy effects. The findings from this study have the potential to help researchers and policymakers with planning, designing, and anticipating policy responses in a networked society.
△ Less
Submitted 28 October, 2024; v1 submitted 25 June, 2023;
originally announced June 2023.
-
Equivariant orientation of vector bundles over disconnected base spaces
Authors:
Prasit Bhattacharya,
Foling Zou
Abstract:
In this paper, we view the equivariant orientation theory of equivariant vector bundles from the lenses of equivariant Picard spectra. This viewpoint allows us to identify, for a finite group $\mathrm{G}$, a precise condition under which an $\mathrm{R}$-orientation of a $\mathrm{G}$-equivariant vector bundle is encoded by a Thom class. Consequently, we are able to construct a generalization of the…
▽ More
In this paper, we view the equivariant orientation theory of equivariant vector bundles from the lenses of equivariant Picard spectra. This viewpoint allows us to identify, for a finite group $\mathrm{G}$, a precise condition under which an $\mathrm{R}$-orientation of a $\mathrm{G}$-equivariant vector bundle is encoded by a Thom class. Consequently, we are able to construct a generalization of the first Stiefel$-$Whitney class of a "homogeneous" $\mathrm{G}$-equivariant bundle with respect to an $\mathbb{E}_\infty^{\mathrm{G}}$-ring spectrum $\mathrm{R}$. As an application, we show that the $2$-fold direct sum of any homogeneous bundle is $\mathrm{H}\underline{\mathcal{A}}_{\mathrm{G}}$-orientable, where $\underline{\mathcal{A}}_{\mathrm{G}}$ is the Burnside Mackey functor. We notice that $\mathrm{H}\underline{\mathcal{A}}_{\mathrm{G}}$-orientability is equivalent to $\mathrm{H}\underline{\mathbb{Z}}$-orientability when the order of $\mathrm{G}$ is odd. When the order of $\mathrm{G}$ is even, we show that a $\mathrm{G}$-equivariant analog of the tautological line bundle over $\mathbb{RP}^\infty$ is $\mathrm{H}\underline{\mathbb{Z}}$-orientable but not $\mathrm{H}\underline{\mathcal{A}}_{\mathrm{G}}$-orientable.
△ Less
Submitted 21 September, 2024; v1 submitted 17 March, 2023;
originally announced March 2023.
-
The structure of the v_2-local algebraic tmf resolution
Authors:
Mark Behrens,
Prasit Bhattacharya,
Dominic Culver
Abstract:
We give a complete description of the E_1-term of the v_2-local as well as g-local algebraic tmf resolution.
We give a complete description of the E_1-term of the v_2-local as well as g-local algebraic tmf resolution.
△ Less
Submitted 23 February, 2025; v1 submitted 26 January, 2023;
originally announced January 2023.
-
FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models
Authors:
Geet Sethi,
Pallab Bhattacharya,
Dhruv Choudhary,
Carole-Jean Wu,
Christos Kozyrakis
Abstract:
Sequence-based deep learning recommendation models (DLRMs) are an emerging class of DLRMs showing great improvements over their prior sum-pooling based counterparts at capturing users' long term interests. These improvements come at immense system cost however, with sequence-based DLRMs requiring substantial amounts of data to be dynamically materialized and communicated by each accelerator during…
▽ More
Sequence-based deep learning recommendation models (DLRMs) are an emerging class of DLRMs showing great improvements over their prior sum-pooling based counterparts at capturing users' long term interests. These improvements come at immense system cost however, with sequence-based DLRMs requiring substantial amounts of data to be dynamically materialized and communicated by each accelerator during a single iteration. To address this rapidly growing bottleneck, we present FlexShard, a new tiered sequence embedding table sharding algorithm which operates at a per-row granularity by exploiting the insight that not every row is equal. Through precise replication of embedding rows based on their underlying probability distribution, along with the introduction of a new sharding strategy adapted to the heterogeneous, skewed performance of real-world cluster network topologies, FlexShard is able to significantly reduce communication demand while using no additional memory compared to the prior state-of-the-art. When evaluated on production-scale sequence DLRMs, FlexShard was able to reduce overall global all-to-all communication traffic by over 85%, resulting in end-to-end training communication latency improvements of almost 6x over the prior state-of-the-art approach.
△ Less
Submitted 7 January, 2023;
originally announced January 2023.
-
What You Like: Generating Explainable Topical Recommendations for Twitter Using Social Annotations
Authors:
Parantapa Bhattacharya,
Saptarshi Ghosh,
Muhammad Bilal Zafar,
Soumya K. Ghosh,
Niloy Ganguly
Abstract:
With over 500 million tweets posted per day, in Twitter, it is difficult for Twitter users to discover interesting content from the deluge of uninteresting posts. In this work, we present a novel, explainable, topical recommendation system, that utilizes social annotations, to help Twitter users discover tweets, on topics of their interest. A major challenge in using traditional rating dependent r…
▽ More
With over 500 million tweets posted per day, in Twitter, it is difficult for Twitter users to discover interesting content from the deluge of uninteresting posts. In this work, we present a novel, explainable, topical recommendation system, that utilizes social annotations, to help Twitter users discover tweets, on topics of their interest. A major challenge in using traditional rating dependent recommendation systems, like collaborative filtering and content based systems, in high volume social networks is that, due to attention scarcity most items do not get any ratings. Additionally, the fact that most Twitter users are passive consumers, with 44% users never tweeting, makes it very difficult to use user ratings for generating recommendations. Further, a key challenge in developing recommendation systems is that in many cases users reject relevant recommendations if they are totally unfamiliar with the recommended item. Providing a suitable explanation, for why the item is recommended, significantly improves the acceptability of recommendation. By virtue of being a topical recommendation system our method is able to present simple topical explanations for the generated recommendations. Comparisons with state-of-the-art matrix factorization based collaborative filtering, content based and social recommendations demonstrate the efficacy of the proposed approach.
△ Less
Submitted 23 December, 2022;
originally announced December 2022.
-
Analyzing Regrettable Communications on Twitter: Characterizing Deleted Tweets and Their Authors
Authors:
Parantapa Bhattacharya,
Saptarshi Ghosh,
Niloy Ganguly
Abstract:
Over 500 million tweets are posted in Twitter each day, out of which about 11% tweets are deleted by the users posting them. This phenomenon of widespread deletion of tweets leads to a number of questions: what kind of content posted by users makes them want to delete them later? %Are all users equally active in deleting their tweets or Are users of certain predispositions more likely to post regr…
▽ More
Over 500 million tweets are posted in Twitter each day, out of which about 11% tweets are deleted by the users posting them. This phenomenon of widespread deletion of tweets leads to a number of questions: what kind of content posted by users makes them want to delete them later? %Are all users equally active in deleting their tweets or Are users of certain predispositions more likely to post regrettable tweets, deleting them later? In this paper we provide a detailed characterization of tweets posted and then later deleted by their authors. We collected tweets from over 200 thousand Twitter users during a period of four weeks. Our characterization shows significant personality differences between users who delete their tweets and those who do not. We find that users who delete their tweets are more likely to be extroverted and neurotic while being less conscientious. Also, we find that deleted tweets while containing less information and being less conversational, contain significant indications of regrettable content. Since users of online communication do not have instant social cues (like listener's body language) to gauge the impact of their words, they are often delayed in employing repair strategies. Finally, we build a classifier which takes textual, contextual, as well as user features to predict if a tweet will be deleted or not. The classifier achieves a F1-score of 0.78 and the precision increases when we consider response features of the tweets.
△ Less
Submitted 23 December, 2022;
originally announced December 2022.
-
Task Preferences across Languages on Community Question Answering Platforms
Authors:
Sebastin Santy,
Prasanta Bhattacharya,
Rishabh Mehrotra
Abstract:
With the steady emergence of community question answering (CQA) platforms like Quora, StackExchange, and WikiHow, users now have an unprecedented access to information on various kind of queries and tasks. Moreover, the rapid proliferation and localization of these platforms spanning geographic and linguistic boundaries offer a unique opportunity to study the task requirements and preferences of u…
▽ More
With the steady emergence of community question answering (CQA) platforms like Quora, StackExchange, and WikiHow, users now have an unprecedented access to information on various kind of queries and tasks. Moreover, the rapid proliferation and localization of these platforms spanning geographic and linguistic boundaries offer a unique opportunity to study the task requirements and preferences of users in different socio-linguistic groups. In this study, we implement an entity-embedding model trained on a large longitudinal dataset of multi-lingual and task-oriented question-answer pairs to uncover and quantify the (i) prevalence and distribution of various online tasks across linguistic communities, and (ii) emerging and receding trends in task popularity over time in these communities. Our results show that there exists substantial variance in task preference as well as popularity trends across linguistic communities on the platform. Findings from this study will help Q&A platforms better curate and personalize content for non-English users, while also offering valuable insights to businesses looking to target non-English speaking communities online.
△ Less
Submitted 18 December, 2022;
originally announced December 2022.
-
Numerical simulation of the response of single gap timing RPCs with the space charge effects and Garfield++
Authors:
Tanay Dey,
Purba Bhattacharya,
Supratik Mukhopadhyay,
Nayana Majumdar,
Abhishek Seal,
Subhasis Chattopadhyay
Abstract:
In this article, we report the simulated response of timing RPCs of different gas gaps. A 3D Montecarlo code was developed and integrated with Garfield++ to simulate the avalanche processes with space charge effects which allow actual charge and timing spectrums. The results of this study are presented with examples of timing RPCs of gas gaps 0.02 cm and 0.03 cm.
In this article, we report the simulated response of timing RPCs of different gas gaps. A 3D Montecarlo code was developed and integrated with Garfield++ to simulate the avalanche processes with space charge effects which allow actual charge and timing spectrums. The results of this study are presented with examples of timing RPCs of gas gaps 0.02 cm and 0.03 cm.
△ Less
Submitted 10 December, 2022;
originally announced December 2022.
-
Parallelization of Garfield++ and neBEM to simulate space charge effects in RPCs
Authors:
Tanay Dey,
Purba Bhattacharya,
Supratik Mukhopadhyay,
Nayana Majumdar,
Abhishek Seal,
Subhasis Chattopadhyay
Abstract:
Numerical simulation of avalanches, saturated avalanches, and streamers can help us understand the detector physics of Resistive Plate Chambers (RPC). 3D Monte Carlo simulation of an avalanche inside an RPC, the transition from avalanche to saturated avalanche to streamer may help the search for the optimum voltage and alternate gas mixtures. This task is dauntingly resource-hungry, especially whe…
▽ More
Numerical simulation of avalanches, saturated avalanches, and streamers can help us understand the detector physics of Resistive Plate Chambers (RPC). 3D Monte Carlo simulation of an avalanche inside an RPC, the transition from avalanche to saturated avalanche to streamer may help the search for the optimum voltage and alternate gas mixtures. This task is dauntingly resource-hungry, especially when space charge effects become important, which often coincides with important regimes of operation of these devices. By modifying the electric field inside the RPC dynamically, the space charge plays a crucial role in determining the response of the detector. In this work, a numerical model has been proposed to calculate the dynamic space-charge field inside an RPC and the same has been implemented in the Garfield++ framework. By modeling space charge as the large number of line charges and using the multithreading technique OpenMP to calculate electric field, drift line, electron gain, and space charge field, it has been possible to maintain time consumption within reasonable limits. For this purpose, a new class, pAvalancheMC has been introduced in Garfield++. The calculations have been successfully verified with those from existing solvers and an example is provided to show the performance of pAvalancheMC. Moreover, the details of the transition of an avalanche into a saturated avalanche have been discussed. The induced charge distribution is calculated for a timing RPC and results are verified with the experiment.
△ Less
Submitted 5 October, 2023; v1 submitted 11 November, 2022;
originally announced November 2022.
-
Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages
Authors:
Anusha Prakash,
Arun Kumar,
Ashish Seth,
Bhagyashree Mukherjee,
Ishika Gupta,
Jom Kuriakose,
Jordan Fernandes,
K V Vikram,
Mano Ranjith Kumar M,
Metilda Sagaya Mary,
Mohammad Wajahat,
Mohana N,
Mudit Batra,
Navina K,
Nihal John George,
Nithya Ravi,
Pruthwik Mishra,
Sudhanshu Srivastava,
Vasista Sai Lodagala,
Vandan Mujadia,
Kada Sai Venkata Vineeth,
Vrunda Sukhadia,
Dipti Sharma,
Hema Murthy,
Pushpak Bhattacharya
, et al. (2 additional authors not shown)
Abstract:
Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages…
▽ More
Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages belong to different language families, resulting in differences in generated audio duration. This is further compounded by the original speaker's rhythm, especially for extempore speech. This paper describes the challenges in regenerating English lecture videos in Indian languages semi-automatically. A prototype is developed for dubbing lectures into 9 Indian languages. A mean-opinion-score (MOS) is obtained for two languages, Hindi and Tamil, on two different courses. The output video is compared with the original video in terms of MOS (1-5) and lip synchronisation with scores of 4.09 and 3.74, respectively. The human effort also reduces by 75%.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
Design and studies of thick Gas Electron Multipliers fabricated in India
Authors:
Promita Roy,
Purba Bhattacharya,
Vishal Kumar,
Supratik Mukhopadhyay,
Nayana Majumdar,
Sandip Sarkar
Abstract:
THick Gas Electron Multipliers (THGEMs) are robust and high gain Micro Pattern Gaseous Detectors which are economically manufactured by standard drilling and etching of thin printed circuit boards. In this paper, we present our recent simulation as well as experimental studies on THGEMs which have been fabricated in India using local expertise. Two types of THGEMs have been fabricated; one set has…
▽ More
THick Gas Electron Multipliers (THGEMs) are robust and high gain Micro Pattern Gaseous Detectors which are economically manufactured by standard drilling and etching of thin printed circuit boards. In this paper, we present our recent simulation as well as experimental studies on THGEMs which have been fabricated in India using local expertise. Two types of THGEMs have been fabricated; one set has holes without any external rim and another set has holes with rims. These detectors have been characterized using argon-carbon dioxide and argon-isobutane gas mixtures. Electron transmission, effective gain, energy resolution and optimized working range studies have been presented for both the sets of THGEMs.
△ Less
Submitted 17 December, 2022; v1 submitted 19 October, 2022;
originally announced October 2022.
-
Deepfake Text Detection: Limitations and Opportunities
Authors:
Jiameng Pu,
Zain Sarwar,
Sifat Muhammad Abdullah,
Abdullah Rehman,
Yoonjin Kim,
Parantapa Bhattacharya,
Mobin Javed,
Bimal Viswanath
Abstract:
Recent advances in generative models for language have enabled the creation of convincing synthetic text or deepfake text. Prior work has demonstrated the potential for misuse of deepfake text to mislead content consumers. Therefore, deepfake text detection, the task of discriminating between human and machine-generated text, is becoming increasingly critical. Several defenses have been proposed f…
▽ More
Recent advances in generative models for language have enabled the creation of convincing synthetic text or deepfake text. Prior work has demonstrated the potential for misuse of deepfake text to mislead content consumers. Therefore, deepfake text detection, the task of discriminating between human and machine-generated text, is becoming increasingly critical. Several defenses have been proposed for deepfake text detection. However, we lack a thorough understanding of their real-world applicability. In this paper, we collect deepfake text from 4 online services powered by Transformer-based tools to evaluate the generalization ability of the defenses on content in the wild. We develop several low-cost adversarial attacks, and investigate the robustness of existing defenses against an adaptive attacker. We find that many defenses show significant degradation in performance under our evaluation scenarios compared to their original claimed performance. Our evaluation shows that tapping into the semantic information in the text content is a promising approach for improving the robustness and generalization performance of deepfake text detection schemes.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Legal Case Document Summarization: Extractive and Abstractive Methods and their Evaluation
Authors:
Abhay Shukla,
Paheli Bhattacharya,
Soham Poddar,
Rajdeep Mukherjee,
Kripabandhu Ghosh,
Pawan Goyal,
Saptarshi Ghosh
Abstract:
Summarization of legal case judgement documents is a challenging problem in Legal NLP. However, not much analyses exist on how different families of summarization models (e.g., extractive vs. abstractive) perform when applied to legal case documents. This question is particularly important since many recent transformer-based abstractive summarization models have restrictions on the number of input…
▽ More
Summarization of legal case judgement documents is a challenging problem in Legal NLP. However, not much analyses exist on how different families of summarization models (e.g., extractive vs. abstractive) perform when applied to legal case documents. This question is particularly important since many recent transformer-based abstractive summarization models have restrictions on the number of input tokens, and legal documents are known to be very long. Also, it is an open question on how best to evaluate legal case document summarization systems. In this paper, we carry out extensive experiments with several extractive and abstractive summarization methods (both supervised and unsupervised) over three legal summarization datasets that we have developed. Our analyses, that includes evaluation by law practitioners, lead to several interesting insights on legal summarization in specific and long document summarization in general.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Reduction in turbulence-induced non-linear dynamic vibration using tuned liquid damper (TLD)
Authors:
Ananya Majumdar,
Biplab Ranjan Adhikary,
Partha Bhattacharya
Abstract:
In the present research work, an attempt is made to develop a coupled non-linear turbulence-structure-damper model in a finite volume-finite difference (FV-FD) framework. Tuned liquid damper (TLD) is used as the additional damping system along with inherent structural damping. Real-time simulation of flow-excited bridge box girder or chimney section and the vibration reduction using TLD can be per…
▽ More
In the present research work, an attempt is made to develop a coupled non-linear turbulence-structure-damper model in a finite volume-finite difference (FV-FD) framework. Tuned liquid damper (TLD) is used as the additional damping system along with inherent structural damping. Real-time simulation of flow-excited bridge box girder or chimney section and the vibration reduction using TLD can be performed using the developed model. The turbulent flow field around a structure is modeled using an OpenFOAM transient PISO solver, and the time-varying drag force is calculated. This force perturbs the structure, causing the sloshing phenomena of the attached TLD, modeled using shallow depth approximation, damping the flow-induced vibration of the structure. The structural motion with and without the attached TLD is modeled involving the FD-based Newmark-Beta method using in-house MATLAB codes. The TLD is tuned with the vortex-shedding frequency of the low-Reynolds number flows, and it is found to be reducing the structural excitation significantly. On the other hand, the high-Reynolds number turbulent flow exhibits a broadband excitation, for which by tuning the TLD with few frequencies obtained through investigations, a good reduction in vibration is observed.
△ Less
Submitted 19 October, 2023; v1 submitted 2 October, 2022;
originally announced October 2022.
-
Legal Case Document Similarity: You Need Both Network and Text
Authors:
Paheli Bhattacharya,
Kripabandhu Ghosh,
Arindam Pal,
Saptarshi Ghosh
Abstract:
Estimating the similarity between two legal case documents is an important and challenging problem, having various downstream applications such as prior-case retrieval and citation recommendation. There are two broad approaches for the task -- citation network-based and text-based. Prior citation network-based approaches consider citations only to prior-cases (also called precedents) (PCNet). This…
▽ More
Estimating the similarity between two legal case documents is an important and challenging problem, having various downstream applications such as prior-case retrieval and citation recommendation. There are two broad approaches for the task -- citation network-based and text-based. Prior citation network-based approaches consider citations only to prior-cases (also called precedents) (PCNet). This approach misses important signals inherent in Statutes (written laws of a jurisdiction). In this work, we propose Hier-SPCNet that augments PCNet with a heterogeneous network of Statutes. We incorporate domain knowledge for legal document similarity into Hier-SPCNet, thereby obtaining state-of-the-art results for network-based legal document similarity. Both textual and network similarity provide important signals for legal case similarity; but till now, only trivial attempts have been made to unify the two signals. In this work, we apply several methods for combining textual and network information for estimating legal case similarity. We perform extensive experiments over legal case documents from the Indian judiciary, where the gold standard similarity between document-pairs is judged by law experts from two reputed Law institutes in India. Our experiments establish that our proposed network-based methods significantly improve the correlation with domain experts' opinion when compared to the existing methods for network-based legal document similarity. Our best-performing combination method (that combines network-based and text-based similarity) improves the correlation with domain experts' opinion by 11.8% over the best text-based method and 20.6\% over the best network-based method. We also establish that our best-performing method can be used to recommend / retrieve citable and similar cases for a source (query) case, which are well appreciated by legal experts.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Sensitivity mapping of TBL wall-pressure spectra with CFD turbulence models for wind tunnel test result prediction
Authors:
Biplab Ranjan Adhikary,
Ananya Majumdar,
Subhadeep Sarkar,
Partha Bhattacharya
Abstract:
In the present work, an attempt is made to map the sensitivity of the existing zero pressure gradient (ZPG) turbulent boundary layer (TBL) wall-pressure spectrum models with different TBL parameters, and eventually, with different Reynolds Averaged Navier Stokes (RANS) turbulence models, simulated in OpenFOAM and ANSYS Fluent solvers. This study will help future researchers to choose a particular…
▽ More
In the present work, an attempt is made to map the sensitivity of the existing zero pressure gradient (ZPG) turbulent boundary layer (TBL) wall-pressure spectrum models with different TBL parameters, and eventually, with different Reynolds Averaged Navier Stokes (RANS) turbulence models, simulated in OpenFOAM and ANSYS Fluent solvers. This study will help future researchers to choose a particular RANS turbulence model vis-à-vis a particular wall-spectrum model in order to obtain a reasonably accurate wind tunnel result predicting capability. First, the best-predicting pressure spectrum models are selected by comparing them with wind tunnel test data. Next, considering the experimental TBL parameters as benchmarks, errors in RANS-produced data are estimated. Furthermore, wall-pressure spectra are calculated following semi-empirical spectrum models using TBL parameter feed obtained from experiments and computational fluid dynamics (CFD) simulations. Finally, sensitivity mapping is performed between spectrum models and the RANS models, with different normalized wall-normal distances (y+).
△ Less
Submitted 24 September, 2022;
originally announced September 2022.
-
Study of space charge phenomena in GEM-based detectors
Authors:
Promita Roy,
Prasant Kumar Rout,
Jaydeep Datta,
Purba Bhattacharya,
Supratik Mukhopadhyay,
Nayana Majumdar,
Sandip Sarkar
Abstract:
Space charge accumulation within GEM holes is one of the vital phenomena which affects many of the key working parameters of the detector. This accumulation is found to be significantly affected by the initial primary charge configurations and applied GEM voltages since they determine charge sharing and the subsequent evolution of detector response. In this work, we have studied the effects of spa…
▽ More
Space charge accumulation within GEM holes is one of the vital phenomena which affects many of the key working parameters of the detector. This accumulation is found to be significantly affected by the initial primary charge configurations and applied GEM voltages since they determine charge sharing and the subsequent evolution of detector response. In this work, we have studied the effects of space charge phenomena on different parameters for single GEM detectors using a hybrid numerical model.
△ Less
Submitted 21 September, 2022;
originally announced September 2022.
-
TBL-induced energy transmission into a double wall backed enclosure system computed in a cloud-based Python-FE environment
Authors:
Biplab Ranjan Adhikary,
Atanu Sahu,
Partha Bhattacharya
Abstract:
We propose a fully coupled numerical model to predict turbulent boundary layer (TBL) induced energy transmission behavior for a double-wall backed enclosure system in a finite element (FE) framework computed in cloud-based Python environment. Goody single point wall-pressure spectrum and Corcos spatial correlation function are used to generate the TBL cross-power spectra. Mindlins first order shea…
▽ More
We propose a fully coupled numerical model to predict turbulent boundary layer (TBL) induced energy transmission behavior for a double-wall backed enclosure system in a finite element (FE) framework computed in cloud-based Python environment. Goody single point wall-pressure spectrum and Corcos spatial correlation function are used to generate the TBL cross-power spectra. Mindlins first order shear deformation model is considered for the panels and a fully coupled TBL-structure-acoustic model is developed using the FE approach to predict the acoustic power level inside the enclosure for variable gap distance between the panels. The model is developed in a way to capture the contribution of orthotropic lamina sequence, frequency-dependent structural damping, and stiffening orientation in predicting the energy transmission into a double-wall backed enclosure. Thus, a new numerical model is presented that enables the designers with more precise energy transmission quantification with greater flexibility in terms of the number of panel leaves, geometry, and boundary conditions of the enclosure system, backed by double wall made of isotropic or orthotropic laminates.
△ Less
Submitted 17 September, 2022;
originally announced September 2022.
-
A coupled FE-BE approach for vibro-acoustic response prediction of laminated composite panels due to turbulent boundary layer excitation involving Cholesky decomposition
Authors:
Biplab Ranjan Adhikary,
Atanu Sahu,
Partha Bhattacharya
Abstract:
An original numerical framework is developed in the present research work in order to estimate the free field sound radiation from baffled structural panels subjected to turbulent boundary layer-induced excitation. A semi-analytical method is used to estimate the TBL wall pressure spectrum which is decomposed using Cholesky technique to obtain random wall pressure in the frequency domain. Structur…
▽ More
An original numerical framework is developed in the present research work in order to estimate the free field sound radiation from baffled structural panels subjected to turbulent boundary layer-induced excitation. A semi-analytical method is used to estimate the TBL wall pressure spectrum which is decomposed using Cholesky technique to obtain random wall pressure in the frequency domain. Structural panels are modeled using the finite element technique and a coupled finite element boundary element modeling technique is developed to estimate the sound power level radiating into the free field. Results are obtained for laminated composite structural panels with various fiber orientations and significant findings are discussed. The developed technique has the potential to be further extended for complex structures in terms of geometry, material properties, and boundary conditions. The complete numerical toolbox, developed in an in-house MATLAB environment, enables the prediction of turbulent structure acoustic coupled behavior at an early design stage.
△ Less
Submitted 15 September, 2022;
originally announced September 2022.
-
Test-Beam and Simulation Studies Towards RPWELL-based DHCAL
Authors:
Dan Shaked-Renous,
Fernando Domingues Amaro,
Purba Bhattacharya,
Amos Breskin,
Maximilien Chefdeville,
Cyril Drancourt,
Theo Geralis,
Yannis Karyotakis,
Luca Moleri,
Andrea Tesi,
Maxim Titov,
Joao Veloso,
Guillaum Vouters,
Shikma Bressler
Abstract:
Digital Hadronic Calorimeters (DHCAL) were suggested for future Colliders as part of the particle-flow concept. Though studied mainly with Resistive Plate Chambers (RPC), studies focusing on Micro-Pattern Gaseous Detector (MPGD)-based sampling elements have shown the potential advantages; they can be operated with environmental friendly gases and reach similar detection efficiency at lower average…
▽ More
Digital Hadronic Calorimeters (DHCAL) were suggested for future Colliders as part of the particle-flow concept. Though studied mainly with Resistive Plate Chambers (RPC), studies focusing on Micro-Pattern Gaseous Detector (MPGD)-based sampling elements have shown the potential advantages; they can be operated with environmental friendly gases and reach similar detection efficiency at lower average pad-multiplicity. We summarize here the experimental test-beam results of a small-size DHCAL prototype, incorporating six Micromegas (MM) and two Resistive-Plate WELL (RPWELL) sampling elements, interlaced with steel-absorber plates. It was investigated with 2-6 GeV pion beam at the CERN/PS beam facility. The data permitted validating a GEANT4 simulation framework of a DHCAL, and evaluating the expected pion energy resolution of a full-scale RPWELL-based calorimeter. The pion energy resolution of $\fracσ{E[GeV]}=\frac{50.8\%}{\sqrt{E[GeV]}} \oplus 10.3\%$ derived expected with the RPWELL concept is competitive to that of glass RPC and MM sampling techniques.
△ Less
Submitted 7 October, 2022; v1 submitted 26 August, 2022;
originally announced August 2022.
-
A coupled FE-RRM-based numerical model for analysis of energy transmission loss through stiffened double-wall panel due to TBL excitation
Authors:
Biplab Ranjan Adhikary,
Atanu Sahu,
Partha Bhattacharya
Abstract:
We propose a fully coupled numerical model to predict energy transmission through a turbulent boundary layer (TBL) excited stiffened double-leaf flexible aircraft panel using a finite element (FE) framework. Mindlin first order shear deformation model is adopted for the panels and a TBL-structure-acoustic coupling model is developed using finite element-radiation resistance matrix (FE-RRM) approac…
▽ More
We propose a fully coupled numerical model to predict energy transmission through a turbulent boundary layer (TBL) excited stiffened double-leaf flexible aircraft panel using a finite element (FE) framework. Mindlin first order shear deformation model is adopted for the panels and a TBL-structure-acoustic coupling model is developed using finite element-radiation resistance matrix (FE-RRM) approach to predict the transmission loss (TL) through double-leaf panels with variable thickness and stiffener orientation. The model is also capable to capture the contribution of orthotropic lamina sequence and frequency-dependent structural damping in predicting the TL. Thus, a new numerical model is proposed that enables the designers with greater flexibility in terms of the number of panel leaves, boundary, and stiffening condition of the aircraft panel-cavity-panel system, made of isotropic or orthotropic laminates.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
TPP: Transparent Page Placement for CXL-Enabled Tiered-Memory
Authors:
Hasan Al Maruf,
Hao Wang,
Abhishek Dhanotia,
Johannes Weiner,
Niket Agarwal,
Pallab Bhattacharya,
Chris Petersen,
Mosharaf Chowdhury,
Shobhit Kanaujia,
Prakash Chauhan
Abstract:
The increasing demand for memory in hyperscale applications has led to memory becoming a large portion of the overall datacenter spend. The emergence of coherent interfaces like CXL enables main memory expansion and offers an efficient solution to this problem. In such systems, the main memory can constitute different memory technologies with varied characteristics. In this paper, we characterize…
▽ More
The increasing demand for memory in hyperscale applications has led to memory becoming a large portion of the overall datacenter spend. The emergence of coherent interfaces like CXL enables main memory expansion and offers an efficient solution to this problem. In such systems, the main memory can constitute different memory technologies with varied characteristics. In this paper, we characterize memory usage patterns of a wide range of datacenter applications across the server fleet of Meta. We, therefore, demonstrate the opportunities to offload colder pages to slower memory tiers for these applications. Without efficient memory management, however, such systems can significantly degrade performance.
We propose a novel OS-level application-transparent page placement mechanism (TPP) for CXL-enabled memory. TPP employs a lightweight mechanism to identify and place hot/cold pages to appropriate memory tiers. It enables a proactive page demotion from local memory to CXL-Memory. This technique ensures a memory headroom for new page allocations that are often related to request processing and tend to be short-lived and hot. At the same time, TPP can promptly promote performance-critical hot pages trapped in the slow CXL-Memory to the fast local memory, while minimizing both sampling overhead and unnecessary migrations. TPP works transparently without any application-specific knowledge and can be deployed globally as a kernel release.
We evaluate TPP in the production server fleet with early samples of new x86 CPUs with CXL 1.1 support. TPP makes a tiered memory system performant as an ideal baseline (<1% gap) that has all the memory in the local tier. It is 18% better than today's Linux, and 5-17% better than existing solutions including NUMA Balancing and AutoTiering. Most of the TPP patches have been merged in the Linux v5.18 release.
△ Less
Submitted 28 May, 2023; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Double-hit separation and dE/dx resolution of a time projection chamber with GEM readout
Authors:
Yumi Aoki,
David Attié,
Ties Behnke,
Alain Bellerive,
Oleg Bezshyyko,
Deb Bhattacharya Sankar,
Purba Bhattacharya,
Sudeb Bhattacharya,
Yue Chang,
Paul Colas,
Gilles De Lentdecker,
Klaus Dehmelt,
Klaus Desch,
Ralf Diener,
Madhu Dixit,
Ulrich Einhaus,
Oleksiy Fedorchuk,
Ivor Fleck,
Keisuke Fujii,
Takahiro Fusayasu,
Serguei Ganjour,
Philippe Gros,
Peter Hayman,
Katsumasa Ikematsu,
Leif Jönsson
, et al. (46 additional authors not shown)
Abstract:
A time projection chamber (TPC) with micropattern gaseous detector (MPGD) readout is investigated as main tracking device of the International Large Detector (ILD) concept at the planned International Linear Collider (ILC). A prototype TPC equipped with a triple gas electron multiplier (GEM) readout has been built and operated in an electron test beam. The TPC was placed in a 1 T solenoidal field…
▽ More
A time projection chamber (TPC) with micropattern gaseous detector (MPGD) readout is investigated as main tracking device of the International Large Detector (ILD) concept at the planned International Linear Collider (ILC). A prototype TPC equipped with a triple gas electron multiplier (GEM) readout has been built and operated in an electron test beam. The TPC was placed in a 1 T solenoidal field at the DESY II Test Beam Facility, which provides an electron beam up to 6 GeV/c. The performance of the readout modules, in particular the spatial point resolution, is determined and compared to earlier tests. New studies are presented with first results on the separation of close-by tracks and the capability of the system to measure the specific energy loss dE/dx. This is complemented by a simulation study on the optimization of the readout granularity to improve particle identification by dE/dx.
△ Less
Submitted 25 November, 2022; v1 submitted 24 May, 2022;
originally announced May 2022.
-
A Tour of Visualization Techniques for Computer Vision Datasets
Authors:
Bilal Alsallakh,
Pamela Bhattacharya,
Vanessa Feng,
Narine Kokhlikyan,
Orion Reblitz-Richardson,
Rahul Rajan,
David Yan
Abstract:
We survey a number of data visualization techniques for analyzing Computer Vision (CV) datasets. These techniques help us understand properties and latent patterns in such data, by applying dataset-level analysis. We present various examples of how such analysis helps predict the potential impact of the dataset properties on CV models and informs appropriate mitigation of their shortcomings. Final…
▽ More
We survey a number of data visualization techniques for analyzing Computer Vision (CV) datasets. These techniques help us understand properties and latent patterns in such data, by applying dataset-level analysis. We present various examples of how such analysis helps predict the potential impact of the dataset properties on CV models and informs appropriate mitigation of their shortcomings. Finally, we explore avenues for further visualization techniques of different modalities of CV datasets as well as ones that are tailored to support specific CV tasks and analysis needs.
△ Less
Submitted 18 April, 2022;
originally announced April 2022.
-
Effect of hole geometry on charge sharing and other parameters in GEM-based detectors
Authors:
Promita Roy,
Purba Bhattacharya,
Prasant Kumar Rout,
Supratik Mukhopadhyay,
Nayana Majumdar,
Sandip Sarkar
Abstract:
Gas Electron Multipliers (GEM) are among the more prominent Micro-Pattern Gaseous Detectors (MPGDs) and widely used in high energy particle physics experiments and various related applications. Adoption of different production techniques lead to holes of varying geometries in GEM foils. Since the response of a GEM-based detector is closely related to the hole geometry through the influence of the…
▽ More
Gas Electron Multipliers (GEM) are among the more prominent Micro-Pattern Gaseous Detectors (MPGDs) and widely used in high energy particle physics experiments and various related applications. Adoption of different production techniques lead to holes of varying geometries in GEM foils. Since the response of a GEM-based detector is closely related to the hole geometry through the influence of the latter on charge sharing and transport through GEM foils, attempts have been made to relate hole configurations to different figures of merit of a detector. Numerical simulations have been performed to study the effects of hole geometry on important parameters such as charge sharing, collection efficiency, extraction efficiency, gain, possibility of transition from avalanche to streamer modes for single, double and triple layer GEM detectors. The numerical estimates have been compared to available experimental data. The comparisons, although not always in agreement, are found to be generally encouraging.
△ Less
Submitted 5 March, 2024; v1 submitted 23 December, 2021;
originally announced December 2021.
-
Incorporating Domain Knowledge for Extractive Summarization of Legal Case Documents
Authors:
Paheli Bhattacharya,
Soham Poddar,
Koustav Rudra,
Kripabandhu Ghosh,
Saptarshi Ghosh
Abstract:
Automatic summarization of legal case documents is an important and practical challenge. Apart from many domain-independent text summarization algorithms that can be used for this purpose, several algorithms have been developed specifically for summarizing legal case documents. However, most of the existing algorithms do not systematically incorporate domain knowledge that specifies what informati…
▽ More
Automatic summarization of legal case documents is an important and practical challenge. Apart from many domain-independent text summarization algorithms that can be used for this purpose, several algorithms have been developed specifically for summarizing legal case documents. However, most of the existing algorithms do not systematically incorporate domain knowledge that specifies what information should ideally be present in a legal case document summary. To address this gap, we propose an unsupervised summarization algorithm DELSumm which is designed to systematically incorporate guidelines from legal experts into an optimization setup. We conduct detailed experiments over case documents from the Indian Supreme Court. The experiments show that our proposed unsupervised method outperforms several strong baselines in terms of ROUGE scores, including both general summarization algorithms and legal-specific ones. In fact, though our proposed algorithm is unsupervised, it outperforms several supervised summarization models that are trained over thousands of document-summary pairs.
△ Less
Submitted 30 June, 2021;
originally announced June 2021.
-
On realizations of the subalgebra $A^R(1)$ of the $R$-motivic Steenrod Algebra
Authors:
Prasit Bhattacharya,
Bertrand J. Guillou,
Ang Li
Abstract:
In this paper, we show that the finite subalgebra $\mathcal{A}^{\mathbb{R}}(1)$, generated by $\mathrm{Sq}^1$ and $\mathrm{Sq}^2$, of the $\mathbb{R}$-motivic Steenrod algebra $\mathcal{A}^{\mathbb{R}}$ can be given $128$ different $\mathcal{A}^{\mathbb{R}}$-module structures. We also show that all of these $\mathcal{A}^{\mathbb{R}}$-modules can be realized as the cohomology of a $2$-local finite…
▽ More
In this paper, we show that the finite subalgebra $\mathcal{A}^{\mathbb{R}}(1)$, generated by $\mathrm{Sq}^1$ and $\mathrm{Sq}^2$, of the $\mathbb{R}$-motivic Steenrod algebra $\mathcal{A}^{\mathbb{R}}$ can be given $128$ different $\mathcal{A}^{\mathbb{R}}$-module structures. We also show that all of these $\mathcal{A}^{\mathbb{R}}$-modules can be realized as the cohomology of a $2$-local finite $\mathbb{R}$-motivic spectrum. The realization results are obtained using an $\mathbb{R}$ -motivic analogue of the Toda realization theorem. We notice that each realization of $\mathcal{A}^{\mathbb{R}}(1)$ can be expressed as a cofiber of an $\mathbb{R}$-motivic $v_1$-self-map. The $\mathrm{C}_2$-equivariant analogue of the above results then follows because of the Betti realization functor. We identify a relationship between the $\mathrm{RO}(\mathrm{C}_2)$-graded Steenrod operations on a $\mathrm{C}_2$-equivariant space and the classical Steenrod operations on both its underlying space and its fixed-points. This technique is then used to identify the geometric fixed-point spectra of the $\mathrm{C}_2$-equivariant realizations of $\mathcal{A}^{\mathrm{C}_2}(1)$. We find another application of the $\mathbb{R}$-motivic Toda realization theorem: we produce an $\mathbb{R}$-motivic, and consequently a $\mathrm{C}_2$-equivariant, analogue of the Bhattacharya-Egger spectrum $\mathcal{Z}$, which could be of independent interest.
△ Less
Submitted 11 July, 2021; v1 submitted 20 June, 2021;
originally announced June 2021.
-
A Discussion on Building Practical NLP Leaderboards: The Case of Machine Translation
Authors:
Sebastin Santy,
Prasanta Bhattacharya
Abstract:
Recent advances in AI and ML applications have benefited from rapid progress in NLP research. Leaderboards have emerged as a popular mechanism to track and accelerate progress in NLP through competitive model development. While this has increased interest and participation, the over-reliance on single, and accuracy-based metrics have shifted focus from other important metrics that might be equally…
▽ More
Recent advances in AI and ML applications have benefited from rapid progress in NLP research. Leaderboards have emerged as a popular mechanism to track and accelerate progress in NLP through competitive model development. While this has increased interest and participation, the over-reliance on single, and accuracy-based metrics have shifted focus from other important metrics that might be equally pertinent to consider in real-world contexts. In this paper, we offer a preliminary discussion of the risks associated with focusing exclusively on accuracy metrics and draw on recent discussions to highlight prescriptive suggestions on how to develop more practical and effective leaderboards that can better reflect the real-world utility of models.
△ Less
Submitted 30 December, 2022; v1 submitted 11 June, 2021;
originally announced June 2021.