-
Beyond Text: Characterizing Domain Expert Needs in Document Research
Authors:
Sireesh Gururaja,
Nupoor Gandhi,
Jeremiah Milbauer,
Emma Strubell
Abstract:
Working with documents is a key part of almost any knowledge work, from contextualizing research in a literature review to reviewing legal precedent. Recently, as their capabilities have expanded, primarily text-based NLP systems have often been billed as able to assist or even automate this kind of work. But to what extent are these systems able to model these tasks as experts conceptualize and p…
▽ More
Working with documents is a key part of almost any knowledge work, from contextualizing research in a literature review to reviewing legal precedent. Recently, as their capabilities have expanded, primarily text-based NLP systems have often been billed as able to assist or even automate this kind of work. But to what extent are these systems able to model these tasks as experts conceptualize and perform them now? In this study, we interview sixteen domain experts across two domains to understand their processes of document research, and compare it to the current state of NLP systems. We find that our participants processes are idiosyncratic, iterative, and rely extensively on the social context of a document in addition its content; existing approaches in NLP and adjacent fields that explicitly center the document as an object, rather than as merely a container for text, tend to better reflect our participants' priorities, though they are often less accessible outside their research communities. We call on the NLP community to more carefully consider the role of the document in building useful tools that are accessible, personalizable, iterative, and socially aware.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
Data Efficient Subset Training with Differential Privacy
Authors:
Ninad Jayesh Gandhi,
Moparthy Venkata Subrahmanya Sri Harsha
Abstract:
Private machine learning introduces a trade-off between the privacy budget and training performance. Training convergence is substantially slower and extensive hyper parameter tuning is required. Consequently, efficient methods to conduct private training of models is thoroughly investigated in the literature. To this end, we investigate the strength of the data efficient model training methods in…
▽ More
Private machine learning introduces a trade-off between the privacy budget and training performance. Training convergence is substantially slower and extensive hyper parameter tuning is required. Consequently, efficient methods to conduct private training of models is thoroughly investigated in the literature. To this end, we investigate the strength of the data efficient model training methods in the private training setting. We adapt GLISTER (Killamsetty et al., 2021b) to the private setting and extensively assess its performance. We empirically find that practical choices of privacy budgets are too restrictive for data efficient training in the private setting.
△ Less
Submitted 9 March, 2025;
originally announced March 2025.
-
Evaluating Differentially Private Synthetic Data Generation in High-Stakes Domains
Authors:
Krithika Ramesh,
Nupoor Gandhi,
Pulkit Madaan,
Lisa Bauer,
Charith Peris,
Anjalie Field
Abstract:
The difficulty of anonymizing text data hinders the development and deployment of NLP in high-stakes domains that involve private data, such as healthcare and social services. Poorly anonymized sensitive data cannot be easily shared with annotators or external researchers, nor can it be used to train public models. In this work, we explore the feasibility of using synthetic data generated from dif…
▽ More
The difficulty of anonymizing text data hinders the development and deployment of NLP in high-stakes domains that involve private data, such as healthcare and social services. Poorly anonymized sensitive data cannot be easily shared with annotators or external researchers, nor can it be used to train public models. In this work, we explore the feasibility of using synthetic data generated from differentially private language models in place of real data to facilitate the development of NLP in these domains without compromising privacy. In contrast to prior work, we generate synthetic data for real high-stakes domains, and we propose and conduct use-inspired evaluations to assess data quality. Our results show that prior simplistic evaluations have failed to highlight utility, privacy, and fairness issues in the synthetic data. Overall, our work underscores the need for further improvements to synthetic data generation for it to be a viable way to enable privacy-preserving data sharing.
△ Less
Submitted 10 October, 2024;
originally announced October 2024.
-
Annotated Hands for Generative Models
Authors:
Yue Yang,
Atith N Gandhi,
Greg Turk
Abstract:
Generative models such as GANs and diffusion models have demonstrated impressive image generation capabilities. Despite these successes, these systems are surprisingly poor at creating images with hands. We propose a novel training framework for generative models that substantially improves the ability of such systems to create hand images. Our approach is to augment the training images with three…
▽ More
Generative models such as GANs and diffusion models have demonstrated impressive image generation capabilities. Despite these successes, these systems are surprisingly poor at creating images with hands. We propose a novel training framework for generative models that substantially improves the ability of such systems to create hand images. Our approach is to augment the training images with three additional channels that provide annotations to hands in the image. These annotations provide additional structure that coax the generative model to produce higher quality hand images. We demonstrate this approach on two different generative models: a generative adversarial network and a diffusion model. We demonstrate our method both on a new synthetic dataset of hand images and also on real photographs that contain hands. We measure the improved quality of the generated hands through higher confidence in finger joint identification using an off-the-shelf hand detector.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
Machine Vision Using Cellphone Camera: A Comparison of deep networks for classifying three challenging denominations of Indian Coins
Authors:
Keyur D. Joshi,
Dhruv Shah,
Varshil Shah,
Nilay Gandhi,
Sanket J. Shah,
Sanket B. Shah
Abstract:
Indian currency coins come in a variety of denominations. Off all the varieties Rs.1, RS.2, and Rs.5 have similar diameters. Majority of the coin styles in market circulation for denominations of Rs.1 and Rs.2 coins are nearly the same except for numerals on its reverse side. If a coin is resting on its obverse side, the correct denomination is not distinguishable by humans. Therefore, it was hypo…
▽ More
Indian currency coins come in a variety of denominations. Off all the varieties Rs.1, RS.2, and Rs.5 have similar diameters. Majority of the coin styles in market circulation for denominations of Rs.1 and Rs.2 coins are nearly the same except for numerals on its reverse side. If a coin is resting on its obverse side, the correct denomination is not distinguishable by humans. Therefore, it was hypothesized that a digital image of a coin resting on its either size could be classified into its correct denomination by training a deep neural network model. The digital images were generated by using cheap cell phone cameras. To find the most suitable deep neural network architecture, four were selected based on the preliminary analysis carried out for comparison. The results confirm that two of the four deep neural network models can classify the correct denomination from either side of a coin with an accuracy of 97%.
△ Less
Submitted 12 May, 2023;
originally announced June 2023.
-
Examining risks of racial biases in NLP tools for child protective services
Authors:
Anjalie Field,
Amanda Coston,
Nupoor Gandhi,
Alexandra Chouldechova,
Emily Putnam-Hornstein,
David Steier,
Yulia Tsvetkov
Abstract:
Although much literature has established the presence of demographic bias in natural language processing (NLP) models, most work relies on curated bias metrics that may not be reflective of real-world applications. At the same time, practitioners are increasingly using algorithmic tools in high-stakes settings, with particular recent interest in NLP. In this work, we focus on one such setting: chi…
▽ More
Although much literature has established the presence of demographic bias in natural language processing (NLP) models, most work relies on curated bias metrics that may not be reflective of real-world applications. At the same time, practitioners are increasingly using algorithmic tools in high-stakes settings, with particular recent interest in NLP. In this work, we focus on one such setting: child protective services (CPS). CPS workers often write copious free-form text notes about families they are working with, and CPS agencies are actively seeking to deploy NLP models to leverage these data. Given well-established racial bias in this setting, we investigate possible ways deployed NLP is liable to increase racial disparities. We specifically examine word statistics within notes and algorithmic fairness in risk prediction, coreference resolution, and named entity recognition (NER). We document consistent algorithmic unfairness in NER models, possible algorithmic unfairness in coreference resolution models, and little evidence of exacerbated racial bias in risk prediction. While there is existing pronounced criticism of risk prediction, our results expose previously undocumented risks of racial bias in realistic information extraction systems, highlighting potential concerns in deploying them, even though they may appear more benign. Our work serves as a rare realistic examination of NLP algorithmic fairness in a potential deployed setting and a timely investigation of a specific risk associated with deploying NLP in CPS settings.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
Mention Annotations Alone Enable Efficient Domain Adaptation for Coreference Resolution
Authors:
Nupoor Gandhi,
Anjalie Field,
Emma Strubell
Abstract:
Although recent neural models for coreference resolution have led to substantial improvements on benchmark datasets, transferring these models to new target domains containing out-of-vocabulary spans and requiring differing annotation schemes remains challenging. Typical approaches involve continued training on annotated target-domain data, but obtaining annotations is costly and time-consuming. W…
▽ More
Although recent neural models for coreference resolution have led to substantial improvements on benchmark datasets, transferring these models to new target domains containing out-of-vocabulary spans and requiring differing annotation schemes remains challenging. Typical approaches involve continued training on annotated target-domain data, but obtaining annotations is costly and time-consuming. We show that annotating mentions alone is nearly twice as fast as annotating full coreference chains. Accordingly, we propose a method for efficiently adapting coreference models, which includes a high-precision mention detection objective and requires annotating only mentions in the target domain. Extensive evaluation across three English coreference datasets: CoNLL-2012 (news/conversation), i2b2/VA (medical notes), and previously unstudied child welfare notes, reveals that our approach facilitates annotation-efficient transfer and results in a 7-14% improvement in average F1 without increasing annotator time.
△ Less
Submitted 30 May, 2023; v1 submitted 14 October, 2022;
originally announced October 2022.
-
Improving Span Representation for Domain-adapted Coreference Resolution
Authors:
Nupoor Gandhi,
Anjalie Field,
Yulia Tsvetkov
Abstract:
Recent work has shown fine-tuning neural coreference models can produce strong performance when adapting to different domains. However, at the same time, this can require a large amount of annotated target examples. In this work, we focus on supervised domain adaptation for clinical notes, proposing the use of concept knowledge to more efficiently adapt coreference models to a new domain. We devel…
▽ More
Recent work has shown fine-tuning neural coreference models can produce strong performance when adapting to different domains. However, at the same time, this can require a large amount of annotated target examples. In this work, we focus on supervised domain adaptation for clinical notes, proposing the use of concept knowledge to more efficiently adapt coreference models to a new domain. We develop methods to improve the span representations via (1) a retrofitting loss to incentivize span representations to satisfy a knowledge-based distance function and (2) a scaffolding loss to guide the recovery of knowledge from the span representation. By integrating these losses, our model is able to improve our baseline precision and F-1 score. In particular, we show that incorporating knowledge with end-to-end coreference models results in better performance on the most challenging, domain-specific spans.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Modelling resource allocation in uncertain system environment through deep reinforcement learning
Authors:
Neel Gandhi,
Shakti Mishra
Abstract:
Reinforcement Learning has applications in field of mechatronics, robotics, and other resource-constrained control system. Problem of resource allocation is primarily solved using traditional predefined techniques and modern deep learning methods. The drawback of predefined and most deep learning methods for resource allocation is failing to meet the requirements in cases of uncertain system envir…
▽ More
Reinforcement Learning has applications in field of mechatronics, robotics, and other resource-constrained control system. Problem of resource allocation is primarily solved using traditional predefined techniques and modern deep learning methods. The drawback of predefined and most deep learning methods for resource allocation is failing to meet the requirements in cases of uncertain system environment. We can approach problem of resource allocation in uncertain system environment alongside following certain criteria using deep reinforcement learning. Also, reinforcement learning has ability for adapting to new uncertain environment for prolonged period of time. The paper provides a detailed comparative analysis on various deep reinforcement learning methods by applying different components to modify architecture of reinforcement learning with use of noisy layers, prioritized replay, bagging, duelling networks, and other related combination to obtain improvement in terms of performance and reduction of computational cost. The paper identifies problem of resource allocation in uncertain environment could be effectively solved using Noisy Bagging duelling double deep Q network achieving efficiency of 97.7% by maximizing reward with significant exploration in given simulated environment for resource allocation.
△ Less
Submitted 17 June, 2021;
originally announced June 2021.
-
DeepGamble: Towards unlocking real-time player intelligence using multi-layer instance segmentation and attribute detection
Authors:
Danish Syed,
Naman Gandhi,
Arushi Arora,
Nilesh Kadam
Abstract:
Annually the gaming industry spends approximately $15 billion in marketing reinvestment. However, this amount is spent without any consideration for the skill and luck of the player. For a casino, an unskilled player could fetch ~4 times more revenue than a skilled player. This paper describes a video recognition system that is based on an extension of the Mask R-CNN model. Our system digitizes th…
▽ More
Annually the gaming industry spends approximately $15 billion in marketing reinvestment. However, this amount is spent without any consideration for the skill and luck of the player. For a casino, an unskilled player could fetch ~4 times more revenue than a skilled player. This paper describes a video recognition system that is based on an extension of the Mask R-CNN model. Our system digitizes the game of blackjack by detecting cards and player bets in real-time and processes decisions they took in order to create accurate player personas. Our proposed supervised learning approach consists of a specialized three-stage pipeline that takes images from two viewpoints of the casino table and does instance segmentation to generate masks on proposed regions of interest. These predicted masks along with derivative features are used to classify image attributes that are passed onto the next stage to assimilate the gameplay understanding. Our end-to-end model yields an accuracy of ~95% for the main bet detection and ~97% for card detection in a controlled environment trained using transfer learning approach with 900 training examples. Our approach is generalizable and scalable and shows promising results in varied gaming scenarios and test data. Such granular level gathered data, helped in understanding player's deviation from optimum strategy and thereby separate the skill of the player from the luck of the game. Our system also assesses the likelihood of card counting by correlating the player's betting pattern to the deck's scaled count. Such a system lets casinos flag fraudulent activity and calculate expected personalized profitability for each player and tailor their marketing reinvestment decisions.
△ Less
Submitted 14 December, 2020;
originally announced December 2020.
-
EdgeBench: A Workflow-based Benchmark for Edge Computing
Authors:
Qirui Yang,
Runyu Jin,
Nabil Gandhi,
Xiongzi Ge,
Hoda Aghaei Khouzani,
Ming Zhao
Abstract:
Edge computing has been developed to utilize multiple tiers of resources for privacy, cost and Quality of Service (QoS) reasons. Edge workloads have the characteristics of data-driven and latency-sensitive. Because of this, edge systems have developed to be both heterogeneous and distributed. The unique characteristics of edge workloads and edge systems have motivated EdgeBench, a workflow-based b…
▽ More
Edge computing has been developed to utilize multiple tiers of resources for privacy, cost and Quality of Service (QoS) reasons. Edge workloads have the characteristics of data-driven and latency-sensitive. Because of this, edge systems have developed to be both heterogeneous and distributed. The unique characteristics of edge workloads and edge systems have motivated EdgeBench, a workflow-based benchmark aims to provide the ability to explore the full design space of edge workloads and edge systems. EdgeBench is both customizable and representative. It allows users to customize the workflow logic of edge workloads, the data storage backends, and the distribution of the individual workflow stages to different computing tiers. To illustrate the usability of EdgeBench, we also implements two representative edge workflows, a video analytics workflow and an IoT hub workflow that represents two distinct but common edge workloads. Both workflows are evaluated using the workflow-level and function-level metrics reported by EdgeBench to illustrate both the performance bottlenecks of the edge systems and the edge workloads.
△ Less
Submitted 26 October, 2020;
originally announced October 2020.
-
Realization of MIMO Channel Model for Spatial Diversity with Capacity and SNR Multiplexing Gains
Authors:
Subrato Bharati,
Prajoy Podder,
Niketa Gandhi,
Ajith Abraham
Abstract:
Multiple input multiple output (MIMO) system transmission is a popular diversity technique to improve the reliability of a communication system where transmitter, communication channel and receiver are the important elements. Data transmission reliability can be ensured when the bit error rate is very low. Normally, multiple antenna elements are used at both the transmitting and receiving section…
▽ More
Multiple input multiple output (MIMO) system transmission is a popular diversity technique to improve the reliability of a communication system where transmitter, communication channel and receiver are the important elements. Data transmission reliability can be ensured when the bit error rate is very low. Normally, multiple antenna elements are used at both the transmitting and receiving section in MIMO Systems. MIMO system utilizes antenna diversity or spatial diversity coding system in wireless channels because wireless channels severely suffer from multipath fading in which the transmitted signal is reflected along various multiple paths before reaching to the destination or receiving section.Parallel transmission of MIMO system has also been implemented where both the real part and imaginary part of the original, detected and the corresponding received data sequence has been described graphically. The MIMO channel average capacity is achieved more than 80% for dissimilar levels of impairments in transceiver when the value of kappa (Level of impairments in transmitter hardware) reduces from 0.02 to 0.005. The finite-SNR multiplexing gain (Proportion of MIMO system capacity to SISO system capacity) has been observed for deterministic and uncorrelated Rayleigh fading channels correspondingly. The core difference is in the high SNR level. It may occur for two reasons: (a) there is a quicker convergence to the limits under transceiver impairments (b) deterministic channels that are built on digital architectural plans or topographical maps of the propagation environment acquire an asymptotic gain superior than multiplexing gain when the number of transmitting antenna is greater than the number of receiving antenna.
△ Less
Submitted 24 April, 2020;
originally announced May 2020.
-
Multi-dimensional Features for Prediction with Tweets
Authors:
Nupoor Gandhi,
Alex Morales,
Dolores Albarracin
Abstract:
With the rise of opioid abuse in the US, there has been a growth of overlapping hotspots for overdose-related and HIV-related deaths in Springfield, Boston, Fall River, New Bedford, and parts of Cape Cod. With a large part of population, including rural communities, active on social media, it is crucial that we leverage the predictive power of social media as a preventive measure. We explore the p…
▽ More
With the rise of opioid abuse in the US, there has been a growth of overlapping hotspots for overdose-related and HIV-related deaths in Springfield, Boston, Fall River, New Bedford, and parts of Cape Cod. With a large part of population, including rural communities, active on social media, it is crucial that we leverage the predictive power of social media as a preventive measure. We explore the predictive power of micro-blogging social media website Twitter with respect to HIV new diagnosis rates per county. While trending work in Twitter NLP has focused on primarily text-based features, we show that multi-dimensional feature construction can significantly improve the predictive power of topic features alone with respect STI's (sexually transmitted infections). By multi-dimensional features, we mean leveraging not only the topical features (text) of a corpus, but also location-based information (counties) about the tweets in feature-construction. We develop novel text-location-based smoothing features to predict new diagnoses of HIV.
△ Less
Submitted 15 October, 2019;
originally announced October 2019.
-
Increase Apparent Public Speaking Fluency By Speech Augmentation
Authors:
Sagnik Das,
Nisha Gandhi,
Tejas Naik,
Roy Shilkrot
Abstract:
Fluent and confident speech is desirable to every speaker. But professional speech delivering requires a great deal of experience and practice. In this paper, we propose a speech stream manipulation system which can help non-professional speakers to produce fluent, professional-like speech content, in turn contributing towards better listener engagement and comprehension. We propose to achieve thi…
▽ More
Fluent and confident speech is desirable to every speaker. But professional speech delivering requires a great deal of experience and practice. In this paper, we propose a speech stream manipulation system which can help non-professional speakers to produce fluent, professional-like speech content, in turn contributing towards better listener engagement and comprehension. We propose to achieve this task by manipulating the disfluencies in human speech, like the sounds 'uh' and 'um', the filler words and awkward long silences. Given any unrehearsed speech we segment and silence the filled pauses and doctor the duration of imposed silence as well as other long pauses ('disfluent') by a predictive model learned using professional speech dataset. Finally, we output a audio stream in which speaker sounds more fluent, confident and practiced compared to the original speech he/she recorded. According to our quantitative evaluation, we significantly increase the fluency of speech by reducing rate of pauses and fillers.
△ Less
Submitted 3 August, 2019; v1 submitted 8 December, 2018;
originally announced December 2018.