Search | arXiv e-print repository

Think Global, Act Local: Bayesian Causal Discovery with Language Models in Sequential Data

Authors: Prakhar Verma, David Arbour, Sunav Choudhary, Harshita Chopra, Arno Solin, Atanu R. Sinha

Abstract: Causal discovery from observational data typically assumes full access to data and availability of domain experts. In practice, data often arrive in batches, and expert knowledge is scarce. Language Models (LMs) offer a surrogate but come with their own issues-hallucinations, inconsistencies, and bias. We present BLANCE (Bayesian LM-Augmented Causal Estimation)-a hybrid Bayesian framework that bri… ▽ More Causal discovery from observational data typically assumes full access to data and availability of domain experts. In practice, data often arrive in batches, and expert knowledge is scarce. Language Models (LMs) offer a surrogate but come with their own issues-hallucinations, inconsistencies, and bias. We present BLANCE (Bayesian LM-Augmented Causal Estimation)-a hybrid Bayesian framework that bridges these gaps by adaptively integrating sequential batch data with LM-derived noisy, expert knowledge while accounting for both data-induced and LM-induced biases. Our proposed representation shift from Directed Acyclic Graph (DAG) to Partial Ancestral Graph (PAG) accommodates ambiguities within a coherent Bayesian framework, allowing grounding the global LM knowledge in local observational data. To guide LM interaction, we use a sequential optimization scheme that adaptively queries the most informative edges. Across varied datasets, BLANCE outperforms prior work in structural accuracy and extends to Bayesian parameter estimation, showing robustness to LM noise. △ Less

Submitted 19 June, 2025; originally announced June 2025.

Comments: 24 pages, preprint

arXiv:2502.00682 [pdf, other]

doi 10.1145/3708359.3712166

Guidance Source Matters: How Guidance from AI, Expert, or a Group of Analysts Impacts Visual Data Preparation and Analysis

Authors: Arpit Narechania, Alex Endert, Atanu R Sinha

Abstract: The progress in generative AI has fueled AI-powered tools like co-pilots and assistants to provision better guidance, particularly during data analysis. However, research on guidance has not yet examined the perceived efficacy of the source from which guidance is offered and the impact of this source on the user's perception and usage of guidance. We ask whether users perceive all guidance sources… ▽ More The progress in generative AI has fueled AI-powered tools like co-pilots and assistants to provision better guidance, particularly during data analysis. However, research on guidance has not yet examined the perceived efficacy of the source from which guidance is offered and the impact of this source on the user's perception and usage of guidance. We ask whether users perceive all guidance sources as equal, with particular interest in three sources: (i) AI, (ii) human expert, and (iii) a group of human analysts. As a benchmark, we consider a fourth source, (iv) unattributed guidance, where guidance is provided without attribution to any source, enabling isolation of and comparison with the effects of source-specific guidance. We design a five-condition between-subjects study, with one condition for each of the four guidance sources and an additional (v) no-guidance condition, which serves as a baseline to evaluate the influence of any kind of guidance. We situate our study in a custom data preparation and analysis tool wherein we task users to select relevant attributes from an unfamiliar dataset to inform a business report. Depending on the assigned condition, users can request guidance, which the system then provides in the form of attribute suggestions. To ensure internal validity, we control for the quality of guidance across source-conditions. Through several metrics of usage and perception, we statistically test five preregistered hypotheses and report on additional analysis. We find that the source of guidance matters to users, but not in a manner that matches received wisdom. For instance, users utilize guidance differently at various stages of analysis, including expressing varying levels of regret, despite receiving guidance of similar quality. Notably, users in the AI condition reported both higher post-task benefit and regret. △ Less

Submitted 2 February, 2025; originally announced February 2025.

Comments: 21 pages, 10 figures, 6 figures, to appear in proceedings of ACM IUI 2025

arXiv:2402.19292 [pdf, other]

Fundamental Limits of Throughput and Availability: Applications to prophet inequalities & transaction fee mechanism design

Authors: Aadityan Ganesh, Jason Hartline, Atanu R Sinha, Matthew vonAllmen

Abstract: This paper studies the fundamental limits of availability and throughput for independent and heterogeneous demands of a limited resource. Availability is the probability that the demands are below the capacity of the resource. Throughput is the expected fraction of the resource that is utilized by the demands. We offer a concentration inequality generator that gives lower bounds on feasible availa… ▽ More This paper studies the fundamental limits of availability and throughput for independent and heterogeneous demands of a limited resource. Availability is the probability that the demands are below the capacity of the resource. Throughput is the expected fraction of the resource that is utilized by the demands. We offer a concentration inequality generator that gives lower bounds on feasible availability and throughput pairs with a given capacity and independent but not necessarily identical distributions of up-to-unit demands. We show that availability and throughput cannot both be poor. These bounds are analogous to tail inequalities on sums of independent random variables, but hold throughout the support of the demand distribution. This analysis gives analytically tractable bounds supporting the unit-demand characterization of Chawla, Devanur, and Lykouris (2023) and generalizes to up-to-unit demands. Our bounds also provide an approach towards improved multi-unit prophet inequalities (Hajiaghayi, Kleinberg, and Sandholm, 2007). They have applications to transaction fee mechanism design (for blockchains) where high availability limits the probability of profitable user-miner coalitions (Chung and Shi, 2023). △ Less

Submitted 14 July, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Comments: 34 pages, 7 figures; updated author information to include institutions and email addresses 35 pages, 7 figures; updated the TFM section and the last paragraph in the Applications section

arXiv:2402.03388 [pdf, other]

doi 10.1145/3583780.3614839

Delivery Optimized Discovery in Behavioral User Segmentation under Budget Constraint

Authors: Harshita Chopra, Atanu R. Sinha, Sunav Choudhary, Ryan A. Rossi, Paavan Kumar Indela, Veda Pranav Parwatala, Srinjayee Paul, Aurghya Maiti

Abstract: Users' behavioral footprints online enable firms to discover behavior-based user segments (or, segments) and deliver segment specific messages to users. Following the discovery of segments, delivery of messages to users through preferred media channels like Facebook and Google can be challenging, as only a portion of users in a behavior segment find match in a medium, and only a fraction of those… ▽ More Users' behavioral footprints online enable firms to discover behavior-based user segments (or, segments) and deliver segment specific messages to users. Following the discovery of segments, delivery of messages to users through preferred media channels like Facebook and Google can be challenging, as only a portion of users in a behavior segment find match in a medium, and only a fraction of those matched actually see the message (exposure). Even high quality discovery becomes futile when delivery fails. Many sophisticated algorithms exist for discovering behavioral segments; however, these ignore the delivery component. The problem is compounded because (i) the discovery is performed on the behavior data space in firms' data (e.g., user clicks), while the delivery is predicated on the static data space (e.g., geo, age) as defined by media; and (ii) firms work under budget constraint. We introduce a stochastic optimization based algorithm for delivery optimized discovery of behavioral user segmentation and offer new metrics to address the joint optimization. We leverage optimization under a budget constraint for delivery combined with a learning-based component for discovery. Extensive experiments on a public dataset from Google and a proprietary dataset show the effectiveness of our approach by simultaneously improving delivery metrics, reducing budget spend and achieving strong predictive performance in discovery. △ Less

Submitted 15 March, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

arXiv:2312.16177 [pdf, other]

Learning to Infer Unobserved Behaviors: Estimating User's Preference for a Site over Other Sites

Authors: Atanu R Sinha, Tanay Anand, Paridhi Maheshwari, A V Lakshmy, Vishal Jain

Abstract: A site's recommendation system relies on knowledge of its users' preferences to offer relevant recommendations to them. These preferences are for attributes that comprise items and content shown on the site, and are estimated from the data of users' interactions with the site. Another form of users' preferences is material too, namely, users' preferences for the site over other sites, since that s… ▽ More A site's recommendation system relies on knowledge of its users' preferences to offer relevant recommendations to them. These preferences are for attributes that comprise items and content shown on the site, and are estimated from the data of users' interactions with the site. Another form of users' preferences is material too, namely, users' preferences for the site over other sites, since that shows users' base level propensities to engage with the site. Estimating users' preferences for the site, however, faces major obstacles because (a) the focal site usually has no data of its users' interactions with other sites; these interactions are users' unobserved behaviors for the focal site; and (b) the Machine Learning literature in recommendation does not offer a model of this situation. Even if (b) is resolved, the problem in (a) persists since without access to data of its users' interactions with other sites, there is no ground truth for evaluation. Moreover, it is most useful when (c) users' preferences for the site can be estimated at the individual level, since the site can then personalize recommendations to individual users. We offer a method to estimate individual user's preference for a focal site, under this premise. In particular, we compute the focal site's share of a user's online engagements without any data from other sites. We show an evaluation framework for the model using only the focal site's data, allowing the site to test the model. We rely upon a Hierarchical Bayes Method and perform estimation in two different ways - Markov Chain Monte Carlo and Stochastic Gradient with Langevin Dynamics. Our results find good support for the approach to computing personalized share of engagement and for its evaluation. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2303.01575 [pdf, other]

doi 10.1145/3544548.3581509

DataPilot: Utilizing Quality and Usage Information for Subset Selection during Visual Data Preparation

Authors: Arpit Narechania, Fan Du, Atanu R Sinha, Ryan A. Rossi, Jane Hoffswell, Shunan Guo, Eunyee Koh, Shamkant B. Navathe, Alex Endert

Abstract: Selecting relevant data subsets from large, unfamiliar datasets can be difficult. We address this challenge by modeling and visualizing two kinds of auxiliary information: (1) quality - the validity and appropriateness of data required to perform certain analytical tasks; and (2) usage - the historical utilization characteristics of data across multiple users. Through a design study with 14 data w… ▽ More Selecting relevant data subsets from large, unfamiliar datasets can be difficult. We address this challenge by modeling and visualizing two kinds of auxiliary information: (1) quality - the validity and appropriateness of data required to perform certain analytical tasks; and (2) usage - the historical utilization characteristics of data across multiple users. Through a design study with 14 data workers, we integrate this information into a visual data preparation and analysis tool, DataPilot. DataPilot presents visual cues about "the good, the bad, and the ugly" aspects of data and provides graphical user interface controls as interaction affordances, guiding users to perform subset selection. Through a study with 36 participants, we investigate how DataPilot helps users navigate a large, unfamiliar tabular dataset, prepare a relevant subset, and build a visualization dashboard. We find that users selected smaller, effective subsets with higher quality and usage, and with greater success and confidence. △ Less

Submitted 2 March, 2023; originally announced March 2023.

Comments: 18 pages, 5 figures, 1 table, ACM CHI 2023

arXiv:2209.14250 [pdf]

B2B Advertising: Joint Dynamic Scoring of Account and Users

Authors: Atanu R. Sinha, Gautam Choudhary, Mansi Agarwal, Shivansh Bindal, Abhishek Pande, Camille Girabawe

Abstract: When a business sells to another business (B2B), the buying business is represented by a group of individuals, termed account, who collectively decide whether to buy. The seller advertises to each individual and interacts with them, mostly by digital means. The sales cycle is long, most often over a few months. There is heterogeneity among individuals belonging to an account in seeking information… ▽ More When a business sells to another business (B2B), the buying business is represented by a group of individuals, termed account, who collectively decide whether to buy. The seller advertises to each individual and interacts with them, mostly by digital means. The sales cycle is long, most often over a few months. There is heterogeneity among individuals belonging to an account in seeking information and hence the seller needs to score the interest of each individual over a long horizon to decide which individuals must be reached and when. Moreover, the buy decision rests with the account and must be scored to project the likelihood of purchase, a decision that is subject to change all the way up to the actual decision, emblematic of group decision making. We score decision of the account and its individuals in a dynamic manner. Dynamic scoring allows opportunity to influence different individual members at different time points over the long horizon. The dataset contains behavior logs of each individual's communication activities with the seller; but, there are no data on consultations among individuals which result in the decision. Using neural network architecture, we propose several ways to aggregate information from individual members' activities, to predict the group's collective decision. Multiple evaluations find strong model performance. △ Less

Submitted 28 September, 2022; originally announced September 2022.

Comments: Published at KDD Workshop: AdKDD 2022

arXiv:2206.15129 [pdf, other]

Personalized Detection of Cognitive Biases in Actions of Users from Their Logs: Anchoring and Recency Biases

Authors: Atanu R Sinha, Navita Goyal, Sunny Dhamnani, Tanay Asija, Raja K Dubey, M V Kaarthik Raja, Georgios Theocharous

Abstract: Cognitive biases are mental shortcuts humans use in dealing with information and the environment, and which result in biased actions and behaviors (or, actions), unbeknownst to themselves. Biases take many forms, with cognitive biases occupying a central role that inflicts fairness, accountability, transparency, ethics, law, medicine, and discrimination. Detection of biases is considered a necessa… ▽ More Cognitive biases are mental shortcuts humans use in dealing with information and the environment, and which result in biased actions and behaviors (or, actions), unbeknownst to themselves. Biases take many forms, with cognitive biases occupying a central role that inflicts fairness, accountability, transparency, ethics, law, medicine, and discrimination. Detection of biases is considered a necessary step toward their mitigation. Herein, we focus on two cognitive biases - anchoring and recency. The recognition of cognitive bias in computer science is largely in the domain of information retrieval, and bias is identified at an aggregate level with the help of annotated data. Proposing a different direction for bias detection, we offer a principled approach along with Machine Learning to detect these two cognitive biases from Web logs of users' actions. Our individual user level detection makes it truly personalized, and does not rely on annotated data. Instead, we start with two basic principles established in cognitive psychology, use modified training of an attention network, and interpret attention weights in a novel way according to those principles, to infer and distinguish between these two biases. The personalized approach allows detection for specific users who are susceptible to these biases when performing their tasks, and can help build awareness among them so as to undertake bias mitigation. △ Less

Submitted 1 July, 2022; v1 submitted 30 June, 2022; originally announced June 2022.

arXiv:2006.06323 [pdf, other]

doi 10.1609/aaai.v33i01.3301257

Surveys without Questions: A Reinforcement Learning Approach

Authors: Atanu R Sinha, Deepali Jain, Nikhil Sheoran, Sopan Khosla, Reshmi Sasidharan

Abstract: The 'old world' instrument, survey, remains a tool of choice for firms to obtain ratings of satisfaction and experience that customers realize while interacting online with firms. While avenues for survey have evolved from emails and links to pop-ups while browsing, the deficiencies persist. These include - reliance on ratings of very few respondents to infer about all customers' online interactio… ▽ More The 'old world' instrument, survey, remains a tool of choice for firms to obtain ratings of satisfaction and experience that customers realize while interacting online with firms. While avenues for survey have evolved from emails and links to pop-ups while browsing, the deficiencies persist. These include - reliance on ratings of very few respondents to infer about all customers' online interactions; failing to capture a customer's interactions over time since the rating is a one-time snapshot; and inability to tie back customers' ratings to specific interactions because ratings provided relate to all interactions. To overcome these deficiencies we extract proxy ratings from clickstream data, typically collected for every customer's online interactions, by developing an approach based on Reinforcement Learning (RL). We introduce a new way to interpret values generated by the value function of RL, as proxy ratings. Our approach does not need any survey data for training. Yet, on validation against actual survey data, proxy ratings yield reasonable performance results. Additionally, we offer a new way to draw insights from values of the value function, which allow associating specific interactions to their proxy ratings. We introduce two new metrics to represent ratings - one, customer-level and the other, aggregate-level for click actions across customers. Both are defined around proportion of all pairwise, successive actions that show increase in proxy ratings. This intuitive customer-level metric enables gauging the dynamics of ratings over time and is a better predictor of purchase than customer ratings from survey. The aggregate-level metric allows pinpointing actions that help or hurt experience. In sum, proxy ratings computed unobtrusively from clickstream, for every action, for each customer, and for every session can offer interpretable and more insightful alternative to surveys. △ Less

Submitted 11 June, 2020; originally announced June 2020.

Comments: The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, July 2019, pp. 257-64

arXiv:2004.09900 [pdf, other]

An RNN-Survival Model to Decide Email Send Times

Authors: Harvineet Singh, Moumita Sinha, Atanu R. Sinha, Sahil Garg, Neha Banerjee

Abstract: Email communications are ubiquitous. Firms control send times of emails and thereby the instants at which emails reach recipients (it is assumed email is received instantaneously from the send time). However, they do not control the duration it takes for recipients to open emails, labeled as time-to-open. Importantly, among emails that are opened, most occur within a short window from their send t… ▽ More Email communications are ubiquitous. Firms control send times of emails and thereby the instants at which emails reach recipients (it is assumed email is received instantaneously from the send time). However, they do not control the duration it takes for recipients to open emails, labeled as time-to-open. Importantly, among emails that are opened, most occur within a short window from their send times. We posit that emails are likely to be opened sooner when send times are convenient for recipients, while for other send times, emails can get ignored. Thus, to compute appropriate send times it is important to predict times-to-open accurately. We propose a recurrent neural network (RNN) in a survival model framework to predict times-to-open, for each recipient. Using that we compute appropriate send times. We experiment on a data set of emails sent to a million customers over five months. The sequence of emails received by a person from a sender is a result of interactions with past emails from the sender, and hence contain useful signal that inform our model. This sequential dependence affords our proposed RNN-Survival (RNN-S) approach to outperform survival analysis approaches in predicting times-to-open. We show that best times to send emails can be computed accurately from predicted times-to-open. This approach allows a firm to tune send times of emails, which is in its control, to favorably influence open rates and engagement. △ Less

Submitted 21 April, 2020; originally announced April 2020.

Comments: 11 pages, 3 figures, 2 tables

arXiv:1901.02412 [pdf, other]

Forecasting Granular Audience Size for Online Advertising

Authors: Ritwik Sinha, Dhruv Singal, Pranav Maneriker, Kushal Chawla, Yash Shrivastava, Deepak Pai, Atanu R Sinha

Abstract: Orchestration of campaigns for online display advertising requires marketers to forecast audience size at the granularity of specific attributes of web traffic, characterized by the categorical nature of all attributes (e.g. {US, Chrome, Mobile}). With each attribute taking many values, the very large attribute combination set makes estimating audience size for any specific attribute combination c… ▽ More Orchestration of campaigns for online display advertising requires marketers to forecast audience size at the granularity of specific attributes of web traffic, characterized by the categorical nature of all attributes (e.g. {US, Chrome, Mobile}). With each attribute taking many values, the very large attribute combination set makes estimating audience size for any specific attribute combination challenging. We modify Eclat, a frequent itemset mining (FIM) algorithm, to accommodate categorical variables. For consequent frequent and infrequent itemsets, we then provide forecasts using time series analysis with conditional probabilities to aid approximation. An extensive simulation, based on typical characteristics of audience data, is built to stress test our modified-FIM approach. In two real datasets, comparison with baselines including neural network models, shows that our method lowers computation time of FIM for categorical data. On hold out samples we show that the proposed forecasting method outperforms these baselines. △ Less

Submitted 8 January, 2019; originally announced January 2019.

Comments: Published at AdKDD & TargetAd 2018

Showing 1–11 of 11 results for author: Sinha, A R