Search | arXiv e-print repository

arXiv:2012.04137 [pdf, ps, other]

Adaptive Sampling for Estimating Distributions: A Bayesian Upper Confidence Bound Approach

Authors: Dhruva Kartik, Neeraj Sood, Urbashi Mitra, Tara Javidi

Abstract: The problem of adaptive sampling for estimating probability mass functions (pmf) uniformly well is considered. Performance of the sampling strategy is measured in terms of the worst-case mean squared error. A Bayesian variant of the existing upper confidence bound (UCB) based approaches is proposed. It is shown analytically that the performance of this Bayesian variant is no worse than the existin… ▽ More The problem of adaptive sampling for estimating probability mass functions (pmf) uniformly well is considered. Performance of the sampling strategy is measured in terms of the worst-case mean squared error. A Bayesian variant of the existing upper confidence bound (UCB) based approaches is proposed. It is shown analytically that the performance of this Bayesian variant is no worse than the existing approaches. The posterior distribution on the pmfs in the Bayesian setting allows for a tighter computation of upper confidence bounds which leads to significant performance gains in practice. Using this approach, adaptive sampling protocols are proposed for estimating SARS-CoV-2 seroprevalence in various groups such as location and ethnicity. The effectiveness of this strategy is discussed using data obtained from a seroprevalence survey in Los Angeles county. △ Less

Submitted 7 December, 2020; originally announced December 2020.

arXiv:2010.09905 [pdf, other]

SmartTriage: A system for personalized patient data capture, documentation generation, and decision support

Authors: Ilya Valmianski, Nave Frost, Navdeep Sood, Yang Wang, Baodong Liu, James J. Zhu, Sunil Karumuri, Ian M. Finn, Daniel S. Zisook

Abstract: Symptom checkers have emerged as an important tool for collecting symptoms and diagnosing patients, minimizing the involvement of clinical personnel. We developed a machine-learning-backed system, SmartTriage, which goes beyond conventional symptom checking through a tight bi-directional integration with the electronic medical record (EMR). Conditioned on EMR-derived patient history, our system id… ▽ More Symptom checkers have emerged as an important tool for collecting symptoms and diagnosing patients, minimizing the involvement of clinical personnel. We developed a machine-learning-backed system, SmartTriage, which goes beyond conventional symptom checking through a tight bi-directional integration with the electronic medical record (EMR). Conditioned on EMR-derived patient history, our system identifies the patient's chief complaint from a free-text entry and then asks a series of discrete questions to obtain relevant symptomatology. The patient-specific data are used to predict detailed ICD-10-CM codes as well as medication, laboratory, and imaging orders. Patient responses and clinical decision support (CDS) predictions are then inserted back into the EMR. To train the machine learning components of SmartTriage, we employed novel data sets of over 25 million primary care encounters and 1 million patient free-text reason-for-visit entries. These data sets were used to construct: (1) a long short-term memory (LSTM) based patient history representation, (2) a fine-tuned transformer model for chief complaint extraction, (3) a random forest model for question sequencing, and (4) a feed-forward network for CDS predictions. In total, our system supports 337 patient chief complaints, which together make up $>90\%$ of all primary care encounters at Kaiser Permanente. △ Less

Submitted 11 November, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

Comments: Accepted as a proceeding for ML4H 2021

ACM Class: J.3; I.2.7

arXiv:2007.01972 [pdf, other]

Building a Competitive Associative Classifier

Authors: Nitakshi Sood, Osmar Zaiane

Abstract: With the huge success of deep learning, other machine learning paradigms have had to take back seat. Yet other models, particularly rule-based, are more readable and explainable and can even be competitive when labelled data is not abundant. However, most of the existing rule-based classifiers suffer from the production of a large number of classification rules, affecting the model readability. Th… ▽ More With the huge success of deep learning, other machine learning paradigms have had to take back seat. Yet other models, particularly rule-based, are more readable and explainable and can even be competitive when labelled data is not abundant. However, most of the existing rule-based classifiers suffer from the production of a large number of classification rules, affecting the model readability. This hampers the classification accuracy as noisy rules might not add any useful informationfor classification and also lead to longer classification time. In this study, we propose SigD2 which uses a novel, two-stage pruning strategy which prunes most of the noisy, redundant and uninteresting rules and makes the classification model more accurate and readable. To make SigDirect more competitive with the most prevalent but uninterpretable machine learning-based classifiers like neural networks and support vector machines, we propose bagging and boosting on the ensemble of the SigDirect classifier. The results of the proposed algorithms are quite promising and we are able to obtain a minimal set of statistically significant rules for classification without jeopardizing the classification accuracy. We use 15 UCI datasets and compare our approach with eight existing systems.The SigD2 and boosted SigDirect (ACboost) ensemble model outperform various state-of-the-art classifiers not only in terms of classification accuracy but also in terms of the number of rules. △ Less

Submitted 3 July, 2020; originally announced July 2020.

Comments: To be published in - The 22nd International Conference on Big Data Analytics and Knowledge Discovery - DaWaK2020, Bratislava, Slovakia, September 14-17, 2020

arXiv:2006.11628 [pdf, other]

Learning and Testing Sub-groups with Heterogeneous Treatment Effects:A Sequence of Two Studies

Authors: Rahul Ladhania, Amelia Haviland, Neeraj Sood, Edward Kennedy, Ateev Mehrotra

Abstract: There is strong interest in estimating how the magnitude of treatment effects of an intervention vary across sub-groups of the population of interest. In our paper, we propose a two-study approach to first propose and then test heterogeneous treatment effects. In Study 1, we use a large observational dataset to learn sub-groups with the most distinctive treatment-outcome relationships ('high/low-i… ▽ More There is strong interest in estimating how the magnitude of treatment effects of an intervention vary across sub-groups of the population of interest. In our paper, we propose a two-study approach to first propose and then test heterogeneous treatment effects. In Study 1, we use a large observational dataset to learn sub-groups with the most distinctive treatment-outcome relationships ('high/low-impact sub-groups'). We adopt a model-based recursive partitioning approach to propose the high/low impact sub-groups, and validate them by using sample-splitting. While the first study rules out noise, there is potential bias in our estimated heterogeneous treatment effects. Study 2 uses an experimental design, and here we classify our sample units based on sub-groups learned in Study 1. We then estimate treatment effects within each of the groups, thereby testing the causal hypotheses proposed in Study 1. Using patient claims data from the NBER MarketScan database, we apply our approach to estimate heterogeneous effects of a switch to a high-deductible health insurance plan on use of outpatient care by patients with a common chronic condition. We extend the method to non-parametrically learn the sub-groups in Study 1. We also compare the methods' performance to other state-of-the-art methods in the literature that make use only of the Study 2 data. △ Less

Submitted 20 June, 2020; originally announced June 2020.

arXiv:1609.05162 [pdf, other]

No-Regret Replanning under Uncertainty

Authors: Wen Sun, Niteesh Sood, Debadeepta Dey, Gireeja Ranade, Siddharth Prakash, Ashish Kapoor

Abstract: This paper explores the problem of path planning under uncertainty. Specifically, we consider online receding horizon based planners that need to operate in a latent environment where the latent information can be modeled via Gaussian Processes. Online path planning in latent environments is challenging since the robot needs to explore the environment to get a more accurate model of latent informa… ▽ More This paper explores the problem of path planning under uncertainty. Specifically, we consider online receding horizon based planners that need to operate in a latent environment where the latent information can be modeled via Gaussian Processes. Online path planning in latent environments is challenging since the robot needs to explore the environment to get a more accurate model of latent information for better planning later and also achieves the task as quick as possible. We propose UCB style algorithms that are popular in the bandit settings and show how those analyses can be adapted to the online robotic path planning problems. The proposed algorithm trades-off exploration and exploitation in near-optimal manner and has appealing no-regret properties. We demonstrate the efficacy of the framework on the application of aircraft flight path planning when the winds are partially observed. △ Less

Submitted 16 September, 2016; originally announced September 2016.

Comments: 8 pages

Showing 1–5 of 5 results for author: Sood, N