Skip to main content

Showing 1–50 of 50 results for author: Rao, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2504.02693  [pdf, other

    stat.AP

    Joint Modeling of Spatial Dependencies Across Multiple Subjects in Multiplexed Tissue Imaging

    Authors: Joel Eliason, Arvind Rao, Timothy L Frankel, Michele Peruzzi

    Abstract: The tumor microenvironment (TME) is a spatially heterogeneous ecosystem where cellular interactions shape tumor progression and response to therapy. Multiplexed imaging technologies enable high-resolution spatial characterization of the TME, yet statistical methods for analyzing multi-subject spatial tissue data remain limited. We propose a Bayesian hierarchical model for inferring spatial depende… ▽ More

    Submitted 3 April, 2025; originally announced April 2025.

  2. arXiv:2412.14503  [pdf, other

    stat.CO

    dapper: Data Augmentation for Private Posterior Estimation in R

    Authors: Kevin Eng, Jordan A. Awan, Nianqiao Phyllis Ju, Vinayak A. Rao, Ruobin Gong

    Abstract: This paper serves as a reference and introduction to using the R package dapper. dapper encodes a sampling framework which allows exact Markov chain Monte Carlo simulation of parameters and latent variables in a statistical model given privatized data. The goal of this package is to fill an urgent need by providing applied researchers with a flexible tool to perform valid Bayesian inference on dat… ▽ More

    Submitted 18 December, 2024; originally announced December 2024.

  3. arXiv:2412.08828  [pdf, other

    stat.ME

    A Two-Stage Approach for Segmenting Spatial Point Patterns Applied to Multiplex Imaging

    Authors: Alvin Sheng, Brian J. Reich, Ana-Maria Staicu, Santhoshi N. Krishnan, Arvind Rao, Timothy L. Frankel

    Abstract: Recent advances in multiplex imaging have enabled researchers to locate different types of cells within a tissue sample. This is especially relevant for tumor immunology, as clinical regimes corresponding to different stages of disease or responses to treatment may manifest as different spatial arrangements of tumor and immune cells. Spatial point pattern modeling can be used to partition multiple… ▽ More

    Submitted 11 December, 2024; originally announced December 2024.

    Comments: 13 pages, 5 figures, 1 table

  4. arXiv:2411.02956  [pdf, other

    stat.ME

    On Distributional Discrepancy for Experimental Design with General Assignment Probabilities

    Authors: Anup B. Rao, Peng Zhang

    Abstract: We investigate experimental design for randomized controlled trials (RCTs) with both equal and unequal treatment-control assignment probabilities. Our work makes progress on the connection between the distributional discrepancy minimization (DDM) problem introduced by Harshaw et al. (2024) and the design of RCTs. We make two main contributions: First, we prove that approximating the optimal soluti… ▽ More

    Submitted 18 March, 2025; v1 submitted 5 November, 2024; originally announced November 2024.

    Comments: The first result comes from our previous work at arxiv.org/abs/2211.14658

  5. arXiv:2406.16721  [pdf, other

    stat.AP

    Spatially Structured Regression for Non-conformable Spaces: Integrating Pathology Imaging and Genomics Data in Cancer

    Authors: Nathaniel Osher, Jian Kang, Arvind Rao, Veerabhadran Baladandayuthapani

    Abstract: The spatial composition and cellular heterogeneity of the tumor microenvironment plays a critical role in cancer development and progression. High-definition pathology imaging of tumor biopsies provide a high-resolution view of the spatial organization of different types of cells. This allows for systematic assessment of intra- and inter-patient spatial cellular interactions and heterogeneity by i… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  6. arXiv:2403.00975  [pdf, other

    cs.LG cs.AI math.FA stat.AP

    Equipment Health Assessment: Time Series Analysis for Wind Turbine Performance

    Authors: Jana Backhus, Aniruddha Rajendra Rao, Chandrasekar Venkatraman, Abhishek Padmanabhan, A. Vinoth Kumar, Chetan Gupta

    Abstract: In this study, we leverage SCADA data from diverse wind turbines to predict power output, employing advanced time series methods, specifically Functional Neural Networks (FNN) and Long Short-Term Memory (LSTM) networks. A key innovation lies in the ensemble of FNN and LSTM models, capitalizing on their collective learning. This ensemble approach outperforms individual models, ensuring stable and a… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 19 Pages, 17 Figures, 3 Tables, Submitted at Applied Sciences (MDPI)

  7. arXiv:2401.14498  [pdf, other

    cs.LG eess.SY stat.AP stat.ML

    Predictive Analysis for Optimizing Port Operations

    Authors: Aniruddha Rajendra Rao, Haiyan Wang, Chetan Gupta

    Abstract: Maritime transport is a pivotal logistics mode for the long-distance and bulk transportation of goods. However, the intricate planning involved in this mode is often hindered by uncertainties, including weather conditions, cargo diversity, and port dynamics, leading to increased costs. Consequently, accurate estimation of the total (stay) time of the vessel and any delays at the port are essential… ▽ More

    Submitted 20 September, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: 21 pages, 9 figures, 4 Tables. Submitted at IEEE BigData 2024

  8. arXiv:2305.05532  [pdf, other

    eess.SP cs.AI cs.LG stat.AP stat.ML

    An ensemble of convolution-based methods for fault detection using vibration signals

    Authors: Xian Yeow Lee, Aman Kumar, Lasitha Vidyaratne, Aniruddha Rajendra Rao, Ahmed Farahat, Chetan Gupta

    Abstract: This paper focuses on solving a fault detection problem using multivariate time series of vibration signals collected from planetary gearboxes in a test rig. Various traditional machine learning and deep learning methods have been proposed for multivariate time-series classification, including distance-based, functional data-oriented, feature-driven, and convolution kernel-based methods. Recent st… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: 12 Pages, 9 Figures, 2 Tables. Accepted at ICPHM 2023

    Journal ref: 2023 IEEE International Conference on Prognostics and Health Management (ICPHM)

  9. arXiv:2304.02261  [pdf, other

    cs.DS cs.LG stat.ML

    Optimal Sketching Bounds for Sparse Linear Regression

    Authors: Tung Mai, Alexander Munteanu, Cameron Musco, Anup B. Rao, Chris Schwiegelshohn, David P. Woodruff

    Abstract: We study oblivious sketching for $k$-sparse linear regression under various loss functions such as an $\ell_p$ norm, or from a broad class of hinge-like loss functions, which includes the logistic and ReLU losses. We show that for sparse $\ell_2$ norm regression, there is a distribution over oblivious sketches with $Θ(k\log(d/k)/\varepsilon^2)$ rows, which is tight up to a constant factor. This ex… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: AISTATS 2023

  10. A Functional approach for Two Way Dimension Reduction in Time Series

    Authors: Aniruddha Rajendra Rao, Haiyan Wang, Chetan Gupta

    Abstract: The rise in data has led to the need for dimension reduction techniques, especially in the area of non-scalar variables, including time series, natural language processing, and computer vision. In this paper, we specifically investigate dimension reduction for time series through functional data analysis. Current methods for dimension reduction in functional data are functional principal component… ▽ More

    Submitted 1 January, 2023; originally announced January 2023.

    Comments: 10 pages, 4 figures, 4 tables

    Journal ref: IEEE BigData 2022

  11. arXiv:2210.06594  [pdf, other

    cs.LG cs.AI cs.DS econ.EM stat.ME

    Sample Constrained Treatment Effect Estimation

    Authors: Raghavendra Addanki, David Arbour, Tung Mai, Cameron Musco, Anup Rao

    Abstract: Treatment effect estimation is a fundamental problem in causal inference. We focus on designing efficient randomized controlled trials, to accurately estimate the effect of some treatment on a population of $n$ individuals. In particular, we study sample-constrained treatment effect estimation, where we must select a subset of $s \ll n$ individuals from the population to experiment on. This subset… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: Conference on Neural Information Processing Systems (NeurIPS) 2022

  12. arXiv:2206.10174  [pdf, other

    stat.AP stat.ME stat.ML

    Efficient Inference of Spatially-varying Gaussian Markov Random Fields with Applications in Gene Regulatory Networks

    Authors: Visweswaran Ravikumar, Tong Xu, Wajd N. Al-Holou, Salar Fattahi, Arvind Rao

    Abstract: In this paper, we study the problem of inferring spatially-varying Gaussian Markov random fields (SV-GMRF) where the goal is to learn a network of sparse, context-specific GMRFs representing network relationships between genes. An important application of SV-GMRFs is in inference of gene regulatory networks from spatially-resolved transcriptomics datasets. The current work on inference of SV-GMRFs… ▽ More

    Submitted 21 June, 2022; originally announced June 2022.

  13. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  14. arXiv:2206.00710  [pdf, other

    stat.ME stat.CO

    Data Augmentation MCMC for Bayesian Inference from Privatized Data

    Authors: Nianqiao Ju, Jordan A. Awan, Ruobin Gong, Vinayak A. Rao

    Abstract: Differentially private mechanisms protect privacy by introducing additional randomness into the data. Restricting access to only the privatized data makes it challenging to perform valid statistical inference on parameters underlying the confidential data. Specifically, the likelihood function of the privatized data requires integrating over the large space of confidential databases and is typical… ▽ More

    Submitted 7 December, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: 17 pages, 3 figures, 2 tables. NeurIPS 2022

  15. arXiv:2203.02025  [pdf, other

    stat.ME

    Online Balanced Experimental Design

    Authors: David Arbour, Drew Dimmery, Tung Mai, Anup Rao

    Abstract: e consider the experimental design problem in an online environment, an important practical task for reducing the variance of estimates in randomized experiments which allows for greater precision, and in turn, improved decision making. In this work, we present algorithms that build on recent advances in online discrepancy minimization which accommodate both arbitrary treatment probabilities and m… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

  16. arXiv:2111.14674  [pdf, ps, other

    cs.LG cs.AI cs.DS stat.ML

    Online MAP Inference and Learning for Nonsymmetric Determinantal Point Processes

    Authors: Aravind Reddy, Ryan A. Rossi, Zhao Song, Anup Rao, Tung Mai, Nedim Lipka, Gang Wu, Eunyee Koh, Nesreen Ahmed

    Abstract: In this paper, we introduce the online and streaming MAP inference and learning problems for Non-symmetric Determinantal Point Processes (NDPPs) where data points arrive in an arbitrary order and the algorithms are constrained to use a single-pass over the data as well as sub-linear memory. The online setting has an additional requirement of maintaining a valid solution at any point in time. For s… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

  17. arXiv:2107.14151  [pdf, other

    stat.ME cs.AI cs.LG stat.ML

    Modern Non-Linear Function-on-Function Regression

    Authors: Aniruddha Rajendra Rao, Matthew Reimherr

    Abstract: We introduce a new class of non-linear function-on-function regression models for functional data using neural networks. We propose a framework using a hidden layer consisting of continuous neurons, called a continuous hidden layer, for functional response modeling and give two model fitting strategies, Functional Direct Neural Network (FDNN) and Functional Basis Neural Network (FBNN). Both are de… ▽ More

    Submitted 7 October, 2023; v1 submitted 29 July, 2021; originally announced July 2021.

    Comments: 6 figures, 6 tables (including supplementary material), 16 pages (including supplementary material). arXiv admin note: text overlap with arXiv:2104.09371

    Journal ref: Statistics and Computing 2023

  18. arXiv:2106.12199  [pdf, other

    math.ST cs.LG math.OC stat.ME

    Bayesian Joint Chance Constrained Optimization: Approximations and Statistical Consistency

    Authors: Prateek Jaiswal, Harsha Honnappa, Vinayak A. Rao

    Abstract: This paper considers data-driven chance-constrained stochastic optimization problems in a Bayesian framework. Bayesian posteriors afford a principled mechanism to incorporate data and prior knowledge into stochastic optimization problems. However, the computation of Bayesian posteriors is typically an intractable problem, and has spawned a large literature on approximate Bayesian computation. Here… ▽ More

    Submitted 30 September, 2022; v1 submitted 23 June, 2021; originally announced June 2021.

  19. arXiv:2106.10941  [pdf, other

    stat.ME stat.AP

    Tumor Radiogenomics with Bayesian Layered Variable Selection

    Authors: Shariq Mohammed, Sebastian Kurtek, Karthik Bharath, Arvind Rao, Veerabhadran Baladandayuthapani

    Abstract: We propose a statistical framework to integrate radiological magnetic resonance imaging (MRI) and genomic data to identify the underlying radiogenomic associations in lower grade gliomas (LGG). We devise a novel imaging phenotype by dividing the tumor region into concentric spherical layers that mimics the tumor evolution process. MRI data within each layer is represented by voxel--intensity-based… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

  20. arXiv:2105.02526  [pdf, other

    cs.CR stat.AP

    Honeyboost: Boosting honeypot performance with data fusion and anomaly detection

    Authors: Sevvandi Kandanaarachchi, Hideya Ochiai, Asha Rao

    Abstract: With cyber incidents and data breaches becoming increasingly common, being able to predict a cyberattack has never been more crucial. The ability of Network Anomaly Detection Systems (NADS) to identify unusual behavior makes them useful in predicting such attacks. However, NADS often suffer from high false positive rates. In this paper, we introduce a novel framework called Honeyboost that enhance… ▽ More

    Submitted 7 September, 2021; v1 submitted 6 May, 2021; originally announced May 2021.

    Comments: 26 pages

  21. arXiv:2104.09371  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Non-linear Functional Modeling using Neural Networks

    Authors: Aniruddha Rajendra Rao, Matthew Reimherr

    Abstract: We introduce a new class of non-linear models for functional data based on neural networks. Deep learning has been very successful in non-linear modeling, but there has been little work done in the functional data setting. We propose two variations of our framework: a functional neural network with continuous hidden layers, called the Functional Direct Neural Network (FDNN), and a second version t… ▽ More

    Submitted 3 May, 2023; v1 submitted 19 April, 2021; originally announced April 2021.

    Comments: 3 figures, 10 tables (including supplementary material), 14 pages (including supplementary material)

    Journal ref: Journal of Computational and Graphical Statistics, 2023

  22. arXiv:2104.00510  [pdf, other

    stat.ME stat.AP

    RADIOHEAD: Radiogenomic Analysis Incorporating Tumor Heterogeneity in Imaging Through Densities

    Authors: Shariq Mohammed, Karthik Bharath, Sebastian Kurtek, Arvind Rao, Veerabhadran Baladandayuthapani

    Abstract: Recent technological advancements have enabled detailed investigation of associations between the molecular architecture and tumor heterogeneity, through multi-source integration of radiological imaging and genomic (radiogenomic) data. In this paper, we integrate and harness radiogenomic data in patients with lower grade gliomas (LGG), a type of brain cancer, in order to develop a regression frame… ▽ More

    Submitted 7 April, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

  23. arXiv:2103.04557  [pdf, other

    stat.ML cs.LG

    Asymptotics of Ridge Regression in Convolutional Models

    Authors: Mojtaba Sahraee-Ardakan, Tung Mai, Anup Rao, Ryan Rossi, Sundeep Rangan, Alyson K. Fletcher

    Abstract: Understanding generalization and estimation error of estimators for simple models such as linear and generalized linear models has attracted a lot of attention recently. This is in part due to an interesting observation made in machine learning community that highly over-parameterized neural networks achieve zero training error, and yet they are able to generalize well over the test samples. This… ▽ More

    Submitted 8 March, 2021; originally announced March 2021.

  24. arXiv:2102.13179  [pdf, other

    cs.LG stat.ML

    Machine Unlearning via Algorithmic Stability

    Authors: Enayat Ullah, Tung Mai, Anup Rao, Ryan Rossi, Raman Arora

    Abstract: We study the problem of machine unlearning and identify a notion of algorithmic stability, Total Variation (TV) stability, which we argue, is suitable for the goal of exact unlearning. For convex risk minimization problems, we design TV-stable algorithms based on noisy Stochastic Gradient Descent (SGD). Our key contribution is the design of corresponding efficient unlearning algorithms, which are… ▽ More

    Submitted 25 February, 2021; originally announced February 2021.

  25. arXiv:2101.06309  [pdf, other

    cs.LG math.ST stat.ML

    Fundamental Tradeoffs in Distributionally Adversarial Training

    Authors: Mohammad Mehrabi, Adel Javanmard, Ryan A. Rossi, Anup Rao, Tung Mai

    Abstract: Adversarial training is among the most effective techniques to improve the robustness of models against adversarial perturbations. However, the full effect of this approach on models is not well understood. For example, while adversarial training can reduce the adversarial risk (prediction error against an adversary), it sometimes increase standard risk (generalization error when there is no adver… ▽ More

    Submitted 15 January, 2021; originally announced January 2021.

    Comments: 23 pages, 3 figures

  26. arXiv:2011.12509  [pdf, other

    stat.ME cs.LG stat.ML

    Modern Multiple Imputation with Functional Data

    Authors: Aniruddha Rajendra Rao, Matthew Reimherr

    Abstract: This work considers the problem of fitting functional models with sparsely and irregularly sampled functional data. It overcomes the limitations of the state-of-the-art methods, which face major challenges in the fitting of more complex non-linear models. Currently, many of these models cannot be consistently estimated unless the number of observed points per curve grows sufficiently quickly with… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

    Comments: 7 figures (including supplementary material), 8 tables (including supplementary material), 14 pages (including supplementary material)

    Journal ref: Stat, 2021

  27. arXiv:2011.12378  [pdf, other

    cs.LG stat.ML

    A Non-linear Function-on-Function Model for Regression with Time Series Data

    Authors: Qiyao Wang, Haiyan Wang, Chetan Gupta, Aniruddha Rajendra Rao, Hamed Khorasgani

    Abstract: In the last few decades, building regression models for non-scalar variables, including time series, text, image, and video, has attracted increasing interests of researchers from the data analytic community. In this paper, we focus on a multivariate time series regression problem. Specifically, we aim to learn mathematical mappings from multiple chronologically measured numerical variables within… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

    Comments: Accepted by IEEE Big Data 2020

  28. arXiv:2010.11332  [pdf, other

    stat.ME stat.ML

    Efficient Balanced Treatment Assignments for Experimentation

    Authors: David Arbour, Drew Dimmery, Anup Rao

    Abstract: In this work, we reframe the problem of balanced treatment assignment as optimization of a two-sample test between test and control units. Using this lens we provide an assignment algorithm that is optimal with respect to the minimum spanning tree test of Friedman and Rafsky (1979). This assignment to treatment groups may be performed exactly in polynomial time. We provide a probabilistic interpre… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

  29. arXiv:2009.13566  [pdf, other

    cs.LG cs.SI stat.ML

    Graph Neural Networks with Heterophily

    Authors: Jiong Zhu, Ryan A. Rossi, Anup Rao, Tung Mai, Nedim Lipka, Nesreen K. Ahmed, Danai Koutra

    Abstract: Graph Neural Networks (GNNs) have proven to be useful for many different practical applications. However, many existing GNN models have implicitly assumed homophily among the nodes connected in the graph, and therefore have largely overlooked the important setting of heterophily, where most connected nodes are from different classes. In this work, we propose a novel framework called CPGNN that gen… ▽ More

    Submitted 14 June, 2021; v1 submitted 28 September, 2020; originally announced September 2020.

    Comments: Proceedings version of AAAI 2021 with appendix and additional typo fixes; 12 pages, 4 figures

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence. 35, 12 (May 2021), 11168-11176

  30. arXiv:2009.05665  [pdf, other

    cs.LG stat.ML

    Spatio-Temporal Functional Neural Networks

    Authors: Aniruddha Rajendra Rao, Qiyao Wang, Haiyan Wang, Hamed Khorasgani, Chetan Gupta

    Abstract: Explosive growth in spatio-temporal data and its wide range of applications have attracted increasing interests of researchers in the statistical and machine learning fields. The spatio-temporal regression problem is of paramount importance from both the methodology development and real-world application perspectives. Given the observed spatially encoded time series covariates and real-valued resp… ▽ More

    Submitted 11 September, 2020; originally announced September 2020.

    Comments: Accepted by 2020 IEEE International Conference on Data Science and Advanced Analytics (DSAA)

  31. arXiv:2009.03860  [pdf, other

    stat.ME

    Designing Transportable Experiments

    Authors: My Phan, David Arbour, Drew Dimmery, Anup B. Rao

    Abstract: We consider the problem of designing a randomized experiment on a source population to estimate the Average Treatment Effect (ATE) on a target population. We propose a novel approach which explicitly considers the target when designing the experiment on the source. Under the covariate shift assumption, we design an unbiased importance-weighted estimator for the target population's ATE. To reduce t… ▽ More

    Submitted 4 September, 2021; v1 submitted 8 September, 2020; originally announced September 2020.

  32. arXiv:2006.02672  [pdf, other

    cs.LG stat.ML

    Sample Efficient Graph-Based Optimization with Noisy Observations

    Authors: Tan Nguyen, Ali Shameli, Yasin Abbasi-Yadkori, Anup Rao, Branislav Kveton

    Abstract: We study sample complexity of optimizing "hill-climbing friendly" functions defined on a graph under noisy observations. We define a notion of convexity, and we show that a variant of best-arm identification can find a near-optimal solution after a small number of queries that is independent of the size of the graph. For functions that have local minima and are nearly convex, we show a sample comp… ▽ More

    Submitted 4 June, 2020; originally announced June 2020.

    Comments: The first version of this paper appeared in AISTATS 2019. Thank to community feedback, some typos and a minor issue have been identified. Specifically, on page 4, column 2, line 18, the statement $Δ_{1,s} \ge (1+m)^{S-1-s} Δ_1$ is not valid, and in the proof of Theorem 2, "By Lemma 1" should be "By Definition 2". These problems are fixed in this updated version published here on arxiv

    Journal ref: AISTATS 2019

  33. arXiv:2004.12012  [pdf, other

    stat.AP stat.CO stat.ME

    Integrative Bayesian models using Post-selective Inference: a case study in Radiogenomics

    Authors: Snigdha Panigrahi, Shariq Mohammed, Arvind Rao, Veerabhadran Baladandayuthapani

    Abstract: Integrative analyses based on statistically relevant associations between genomics and a wealth of intermediary phenotypes (such as imaging) provide vital insights into their clinical relevance in terms of the disease mechanisms. Estimates for uncertainty in the resulting integrative models are however unreliable unless inference accounts for the selection of these associations with accuracy. In t… ▽ More

    Submitted 12 August, 2022; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: 45 pages, 7 Figures

  34. arXiv:2003.01704  [pdf, other

    cs.LG stat.ML

    Model Selection in Contextual Stochastic Bandit Problems

    Authors: Aldo Pacchiano, My Phan, Yasin Abbasi-Yadkori, Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvari

    Abstract: We study bandit model selection in stochastic environments. Our approach relies on a meta-algorithm that selects between candidate base algorithms. We develop a meta-algorithm-base algorithm abstraction that can work with general classes of base algorithms and different type of adversarial meta-algorithms. Our methods rely on a novel and generic smoothing transformation for bandit algorithms that… ▽ More

    Submitted 4 December, 2022; v1 submitted 3 March, 2020; originally announced March 2020.

    Comments: 33 main pages, 15 appendix pages

  35. arXiv:2003.01074  [pdf, other

    cs.LG stat.ML

    Gaussian Process Policy Optimization

    Authors: Ashish Rao, Bidipta Sarkar, Tejas Narayanan

    Abstract: We propose a novel actor-critic, model-free reinforcement learning algorithm which employs a Bayesian method of parameter space exploration to solve environments. A Gaussian process is used to learn the expected return of a policy given the policy's parameters. The system is trained by updating the parameters using gradient descent on a new surrogate loss function consisting of the Proximal Policy… ▽ More

    Submitted 2 March, 2020; originally announced March 2020.

  36. arXiv:2001.01404  [pdf, other

    stat.ML cs.LG

    Variational Bayesian Methods for Stochastically Constrained System Design Problems

    Authors: Prateek Jaiswal, Harsha Honnappa, Vinayak A. Rao

    Abstract: We study system design problems stated as parameterized stochastic programs with a chance-constraint set. We adopt a Bayesian approach that requires the computation of a posterior predictive integral which is usually intractable. In addition, for the problem to be a well-defined convex program, we must retain the convexity of the feasible set. Consequently, we propose a variational Bayes-based met… ▽ More

    Submitted 6 January, 2020; originally announced January 2020.

    Journal ref: 2nd Symposium on Advances in Approximate Bayesian Inference, 2019

  37. arXiv:1911.01288  [pdf, other

    stat.ML cs.LG

    Asymptotic Consistency of Loss-Calibrated Variational Bayes

    Authors: Prateek Jaiswal, Harsha Honnappa, Vinayak A. Rao

    Abstract: This paper establishes the asymptotic consistency of the {\it loss-calibrated variational Bayes} (LCVB) method. LCVB was proposed in~\cite{LaSiGh2011} as a method for approximately computing Bayesian posteriors in a `loss aware' manner. This methodology is also highly relevant in general data-driven decision-making contexts. Here, we not only establish the asymptotic consistency of the calibrated… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

  38. arXiv:1906.05059  [pdf, other

    cs.LG cs.IR cs.SI stat.ML

    Higher-Order Ranking and Link Prediction: From Closing Triangles to Closing Higher-Order Motifs

    Authors: Ryan A. Rossi, Anup Rao, Sungchul Kim, Eunyee Koh, Nesreen K. Ahmed, Gang Wu

    Abstract: In this paper, we introduce the notion of motif closure and describe higher-order ranking and link prediction methods based on the notion of closing higher-order network motifs. The methods are fast and efficient for real-time ranking and link prediction-based applications such as web search, online advertising, and recommendation. In such applications, real-time performance is critical. The propo… ▽ More

    Submitted 12 June, 2019; originally announced June 2019.

  39. arXiv:1902.01902  [pdf, other

    math.ST cs.LG stat.ML

    Asymptotic Consistency of $α-$Rényi-Approximate Posteriors

    Authors: Prateek Jaiswal, Vinayak A. Rao, Harsha Honnappa

    Abstract: We study the asymptotic consistency properties of $α$-Rényi approximate posteriors, a class of variational Bayesian methods that approximate an intractable Bayesian posterior with a member of a tractable family of distributions, the member chosen to minimize the $α$-Rényi divergence from the true posterior. Unique to our work is that we consider settings with $α> 1$, resulting in approximations th… ▽ More

    Submitted 14 August, 2020; v1 submitted 5 February, 2019; originally announced February 2019.

  40. arXiv:1810.03496  [pdf, other

    stat.ME

    Regression Analyses of Distributions using Quantile Functional Regression

    Authors: Hojin Yang, Veerabhadran Baladandayuthapani, Arvind U. K. Rao, Jeffrey S. Morris

    Abstract: Radiomics involves the study of tumor images to identify quantitative markers explaining cancer heterogeneity. The predominant approach is to extract hundreds to thousands of image features, including histogram features comprised of summaries of the marginal distribution of pixel intensities, which leads to multiple testing problems and can miss out on insights not contained in the selected featur… ▽ More

    Submitted 4 October, 2018; originally announced October 2018.

    Comments: 83 pages, 32 figures. arXiv admin note: substantial text overlap with arXiv:1711.00031

  41. arXiv:1805.09793  [pdf, other

    cs.LG stat.ML

    New Insights into Bootstrapping for Bandits

    Authors: Sharan Vaswani, Branislav Kveton, Zheng Wen, Anup Rao, Mark Schmidt, Yasin Abbasi-Yadkori

    Abstract: We investigate the use of bootstrapping in the bandit setting. We first show that the commonly used non-parametric bootstrapping (NPB) procedure can be provably inefficient and establish a near-linear lower bound on the regret incurred by it under the bandit model with Bernoulli rewards. We show that NPB with an appropriate amount of forced exploration can result in sub-linear albeit sub-optimal r… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

  42. arXiv:1801.09303  [pdf, other

    stat.ML cs.AI cs.SI

    HONE: Higher-Order Network Embeddings

    Authors: Ryan A. Rossi, Nesreen K. Ahmed, Eunyee Koh, Sungchul Kim, Anup Rao, Yasin Abbasi Yadkori

    Abstract: This paper describes a general framework for learning Higher-Order Network Embeddings (HONE) from graph data based on network motifs. The HONE framework is highly expressive and flexible with many interchangeable components. The experimental results demonstrate the effectiveness of learning higher-order network representations. In all cases, HONE outperforms recent embedding methods that are unabl… ▽ More

    Submitted 29 May, 2018; v1 submitted 28 January, 2018; originally announced January 2018.

  43. arXiv:1712.04644  [pdf, other

    cs.LG stat.ML

    Stochastic Low-Rank Bandits

    Authors: Branislav Kveton, Csaba Szepesvari, Anup Rao, Zheng Wen, Yasin Abbasi-Yadkori, S. Muthukrishnan

    Abstract: Many problems in computer vision and recommender systems involve low-rank matrices. In this work, we study the problem of finding the maximum entry of a stochastic low-rank matrix from sequential observations. At each step, a learning agent chooses pairs of row and column arms, and receives the noisy product of their latent values as a reward. The main challenge is that the latent values are unobs… ▽ More

    Submitted 13 December, 2017; originally announced December 2017.

  44. arXiv:1702.01191  [pdf, other

    stat.AP

    Radiologic Image-based Statistical Shape Analysis of Brain Tumors

    Authors: Karthik Bharath, Sebastian Kurtek, Arvind Rao, Veerabhadran Baladandayuthapani

    Abstract: We propose a curve-based Riemannian-geometric approach for general shape-based statistical analyses of tumors obtained from radiologic images. A key component of the framework is a suitable metric that (1) enables comparisons of tumor shapes, (2) provides tools for computing descriptive statistics and implementing principal component analysis on the space of tumor shapes, and (3) allows for a rich… ▽ More

    Submitted 3 February, 2017; originally announced February 2017.

  45. Going off the Grid: Iterative Model Selection for Biclustered Matrix Completion

    Authors: Eric Chi, Liuiyi Hu, Arvind K. Saibaba, Arvind U. K. Rao

    Abstract: We consider the problem of performing matrix completion with side information on row-by-row and column-by-column similarities. We build upon recent proposals for matrix estimation with smoothness constraints with respect to row and column graphs. We present a novel iterative procedure for directly minimizing an information criterion in order to select an appropriate amount row and column smoothing… ▽ More

    Submitted 19 October, 2016; v1 submitted 17 October, 2016; originally announced October 2016.

    Comments: 42 pages, 7 figures. Supplementary material (https://github.com/echi/IMS/blob/master/BMC_Supplement_JCGS.pdf) and codes available (https://github.com/echi/IMS)

    Journal ref: Journal of Computational and Graphical Statistics, 28(1):36--47, 2019

  46. arXiv:1604.06968  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Agnostic Estimation of Mean and Covariance

    Authors: Kevin A. Lai, Anup B. Rao, Santosh Vempala

    Abstract: We consider the problem of estimating the mean and covariance of a distribution from iid samples in $\mathbb{R}^n$, in the presence of an $η$ fraction of malicious noise; this is in contrast to much recent work where the noise itself is assumed to be from a distribution of known type. The agnostic problem includes many interesting special cases, e.g., learning the parameters of a single Gaussian (… ▽ More

    Submitted 14 August, 2016; v1 submitted 23 April, 2016; originally announced April 2016.

  47. arXiv:1604.00376  [pdf, other

    stat.ME

    Inferring network structure in non-normal and mixed discrete-continuous genomic data

    Authors: Anindya Bhadra, Arvind Rao, Veerabhadran Baladandayuthapani

    Abstract: Inferring dependence structure through undirected graphs is crucial for uncovering the major modes of multivariate interaction among high-dimensional genomic markers that are potentially associated with cancer. Traditionally, conditional independence has been studied using sparse Gaussian graphical models for continuous data and sparse Ising models for discrete data. However, there are two clear s… ▽ More

    Submitted 1 April, 2016; originally announced April 2016.

  48. arXiv:1404.2910  [pdf, other

    math.ST stat.ME

    Statistical Tests for Large Tree-structured Data

    Authors: Karthik Bharath, Prabhanjan Kambadur, Dipak. K. Dey, Arvind Rao, Veerabhadran Baladandayuthapani

    Abstract: We develop a general statistical framework for the analysis and inference of large tree-structured data, with a focus on developing asymptotic goodness-of-fit tests. We first propose a consistent statistical model for binary trees, from which we develop a class of invariant tests. Using the model for binary trees, we then construct tests for general trees by using the distributional properties of… ▽ More

    Submitted 20 September, 2016; v1 submitted 10 April, 2014; originally announced April 2014.

  49. Understanding Theoretically The Impact of Reporting of Disease Cases in Epidemiology

    Authors: Arni S. R. Srinivasa Rao

    Abstract: In conducting preliminary analysis during an epidemic, data on reported disease cases offer key information in guiding the direction to the in-depth analysis. Models for growth and transmission dynamics are heavily dependent on preliminary analysis results. When a particular disease case is reported more than once or alternatively is never reported or detected in the population, then in such a sit… ▽ More

    Submitted 28 February, 2012; v1 submitted 13 February, 2012; originally announced February 2012.

    Comments: 21 pages, 2 figures. To appear in Journal of Theoretical Biology (Elsevier)

    MSC Class: 92D30; 26.70

    Journal ref: (2012) Journal of Theoretical Biology 302:89-95

  50. arXiv:1104.1775  [pdf

    stat.AP q-bio.OT q-bio.QM

    Biometric Cards for Indian Population: Role of Mathematical Models in Assisting and Planning

    Authors: Arni S. R. Srinivasa Rao

    Abstract: Mathematical models could be helpful in assisting the Indian Government's new initiative of issuing biometric cards to its citizens. In this note, we look into the role of mathematical models in estimating the missing, non-enumerated population numbers, estimating annual numbers of cards required by age, gender and regions in India. The linkage between National Population Register and biometric ca… ▽ More

    Submitted 28 April, 2011; v1 submitted 10 April, 2011; originally announced April 2011.

    Comments: Short note

    MSC Class: 92D25

    Journal ref: Arni S. R. Srinivasa Rao (2011): Biometric Cards for Indian Population: Role of Mathematical Models in Assisting and Planning Asian Population Studies, 7:3, 295-300. Publisher: Routledge: Taylor & Francis Group