-
Joint Modeling of Spatial Dependencies Across Multiple Subjects in Multiplexed Tissue Imaging
Authors:
Joel Eliason,
Arvind Rao,
Timothy L Frankel,
Michele Peruzzi
Abstract:
The tumor microenvironment (TME) is a spatially heterogeneous ecosystem where cellular interactions shape tumor progression and response to therapy. Multiplexed imaging technologies enable high-resolution spatial characterization of the TME, yet statistical methods for analyzing multi-subject spatial tissue data remain limited. We propose a Bayesian hierarchical model for inferring spatial depende…
▽ More
The tumor microenvironment (TME) is a spatially heterogeneous ecosystem where cellular interactions shape tumor progression and response to therapy. Multiplexed imaging technologies enable high-resolution spatial characterization of the TME, yet statistical methods for analyzing multi-subject spatial tissue data remain limited. We propose a Bayesian hierarchical model for inferring spatial dependencies in multiplexed imaging datasets across multiple subjects. Our model represents the TME as a multivariate log-Gaussian Cox process, where spatial intensity functions of different cell types are governed by a latent multivariate Gaussian process. By pooling information across subjects, we estimate spatial correlation functions that capture within-type and cross-type dependencies, enabling interpretable inference about disease-specific cellular organization. We validate our method using simulations, demonstrating robustness to latent factor specification and spatial resolution. We apply our approach to two multiplexed imaging datasets: pancreatic cancer and colorectal cancer, revealing distinct spatial organization patterns across disease subtypes and highlighting tumor-immune interactions that differentiate immune-permissive and immune-exclusive microenvironments. These findings provide insight into mechanisms of immune evasion and may inform novel therapeutic strategies. Our approach offers a principled framework for modeling spatial dependencies in multi-subject data, with broader applicability to spatially resolved omics and imaging studies. An R package, available online, implements our methods.
△ Less
Submitted 3 April, 2025;
originally announced April 2025.
-
dapper: Data Augmentation for Private Posterior Estimation in R
Authors:
Kevin Eng,
Jordan A. Awan,
Nianqiao Phyllis Ju,
Vinayak A. Rao,
Ruobin Gong
Abstract:
This paper serves as a reference and introduction to using the R package dapper. dapper encodes a sampling framework which allows exact Markov chain Monte Carlo simulation of parameters and latent variables in a statistical model given privatized data. The goal of this package is to fill an urgent need by providing applied researchers with a flexible tool to perform valid Bayesian inference on dat…
▽ More
This paper serves as a reference and introduction to using the R package dapper. dapper encodes a sampling framework which allows exact Markov chain Monte Carlo simulation of parameters and latent variables in a statistical model given privatized data. The goal of this package is to fill an urgent need by providing applied researchers with a flexible tool to perform valid Bayesian inference on data protected by differential privacy, allowing them to properly account for the noise introduced for privacy protection in their statistical analysis. dapper offers a significant step forward in providing general-purpose statistical inference tools for privatized data.
△ Less
Submitted 18 December, 2024;
originally announced December 2024.
-
A Two-Stage Approach for Segmenting Spatial Point Patterns Applied to Multiplex Imaging
Authors:
Alvin Sheng,
Brian J. Reich,
Ana-Maria Staicu,
Santhoshi N. Krishnan,
Arvind Rao,
Timothy L. Frankel
Abstract:
Recent advances in multiplex imaging have enabled researchers to locate different types of cells within a tissue sample. This is especially relevant for tumor immunology, as clinical regimes corresponding to different stages of disease or responses to treatment may manifest as different spatial arrangements of tumor and immune cells. Spatial point pattern modeling can be used to partition multiple…
▽ More
Recent advances in multiplex imaging have enabled researchers to locate different types of cells within a tissue sample. This is especially relevant for tumor immunology, as clinical regimes corresponding to different stages of disease or responses to treatment may manifest as different spatial arrangements of tumor and immune cells. Spatial point pattern modeling can be used to partition multiplex tissue images according to these regimes. To this end, we propose a two-stage approach: first, local intensities and pair correlation functions are estimated from the spatial point pattern of cells within each image, and the pair correlation functions are reduced in dimension via spectral decomposition of the covariance function. Second, the estimates are clustered in a Bayesian hierarchical model with spatially-dependent cluster labels. The clusters correspond to regimes of interest that are present across subjects; the cluster labels segment the spatial point patterns according to those regimes. Through Markov Chain Monte Carlo sampling, we jointly estimate and quantify uncertainty in the cluster assignment and spatial characteristics of each cluster. Simulations demonstrate the performance of the method, and it is applied to a set of multiplex immunofluorescence images of diseased pancreatic tissue.
△ Less
Submitted 11 December, 2024;
originally announced December 2024.
-
On Distributional Discrepancy for Experimental Design with General Assignment Probabilities
Authors:
Anup B. Rao,
Peng Zhang
Abstract:
We investigate experimental design for randomized controlled trials (RCTs) with both equal and unequal treatment-control assignment probabilities. Our work makes progress on the connection between the distributional discrepancy minimization (DDM) problem introduced by Harshaw et al. (2024) and the design of RCTs. We make two main contributions: First, we prove that approximating the optimal soluti…
▽ More
We investigate experimental design for randomized controlled trials (RCTs) with both equal and unequal treatment-control assignment probabilities. Our work makes progress on the connection between the distributional discrepancy minimization (DDM) problem introduced by Harshaw et al. (2024) and the design of RCTs. We make two main contributions: First, we prove that approximating the optimal solution of the DDM problem within a certain constant error is NP-hard. Second, we introduce a new Multiplicative Weights Update (MWU) algorithm for the DDM problem, which improves the Gram-Schmidt walk algorithm used by Harshaw et al. (2024) when assignment probabilities are unequal. Building on the framework of Harshaw et al. (2024) and our MWU algorithm, we then develop the MWU design, which reduces the worst-case mean squared error in estimating the average treatment effect. Finally, we present a comprehensive simulation study comparing our design with commonly used designs.
△ Less
Submitted 18 March, 2025; v1 submitted 5 November, 2024;
originally announced November 2024.
-
Spatially Structured Regression for Non-conformable Spaces: Integrating Pathology Imaging and Genomics Data in Cancer
Authors:
Nathaniel Osher,
Jian Kang,
Arvind Rao,
Veerabhadran Baladandayuthapani
Abstract:
The spatial composition and cellular heterogeneity of the tumor microenvironment plays a critical role in cancer development and progression. High-definition pathology imaging of tumor biopsies provide a high-resolution view of the spatial organization of different types of cells. This allows for systematic assessment of intra- and inter-patient spatial cellular interactions and heterogeneity by i…
▽ More
The spatial composition and cellular heterogeneity of the tumor microenvironment plays a critical role in cancer development and progression. High-definition pathology imaging of tumor biopsies provide a high-resolution view of the spatial organization of different types of cells. This allows for systematic assessment of intra- and inter-patient spatial cellular interactions and heterogeneity by integrating accompanying patient-level genomics data. However, joint modeling across tumor biopsies presents unique challenges due to non-conformability (lack of a common spatial domain across biopsies) as well as high-dimensionality. To address this problem, we propose the Dual random effect and main effect selection model for Spatially structured regression model (DreameSpase). DreameSpase employs a Bayesian variable selection framework that facilitates the assessment of spatial heterogeneity with respect to covariates both within (through fixed effects) and between spaces (through spatial random effects) for non-conformable spatial domains. We demonstrate the efficacy of DreameSpase via simulations and integrative analyses of pathology imaging and gene expression data obtained from $335$ melanoma biopsies. Our findings confirm several existing relationships, e.g. neutrophil genes being associated with both inter- and intra-patient spatial heterogeneity, as well as discovering novel associations. We also provide freely available and computationally efficient software for implementing DreameSpase.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Equipment Health Assessment: Time Series Analysis for Wind Turbine Performance
Authors:
Jana Backhus,
Aniruddha Rajendra Rao,
Chandrasekar Venkatraman,
Abhishek Padmanabhan,
A. Vinoth Kumar,
Chetan Gupta
Abstract:
In this study, we leverage SCADA data from diverse wind turbines to predict power output, employing advanced time series methods, specifically Functional Neural Networks (FNN) and Long Short-Term Memory (LSTM) networks. A key innovation lies in the ensemble of FNN and LSTM models, capitalizing on their collective learning. This ensemble approach outperforms individual models, ensuring stable and a…
▽ More
In this study, we leverage SCADA data from diverse wind turbines to predict power output, employing advanced time series methods, specifically Functional Neural Networks (FNN) and Long Short-Term Memory (LSTM) networks. A key innovation lies in the ensemble of FNN and LSTM models, capitalizing on their collective learning. This ensemble approach outperforms individual models, ensuring stable and accurate power output predictions. Additionally, machine learning techniques are applied to detect wind turbine performance deterioration, enabling proactive maintenance strategies and health assessment. Crucially, our analysis reveals the uniqueness of each wind turbine, necessitating tailored models for optimal predictions. These insight underscores the importance of providing automatized customization for different turbines to keep human modeling effort low. Importantly, the methodologies developed in this analysis are not limited to wind turbines; they can be extended to predict and optimize performance in various machinery, highlighting the versatility and applicability of our research across diverse industrial contexts.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Predictive Analysis for Optimizing Port Operations
Authors:
Aniruddha Rajendra Rao,
Haiyan Wang,
Chetan Gupta
Abstract:
Maritime transport is a pivotal logistics mode for the long-distance and bulk transportation of goods. However, the intricate planning involved in this mode is often hindered by uncertainties, including weather conditions, cargo diversity, and port dynamics, leading to increased costs. Consequently, accurate estimation of the total (stay) time of the vessel and any delays at the port are essential…
▽ More
Maritime transport is a pivotal logistics mode for the long-distance and bulk transportation of goods. However, the intricate planning involved in this mode is often hindered by uncertainties, including weather conditions, cargo diversity, and port dynamics, leading to increased costs. Consequently, accurate estimation of the total (stay) time of the vessel and any delays at the port are essential for efficient planning and scheduling of port operations. This study aims to develop predictive analytics to address the shortcomings in the previous works of port operations for a vessels Stay Time and Delay Time, offering a valuable contribution to the field of maritime logistics. The proposed solution is designed to assist decision making in port environments and predict service delays. This is demonstrated through a case study on Brazil's ports. Additionally, feature analysis is used to understand the key factors impacting maritime logistics, enhancing the overall understanding of the complexities involved in port operations. Furthermore, we perform Shapley Additive Explanations (SHAP) analysis to interpret the effects of the features on the outcomes and understand their impact on each sample, providing deeper insights into the factors influencing port operations.
△ Less
Submitted 20 September, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
An ensemble of convolution-based methods for fault detection using vibration signals
Authors:
Xian Yeow Lee,
Aman Kumar,
Lasitha Vidyaratne,
Aniruddha Rajendra Rao,
Ahmed Farahat,
Chetan Gupta
Abstract:
This paper focuses on solving a fault detection problem using multivariate time series of vibration signals collected from planetary gearboxes in a test rig. Various traditional machine learning and deep learning methods have been proposed for multivariate time-series classification, including distance-based, functional data-oriented, feature-driven, and convolution kernel-based methods. Recent st…
▽ More
This paper focuses on solving a fault detection problem using multivariate time series of vibration signals collected from planetary gearboxes in a test rig. Various traditional machine learning and deep learning methods have been proposed for multivariate time-series classification, including distance-based, functional data-oriented, feature-driven, and convolution kernel-based methods. Recent studies have shown using convolution kernel-based methods like ROCKET, and 1D convolutional neural networks with ResNet and FCN, have robust performance for multivariate time-series data classification. We propose an ensemble of three convolution kernel-based methods and show its efficacy on this fault detection problem by outperforming other approaches and achieving an accuracy of more than 98.8\%.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
Optimal Sketching Bounds for Sparse Linear Regression
Authors:
Tung Mai,
Alexander Munteanu,
Cameron Musco,
Anup B. Rao,
Chris Schwiegelshohn,
David P. Woodruff
Abstract:
We study oblivious sketching for $k$-sparse linear regression under various loss functions such as an $\ell_p$ norm, or from a broad class of hinge-like loss functions, which includes the logistic and ReLU losses. We show that for sparse $\ell_2$ norm regression, there is a distribution over oblivious sketches with $Θ(k\log(d/k)/\varepsilon^2)$ rows, which is tight up to a constant factor. This ex…
▽ More
We study oblivious sketching for $k$-sparse linear regression under various loss functions such as an $\ell_p$ norm, or from a broad class of hinge-like loss functions, which includes the logistic and ReLU losses. We show that for sparse $\ell_2$ norm regression, there is a distribution over oblivious sketches with $Θ(k\log(d/k)/\varepsilon^2)$ rows, which is tight up to a constant factor. This extends to $\ell_p$ loss with an additional additive $O(k\log(k/\varepsilon)/\varepsilon^2)$ term in the upper bound. This establishes a surprising separation from the related sparse recovery problem, which is an important special case of sparse regression. For this problem, under the $\ell_2$ norm, we observe an upper bound of $O(k \log (d)/\varepsilon + k\log(k/\varepsilon)/\varepsilon^2)$ rows, showing that sparse recovery is strictly easier to sketch than sparse regression. For sparse regression under hinge-like loss functions including sparse logistic and sparse ReLU regression, we give the first known sketching bounds that achieve $o(d)$ rows showing that $O(μ^2 k\log(μn d/\varepsilon)/\varepsilon^2)$ rows suffice, where $μ$ is a natural complexity parameter needed to obtain relative error bounds for these loss functions. We again show that this dimension is tight, up to lower order terms and the dependence on $μ$. Finally, we show that similar sketching bounds can be achieved for LASSO regression, a popular convex relaxation of sparse regression, where one aims to minimize $\|Ax-b\|_2^2+λ\|x\|_1$ over $x\in\mathbb{R}^d$. We show that sketching dimension $O(\log(d)/(λ\varepsilon)^2)$ suffices and that the dependence on $d$ and $λ$ is tight.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.
-
A Functional approach for Two Way Dimension Reduction in Time Series
Authors:
Aniruddha Rajendra Rao,
Haiyan Wang,
Chetan Gupta
Abstract:
The rise in data has led to the need for dimension reduction techniques, especially in the area of non-scalar variables, including time series, natural language processing, and computer vision. In this paper, we specifically investigate dimension reduction for time series through functional data analysis. Current methods for dimension reduction in functional data are functional principal component…
▽ More
The rise in data has led to the need for dimension reduction techniques, especially in the area of non-scalar variables, including time series, natural language processing, and computer vision. In this paper, we specifically investigate dimension reduction for time series through functional data analysis. Current methods for dimension reduction in functional data are functional principal component analysis and functional autoencoders, which are limited to linear mappings or scalar representations for the time series, which is inefficient. In real data applications, the nature of the data is much more complex. We propose a non-linear function-on-function approach, which consists of a functional encoder and a functional decoder, that uses continuous hidden layers consisting of continuous neurons to learn the structure inherent in functional data, which addresses the aforementioned concerns in the existing approaches. Our approach gives a low dimension latent representation by reducing the number of functional features as well as the timepoints at which the functions are observed. The effectiveness of the proposed model is demonstrated through multiple simulations and real data examples.
△ Less
Submitted 1 January, 2023;
originally announced January 2023.
-
Sample Constrained Treatment Effect Estimation
Authors:
Raghavendra Addanki,
David Arbour,
Tung Mai,
Cameron Musco,
Anup Rao
Abstract:
Treatment effect estimation is a fundamental problem in causal inference. We focus on designing efficient randomized controlled trials, to accurately estimate the effect of some treatment on a population of $n$ individuals. In particular, we study sample-constrained treatment effect estimation, where we must select a subset of $s \ll n$ individuals from the population to experiment on. This subset…
▽ More
Treatment effect estimation is a fundamental problem in causal inference. We focus on designing efficient randomized controlled trials, to accurately estimate the effect of some treatment on a population of $n$ individuals. In particular, we study sample-constrained treatment effect estimation, where we must select a subset of $s \ll n$ individuals from the population to experiment on. This subset must be further partitioned into treatment and control groups. Algorithms for partitioning the entire population into treatment and control groups, or for choosing a single representative subset, have been well-studied. The key challenge in our setting is jointly choosing a representative subset and a partition for that set.
We focus on both individual and average treatment effect estimation, under a linear effects model. We give provably efficient experimental designs and corresponding estimators, by identifying connections to discrepancy minimization and leverage-score-based sampling used in randomized numerical linear algebra. Our theoretical results obtain a smooth transition to known guarantees when $s$ equals the population size. We also empirically demonstrate the performance of our algorithms.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
Efficient Inference of Spatially-varying Gaussian Markov Random Fields with Applications in Gene Regulatory Networks
Authors:
Visweswaran Ravikumar,
Tong Xu,
Wajd N. Al-Holou,
Salar Fattahi,
Arvind Rao
Abstract:
In this paper, we study the problem of inferring spatially-varying Gaussian Markov random fields (SV-GMRF) where the goal is to learn a network of sparse, context-specific GMRFs representing network relationships between genes. An important application of SV-GMRFs is in inference of gene regulatory networks from spatially-resolved transcriptomics datasets. The current work on inference of SV-GMRFs…
▽ More
In this paper, we study the problem of inferring spatially-varying Gaussian Markov random fields (SV-GMRF) where the goal is to learn a network of sparse, context-specific GMRFs representing network relationships between genes. An important application of SV-GMRFs is in inference of gene regulatory networks from spatially-resolved transcriptomics datasets. The current work on inference of SV-GMRFs are based on the regularized maximum likelihood estimation (MLE) and suffer from overwhelmingly high computational cost due to their highly nonlinear nature. To alleviate this challenge, we propose a simple and efficient optimization problem in lieu of MLE that comes equipped with strong statistical and computational guarantees. Our proposed optimization problem is extremely efficient in practice: we can solve instances of SV-GMRFs with more than 2 million variables in less than 2 minutes. We apply the developed framework to study how gene regulatory networks in Glioblastoma are spatially rewired within tissue, and identify prominent activity of the transcription factor HES4 and ribosomal proteins as characterizing the gene expression network in the tumor peri-vascular niche that is known to harbor treatment resistant stem cells.
△ Less
Submitted 21 June, 2022;
originally announced June 2022.
-
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Authors:
Aarohi Srivastava,
Abhinav Rastogi,
Abhishek Rao,
Abu Awal Md Shoeb,
Abubakar Abid,
Adam Fisch,
Adam R. Brown,
Adam Santoro,
Aditya Gupta,
Adrià Garriga-Alonso,
Agnieszka Kluska,
Aitor Lewkowycz,
Akshat Agarwal,
Alethea Power,
Alex Ray,
Alex Warstadt,
Alexander W. Kocurek,
Ali Safaya,
Ali Tazarv,
Alice Xiang,
Alicia Parrish,
Allen Nie,
Aman Hussain,
Amanda Askell,
Amanda Dsouza
, et al. (426 additional authors not shown)
Abstract:
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur…
▽ More
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.
△ Less
Submitted 12 June, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Data Augmentation MCMC for Bayesian Inference from Privatized Data
Authors:
Nianqiao Ju,
Jordan A. Awan,
Ruobin Gong,
Vinayak A. Rao
Abstract:
Differentially private mechanisms protect privacy by introducing additional randomness into the data. Restricting access to only the privatized data makes it challenging to perform valid statistical inference on parameters underlying the confidential data. Specifically, the likelihood function of the privatized data requires integrating over the large space of confidential databases and is typical…
▽ More
Differentially private mechanisms protect privacy by introducing additional randomness into the data. Restricting access to only the privatized data makes it challenging to perform valid statistical inference on parameters underlying the confidential data. Specifically, the likelihood function of the privatized data requires integrating over the large space of confidential databases and is typically intractable. For Bayesian analysis, this results in a posterior distribution that is doubly intractable, rendering traditional MCMC techniques inapplicable. We propose an MCMC framework to perform Bayesian inference from the privatized data, which is applicable to a wide range of statistical models and privacy mechanisms. Our MCMC algorithm augments the model parameters with the unobserved confidential data, and alternately updates each one conditional on the other. For the potentially challenging step of updating the confidential data, we propose a generic approach that exploits the privacy guarantee of the mechanism to ensure efficiency. We give results on the computational complexity, acceptance rate, and mixing properties of our MCMC. We illustrate the efficacy and applicability of our methods on a naïve-Bayes log-linear model as well as on a linear regression model.
△ Less
Submitted 7 December, 2022; v1 submitted 1 June, 2022;
originally announced June 2022.
-
Online Balanced Experimental Design
Authors:
David Arbour,
Drew Dimmery,
Tung Mai,
Anup Rao
Abstract:
e consider the experimental design problem in an online environment, an important practical task for reducing the variance of estimates in randomized experiments which allows for greater precision, and in turn, improved decision making. In this work, we present algorithms that build on recent advances in online discrepancy minimization which accommodate both arbitrary treatment probabilities and m…
▽ More
e consider the experimental design problem in an online environment, an important practical task for reducing the variance of estimates in randomized experiments which allows for greater precision, and in turn, improved decision making. In this work, we present algorithms that build on recent advances in online discrepancy minimization which accommodate both arbitrary treatment probabilities and multiple treatments. The proposed algorithms are computational efficient, minimize covariate imbalance, and include randomization which enables robustness to misspecification. We provide worst case bounds on the expected mean squared error of the causal estimate and show that the proposed estimator is no worse than an implicit ridge regression, which are within a logarithmic factor of the best known results for offline experimental design. We conclude with a detailed simulation study showing favorable results relative to complete randomization as well as to offline methods for experimental design with time complexities exceeding our algorithm.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Online MAP Inference and Learning for Nonsymmetric Determinantal Point Processes
Authors:
Aravind Reddy,
Ryan A. Rossi,
Zhao Song,
Anup Rao,
Tung Mai,
Nedim Lipka,
Gang Wu,
Eunyee Koh,
Nesreen Ahmed
Abstract:
In this paper, we introduce the online and streaming MAP inference and learning problems for Non-symmetric Determinantal Point Processes (NDPPs) where data points arrive in an arbitrary order and the algorithms are constrained to use a single-pass over the data as well as sub-linear memory. The online setting has an additional requirement of maintaining a valid solution at any point in time. For s…
▽ More
In this paper, we introduce the online and streaming MAP inference and learning problems for Non-symmetric Determinantal Point Processes (NDPPs) where data points arrive in an arbitrary order and the algorithms are constrained to use a single-pass over the data as well as sub-linear memory. The online setting has an additional requirement of maintaining a valid solution at any point in time. For solving these new problems, we propose algorithms with theoretical guarantees, evaluate them on several real-world datasets, and show that they give comparable performance to state-of-the-art offline algorithms that store the entire data in memory and take multiple passes over it.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
Modern Non-Linear Function-on-Function Regression
Authors:
Aniruddha Rajendra Rao,
Matthew Reimherr
Abstract:
We introduce a new class of non-linear function-on-function regression models for functional data using neural networks. We propose a framework using a hidden layer consisting of continuous neurons, called a continuous hidden layer, for functional response modeling and give two model fitting strategies, Functional Direct Neural Network (FDNN) and Functional Basis Neural Network (FBNN). Both are de…
▽ More
We introduce a new class of non-linear function-on-function regression models for functional data using neural networks. We propose a framework using a hidden layer consisting of continuous neurons, called a continuous hidden layer, for functional response modeling and give two model fitting strategies, Functional Direct Neural Network (FDNN) and Functional Basis Neural Network (FBNN). Both are designed explicitly to exploit the structure inherent in functional data and capture the complex relations existing between the functional predictors and the functional response. We fit these models by deriving functional gradients and implement regularization techniques for more parsimonious results. We demonstrate the power and flexibility of our proposed method in handling complex functional models through extensive simulation studies as well as real data examples.
△ Less
Submitted 7 October, 2023; v1 submitted 29 July, 2021;
originally announced July 2021.
-
Bayesian Joint Chance Constrained Optimization: Approximations and Statistical Consistency
Authors:
Prateek Jaiswal,
Harsha Honnappa,
Vinayak A. Rao
Abstract:
This paper considers data-driven chance-constrained stochastic optimization problems in a Bayesian framework. Bayesian posteriors afford a principled mechanism to incorporate data and prior knowledge into stochastic optimization problems. However, the computation of Bayesian posteriors is typically an intractable problem, and has spawned a large literature on approximate Bayesian computation. Here…
▽ More
This paper considers data-driven chance-constrained stochastic optimization problems in a Bayesian framework. Bayesian posteriors afford a principled mechanism to incorporate data and prior knowledge into stochastic optimization problems. However, the computation of Bayesian posteriors is typically an intractable problem, and has spawned a large literature on approximate Bayesian computation. Here, in the context of chance-constrained optimization, we focus on the question of statistical consistency (in an appropriate sense) of the optimal value, computed using an approximate posterior distribution. To this end, we rigorously prove a frequentist consistency result demonstrating the convergence of the optimal value to the optimal value of a fixed, parameterized constrained optimization problem. We augment this by also establishing a probabilistic rate of convergence of the optimal value. We also prove the convex feasibility of the approximate Bayesian stochastic optimization problem. Finally, we demonstrate the utility of our approach on an optimal staffing problem for an M/M/c queueing model.
△ Less
Submitted 30 September, 2022; v1 submitted 23 June, 2021;
originally announced June 2021.
-
Tumor Radiogenomics with Bayesian Layered Variable Selection
Authors:
Shariq Mohammed,
Sebastian Kurtek,
Karthik Bharath,
Arvind Rao,
Veerabhadran Baladandayuthapani
Abstract:
We propose a statistical framework to integrate radiological magnetic resonance imaging (MRI) and genomic data to identify the underlying radiogenomic associations in lower grade gliomas (LGG). We devise a novel imaging phenotype by dividing the tumor region into concentric spherical layers that mimics the tumor evolution process. MRI data within each layer is represented by voxel--intensity-based…
▽ More
We propose a statistical framework to integrate radiological magnetic resonance imaging (MRI) and genomic data to identify the underlying radiogenomic associations in lower grade gliomas (LGG). We devise a novel imaging phenotype by dividing the tumor region into concentric spherical layers that mimics the tumor evolution process. MRI data within each layer is represented by voxel--intensity-based probability density functions which capture the complete information about tumor heterogeneity. Under a Riemannian-geometric framework these densities are mapped to a vector of principal component scores which act as imaging phenotypes. Subsequently, we build Bayesian variable selection models for each layer with the imaging phenotypes as the response and the genomic markers as predictors. Our novel hierarchical prior formulation incorporates the interior-to-exterior structure of the layers, and the correlation between the genomic markers. We employ a computationally-efficient Expectation--Maximization-based strategy for estimation. Simulation studies demonstrate the superior performance of our approach compared to other approaches. With a focus on the cancer driver genes in LGG, we discuss some biologically relevant findings. Genes implicated with survival and oncogenesis are identified as being associated with the spherical layers, which could potentially serve as early-stage diagnostic markers for disease monitoring, prior to routine invasive approaches.
△ Less
Submitted 21 June, 2021;
originally announced June 2021.
-
Honeyboost: Boosting honeypot performance with data fusion and anomaly detection
Authors:
Sevvandi Kandanaarachchi,
Hideya Ochiai,
Asha Rao
Abstract:
With cyber incidents and data breaches becoming increasingly common, being able to predict a cyberattack has never been more crucial. The ability of Network Anomaly Detection Systems (NADS) to identify unusual behavior makes them useful in predicting such attacks. However, NADS often suffer from high false positive rates. In this paper, we introduce a novel framework called Honeyboost that enhance…
▽ More
With cyber incidents and data breaches becoming increasingly common, being able to predict a cyberattack has never been more crucial. The ability of Network Anomaly Detection Systems (NADS) to identify unusual behavior makes them useful in predicting such attacks. However, NADS often suffer from high false positive rates. In this paper, we introduce a novel framework called Honeyboost that enhances the performance of honeypot aided NADS. Using data from the LAN Security Monitoring Project, Honeyboost identifies most anomalous nodes before they access the honeypot aiding early detection and prediction. Furthermore, using extreme value theory, we achieve the highly desirable low false positive rates.
Honeyboost is an unsupervised method comprising two approaches: horizontal and vertical. The horizontal approach constructs a time series from the communications of each node, with node-level features encapsulating their behavior over time. The vertical approach finds anomalies in each protocol space. Using a window-based model, which is typically used in online scenarios, the horizontal and vertical approaches are combined to identify anomalies and gain useful insights. Experimental results indicate the efficacy of our framework in identifying suspicious activities of nodes.
△ Less
Submitted 7 September, 2021; v1 submitted 6 May, 2021;
originally announced May 2021.
-
Non-linear Functional Modeling using Neural Networks
Authors:
Aniruddha Rajendra Rao,
Matthew Reimherr
Abstract:
We introduce a new class of non-linear models for functional data based on neural networks. Deep learning has been very successful in non-linear modeling, but there has been little work done in the functional data setting. We propose two variations of our framework: a functional neural network with continuous hidden layers, called the Functional Direct Neural Network (FDNN), and a second version t…
▽ More
We introduce a new class of non-linear models for functional data based on neural networks. Deep learning has been very successful in non-linear modeling, but there has been little work done in the functional data setting. We propose two variations of our framework: a functional neural network with continuous hidden layers, called the Functional Direct Neural Network (FDNN), and a second version that utilizes basis expansions and continuous hidden layers, called the Functional Basis Neural Network (FBNN). Both are designed explicitly to exploit the structure inherent in functional data. To fit these models we derive a functional gradient based optimization algorithm. The effectiveness of the proposed methods in handling complex functional models is demonstrated by comprehensive simulation studies and real data examples.
△ Less
Submitted 3 May, 2023; v1 submitted 19 April, 2021;
originally announced April 2021.
-
RADIOHEAD: Radiogenomic Analysis Incorporating Tumor Heterogeneity in Imaging Through Densities
Authors:
Shariq Mohammed,
Karthik Bharath,
Sebastian Kurtek,
Arvind Rao,
Veerabhadran Baladandayuthapani
Abstract:
Recent technological advancements have enabled detailed investigation of associations between the molecular architecture and tumor heterogeneity, through multi-source integration of radiological imaging and genomic (radiogenomic) data. In this paper, we integrate and harness radiogenomic data in patients with lower grade gliomas (LGG), a type of brain cancer, in order to develop a regression frame…
▽ More
Recent technological advancements have enabled detailed investigation of associations between the molecular architecture and tumor heterogeneity, through multi-source integration of radiological imaging and genomic (radiogenomic) data. In this paper, we integrate and harness radiogenomic data in patients with lower grade gliomas (LGG), a type of brain cancer, in order to develop a regression framework called RADIOHEAD (RADIOgenomic analysis incorporating tumor HEterogeneity in imAging through Densities) to identify radiogenomic associations. Imaging data is represented through voxel intensity probability density functions of tumor sub-regions obtained from multimodal magnetic resonance imaging, and genomic data through molecular signatures in the form of pathway enrichment scores corresponding to their gene expression profiles. Employing a Riemannian-geometric framework for principal component analysis on the set of probability densities functions, we map each probability density to a vector of principal component scores, which are then included as predictors in a Bayesian regression model with the pathway enrichment scores as the response. Variable selection compatible with the grouping structure amongst the predictors induced through the tumor sub-regions is carried out under a group spike-and-slab prior. A Bayesian false discovery rate mechanism is then used to infer significant associations based on the posterior distribution of the regression coefficients. Our analyses reveal several pathways relevant to LGG etiology (such as synaptic transmission, nerve impulse and neurotransmitter pathways), to have significant associations with the corresponding imaging-based predictors.
△ Less
Submitted 7 April, 2021; v1 submitted 1 April, 2021;
originally announced April 2021.
-
Asymptotics of Ridge Regression in Convolutional Models
Authors:
Mojtaba Sahraee-Ardakan,
Tung Mai,
Anup Rao,
Ryan Rossi,
Sundeep Rangan,
Alyson K. Fletcher
Abstract:
Understanding generalization and estimation error of estimators for simple models such as linear and generalized linear models has attracted a lot of attention recently. This is in part due to an interesting observation made in machine learning community that highly over-parameterized neural networks achieve zero training error, and yet they are able to generalize well over the test samples. This…
▽ More
Understanding generalization and estimation error of estimators for simple models such as linear and generalized linear models has attracted a lot of attention recently. This is in part due to an interesting observation made in machine learning community that highly over-parameterized neural networks achieve zero training error, and yet they are able to generalize well over the test samples. This phenomenon is captured by the so called double descent curve, where the generalization error starts decreasing again after the interpolation threshold. A series of recent works tried to explain such phenomenon for simple models. In this work, we analyze the asymptotics of estimation error in ridge estimators for convolutional linear models. These convolutional inverse problems, also known as deconvolution, naturally arise in different fields such as seismology, imaging, and acoustics among others. Our results hold for a large class of input distributions that include i.i.d. features as a special case. We derive exact formulae for estimation error of ridge estimators that hold in a certain high-dimensional regime. We show the double descent phenomenon in our experiments for convolutional models and show that our theoretical results match the experiments.
△ Less
Submitted 8 March, 2021;
originally announced March 2021.
-
Machine Unlearning via Algorithmic Stability
Authors:
Enayat Ullah,
Tung Mai,
Anup Rao,
Ryan Rossi,
Raman Arora
Abstract:
We study the problem of machine unlearning and identify a notion of algorithmic stability, Total Variation (TV) stability, which we argue, is suitable for the goal of exact unlearning. For convex risk minimization problems, we design TV-stable algorithms based on noisy Stochastic Gradient Descent (SGD). Our key contribution is the design of corresponding efficient unlearning algorithms, which are…
▽ More
We study the problem of machine unlearning and identify a notion of algorithmic stability, Total Variation (TV) stability, which we argue, is suitable for the goal of exact unlearning. For convex risk minimization problems, we design TV-stable algorithms based on noisy Stochastic Gradient Descent (SGD). Our key contribution is the design of corresponding efficient unlearning algorithms, which are based on constructing a (maximal) coupling of Markov chains for the noisy SGD procedure. To understand the trade-offs between accuracy and unlearning efficiency, we give upper and lower bounds on excess empirical and populations risk of TV stable algorithms for convex risk minimization. Our techniques generalize to arbitrary non-convex functions, and our algorithms are differentially private as well.
△ Less
Submitted 25 February, 2021;
originally announced February 2021.
-
Fundamental Tradeoffs in Distributionally Adversarial Training
Authors:
Mohammad Mehrabi,
Adel Javanmard,
Ryan A. Rossi,
Anup Rao,
Tung Mai
Abstract:
Adversarial training is among the most effective techniques to improve the robustness of models against adversarial perturbations. However, the full effect of this approach on models is not well understood. For example, while adversarial training can reduce the adversarial risk (prediction error against an adversary), it sometimes increase standard risk (generalization error when there is no adver…
▽ More
Adversarial training is among the most effective techniques to improve the robustness of models against adversarial perturbations. However, the full effect of this approach on models is not well understood. For example, while adversarial training can reduce the adversarial risk (prediction error against an adversary), it sometimes increase standard risk (generalization error when there is no adversary). Even more, such behavior is impacted by various elements of the learning problem, including the size and quality of training data, specific forms of adversarial perturbations in the input, model overparameterization, and adversary's power, among others. In this paper, we focus on \emph{distribution perturbing} adversary framework wherein the adversary can change the test distribution within a neighborhood of the training data distribution. The neighborhood is defined via Wasserstein distance between distributions and the radius of the neighborhood is a measure of adversary's manipulative power. We study the tradeoff between standard risk and adversarial risk and derive the Pareto-optimal tradeoff, achievable over specific classes of models, in the infinite data limit with features dimension kept fixed. We consider three learning settings: 1) Regression with the class of linear models; 2) Binary classification under the Gaussian mixtures data model, with the class of linear classifiers; 3) Regression with the class of random features model (which can be equivalently represented as two-layer neural network with random first-layer weights). We show that a tradeoff between standard and adversarial risk is manifested in all three settings. We further characterize the Pareto-optimal tradeoff curves and discuss how a variety of factors, such as features correlation, adversary's power or the width of two-layer neural network would affect this tradeoff.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
Modern Multiple Imputation with Functional Data
Authors:
Aniruddha Rajendra Rao,
Matthew Reimherr
Abstract:
This work considers the problem of fitting functional models with sparsely and irregularly sampled functional data. It overcomes the limitations of the state-of-the-art methods, which face major challenges in the fitting of more complex non-linear models. Currently, many of these models cannot be consistently estimated unless the number of observed points per curve grows sufficiently quickly with…
▽ More
This work considers the problem of fitting functional models with sparsely and irregularly sampled functional data. It overcomes the limitations of the state-of-the-art methods, which face major challenges in the fitting of more complex non-linear models. Currently, many of these models cannot be consistently estimated unless the number of observed points per curve grows sufficiently quickly with the sample size, whereas, we show numerically that a modified approach with more modern multiple imputation methods can produce better estimates in general. We also propose a new imputation approach that combines the ideas of {\it MissForest} with {\it Local Linear Forest} and compare their performance with {\it PACE} and several other multivariate multiple imputation methods. This work is motivated by a longitudinal study on smoking cessation, in which the Electronic Health Records (EHR) from Penn State PaTH to Health allow for the collection of a great deal of data, with highly variable sampling. To illustrate our approach, we explore the relation between relapse and diastolic blood pressure. We also consider a variety of simulation schemes with varying levels of sparsity to validate our methods.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
A Non-linear Function-on-Function Model for Regression with Time Series Data
Authors:
Qiyao Wang,
Haiyan Wang,
Chetan Gupta,
Aniruddha Rajendra Rao,
Hamed Khorasgani
Abstract:
In the last few decades, building regression models for non-scalar variables, including time series, text, image, and video, has attracted increasing interests of researchers from the data analytic community. In this paper, we focus on a multivariate time series regression problem. Specifically, we aim to learn mathematical mappings from multiple chronologically measured numerical variables within…
▽ More
In the last few decades, building regression models for non-scalar variables, including time series, text, image, and video, has attracted increasing interests of researchers from the data analytic community. In this paper, we focus on a multivariate time series regression problem. Specifically, we aim to learn mathematical mappings from multiple chronologically measured numerical variables within a certain time interval S to multiple numerical variables of interest over time interval T. Prior arts, including the multivariate regression model, the Seq2Seq model, and the functional linear models, suffer from several limitations. The first two types of models can only handle regularly observed time series. Besides, the conventional multivariate regression models tend to be biased and inefficient, as they are incapable of encoding the temporal dependencies among observations from the same time series. The sequential learning models explicitly use the same set of parameters along time, which has negative impacts on accuracy. The function-on-function linear model in functional data analysis (a branch of statistics) is insufficient to capture complex correlations among the considered time series and suffer from underfitting easily. In this paper, we propose a general functional mapping that embraces the function-on-function linear model as a special case. We then propose a non-linear function-on-function model using the fully connected neural network to learn the mapping from data, which addresses the aforementioned concerns in the existing approaches. For the proposed model, we describe in detail the corresponding numerical implementation procedures. The effectiveness of the proposed model is demonstrated through the application to two real-world problems.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
Efficient Balanced Treatment Assignments for Experimentation
Authors:
David Arbour,
Drew Dimmery,
Anup Rao
Abstract:
In this work, we reframe the problem of balanced treatment assignment as optimization of a two-sample test between test and control units. Using this lens we provide an assignment algorithm that is optimal with respect to the minimum spanning tree test of Friedman and Rafsky (1979). This assignment to treatment groups may be performed exactly in polynomial time. We provide a probabilistic interpre…
▽ More
In this work, we reframe the problem of balanced treatment assignment as optimization of a two-sample test between test and control units. Using this lens we provide an assignment algorithm that is optimal with respect to the minimum spanning tree test of Friedman and Rafsky (1979). This assignment to treatment groups may be performed exactly in polynomial time. We provide a probabilistic interpretation of this process in terms of the most probable element of designs drawn from a determinantal point process which admits a probabilistic interpretation of the design. We provide a novel formulation of estimation as transductive inference and show how the tree structures used in design can also be used in an adjustment estimator. We conclude with a simulation study demonstrating the improved efficacy of our method.
△ Less
Submitted 21 October, 2020;
originally announced October 2020.
-
Graph Neural Networks with Heterophily
Authors:
Jiong Zhu,
Ryan A. Rossi,
Anup Rao,
Tung Mai,
Nedim Lipka,
Nesreen K. Ahmed,
Danai Koutra
Abstract:
Graph Neural Networks (GNNs) have proven to be useful for many different practical applications. However, many existing GNN models have implicitly assumed homophily among the nodes connected in the graph, and therefore have largely overlooked the important setting of heterophily, where most connected nodes are from different classes. In this work, we propose a novel framework called CPGNN that gen…
▽ More
Graph Neural Networks (GNNs) have proven to be useful for many different practical applications. However, many existing GNN models have implicitly assumed homophily among the nodes connected in the graph, and therefore have largely overlooked the important setting of heterophily, where most connected nodes are from different classes. In this work, we propose a novel framework called CPGNN that generalizes GNNs for graphs with either homophily or heterophily. The proposed framework incorporates an interpretable compatibility matrix for modeling the heterophily or homophily level in the graph, which can be learned in an end-to-end fashion, enabling it to go beyond the assumption of strong homophily. Theoretically, we show that replacing the compatibility matrix in our framework with the identity (which represents pure homophily) reduces to GCN. Our extensive experiments demonstrate the effectiveness of our approach in more realistic and challenging experimental settings with significantly less training data compared to previous works: CPGNN variants achieve state-of-the-art results in heterophily settings with or without contextual node features, while maintaining comparable performance in homophily settings.
△ Less
Submitted 14 June, 2021; v1 submitted 28 September, 2020;
originally announced September 2020.
-
Spatio-Temporal Functional Neural Networks
Authors:
Aniruddha Rajendra Rao,
Qiyao Wang,
Haiyan Wang,
Hamed Khorasgani,
Chetan Gupta
Abstract:
Explosive growth in spatio-temporal data and its wide range of applications have attracted increasing interests of researchers in the statistical and machine learning fields. The spatio-temporal regression problem is of paramount importance from both the methodology development and real-world application perspectives. Given the observed spatially encoded time series covariates and real-valued resp…
▽ More
Explosive growth in spatio-temporal data and its wide range of applications have attracted increasing interests of researchers in the statistical and machine learning fields. The spatio-temporal regression problem is of paramount importance from both the methodology development and real-world application perspectives. Given the observed spatially encoded time series covariates and real-valued response data samples, the goal of spatio-temporal regression is to leverage the temporal and spatial dependencies to build a mapping from covariates to response with minimized prediction error. Prior arts, including the convolutional Long Short-Term Memory (CovLSTM) and variations of the functional linear models, cannot learn the spatio-temporal information in a simple and efficient format for proper model building. In this work, we propose two novel extensions of the Functional Neural Network (FNN), a temporal regression model whose effectiveness and superior performance over alternative sequential models have been proven by many researchers. The effectiveness of the proposed spatio-temporal FNNs in handling varying spatial correlations is demonstrated in comprehensive simulation studies. The proposed models are then deployed to solve a practical and challenging precipitation prediction problem in the meteorology field.
△ Less
Submitted 11 September, 2020;
originally announced September 2020.
-
Designing Transportable Experiments
Authors:
My Phan,
David Arbour,
Drew Dimmery,
Anup B. Rao
Abstract:
We consider the problem of designing a randomized experiment on a source population to estimate the Average Treatment Effect (ATE) on a target population. We propose a novel approach which explicitly considers the target when designing the experiment on the source. Under the covariate shift assumption, we design an unbiased importance-weighted estimator for the target population's ATE. To reduce t…
▽ More
We consider the problem of designing a randomized experiment on a source population to estimate the Average Treatment Effect (ATE) on a target population. We propose a novel approach which explicitly considers the target when designing the experiment on the source. Under the covariate shift assumption, we design an unbiased importance-weighted estimator for the target population's ATE. To reduce the variance of our estimator, we design a covariate balance condition (Target Balance) between the treatment and control groups based on the target population. We show that Target Balance achieves a higher variance reduction asymptotically than methods that do not consider the target population during the design phase. Our experiments illustrate that Target Balance reduces the variance even for small sample sizes.
△ Less
Submitted 4 September, 2021; v1 submitted 8 September, 2020;
originally announced September 2020.
-
Sample Efficient Graph-Based Optimization with Noisy Observations
Authors:
Tan Nguyen,
Ali Shameli,
Yasin Abbasi-Yadkori,
Anup Rao,
Branislav Kveton
Abstract:
We study sample complexity of optimizing "hill-climbing friendly" functions defined on a graph under noisy observations. We define a notion of convexity, and we show that a variant of best-arm identification can find a near-optimal solution after a small number of queries that is independent of the size of the graph. For functions that have local minima and are nearly convex, we show a sample comp…
▽ More
We study sample complexity of optimizing "hill-climbing friendly" functions defined on a graph under noisy observations. We define a notion of convexity, and we show that a variant of best-arm identification can find a near-optimal solution after a small number of queries that is independent of the size of the graph. For functions that have local minima and are nearly convex, we show a sample complexity for the classical simulated annealing under noisy observations. We show effectiveness of the greedy algorithm with restarts and the simulated annealing on problems of graph-based nearest neighbor classification as well as a web document re-ranking application.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.
-
Integrative Bayesian models using Post-selective Inference: a case study in Radiogenomics
Authors:
Snigdha Panigrahi,
Shariq Mohammed,
Arvind Rao,
Veerabhadran Baladandayuthapani
Abstract:
Integrative analyses based on statistically relevant associations between genomics and a wealth of intermediary phenotypes (such as imaging) provide vital insights into their clinical relevance in terms of the disease mechanisms. Estimates for uncertainty in the resulting integrative models are however unreliable unless inference accounts for the selection of these associations with accuracy. In t…
▽ More
Integrative analyses based on statistically relevant associations between genomics and a wealth of intermediary phenotypes (such as imaging) provide vital insights into their clinical relevance in terms of the disease mechanisms. Estimates for uncertainty in the resulting integrative models are however unreliable unless inference accounts for the selection of these associations with accuracy. In this article, we develop selection-aware Bayesian methods which: (i) counteract the impact of model selection bias through a "selection-aware posterior" in a flexible class of integrative Bayesian models post a selection of promising variables via $\ell_1$-regularized algorithms; (ii) strike an inevitable tradeoff between the quality of model selection and inferential power when the same dataset is used for both selection and uncertainty estimation. Central to our methodological development, a carefully constructed conditional likelihood function deployed with a reparameterization mapping provides notably tractable updates when gradient-based MCMC sampling is used for estimating uncertainties from the selection-aware posterior. Applying our methods to a radiogenomic analysis, we successfully recover several important gene pathways and estimate uncertainties for their associations with patient survival times.
△ Less
Submitted 12 August, 2022; v1 submitted 24 April, 2020;
originally announced April 2020.
-
Model Selection in Contextual Stochastic Bandit Problems
Authors:
Aldo Pacchiano,
My Phan,
Yasin Abbasi-Yadkori,
Anup Rao,
Julian Zimmert,
Tor Lattimore,
Csaba Szepesvari
Abstract:
We study bandit model selection in stochastic environments. Our approach relies on a meta-algorithm that selects between candidate base algorithms. We develop a meta-algorithm-base algorithm abstraction that can work with general classes of base algorithms and different type of adversarial meta-algorithms. Our methods rely on a novel and generic smoothing transformation for bandit algorithms that…
▽ More
We study bandit model selection in stochastic environments. Our approach relies on a meta-algorithm that selects between candidate base algorithms. We develop a meta-algorithm-base algorithm abstraction that can work with general classes of base algorithms and different type of adversarial meta-algorithms. Our methods rely on a novel and generic smoothing transformation for bandit algorithms that permits us to obtain optimal $O(\sqrt{T})$ model selection guarantees for stochastic contextual bandit problems as long as the optimal base algorithm satisfies a high probability regret guarantee. We show through a lower bound that even when one of the base algorithms has $O(\log T)$ regret, in general it is impossible to get better than $Ω(\sqrt{T})$ regret in model selection, even asymptotically. Using our techniques, we address model selection in a variety of problems such as misspecified linear contextual bandits, linear bandit with unknown dimension and reinforcement learning with unknown feature maps. Our algorithm requires the knowledge of the optimal base regret to adjust the meta-algorithm learning rate. We show that without such prior knowledge any meta-algorithm can suffer a regret larger than the optimal base regret.
△ Less
Submitted 4 December, 2022; v1 submitted 3 March, 2020;
originally announced March 2020.
-
Gaussian Process Policy Optimization
Authors:
Ashish Rao,
Bidipta Sarkar,
Tejas Narayanan
Abstract:
We propose a novel actor-critic, model-free reinforcement learning algorithm which employs a Bayesian method of parameter space exploration to solve environments. A Gaussian process is used to learn the expected return of a policy given the policy's parameters. The system is trained by updating the parameters using gradient descent on a new surrogate loss function consisting of the Proximal Policy…
▽ More
We propose a novel actor-critic, model-free reinforcement learning algorithm which employs a Bayesian method of parameter space exploration to solve environments. A Gaussian process is used to learn the expected return of a policy given the policy's parameters. The system is trained by updating the parameters using gradient descent on a new surrogate loss function consisting of the Proximal Policy Optimization 'Clipped' loss function and a bonus term representing the expected improvement acquisition function given by the Gaussian process. This new method is shown to be comparable to and at times empirically outperform current algorithms on environments that simulate robotic locomotion using the MuJoCo physics engine.
△ Less
Submitted 2 March, 2020;
originally announced March 2020.
-
Variational Bayesian Methods for Stochastically Constrained System Design Problems
Authors:
Prateek Jaiswal,
Harsha Honnappa,
Vinayak A. Rao
Abstract:
We study system design problems stated as parameterized stochastic programs with a chance-constraint set. We adopt a Bayesian approach that requires the computation of a posterior predictive integral which is usually intractable. In addition, for the problem to be a well-defined convex program, we must retain the convexity of the feasible set. Consequently, we propose a variational Bayes-based met…
▽ More
We study system design problems stated as parameterized stochastic programs with a chance-constraint set. We adopt a Bayesian approach that requires the computation of a posterior predictive integral which is usually intractable. In addition, for the problem to be a well-defined convex program, we must retain the convexity of the feasible set. Consequently, we propose a variational Bayes-based method to approximately compute the posterior predictive integral that ensures tractability and retains the convexity of the feasible set. Under certain regularity conditions, we also show that the solution set obtained using variational Bayes converges to the true solution set as the number of observations tends to infinity. We also provide bounds on the probability of qualifying a true infeasible point (with respect to the true constraints) as feasible under the VB approximation for a given number of samples.
△ Less
Submitted 6 January, 2020;
originally announced January 2020.
-
Asymptotic Consistency of Loss-Calibrated Variational Bayes
Authors:
Prateek Jaiswal,
Harsha Honnappa,
Vinayak A. Rao
Abstract:
This paper establishes the asymptotic consistency of the {\it loss-calibrated variational Bayes} (LCVB) method. LCVB was proposed in~\cite{LaSiGh2011} as a method for approximately computing Bayesian posteriors in a `loss aware' manner. This methodology is also highly relevant in general data-driven decision-making contexts. Here, we not only establish the asymptotic consistency of the calibrated…
▽ More
This paper establishes the asymptotic consistency of the {\it loss-calibrated variational Bayes} (LCVB) method. LCVB was proposed in~\cite{LaSiGh2011} as a method for approximately computing Bayesian posteriors in a `loss aware' manner. This methodology is also highly relevant in general data-driven decision-making contexts. Here, we not only establish the asymptotic consistency of the calibrated approximate posterior, but also the asymptotic consistency of decision rules. We also establish the asymptotic consistency of decision rules obtained from a `naive' variational Bayesian procedure.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
Higher-Order Ranking and Link Prediction: From Closing Triangles to Closing Higher-Order Motifs
Authors:
Ryan A. Rossi,
Anup Rao,
Sungchul Kim,
Eunyee Koh,
Nesreen K. Ahmed,
Gang Wu
Abstract:
In this paper, we introduce the notion of motif closure and describe higher-order ranking and link prediction methods based on the notion of closing higher-order network motifs. The methods are fast and efficient for real-time ranking and link prediction-based applications such as web search, online advertising, and recommendation. In such applications, real-time performance is critical. The propo…
▽ More
In this paper, we introduce the notion of motif closure and describe higher-order ranking and link prediction methods based on the notion of closing higher-order network motifs. The methods are fast and efficient for real-time ranking and link prediction-based applications such as web search, online advertising, and recommendation. In such applications, real-time performance is critical. The proposed methods do not require any explicit training data, nor do they derive an embedding from the graph data, or perform any explicit learning. Existing methods with the above desired properties are all based on closing triangles (common neighbors, Jaccard similarity, and the ilk). In this work, we investigate higher-order network motifs and develop techniques based on the notion of closing higher-order motifs that move beyond closing simple triangles. All methods described in this work are fast with a runtime that is sublinear in the number of nodes. The experimental results indicate the importance of closing higher-order motifs for ranking and link prediction applications. Finally, the proposed notion of higher-order motif closure can serve as a basis for studying and developing better ranking and link prediction methods.
△ Less
Submitted 12 June, 2019;
originally announced June 2019.
-
Asymptotic Consistency of $α-$Rényi-Approximate Posteriors
Authors:
Prateek Jaiswal,
Vinayak A. Rao,
Harsha Honnappa
Abstract:
We study the asymptotic consistency properties of $α$-Rényi approximate posteriors, a class of variational Bayesian methods that approximate an intractable Bayesian posterior with a member of a tractable family of distributions, the member chosen to minimize the $α$-Rényi divergence from the true posterior. Unique to our work is that we consider settings with $α> 1$, resulting in approximations th…
▽ More
We study the asymptotic consistency properties of $α$-Rényi approximate posteriors, a class of variational Bayesian methods that approximate an intractable Bayesian posterior with a member of a tractable family of distributions, the member chosen to minimize the $α$-Rényi divergence from the true posterior. Unique to our work is that we consider settings with $α> 1$, resulting in approximations that upperbound the log-likelihood, and consequently have wider spread than traditional variational approaches that minimize the Kullback-Liebler (KL) divergence from the posterior. Our primary result identifies sufficient conditions under which consistency holds, centering around the existence of a 'good' sequence of distributions in the approximating family that possesses, among other properties, the right rate of convergence to a limit distribution. We further characterize the good sequence by demonstrating that a sequence of distributions that converges too quickly cannot be a good sequence. We also extend our analysis to the setting where $α$ equals one, corresponding to the minimizer of the reverse KL divergence, and to models with local latent variables. We also illustrate the existence of good sequence with a number of examples. Our results complement a growing body of work focused on the frequentist properties of variational Bayesian methods.
△ Less
Submitted 14 August, 2020; v1 submitted 5 February, 2019;
originally announced February 2019.
-
Regression Analyses of Distributions using Quantile Functional Regression
Authors:
Hojin Yang,
Veerabhadran Baladandayuthapani,
Arvind U. K. Rao,
Jeffrey S. Morris
Abstract:
Radiomics involves the study of tumor images to identify quantitative markers explaining cancer heterogeneity. The predominant approach is to extract hundreds to thousands of image features, including histogram features comprised of summaries of the marginal distribution of pixel intensities, which leads to multiple testing problems and can miss out on insights not contained in the selected featur…
▽ More
Radiomics involves the study of tumor images to identify quantitative markers explaining cancer heterogeneity. The predominant approach is to extract hundreds to thousands of image features, including histogram features comprised of summaries of the marginal distribution of pixel intensities, which leads to multiple testing problems and can miss out on insights not contained in the selected features. In this paper, we present methods to model the entire marginal distribution of pixel intensities via the quantile function as functional data, regressed on a set of demographic, clinical, and genetic predictors. We call this approach quantile functional regression, regressing subject-specific marginal distributions across repeated measurements on a set of covariates, allowing us to assess which covariates are associated with the distribution in a global sense, as well as to identify distributional features characterizing these differences, including mean, variance, skewness, and various upper and lower quantiles. To account for smoothness in the quantile functions, we introduce custom basis functions we call quantlets that are sparse, regularized, near-lossless, and empirically defined, adapting to the features of a given data set. We fit this model using a Bayesian framework that uses nonlinear shrinkage of quantlet coefficients to regularize the functional regression coefficients and provides fully Bayesian inference after fitting a Markov chain Monte Carlo. We demonstrate the benefit of the basis space modeling through simulation studies, and apply the method to Magnetic resonance imaging (MRI) based radiomic dataset from Glioblastoma Multiforme to relate imaging-based quantile functions to demographic, clinical, and genetic predictors, finding specific differences in tumor pixel intensity distribution between males and females and between tumors with and without DDIT3 mutations.
△ Less
Submitted 4 October, 2018;
originally announced October 2018.
-
New Insights into Bootstrapping for Bandits
Authors:
Sharan Vaswani,
Branislav Kveton,
Zheng Wen,
Anup Rao,
Mark Schmidt,
Yasin Abbasi-Yadkori
Abstract:
We investigate the use of bootstrapping in the bandit setting. We first show that the commonly used non-parametric bootstrapping (NPB) procedure can be provably inefficient and establish a near-linear lower bound on the regret incurred by it under the bandit model with Bernoulli rewards. We show that NPB with an appropriate amount of forced exploration can result in sub-linear albeit sub-optimal r…
▽ More
We investigate the use of bootstrapping in the bandit setting. We first show that the commonly used non-parametric bootstrapping (NPB) procedure can be provably inefficient and establish a near-linear lower bound on the regret incurred by it under the bandit model with Bernoulli rewards. We show that NPB with an appropriate amount of forced exploration can result in sub-linear albeit sub-optimal regret. As an alternative to NPB, we propose a weighted bootstrapping (WB) procedure. For Bernoulli rewards, WB with multiplicative exponential weights is mathematically equivalent to Thompson sampling (TS) and results in near-optimal regret bounds. Similarly, in the bandit setting with Gaussian rewards, we show that WB with additive Gaussian weights achieves near-optimal regret. Beyond these special cases, we show that WB leads to better empirical performance than TS for several reward distributions bounded on $[0,1]$. For the contextual bandit setting, we give practical guidelines that make bootstrapping simple and efficient to implement and result in good empirical performance on real-world datasets.
△ Less
Submitted 24 May, 2018;
originally announced May 2018.
-
HONE: Higher-Order Network Embeddings
Authors:
Ryan A. Rossi,
Nesreen K. Ahmed,
Eunyee Koh,
Sungchul Kim,
Anup Rao,
Yasin Abbasi Yadkori
Abstract:
This paper describes a general framework for learning Higher-Order Network Embeddings (HONE) from graph data based on network motifs. The HONE framework is highly expressive and flexible with many interchangeable components. The experimental results demonstrate the effectiveness of learning higher-order network representations. In all cases, HONE outperforms recent embedding methods that are unabl…
▽ More
This paper describes a general framework for learning Higher-Order Network Embeddings (HONE) from graph data based on network motifs. The HONE framework is highly expressive and flexible with many interchangeable components. The experimental results demonstrate the effectiveness of learning higher-order network representations. In all cases, HONE outperforms recent embedding methods that are unable to capture higher-order structures with a mean relative gain in AUC of $19\%$ (and up to $75\%$ gain) across a wide variety of networks and embedding methods.
△ Less
Submitted 29 May, 2018; v1 submitted 28 January, 2018;
originally announced January 2018.
-
Stochastic Low-Rank Bandits
Authors:
Branislav Kveton,
Csaba Szepesvari,
Anup Rao,
Zheng Wen,
Yasin Abbasi-Yadkori,
S. Muthukrishnan
Abstract:
Many problems in computer vision and recommender systems involve low-rank matrices. In this work, we study the problem of finding the maximum entry of a stochastic low-rank matrix from sequential observations. At each step, a learning agent chooses pairs of row and column arms, and receives the noisy product of their latent values as a reward. The main challenge is that the latent values are unobs…
▽ More
Many problems in computer vision and recommender systems involve low-rank matrices. In this work, we study the problem of finding the maximum entry of a stochastic low-rank matrix from sequential observations. At each step, a learning agent chooses pairs of row and column arms, and receives the noisy product of their latent values as a reward. The main challenge is that the latent values are unobserved. We identify a class of non-negative matrices whose maximum entry can be found statistically efficiently and propose an algorithm for finding them, which we call LowRankElim. We derive a $\DeclareMathOperator{\poly}{poly} O((K + L) \poly(d) Δ^{-1} \log n)$ upper bound on its $n$-step regret, where $K$ is the number of rows, $L$ is the number of columns, $d$ is the rank of the matrix, and $Δ$ is the minimum gap. The bound depends on other problem-specific constants that clearly do not depend $K L$. To the best of our knowledge, this is the first such result in the literature.
△ Less
Submitted 13 December, 2017;
originally announced December 2017.
-
Radiologic Image-based Statistical Shape Analysis of Brain Tumors
Authors:
Karthik Bharath,
Sebastian Kurtek,
Arvind Rao,
Veerabhadran Baladandayuthapani
Abstract:
We propose a curve-based Riemannian-geometric approach for general shape-based statistical analyses of tumors obtained from radiologic images. A key component of the framework is a suitable metric that (1) enables comparisons of tumor shapes, (2) provides tools for computing descriptive statistics and implementing principal component analysis on the space of tumor shapes, and (3) allows for a rich…
▽ More
We propose a curve-based Riemannian-geometric approach for general shape-based statistical analyses of tumors obtained from radiologic images. A key component of the framework is a suitable metric that (1) enables comparisons of tumor shapes, (2) provides tools for computing descriptive statistics and implementing principal component analysis on the space of tumor shapes, and (3) allows for a rich class of continuous deformations of a tumor shape. The utility of the framework is illustrated through specific statistical tasks on a dataset of radiologic images of patients diagnosed with glioblastoma multiforme, a malignant brain tumor with poor prognosis. In particular, our analysis discovers two patient clusters with very different survival, subtype and genomic characteristics. Furthermore, it is demonstrated that adding tumor shape information into survival models containing clinical and genomic variables results in a significant increase in predictive power.
△ Less
Submitted 3 February, 2017;
originally announced February 2017.
-
Going off the Grid: Iterative Model Selection for Biclustered Matrix Completion
Authors:
Eric Chi,
Liuiyi Hu,
Arvind K. Saibaba,
Arvind U. K. Rao
Abstract:
We consider the problem of performing matrix completion with side information on row-by-row and column-by-column similarities. We build upon recent proposals for matrix estimation with smoothness constraints with respect to row and column graphs. We present a novel iterative procedure for directly minimizing an information criterion in order to select an appropriate amount row and column smoothing…
▽ More
We consider the problem of performing matrix completion with side information on row-by-row and column-by-column similarities. We build upon recent proposals for matrix estimation with smoothness constraints with respect to row and column graphs. We present a novel iterative procedure for directly minimizing an information criterion in order to select an appropriate amount row and column smoothing, namely perform model selection. We also discuss how to exploit the special structure of the problem to scale up the estimation and model selection procedure via the Hutchinson estimator. We present simulation results and an application to predicting associations in imaging-genomics studies.
△ Less
Submitted 19 October, 2016; v1 submitted 17 October, 2016;
originally announced October 2016.
-
Agnostic Estimation of Mean and Covariance
Authors:
Kevin A. Lai,
Anup B. Rao,
Santosh Vempala
Abstract:
We consider the problem of estimating the mean and covariance of a distribution from iid samples in $\mathbb{R}^n$, in the presence of an $η$ fraction of malicious noise; this is in contrast to much recent work where the noise itself is assumed to be from a distribution of known type. The agnostic problem includes many interesting special cases, e.g., learning the parameters of a single Gaussian (…
▽ More
We consider the problem of estimating the mean and covariance of a distribution from iid samples in $\mathbb{R}^n$, in the presence of an $η$ fraction of malicious noise; this is in contrast to much recent work where the noise itself is assumed to be from a distribution of known type. The agnostic problem includes many interesting special cases, e.g., learning the parameters of a single Gaussian (or finding the best-fit Gaussian) when $η$ fraction of data is adversarially corrupted, agnostically learning a mixture of Gaussians, agnostic ICA, etc. We present polynomial-time algorithms to estimate the mean and covariance with error guarantees in terms of information-theoretic lower bounds. As a corollary, we also obtain an agnostic algorithm for Singular Value Decomposition.
△ Less
Submitted 14 August, 2016; v1 submitted 23 April, 2016;
originally announced April 2016.
-
Inferring network structure in non-normal and mixed discrete-continuous genomic data
Authors:
Anindya Bhadra,
Arvind Rao,
Veerabhadran Baladandayuthapani
Abstract:
Inferring dependence structure through undirected graphs is crucial for uncovering the major modes of multivariate interaction among high-dimensional genomic markers that are potentially associated with cancer. Traditionally, conditional independence has been studied using sparse Gaussian graphical models for continuous data and sparse Ising models for discrete data. However, there are two clear s…
▽ More
Inferring dependence structure through undirected graphs is crucial for uncovering the major modes of multivariate interaction among high-dimensional genomic markers that are potentially associated with cancer. Traditionally, conditional independence has been studied using sparse Gaussian graphical models for continuous data and sparse Ising models for discrete data. However, there are two clear situations when these approaches are inadequate. The first occurs when the data are continuous but display non-normal marginal behavior such as heavy tails or skewness, rendering an assumption of normality inappropriate. The second occurs when a part of the data is ordinal or discrete (e.g., presence or absence of a mutation) and the other part is continuous (e.g., expression levels of genes or proteins). In this case, the existing Bayesian approaches typically employ a latent variable framework for the discrete part that precludes inferring conditional independence among the data that are actually observed. The current article overcomes these two challenges in a unified framework using Gaussian scale mixtures. Our framework is able to handle continuous data that are not normal and data that are of mixed continuous and discrete nature, while still being able to infer a sparse conditional sign independence structure among the observed data. Extensive performance comparison in simulations with alternative techniques and an analysis of a real cancer genomics data set demonstrate the effectiveness of the proposed approach.
△ Less
Submitted 1 April, 2016;
originally announced April 2016.
-
Statistical Tests for Large Tree-structured Data
Authors:
Karthik Bharath,
Prabhanjan Kambadur,
Dipak. K. Dey,
Arvind Rao,
Veerabhadran Baladandayuthapani
Abstract:
We develop a general statistical framework for the analysis and inference of large tree-structured data, with a focus on developing asymptotic goodness-of-fit tests. We first propose a consistent statistical model for binary trees, from which we develop a class of invariant tests. Using the model for binary trees, we then construct tests for general trees by using the distributional properties of…
▽ More
We develop a general statistical framework for the analysis and inference of large tree-structured data, with a focus on developing asymptotic goodness-of-fit tests. We first propose a consistent statistical model for binary trees, from which we develop a class of invariant tests. Using the model for binary trees, we then construct tests for general trees by using the distributional properties of the Continuum Random Tree, which arises as the invariant limit for a broad class of models for tree-structured data based on conditioned Galton--Watson processes. The test statistics for the goodness-of-fit tests are simple to compute and are asymptotically distributed as $χ^2$ and $F$ random variables. We illustrate our methods on an important application of detecting tumour heterogeneity in brain cancer. We use a novel approach with tree-based representations of magnetic resonance images and employ the developed tests to ascertain tumor heterogeneity between two groups of patients.
△ Less
Submitted 20 September, 2016; v1 submitted 10 April, 2014;
originally announced April 2014.
-
Understanding Theoretically The Impact of Reporting of Disease Cases in Epidemiology
Authors:
Arni S. R. Srinivasa Rao
Abstract:
In conducting preliminary analysis during an epidemic, data on reported disease cases offer key information in guiding the direction to the in-depth analysis. Models for growth and transmission dynamics are heavily dependent on preliminary analysis results. When a particular disease case is reported more than once or alternatively is never reported or detected in the population, then in such a sit…
▽ More
In conducting preliminary analysis during an epidemic, data on reported disease cases offer key information in guiding the direction to the in-depth analysis. Models for growth and transmission dynamics are heavily dependent on preliminary analysis results. When a particular disease case is reported more than once or alternatively is never reported or detected in the population, then in such a situation, there is a possibility of existence of multiple reporting or under reporting in the population. In this work, a theoretical approach for studying reporting error in epidemiology is explored. The upper bound for the error that arises due to multiple reporting is higher than that which arises due to under reporting. Numerical examples are provided to support the arguments. This article mainly treats reporting error as deterministic and one can explore a stochastic model for the same.
△ Less
Submitted 28 February, 2012; v1 submitted 13 February, 2012;
originally announced February 2012.
-
Biometric Cards for Indian Population: Role of Mathematical Models in Assisting and Planning
Authors:
Arni S. R. Srinivasa Rao
Abstract:
Mathematical models could be helpful in assisting the Indian Government's new initiative of issuing biometric cards to its citizens. In this note, we look into the role of mathematical models in estimating the missing, non-enumerated population numbers, estimating annual numbers of cards required by age, gender and regions in India. The linkage between National Population Register and biometric ca…
▽ More
Mathematical models could be helpful in assisting the Indian Government's new initiative of issuing biometric cards to its citizens. In this note, we look into the role of mathematical models in estimating the missing, non-enumerated population numbers, estimating annual numbers of cards required by age, gender and regions in India. The linkage between National Population Register and biometric cards is also highlighted. See technical Appendices. There are other scientific issues, namely, electronic, data storage management, identity verification etc, which we do not address in this paper.
△ Less
Submitted 28 April, 2011; v1 submitted 10 April, 2011;
originally announced April 2011.