-
Bayesian Inference for High-dimensional Time Series with a Directed Acyclic Graphical Structure
Authors:
Arkaprava Roy,
Anindya Roy,
Subhashis Ghosal
Abstract:
In multivariate time series analysis, understanding the underlying causal relationships among variables is often of interest for various applications. Directed acyclic graphs (DAGs) provide a powerful framework for representing causal dependencies. This paper proposes a novel Bayesian approach for modeling multivariate time series where conditional independencies and causal structure are encoded b…
▽ More
In multivariate time series analysis, understanding the underlying causal relationships among variables is often of interest for various applications. Directed acyclic graphs (DAGs) provide a powerful framework for representing causal dependencies. This paper proposes a novel Bayesian approach for modeling multivariate time series where conditional independencies and causal structure are encoded by a DAG. The proposed model allows structural properties such as stationarity to be easily accommodated. Given the application, we further extend the model for matrix-variate time series. We take a Bayesian approach to inference, and a ``projection-posterior'' based efficient computational algorithm is developed. The posterior convergence properties of the proposed method are established along with two identifiability results for the unrestricted structural equation models. The utility of the proposed method is demonstrated through simulation studies and real data analysis.
△ Less
Submitted 11 April, 2025; v1 submitted 30 March, 2025;
originally announced March 2025.
-
Relational Graph in Vector Autoregression: A Case Study on the Effect of the Great Recession on Connectivity of Economic Indicators
Authors:
Arkaprava Roy,
Anindya Roy,
Subhashis Ghosal
Abstract:
Under a high-dimensional vector autoregressive (VAR) model, we propose a way of efficiently estimating both the stationary graph structure between the nodal time series and their temporal dynamics. The framework is then used to make inferences on the change in interdependencies between several economic indicators due to the impact of the Great Recession, the financial crisis that lasted from 2007…
▽ More
Under a high-dimensional vector autoregressive (VAR) model, we propose a way of efficiently estimating both the stationary graph structure between the nodal time series and their temporal dynamics. The framework is then used to make inferences on the change in interdependencies between several economic indicators due to the impact of the Great Recession, the financial crisis that lasted from 2007 through 2009. There are several key advantages of the proposed framework; (1) it develops a reparametrized VAR likelihood that can be used in general high-dimensional VAR problems, (2) it strictly maintains causality of the estimated process, making inference on stationary features more meaningful and (3) it is computationally efficient due to the reduced rank structure of the parameterization. We apply the methodology to the seasonally adjusted quarterly economic indicators available in the FRED-QD database of the Federal Reserve. The analysis essentially confirms much of the prevailing knowledge about the impact of the Great Recession on different economic indicators. At the same time, it provides deeper insight into the nature and extent of the impact on the interplay of the different indicators. We also contribute to the theory of Bayesian VAR by showing the consistency of the posterior under sparse priors for the parameters of the reduced rank formulation of the VAR process.
△ Less
Submitted 29 March, 2025; v1 submitted 29 October, 2024;
originally announced October 2024.
-
Quantum Reinforcement Learning in Non-Abelian Environments: Unveiling Novel Formulations and Quantum Advantage Exploration
Authors:
Shubhayan Ghosal
Abstract:
This paper delves into recent advancements in Quantum Reinforcement Learning (QRL), particularly focusing on non-commutative environments, which represent uncharted territory in this field. Our research endeavors to redefine the boundaries of decision-making by introducing formulations and strategies that harness the inherent properties of quantum systems.
At the core of our investigation charac…
▽ More
This paper delves into recent advancements in Quantum Reinforcement Learning (QRL), particularly focusing on non-commutative environments, which represent uncharted territory in this field. Our research endeavors to redefine the boundaries of decision-making by introducing formulations and strategies that harness the inherent properties of quantum systems.
At the core of our investigation characterization of the agent's state space within a Hilbert space ($\mathcal{H}$). Here, quantum states emerge as complex superpositions of classical state introducing non-commutative quantum actions governed by unitary operators, necessitating a reimagining of state transitions. Complementing this framework is a refined reward function, rooted in quantum mechanics as a Hermitian operator on $\mathcal{H}$. This reward function serves as the foundation for the agent's decision-making process. By leveraging the quantum Bellman equation, we establish a methodology for maximizing expected cumulative reward over an infinite horizon, considering the entangled dynamics of quantum systems. We also connect the Quantum Bellman Equation to the Degree of Non Commutativity of the Environment, evident in Pure Algebra.
We design a quantum advantage function. This ingeniously designed function exploits latent quantum parallelism inherent in the system, enhancing the agent's decision-making capabilities and paving the way for exploration of quantum advantage in uncharted territories. Furthermore, we address the significant challenge of quantum exploration directly, recognizing the limitations of traditional strategies in this complex environment.
△ Less
Submitted 11 April, 2024;
originally announced June 2024.
-
Bayesian Learning of Relational Graph in Semiparametric High-dimensional Time Series
Authors:
Arkaprava Roy,
Anindya Roy,
Subhashis Ghosal
Abstract:
Time series data arising in many applications nowadays are high-dimensional. A large number of parameters describe features of these time series. We propose a novel approach to modeling a high-dimensional time series through several independent univariate time series, which are then orthogonally rotated and sparsely linearly transformed. With this approach, any specified intrinsic relations among…
▽ More
Time series data arising in many applications nowadays are high-dimensional. A large number of parameters describe features of these time series. We propose a novel approach to modeling a high-dimensional time series through several independent univariate time series, which are then orthogonally rotated and sparsely linearly transformed. With this approach, any specified intrinsic relations among component time series given by a graphical structure can be maintained at all time snapshots. We call the resulting process an Orthogonally-rotated Univariate Time series (OUT). Key structural properties of time series such as stationarity and causality can be easily accommodated in the OUT model. For Bayesian inference, we put suitable prior distributions on the spectral densities of the independent latent times series, the orthogonal rotation matrix, and the common precision matrix of the component times series at every time point. A likelihood is constructed using the Whittle approximation for univariate latent time series. An efficient Markov Chain Monte Carlo (MCMC) algorithm is developed for posterior computation. We study the convergence of the pseudo-posterior distribution based on the Whittle likelihood for the model's parameters upon developing a new general posterior convergence theorem for pseudo-posteriors. We find that the posterior contraction rate for independent observations essentially prevails in the OUT model under very mild conditions on the temporal dependence described in terms of the smoothness of the corresponding spectral densities. Through a simulation study, we compare the accuracy of estimating the parameters and identifying the graphical structure with other approaches. We apply the proposed methodology to analyze a dataset on different industrial components of the US gross domestic product between 2010 and 2019 and predict future observations.
△ Less
Submitted 20 August, 2024; v1 submitted 7 March, 2024;
originally announced March 2024.
-
Bayesian Inference for Multivariate Monotone Densities
Authors:
Kang Wang,
Subhashis Ghosal
Abstract:
We consider a nonparametric Bayesian approach to estimation and testing for a multivariate monotone density. Instead of following the conventional Bayesian route of putting a prior distribution complying with the monotonicity restriction, we put a prior on the step heights through binning and a Dirichlet distribution. An arbitrary piece-wise constant probability density is converted to a monotone…
▽ More
We consider a nonparametric Bayesian approach to estimation and testing for a multivariate monotone density. Instead of following the conventional Bayesian route of putting a prior distribution complying with the monotonicity restriction, we put a prior on the step heights through binning and a Dirichlet distribution. An arbitrary piece-wise constant probability density is converted to a monotone one by a projection map, taking its $\mathbb{L}_1$-projection onto the space of monotone functions, which is subsequently normalized to integrate to one. We construct consistent Bayesian tests to test multivariate monotonicity of a probability density based on the $\mathbb{L}_1$-distance to the class of monotone functions. The test is shown to have a size going to zero and high power against alternatives sufficiently separated from the null hypothesis. To obtain a Bayesian credible interval for the value of the density function at an interior point with guaranteed asymptotic frequentist coverage, we consider a posterior quantile interval of an induced map transforming the function value to its value optimized over certain blocks. The limiting coverage is explicitly calculated and is seen to be higher than the credibility level used in the construction. By exploring the asymptotic relationship between the coverage and the credibility, we show that a desired asymptomatic coverage can be obtained exactly by starting with an appropriate credibility level.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Bayesian Inference for $k$-Monotone Densities with Applications to Multiple Testing
Authors:
Kang Wang,
Subhashis Ghosal
Abstract:
Shape restriction, like monotonicity or convexity, imposed on a function of interest, such as a regression or density function, allows for its estimation without smoothness assumptions. The concept of $k$-monotonicity encompasses a family of shape restrictions, including decreasing and convex decreasing as special cases corresponding to $k=1$ and $k=2$. We consider Bayesian approaches to estimate…
▽ More
Shape restriction, like monotonicity or convexity, imposed on a function of interest, such as a regression or density function, allows for its estimation without smoothness assumptions. The concept of $k$-monotonicity encompasses a family of shape restrictions, including decreasing and convex decreasing as special cases corresponding to $k=1$ and $k=2$. We consider Bayesian approaches to estimate a $k$-monotone density. By utilizing a kernel mixture representation and putting a Dirichlet process or a finite mixture prior on the mixing distribution, we show that the posterior contraction rate in the Hellinger distance is $(n/\log n)^{- k/(2k + 1)}$ for a $k$-monotone density, which is minimax optimal up to a polylogarithmic factor. When the true $k$-monotone density is a finite $J_0$-component mixture of the kernel, the contraction rate improves to the nearly parametric rate $\sqrt{(J_0 \log n)/n}$. Moreover, by putting a prior on $k$, we show that the same rates hold even when the best value of $k$ is unknown. A specific application in modeling the density of $p$-values in a large-scale multiple testing problem is considered. Simulation studies are conducted to evaluate the performance of the proposed method.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
The envelope of a complex Gaussian random variable
Authors:
Sattwik Ghosal,
Ranjan Maitra
Abstract:
The envelope of an elliptical Gaussian complex vector, or equivalently, the amplitude or norm of a bivariate normal random vector has application in many weather and signal processing contexts. We explicitly characterize its distribution in the general case through its probability density, cumulative distribution and moment generating function. Moments and limiting distributions are also derived.…
▽ More
The envelope of an elliptical Gaussian complex vector, or equivalently, the amplitude or norm of a bivariate normal random vector has application in many weather and signal processing contexts. We explicitly characterize its distribution in the general case through its probability density, cumulative distribution and moment generating function. Moments and limiting distributions are also derived. These derivations are exploited to also characterize the special cases where the bivariate Gaussian mean vector and covariance matrix have a simpler structure, providing new additional insights in many cases. Simulations illustrate the benefits of using our formulae over Monte Carlo methods. We also use our derivations to get a better initial characterization of the distribution of the observed values in structural Magnetic Resonance Imaging datasets, and of wind speed.
△ Less
Submitted 10 June, 2025; v1 submitted 4 May, 2023;
originally announced May 2023.
-
Posterior Contraction and Testing for Multivariate Isotonic Regression
Authors:
Kang Wang,
Subhashis Ghosal
Abstract:
We consider the nonparametric regression problem with multiple predictors and an additive error, where the regression function is assumed to be coordinatewise nondecreasing. We propose a Bayesian approach to make an inference on the multivariate monotone regression function, obtain the posterior contraction rate, and construct a universally consistent Bayesian testing procedure for multivariate mo…
▽ More
We consider the nonparametric regression problem with multiple predictors and an additive error, where the regression function is assumed to be coordinatewise nondecreasing. We propose a Bayesian approach to make an inference on the multivariate monotone regression function, obtain the posterior contraction rate, and construct a universally consistent Bayesian testing procedure for multivariate monotonicity. To facilitate posterior analysis, we set aside the shape restrictions temporarily, and endow a prior on blockwise constant regression functions with heights independently normally distributed. The unknown variance of the error term is either estimated by the marginal maximum likelihood estimate or is equipped with an inverse-gamma prior. Then the unrestricted block heights are a posteriori also independently normally distributed given the error variance, by conjugacy. To comply with the shape restrictions, we project samples from the unrestricted posterior onto the class of multivariate monotone functions, inducing the "projection-posterior distribution", to be used for making an inference. Under an $\mathbb{L}_1$-metric, we show that the projection-posterior based on $n$ independent samples contracts around the true monotone regression function at the optimal rate $n^{-1/(2+d)}$. Then we construct a Bayesian test for multivariate monotonicity based on the posterior probability of a shrinking neighborhood of the class of multivariate monotone functions. We show that the test is universally consistent, that is, the level of the Bayesian test goes to zero, and the power at any fixed alternative goes to one. Moreover, we show that for a smooth alternative function, power goes to one as long as its distance to the class of multivariate monotone functions is at least of the order of the estimation error for a smooth function.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Coverage of Credible Intervals in Bayesian Multivariate Isotonic Regression
Authors:
Kang Wang,
Subhashis Ghosal
Abstract:
We consider the nonparametric multivariate isotonic regression problem, where the regression function is assumed to be nondecreasing with respect to each predictor. Our goal is to construct a Bayesian credible interval for the function value at a given interior point with assured limiting frequentist coverage. We put a prior on unrestricted step-functions, but make inference using the induced post…
▽ More
We consider the nonparametric multivariate isotonic regression problem, where the regression function is assumed to be nondecreasing with respect to each predictor. Our goal is to construct a Bayesian credible interval for the function value at a given interior point with assured limiting frequentist coverage. We put a prior on unrestricted step-functions, but make inference using the induced posterior measure by an "immersion map" from the space of unrestricted functions to that of multivariate monotone functions. This allows maintaining the natural conjugacy for posterior sampling. A natural immersion map to use is a projection via a distance, but in the present context, a block isotonization map is found to be more useful. The approach of using the induced "immersion posterior" measure instead of the original posterior to make inference provides a useful extension of the Bayesian paradigm, particularly helpful when the model space is restricted by some complex relations. We establish a key weak convergence result for the posterior distribution of the function at a point in terms of some functional of a multi-indexed Gaussian process that leads to an expression for the limiting coverage of the Bayesian credible interval. Analogous to a recent result for univariate monotone functions, we find that the limiting coverage is slightly higher than the credibility, the opposite of a phenomenon observed in smoothing problems. Interestingly, the relation between credibility and limiting coverage does not involve any unknown parameter. Hence by a recalibration procedure, we can get a predetermined asymptotic coverage by choosing a suitable credibility level smaller than the targeted coverage, and thus also shorten the credible intervals.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
When degree of roughness is a neighborhood over locally solid Riesz spaces
Authors:
Sanjoy Ghosal,
Sourav Mandal
Abstract:
In this paper we introduce the notion of rough weighted $\mathcal{I}_τ$-limit points set and weighted $\mathcal{I}_τ$-cluster points set in a locally solid Riesz space which are more generalized version of rough weighted $\mathcal{I}$-limit points set and weighted $\mathcal{I}$-cluster points set in a $θ$-metric space respectively. Successively to compare with the following important results of Fr…
▽ More
In this paper we introduce the notion of rough weighted $\mathcal{I}_τ$-limit points set and weighted $\mathcal{I}_τ$-cluster points set in a locally solid Riesz space which are more generalized version of rough weighted $\mathcal{I}$-limit points set and weighted $\mathcal{I}$-cluster points set in a $θ$-metric space respectively. Successively to compare with the following important results of Fridy [Proc. Amer. Math. Soc. {118} (4) (1993), 1187-1192] and Das [Topology Appl. {159} (10-11) (2012), 2621-2626], respectively be stated as \begin{description}
\item[(i)] Any number sequence $x=\{x_{n}\}_{n\in \mathbb{N}},$ the statistical cluster points set of $x$ is closed,
\item[(ii)] In a topological space the $\mathcal{I}$-cluster points set is closed, \end{description}
we show that in general, the weighted $\mathcal{I}_τ$-cluster points set in a locally solid Riesz space may not be closed. The resulting summability method unfollows some previous results in the direction of research works of Aytar [Numer. Funct. Anal. Optim. {29} (3-4) (2008) 291-303], D$\ddot{\mbox{u}}$ndar [Numer. Funct. Anal. Optim. {37} (4) (2016) 480-491], Ghosal [Math. Slovaca {70} (3) (2020) 667-680] and Savaş, Et [Period. Math. Hungar. 71 (2015) 135-145].
△ Less
Submitted 28 June, 2021;
originally announced June 2021.
-
Optimal Bayesian Smoothing of Functional Observations over a Large Graph
Authors:
Arkaprava Roy,
Shubhashis Ghosal
Abstract:
In modern contexts, some types of data are observed in high-resolution, essentially continuously in time. Such data units are best described as taking values in a space of functions. Subject units carrying the observations may have intrinsic relations among themselves, and are best described by the nodes of a large graph. It is often sensible to think that the underlying signals in these functiona…
▽ More
In modern contexts, some types of data are observed in high-resolution, essentially continuously in time. Such data units are best described as taking values in a space of functions. Subject units carrying the observations may have intrinsic relations among themselves, and are best described by the nodes of a large graph. It is often sensible to think that the underlying signals in these functional observations vary smoothly over the graph, in that neighboring nodes have similar underlying signals. This qualitative information allows borrowing of strength over neighboring nodes and consequently leads to more accurate inference. In this paper, we consider a model with Gaussian functional observations and adopt a Bayesian approach to smoothing over the nodes of the graph. We characterize the minimax rate of estimation in terms of the regularity of the signals and their variation across nodes quantified in terms of the graph Laplacian. We show that an appropriate prior constructed from the graph Laplacian can attain the minimax bound, while using a mixture prior, the minimax rate up to a logarithmic factor can be attained simultaneously for all possible values of functional and graphical smoothness. We also show that in the fixed smoothness setting, an optimal sized credible region has arbitrarily high frequentist coverage. A simulation experiment demonstrates that the method performs better than potential competing methods like the random forest. The method is also applied to a dataset on daily temperatures measured at several weather stations in the US state of North Carolina.
△ Less
Submitted 19 July, 2021; v1 submitted 20 April, 2021;
originally announced April 2021.
-
Bayesian inference in high-dimensional models
Authors:
Sayantan Banerjee,
Ismaël Castillo,
Subhashis Ghosal
Abstract:
Models with dimension more than the available sample size are now commonly used in various applications. A sensible inference is possible using a lower-dimensional structure. In regression problems with a large number of predictors, the model is often assumed to be sparse, with only a few predictors active. Interdependence between a large number of variables is succinctly described by a graphical…
▽ More
Models with dimension more than the available sample size are now commonly used in various applications. A sensible inference is possible using a lower-dimensional structure. In regression problems with a large number of predictors, the model is often assumed to be sparse, with only a few predictors active. Interdependence between a large number of variables is succinctly described by a graphical model, where variables are represented by nodes on a graph and an edge between two nodes is used to indicate their conditional dependence given other variables. Many procedures for making inferences in the high-dimensional setting, typically using penalty functions to induce sparsity in the solution obtained by minimizing a loss function, were developed. Bayesian methods have been proposed for such problems more recently, where the prior takes care of the sparsity structure. These methods have the natural ability to also automatically quantify the uncertainty of the inference through the posterior distribution. Theoretical studies of Bayesian procedures in high-dimension have been carried out recently. Questions that arise are, whether the posterior distribution contracts near the true value of the parameter at the minimax optimal rate, whether the correct lower-dimensional structure is discovered with high posterior probability, and whether a credible region has adequate frequentist coverage. In this paper, we review these properties of Bayesian and related methods for several high-dimensional models such as many normal means problem, linear regression, generalized linear models, Gaussian and non-Gaussian graphical models. Effective computational approaches are also discussed.
△ Less
Submitted 12 January, 2021;
originally announced January 2021.
-
Unified Bayesian theory of sparse linear regression with nuisance parameters
Authors:
Seonghyun Jeong,
Subhashis Ghosal
Abstract:
We study frequentist asymptotic properties of Bayesian procedures for high-dimensional Gaussian sparse regression when unknown nuisance parameters are involved. Nuisance parameters can be finite-, high-, or infinite-dimensional. A mixture of point masses at zero and continuous distributions is used for the prior distribution on sparse regression coefficients, and appropriate prior distributions ar…
▽ More
We study frequentist asymptotic properties of Bayesian procedures for high-dimensional Gaussian sparse regression when unknown nuisance parameters are involved. Nuisance parameters can be finite-, high-, or infinite-dimensional. A mixture of point masses at zero and continuous distributions is used for the prior distribution on sparse regression coefficients, and appropriate prior distributions are used for nuisance parameters. The optimal posterior contraction of sparse regression coefficients, hampered by the presence of nuisance parameters, is also examined and discussed. It is shown that the procedure yields strong model selection consistency. A Bernstein-von Mises-type theorem for sparse regression coefficients is also obtained for uncertainty quantification through credible sets with guaranteed frequentist coverage. Asymptotic properties of numerous examples are investigated using the theories developed in this study.
△ Less
Submitted 17 February, 2021; v1 submitted 24 August, 2020;
originally announced August 2020.
-
Convergence Rates for Bayesian Estimation and Testing in Monotone Regression
Authors:
Moumita Chakraborty,
Subhashis Ghosal
Abstract:
Shape restrictions such as monotonicity on functions often arise naturally in statistical modeling.
We consider a Bayesian approach to the problem of estimation of a monotone regression function and testing for monotonicity. We construct a prior distribution using piecewise constant functions. For estimation, a prior imposing monotonicity of the heights of these steps is sensible, but the result…
▽ More
Shape restrictions such as monotonicity on functions often arise naturally in statistical modeling.
We consider a Bayesian approach to the problem of estimation of a monotone regression function and testing for monotonicity. We construct a prior distribution using piecewise constant functions. For estimation, a prior imposing monotonicity of the heights of these steps is sensible, but the resulting posterior is harder to analyze theoretically. We consider a ``projection-posterior'' approach, where a conjugate normal prior is used, but the monotonicity constraint is imposed on posterior samples by a projection map on the space of monotone functions. We show that the resulting posterior contracts at the optimal rate $n^{-1/3}$ under the $L_1$-metric and at a nearly optimal rate under the empirical $L_p$-metrics for $0<p\le 2$. The projection-posterior approach is also computationally more convenient. We also construct a Bayesian test for the hypothesis of monotonicity using the posterior probability of a shrinking neighborhood of the set of monotone functions. We show that the resulting test has a universal consistency property and obtain the separation rate which ensures that the resulting power function approaches one.
△ Less
Submitted 3 August, 2020;
originally announced August 2020.
-
Bayesian nonparametric tests for multivariate locations
Authors:
Indrabati Bhattacharya,
Subhashis Ghosal
Abstract:
In this paper, we propose novel, fully Bayesian non-parametric tests for one-sample and two-sample multivariate location problems. We model the underlying distribution using a Dirichlet process prior, and develop a testing procedure based on the posterior credible region for the spatial median functional of the distribution. For the one-sample problem, we fail to reject the null hypothesis if the…
▽ More
In this paper, we propose novel, fully Bayesian non-parametric tests for one-sample and two-sample multivariate location problems. We model the underlying distribution using a Dirichlet process prior, and develop a testing procedure based on the posterior credible region for the spatial median functional of the distribution. For the one-sample problem, we fail to reject the null hypothesis if the credible set contains the null value. For the two-sample problem, we form a credible set for the difference of the spatial medians of the two samples and we fail to reject the null hypothesis of equality if the credible set contains zero. We derive the local asymptotic power of the tests under shrinking alternatives, and also present a simulation study to compare the finite-sample performance of our testing procedures with existing parametric and non-parametric tests.
△ Less
Submitted 1 August, 2021; v1 submitted 1 July, 2020;
originally announced July 2020.
-
Bayesian Inference on Multivariate Medians and Quantiles
Authors:
Indrabati Bhattacharya,
Subhashis Ghosal
Abstract:
In this paper, we consider Bayesian inference on a class of multivariate median and the multivariate quantile functionals of a joint distribution using a Dirichlet process prior. Since, unlike univariate quantiles, the exact posterior distribution of multivariate median and multivariate quantiles are not obtainable explicitly, we study these distributions asymptotically. We derive a Bernstein-von…
▽ More
In this paper, we consider Bayesian inference on a class of multivariate median and the multivariate quantile functionals of a joint distribution using a Dirichlet process prior. Since, unlike univariate quantiles, the exact posterior distribution of multivariate median and multivariate quantiles are not obtainable explicitly, we study these distributions asymptotically. We derive a Bernstein-von Mises theorem for the multivariate $\ell_1$-median with respect to general $\ell_p$-norm, which in particular shows that its posterior concentrates around its true value at $n^{-1/2}$-rate and its credible sets have asymptotically correct frequentist coverage. In particular, asymptotic normality results for the empirical multivariate median with general $\ell_p$-norm is also derived in the course of the proof which extends the results from the case $p=2$ in the literature to a general $p$. The technique involves approximating the posterior Dirichlet process by a Bayesian bootstrap process and deriving a conditional Donsker theorem. We also obtain analogous results for an affine equivariant version of the multivariate $\ell_1$-median based on an adaptive transformation and re-transformation technique. The results are extended to a joint distribution of multivariate quantiles. The accuracy of the asymptotic result is confirmed by a simulation study. We also use the results to obtain Bayesian credible regions for multivariate medians for Fisher's iris data, which consists of four features measured for each of three plant species.
△ Less
Submitted 22 September, 2019;
originally announced September 2019.
-
Bayesian Linear Regression for Multivariate Responses Under Group Sparsity
Authors:
Bo Ning,
Seonghyun Jeong,
Subhashis Ghosal
Abstract:
We study frequentist properties of a Bayesian high-dimensional multivariate linear regression model with correlated responses. The predictors are separated into many groups and the group structure is pre-determined. Two features of the model are unique: (i) group sparsity is imposed on the predictors. (ii) the covariance matrix is unknown and its dimensions can also be high. We choose a product of…
▽ More
We study frequentist properties of a Bayesian high-dimensional multivariate linear regression model with correlated responses. The predictors are separated into many groups and the group structure is pre-determined. Two features of the model are unique: (i) group sparsity is imposed on the predictors. (ii) the covariance matrix is unknown and its dimensions can also be high. We choose a product of independent spike-and-slab priors on the regression coefficients and a new prior on the covariance matrix based on its eigendecomposition. Each spike-and-slab prior is a mixture of a point mass at zero and a multivariate density involving a $\ell_{2,1}$-norm. We first obtain the posterior contraction rate, the bounds on the effective dimension of the model with high posterior probabilities. We then show that the multivariate regression coefficients can be recovered under certain compatibility conditions. Finally, we quantify the uncertainty for the regression coefficients with frequentist validity through a Bernstein-von Mises type theorem. The result leads to selection consistency for the Bayesian method. We derive the posterior contraction rate using the general theory by constructing a suitable test from the first principle using moment bounds for certain likelihood ratios. This leads to posterior concentration around the truth with respect to the average Rényi divergence of order 1/2. This technique of obtaining the required tests for posterior contraction rate could be useful in many other problems.
△ Less
Submitted 11 June, 2019; v1 submitted 9 July, 2018;
originally announced July 2018.
-
Posterior Contraction and Credible Sets for Filaments of Regression Functions
Authors:
Wei Li,
Subhashis Ghosal
Abstract:
A filament consists of local maximizers of a smooth function $f$ when moving in a certain direction. A filamentary structure is an important feature of the shape of an object and is also considered as an important lower dimensional characterization of multivariate data. There have been some recent theoretical studies of filaments in the nonparametric kernel density estimation context. This paper s…
▽ More
A filament consists of local maximizers of a smooth function $f$ when moving in a certain direction. A filamentary structure is an important feature of the shape of an object and is also considered as an important lower dimensional characterization of multivariate data. There have been some recent theoretical studies of filaments in the nonparametric kernel density estimation context. This paper supplements the current literature in two ways. First, we provide a Bayesian approach to the filament estimation in regression context and study the posterior contraction rates using a finite random series of B-splines basis. Compared with the kernel-estimation method, this has a theoretical advantage as the bias can be better controlled when the function is smoother, which allows obtaining better rates. Assuming that $f: \mathbb{R}^2 \mapsto \mathbb{R}$ belongs to an isotropic Hölder class of order $α\geq 4$, with the optimal choice of smoothing parameters, the posterior contraction rates for the filament points on some appropriately defined integral curves and for the Hausdorff distance of the filament are both $(n/\log n)^{(2-α)/(2(1+α))}$. Secondly, we provide a way to construct a credible set with sufficient frequentist coverage for the filaments. We demonstrate the success of our proposed method in simulations and one application to earthquake data.
△ Less
Submitted 24 March, 2020; v1 submitted 10 March, 2018;
originally announced March 2018.
-
Bayesian mode and maximum estimation and accelerated rates of contraction
Authors:
William Weimin Yoo,
Subhashis Ghosal
Abstract:
We study the problem of estimating the mode and maximum of an unknown regression function in the presence of noise. We adopt the Bayesian approach by using tensor-product B-splines and endowing the coefficients with Gaussian priors. In the usual fixed-in-advanced sampling plan, we establish posterior contraction rates for mode and maximum and show that they coincide with the minimax rates for this…
▽ More
We study the problem of estimating the mode and maximum of an unknown regression function in the presence of noise. We adopt the Bayesian approach by using tensor-product B-splines and endowing the coefficients with Gaussian priors. In the usual fixed-in-advanced sampling plan, we establish posterior contraction rates for mode and maximum and show that they coincide with the minimax rates for this problem. To quantify estimation uncertainty, we construct credible sets for these two quantities that have high coverage probabilities with optimal sizes. If one is allowed to collect data sequentially, we further propose a Bayesian two-stage estimation procedure, where a second stage posterior is built based on samples collected within a credible set constructed from a first stage posterior. Under appropriate conditions on the radius of this credible set, we can accelerate optimal contraction rates from the fixed-in-advanced setting to the minimax sequential rates. A simulation experiment shows that our Bayesian two-stage procedure outperforms single-stage procedure and also slightly improves upon a non-Bayesian two-stage procedure.
△ Less
Submitted 15 March, 2018; v1 submitted 12 August, 2016;
originally announced August 2016.
-
A notion of $αβ$-statistical convergence of order $γ$ in probability
Authors:
Pratulananda Das,
Sanjoy Ghosal,
Vatan Karakaya,
Sumit Som
Abstract:
A sequence of real numbers $\{x_{n}\}_{n\in \mathbb{N}}$ is said to be $αβ$-statistically convergent of order $γ$ (where $0<γ\leq 1$) to a real number $x$ \cite{a} if for every $δ>0,$ $$\underset{n\rightarrow \infty} {\lim} \frac{1}{(β_{n} - α_{n} + 1)^γ}~ |\{k \in [α_n,β_n] : |x_{k}-x|\geq δ\}|=0.$$ where $\{α_{n}\}_{n\in \mathbb{N}}$ and $\{β_{n}\}_{n\in \mathbb{N}}$ be two sequences of positive…
▽ More
A sequence of real numbers $\{x_{n}\}_{n\in \mathbb{N}}$ is said to be $αβ$-statistically convergent of order $γ$ (where $0<γ\leq 1$) to a real number $x$ \cite{a} if for every $δ>0,$ $$\underset{n\rightarrow \infty} {\lim} \frac{1}{(β_{n} - α_{n} + 1)^γ}~ |\{k \in [α_n,β_n] : |x_{k}-x|\geq δ\}|=0.$$ where $\{α_{n}\}_{n\in \mathbb{N}}$ and $\{β_{n}\}_{n\in \mathbb{N}}$ be two sequences of positive real numbers such that $\{α_{n}\}_{n\in \mathbb{N}}$ and $\{β_{n}\}_{n\in \mathbb{N}}$ are both non-decreasing, $β_{n}\geq α_{n}$ $\forall ~n\in \mathbb{N},$ ($β_{n}-α_{n})\rightarrow \infty$ as $n\rightarrow \infty.$ In this paper we study a related concept of convergences in which the value $|x_{k}-x|$ is replaced by $P(|X_{k}-X|\geq \varepsilon)$ and $E(|X_{k}-X|^{r})$ repectively (Where $X, X_k$ are random variables for each $k\in \mathbb{N}$, $\varepsilon>0$, $P$ denote the probability, $E$ denote the expectation) and we call them $αβ$-statistical convergence of order $γ$ in probability and $αβ$-statistical convergence of order $γ$ in $r^{\mbox{th}}$ expectation respectively. The results are applied to build the probability distribution for $αβ$-strong $p$-Ces$\grave{\mbox{a}}$ro summability of order $γ$ in probability and $αβ$-statistical convergence of order $γ$ in distribution. Our main objective is to interpret a relational behavior of above mentioned four convergences.
△ Less
Submitted 20 May, 2016;
originally announced May 2016.
-
Statistical convergence of order $α$ in probability
Authors:
Pratulananda Das,
Sanjoy Ghosal,
Sumit Som
Abstract:
In this paper ideas of different types of convergence of a sequence of random variables in probability, namely, statistical convergence of order $α$ in probability, strong $p$-Ces$\grave{\mbox{a}}$ro summability of order $α$ in probability, lacunary statistical convergence or $S_θ$-convergence of order $α$ in probability, ${N_θ}$-convergence of order $α$ in probability have been introduced and the…
▽ More
In this paper ideas of different types of convergence of a sequence of random variables in probability, namely, statistical convergence of order $α$ in probability, strong $p$-Ces$\grave{\mbox{a}}$ro summability of order $α$ in probability, lacunary statistical convergence or $S_θ$-convergence of order $α$ in probability, ${N_θ}$-convergence of order $α$ in probability have been introduced and their certain basic properties have been studied.
△ Less
Submitted 18 May, 2016;
originally announced May 2016.
-
Discussion of "Frequentist coverage of adaptive nonparametric Bayesian credible sets"
Authors:
Subhashis Ghosal
Abstract:
Discussion of "Frequentist coverage of adaptive nonparametric Bayesian credible sets" by Szabó, van der Vaart and van Zanten [arXiv:1310.4489v5].
Discussion of "Frequentist coverage of adaptive nonparametric Bayesian credible sets" by Szabó, van der Vaart and van Zanten [arXiv:1310.4489v5].
△ Less
Submitted 7 September, 2015;
originally announced September 2015.
-
Bayesian Detection of Image Boundaries
Authors:
Meng Li,
Subhashis Ghosal
Abstract:
Detecting boundary of an image based on noisy observations is a fundamental problem of image processing and image segmentation. For a $d$-dimensional image ($d = 2, 3, \ldots$), the boundary can often be described by a closed smooth $(d - 1)$-dimensional manifold. In this paper, we propose a nonparametric Bayesian approach based on priors indexed by $\mathbb{S}^{d - 1}$, the unit sphere in…
▽ More
Detecting boundary of an image based on noisy observations is a fundamental problem of image processing and image segmentation. For a $d$-dimensional image ($d = 2, 3, \ldots$), the boundary can often be described by a closed smooth $(d - 1)$-dimensional manifold. In this paper, we propose a nonparametric Bayesian approach based on priors indexed by $\mathbb{S}^{d - 1}$, the unit sphere in $\mathbb{R}^d$. We derive optimal posterior contraction rates using Gaussian processes or finite random series priors using basis functions such as trigonometric polynomials for 2-dimensional images and spherical harmonics for 3-dimensional images. For 2-dimensional images, we show a rescaled squared exponential Gaussian process on $\mathbb{S}^1$ achieves four goals of guaranteed geometric restriction, (nearly) minimax rate optimal and adaptive to the smoothness level, convenient for joint inference and computationally efficient. We conduct an extensive study of its reproducing kernel Hilbert space, which may be of interest by its own and can also be used in other contexts. Simulations confirm excellent performance of the proposed method and indicate its robustness under model misspecification at least under the simulated settings.
△ Less
Submitted 24 May, 2016; v1 submitted 24 August, 2015;
originally announced August 2015.
-
Bayesian inference for higher order ordinary differential equation models
Authors:
Prithwish Bhaumik,
Subhashis Ghosal
Abstract:
Often the regression function appearing in fields like economics, engineering, biomedical sciences obeys a system of higher order ordinary differential equations (ODEs). The equations are usually not analytically solvable. We are interested in inferring on the unknown parameters appearing in the equations. Significant amount of work has been done on parameter estimation in first order ODE models.…
▽ More
Often the regression function appearing in fields like economics, engineering, biomedical sciences obeys a system of higher order ordinary differential equations (ODEs). The equations are usually not analytically solvable. We are interested in inferring on the unknown parameters appearing in the equations. Significant amount of work has been done on parameter estimation in first order ODE models. Bhaumik and Ghosal (2014a) considered a two-step Bayesian approach by putting a finite random series prior on the regression function using B-spline basis. The posterior distribution of the parameter vector is induced from that of the regression function. Although this approach is computationally fast, the Bayes estimator is not asymptotically efficient. Bhaumik and Ghosal (2014b) remedied this by directly considering the distance between the function in the nonparametric model and a Runge-Kutta (RK$4$) approximate solution of the ODE while inducing the posterior distribution on the parameter. They also studied the direct Bayesian method obtained from the approximate likelihood obtained by the RK4 method. In this paper we extend these ideas for the higher order ODE model and establish Bernstein-von Mises theorems for the posterior distribution of the parameter vector for each method with $n^{-1/2}$ contraction rate.
△ Less
Submitted 16 May, 2015;
originally announced May 2015.
-
Supremum Norm Posterior Contraction and Credible Sets for Nonparametric Multivariate Regression
Authors:
William Weimin Yoo,
Subhashis Ghosal
Abstract:
In the setting of nonparametric multivariate regression with unknown error variance, we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a prior for f, and the error variance is either estimated using the empirical Bayes approach or is en…
▽ More
In the setting of nonparametric multivariate regression with unknown error variance, we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a prior for f, and the error variance is either estimated using the empirical Bayes approach or is endowed with a suitable prior in a hierarchical Bayes approach. We establish pointwise, L2 and supremum norm posterior contraction rates for f and its mixed partial derivatives, and show that they coincide with the minimax rates. Our results cover even the anisotropic situation, where the true regression function may have different smoothness in different directions. Using the convergence bounds, we show that pointwise, L2 and supremum norm credible sets for f and its mixed partial derivatives have guaranteed frequentist coverage with optimal size. New results on tensor products of B-splines are also obtained in the course.
△ Less
Submitted 23 September, 2015; v1 submitted 24 November, 2014;
originally announced November 2014.
-
Efficient Bayesian estimation and uncertainty quantification in ordinary differential equation models
Authors:
Prithwish Bhaumik,
Subhashis Ghosal
Abstract:
Often the regression function is specified by a system of ordinary differential equations (ODEs) involving some unknown parameters. Typically analytical solution of the ODEs is not available, and hence likelihood evaluation at many parameter values by numerical solution of equations may be computationally prohibitive. Bhaumik and Ghosal (2015) considered a Bayesian two-step approach by embedding t…
▽ More
Often the regression function is specified by a system of ordinary differential equations (ODEs) involving some unknown parameters. Typically analytical solution of the ODEs is not available, and hence likelihood evaluation at many parameter values by numerical solution of equations may be computationally prohibitive. Bhaumik and Ghosal (2015) considered a Bayesian two-step approach by embedding the model in a larger nonparametric regression model, where a prior is put through a random series based on B-spline basis functions. A posterior on the parameter is induced from the regression function by minimizing an integrated weighted squared distance between the derivative of the regression function and the derivative suggested by the ODEs. Although this approach is computationally fast, the Bayes estimator is not asymptotically efficient. In this paper we suggest a modification of the two-step method by directly considering the distance between the function in the nonparametric model and that obtained from a four stage Runge-Kutta (RK4) method. We also study the asymptotic behavior of the posterior distribution of the parameter based on an approximate likelihood obtained from an RK4 numerical solution of the ODEs. We establish a Bernstein-von Mises theorem for both methods which assures that Bayesian uncertainty quantification matches with the frequentist one and the Bayes estimator is asymptotically efficient.
△ Less
Submitted 22 February, 2016; v1 submitted 5 November, 2014;
originally announced November 2014.
-
Bayesian two-step estimation in differential equation models
Authors:
Prithwish Bhaumik,
Subhashis Ghosal
Abstract:
Ordinary differential equations (ODEs) are used to model dynamic systems appearing in engineering, physics, biomedical sciences and many other fields. These equations contain unknown parameters, say $\bmθ$ of physical significance which have to be estimated from the noisy data. Often there is no closed form analytic solution of the equations and hence we cannot use the usual non-linear least squar…
▽ More
Ordinary differential equations (ODEs) are used to model dynamic systems appearing in engineering, physics, biomedical sciences and many other fields. These equations contain unknown parameters, say $\bmθ$ of physical significance which have to be estimated from the noisy data. Often there is no closed form analytic solution of the equations and hence we cannot use the usual non-linear least squares technique to estimate the unknown parameters. There is a two-step approach to solve this problem, where the first step involves fitting the data nonparametrically. In the second step the parameter is estimated by minimizing the distance between the nonparametrically estimated derivative and the derivative suggested by the system of ODEs. The statistical aspects of this approach have been studied under the frequentist framework. We consider this two-step estimation under the Bayesian framework. The response variable is allowed to be multidimensional and the true mean function of it is not assumed to be in the model. We induce a prior on the regression function using a random series based on the B-spline basis functions. We establish the Bernstein-von Mises theorem for the posterior distribution of the parameter of interest. Interestingly, even though the posterior distribution of the regression function based on splines converges at a rate slower than $n^{-1/2}$, the parameter vector $\bmθ$ is nevertheless estimated at $n^{-1/2}$ rate.
△ Less
Submitted 4 November, 2014;
originally announced November 2014.
-
Adaptive Bayesian density regression for high-dimensional data
Authors:
Weining Shen,
Subhashis Ghosal
Abstract:
Density regression provides a flexible strategy for modeling the distribution of a response variable $Y$ given predictors $\mathbf{X}=(X_1,\ldots,X_p)$ by letting that the conditional density of $Y$ given $\mathbf{X}$ as a completely unknown function and allowing its shape to change with the value of $\mathbf{X}$. The number of predictors $p$ may be very large, possibly much larger than the number…
▽ More
Density regression provides a flexible strategy for modeling the distribution of a response variable $Y$ given predictors $\mathbf{X}=(X_1,\ldots,X_p)$ by letting that the conditional density of $Y$ given $\mathbf{X}$ as a completely unknown function and allowing its shape to change with the value of $\mathbf{X}$. The number of predictors $p$ may be very large, possibly much larger than the number of observations $n$, but the conditional density is assumed to depend only on a much smaller number of predictors, which are unknown. In addition to estimation, the goal is also to select the important predictors which actually affect the true conditional density. We consider a nonparametric Bayesian approach to density regression by constructing a random series prior based on tensor products of spline functions. The proposed prior also incorporates the issue of variable selection. We show that the posterior distribution of the conditional density contracts adaptively at the truth nearly at the optimal oracle rate, determined by the unknown sparsity and smoothness levels, even in the ultra high-dimensional settings where $p$ increases exponentially with $n$. The result is also extended to the anisotropic case where the degree of smoothness can vary in different directions, and both random and deterministic predictors are considered. We also propose a technique to calculate posterior moments of the conditional density function without requiring Markov chain Monte Carlo methods.
△ Less
Submitted 6 January, 2016; v1 submitted 11 March, 2014;
originally announced March 2014.
-
Adaptive Bayesian procedures using random series priors
Authors:
Weining Shen,
Subhashis Ghosal
Abstract:
We consider a prior for nonparametric Bayesian estimation which uses finite random series with a random number of terms. The prior is constructed through distributions on the number of basis functions and the associated coefficients. We derive a general result on adaptive posterior convergence rates for all smoothness levels of the function in the true model by constructing an appropriate "sieve"…
▽ More
We consider a prior for nonparametric Bayesian estimation which uses finite random series with a random number of terms. The prior is constructed through distributions on the number of basis functions and the associated coefficients. We derive a general result on adaptive posterior convergence rates for all smoothness levels of the function in the true model by constructing an appropriate "sieve" and applying the general theory of posterior convergence rates. We apply this general result on several statistical problems such as signal processing, density estimation, various nonparametric regressions, classification, spectral density estimation, functional regression etc. The prior can be viewed as an alternative to the commonly used Gaussian process prior, but properties of the posterior distribution can be analyzed by relatively simpler techniques and in many cases allows a simpler approach to computation without using Markov chain Monte-Carlo (MCMC) methods. A simulation study is conducted to show that the accuracy of the Bayesian estimators based on the random series prior and the Gaussian process prior are comparable. We apply the method on two interesting data sets on functional regression.
△ Less
Submitted 7 February, 2015; v1 submitted 3 March, 2014;
originally announced March 2014.
-
Bayesian estimation in differential equation models
Authors:
Prithwish Bhaumik,
Subhashis Ghosal
Abstract:
Ordinary differential equations (ODEs) are used to model dynamic systems appearing in engineering, physics, biomedical sciences and many other fields. These equations contain unknown parameters, say $θ$ of physical significance which have to be estimated from the noisy data. Often there is no closed form analytic solution of the equations and hence we cannot use the usual non-linear least squares…
▽ More
Ordinary differential equations (ODEs) are used to model dynamic systems appearing in engineering, physics, biomedical sciences and many other fields. These equations contain unknown parameters, say $θ$ of physical significance which have to be estimated from the noisy data. Often there is no closed form analytic solution of the equations and hence we cannot use the usual non-linear least squares technique to estimate the unknown parameters. There is a two step approach to solve this problem, where the first step involves fitting the data nonparametrically. In the second step the parameter is estimated by minimizing the distance between the nonparametrically estimated derivative and the derivative suggested by the system of ODEs. The statistical aspects of this approach have been studied under the frequentist framework. We consider this two step estimation under the Bayesian framework. The response variable is allowed to be multidimensional and the true mean function of it is not assumed to be in the model. We induce a prior on the regression function using a random series based on the B-spline basis functions. We establish the Bernstein-von Mises theorem for the posterior distribution of the parameter of interest. Interestingly, even though the posterior distribution of the regression function based on splines converges at a rate slower than $n^{-1/2}$, the parameter vector $θ$ is nevertheless estimated at $n^{-1/2}$ rate.
△ Less
Submitted 3 March, 2014;
originally announced March 2014.
-
Bayesian estimation of a sparse precision matrix
Authors:
Sayantan Banerjee,
Subhashis Ghosal
Abstract:
We consider the problem of estimating a sparse precision matrix of a multivariate Gaussian distribution, including the case where the dimension $p$ is large. Gaussian graphical models provide an important tool in describing conditional independence through presence or absence of the edges in the underlying graph. A popular non-Bayesian method of estimating a graphical structure is given by the gra…
▽ More
We consider the problem of estimating a sparse precision matrix of a multivariate Gaussian distribution, including the case where the dimension $p$ is large. Gaussian graphical models provide an important tool in describing conditional independence through presence or absence of the edges in the underlying graph. A popular non-Bayesian method of estimating a graphical structure is given by the graphical lasso. In this paper, we consider a Bayesian approach to the problem. We use priors which put a mixture of a point mass at zero and certain absolutely continuous distribution on off-diagonal elements of the precision matrix. Hence the resulting posterior distribution can be used for graphical structure learning. The posterior convergence rate of the precision matrix is obtained. The posterior distribution on the model space is extremely cumbersome to compute. We propose a fast computational method for approximating the posterior probabilities of various graphs using the Laplace approximation approach by expanding the posterior density around the posterior mode, which is the graphical lasso by our choice of the prior distribution. We also provide estimates of the accuracy in the approximation.
△ Less
Submitted 6 April, 2014; v1 submitted 6 September, 2013;
originally announced September 2013.
-
Optimal two-stage procedures for estimating location and size of the maximum of a multivariate regression function
Authors:
Eduard Belitser,
Subhashis Ghosal,
Harry van Zanten
Abstract:
We propose a two-stage procedure for estimating the location $\boldsμ$ and size M of the maximum of a smooth d-variate regression function f. In the first stage, a preliminary estimator of $\boldsμ$ obtained from a standard nonparametric smoothing method is used. At the second stage, we "zoom-in" near the vicinity of the preliminary estimator and make further observations at some design points in…
▽ More
We propose a two-stage procedure for estimating the location $\boldsμ$ and size M of the maximum of a smooth d-variate regression function f. In the first stage, a preliminary estimator of $\boldsμ$ obtained from a standard nonparametric smoothing method is used. At the second stage, we "zoom-in" near the vicinity of the preliminary estimator and make further observations at some design points in that vicinity. We fit an appropriate polynomial regression model to estimate the location and size of the maximum. We establish that, under suitable smoothness conditions and appropriate choice of the zooming, the second stage estimators have better convergence rates than the corresponding first stage estimators of $\boldsμ$ and M. More specifically, for $α$-smooth regression functions, the optimal nonparametric rates $n^{-(α-1)/(2α+d)}$ and $n^{-α/(2α+d)}$ at the first stage can be improved to $n^{-(α-1)/(2α)}$ and $n^{-1/2}$, respectively, for $α>1+\sqrt{1+d/2}$. These rates are optimal in the class of all possible sequential estimators. Interestingly, the two-stage procedure resolves "the curse of the dimensionality" problem to some extent, as the dimension d does not control the second stage convergence rates, provided that the function class is sufficiently smooth. We consider a multi-stage generalization of our procedure that attains the optimal rate for any smoothness level $α>2$ starting with a preliminary estimator with any power-law rate at the first stage.
△ Less
Submitted 19 February, 2013;
originally announced February 2013.
-
Posterior convergence rates for estimating large precision matrices using graphical models
Authors:
Sayantan Banerjee,
Subhashis Ghosal
Abstract:
We consider Bayesian estimation of a $p\times p$ precision matrix, when $p$ can be much larger than the available sample size $n$. It is well known that consistent estimation in such ultra-high dimensional situations requires regularization such as banding, tapering or thresholding. We consider a banding structure in the model and induce a prior distribution on a banded precision matrix through a…
▽ More
We consider Bayesian estimation of a $p\times p$ precision matrix, when $p$ can be much larger than the available sample size $n$. It is well known that consistent estimation in such ultra-high dimensional situations requires regularization such as banding, tapering or thresholding. We consider a banding structure in the model and induce a prior distribution on a banded precision matrix through a Gaussian graphical model, where an edge is present only when two vertices are within a given distance. For a proper choice of the order of graph, we obtain the convergence rate of the posterior distribution and Bayes estimators based on the graphical model in the $L_{\infty}$-operator norm uniformly over a class of precision matrices, even if the true precision matrix may not have a banded structure. Along the way to the proof, we also compute the convergence rate of the maximum likelihood estimator (MLE) under the same set of condition, which is of independent interest. The graphical model based MLE and Bayes estimators are automatically positive definite, which is a desirable property not possessed by some other estimators in the literature. We also conduct a simulation study to compare finite sample performance of the Bayes estimators and the MLE based on the graphical model with that obtained by using a Cholesky decomposition of the precision matrix. Finally, we discuss a practical method of choosing the order of the graphical model using the marginal likelihood function.
△ Less
Submitted 6 November, 2014; v1 submitted 11 February, 2013;
originally announced February 2013.
-
MCMC-free adaptive Bayesian procedures using random series prior
Authors:
Weining Shen,
Subhashis Ghosal
Abstract:
We consider priors for several nonparametric Bayesian models which use finite random series with a random number of terms. The prior is constructed through distributions on the number of basis functions and the associated coefficients. We derive a general result on the construction of an appropriate sieve and obtain adaptive posterior contraction rates for all smoothness levels of the function in…
▽ More
We consider priors for several nonparametric Bayesian models which use finite random series with a random number of terms. The prior is constructed through distributions on the number of basis functions and the associated coefficients. We derive a general result on the construction of an appropriate sieve and obtain adaptive posterior contraction rates for all smoothness levels of the function in the true model. We apply this general result on several statistical problems such as signal processing, density estimation, nonparametric additive regression, classification, spectral density estimation, functional regression etc. The prior can be viewed as an alternative to commonly used Gaussian process prior, but can be analyzed by relatively simpler techniques and in many cases allows a simpler approach to computation without using Markov chain Monte-Carlo (MCMC) methods. A simulation study was conducted to show that the performance of the random series prior is comparable to that of a Gaussian process prior.
△ Less
Submitted 7 February, 2015; v1 submitted 18 April, 2012;
originally announced April 2012.
-
Adaptive Bayesian multivariate density estimation with Dirichlet mixtures
Authors:
Weining Shen,
Surya T. Tokdar,
Subhashis Ghosal
Abstract:
We show that rate-adaptive multivariate density estimation can be performed using Bayesian methods based on Dirichlet mixtures of normal kernels with a prior distribution on the kernel's covariance matrix parameter. We derive sufficient conditions on the prior specification that guarantee convergence to a true density at a rate that is optimal minimax for the smoothness class to which the true den…
▽ More
We show that rate-adaptive multivariate density estimation can be performed using Bayesian methods based on Dirichlet mixtures of normal kernels with a prior distribution on the kernel's covariance matrix parameter. We derive sufficient conditions on the prior specification that guarantee convergence to a true density at a rate that is optimal minimax for the smoothness class to which the true density belongs. No prior knowledge of smoothness is assumed. The sufficient conditions are shown to hold for the Dirichlet location mixture of normals prior with a Gaussian base measure and an inverse-Wishart prior on the covariance matrix parameter. Locally Hölder smoothness classes and their anisotropic extensions are considered. Our study involves several technical novelties, including sharp approximation of finitely differentiable multivariate densities by normal mixtures and a new sieve on the space of such densities.
△ Less
Submitted 18 March, 2013; v1 submitted 29 September, 2011;
originally announced September 2011.
-
Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh
Authors:
Bertrand Clarke,
Subhashis Ghosal
Abstract:
Jayanta Kumar Ghosh is one of the most extraordinary professors in the field of Statistics. His research in numerous areas, especially asymptotics, has been groundbreaking, influential throughout the world, and widely recognized through awards and other honors. His leadership in Statistics as Director of the Indian Statistical Institute and President of the International Statistical Institute, a…
▽ More
Jayanta Kumar Ghosh is one of the most extraordinary professors in the field of Statistics. His research in numerous areas, especially asymptotics, has been groundbreaking, influential throughout the world, and widely recognized through awards and other honors. His leadership in Statistics as Director of the Indian Statistical Institute and President of the International Statistical Institute, among other eminent positions, has been likewise outstanding. In recognition of Jayanta's enormous impact, this volume is an effort to honor him by drawing together contributions to the main areas in which he has worked and continues to work. The papers naturally fall into five categories. First, sequential estimation was Jayanta's starting point. Thus, beginning with that topic, there are two papers, one classical by Hall and Ding leading to a variant on p-values, and one Bayesian by Berger and Sun extending reference priors to stopping time problems. Second, there are five papers in the general area of prior specification. Much of Jayanta's earlier work involved group families as does Sweeting's paper here for instance. There are also two papers dwelling on the link between fuzzy sets and priors, by Meeden and by Delampady and Angers. Equally daring is the work by Mukerjee with data dependent priors and the pleasing confluence of several prior selection criteria found by Ghosh, Santra and Kim. Jayanta himself studied a variety of prior selection criteria including probability matching priors and reference priors.
△ Less
Submitted 27 June, 2008;
originally announced June 2008.
-
J. K. Ghosh's contribution to statistics: A brief outline
Authors:
Bertrand Clarke,
Subhashis Ghosal
Abstract:
Professor Jayanta Kumar Ghosh has contributed massively to various areas of Statistics over the last five decades. Here, we survey some of his most important contributions. In roughly chronological order, we discuss his major results in the areas of sequential analysis, foundations, asymptotics, and Bayesian inference. It is seen that he progressed from thinking about data points, to thinking ab…
▽ More
Professor Jayanta Kumar Ghosh has contributed massively to various areas of Statistics over the last five decades. Here, we survey some of his most important contributions. In roughly chronological order, we discuss his major results in the areas of sequential analysis, foundations, asymptotics, and Bayesian inference. It is seen that he progressed from thinking about data points, to thinking about data summarization, to the limiting cases of data summarization in as they relate to parameter estimation, and then to more general aspects of modeling including prior and model selection.
△ Less
Submitted 20 May, 2008;
originally announced May 2008.
-
Posterior consistency of Dirichlet mixtures of beta densities in estimating positive false discovery rates
Authors:
Subhashis Ghosal,
Anindya Roy,
Yongqiang Tang
Abstract:
In recent years, multiple hypothesis testing has come to the forefront of statistical research, ostensibly in relation to applications in genomics and some other emerging fields. The false discovery rate (FDR) and its variants provide very important notions of errors in this context comparable to the role of error probabilities in classical testing problems. Accurate estimation of positive FDR (…
▽ More
In recent years, multiple hypothesis testing has come to the forefront of statistical research, ostensibly in relation to applications in genomics and some other emerging fields. The false discovery rate (FDR) and its variants provide very important notions of errors in this context comparable to the role of error probabilities in classical testing problems. Accurate estimation of positive FDR (pFDR), a variant of the FDR, is essential in assessing and controlling this measure. In a recent paper, the authors proposed a model-based nonparametric Bayesian method of estimation of the pFDR function. In particular, the density of p-values was modeled as a mixture of decreasing beta densities and an appropriate Dirichlet process was considered as a prior on the mixing measure. The resulting procedure was shown to work well in simulations. In this paper, we provide some theoretical results in support of the beta mixture model for the density of p-values, and show that, under appropriate conditions, the resulting posterior is consistent as the number of hypotheses grows to infinity.
△ Less
Submitted 15 May, 2008;
originally announced May 2008.
-
Nonparametric Bayesian model selection and averaging
Authors:
Subhashis Ghosal,
Jüri Lember,
Aad van der Vaart
Abstract:
We consider nonparametric Bayesian estimation of a probability density $p$ based on a random sample of size $n$ from this density using a hierarchical prior. The prior consists, for instance, of prior weights on the regularity of the unknown density combined with priors that are appropriate given that the density has this regularity. More generally, the hierarchy consists of prior weights on an…
▽ More
We consider nonparametric Bayesian estimation of a probability density $p$ based on a random sample of size $n$ from this density using a hierarchical prior. The prior consists, for instance, of prior weights on the regularity of the unknown density combined with priors that are appropriate given that the density has this regularity. More generally, the hierarchy consists of prior weights on an abstract model index and a prior on a density model for each model index. We present a general theorem on the rate of contraction of the resulting posterior distribution as $n\to \infty$, which gives conditions under which the rate of contraction is the one attached to the model that best approximates the true density of the observations. This shows that, for instance, the posterior distribution can adapt to the smoothness of the underlying density. We also study the posterior distribution of the model index, and find that under the same conditions the posterior distribution gives negligible weight to models that are bigger than the optimal one, and thus selects the optimal model or smaller models that also approximate the true density well. We apply these result to log spline density models, where we show that the prior weights on the regularity index interact with the priors on the models, making the exact rates depend in a complicated way on the priors, but also that the rate is fairly robust to specification of the prior weights.
△ Less
Submitted 1 February, 2008;
originally announced February 2008.
-
Kullback Leibler property of kernel mixture priors in Bayesian density estimation
Authors:
Yuefeng Wu,
Subhashis Ghosal
Abstract:
Positivity of the prior probability of Kullback-Leibler neighborhood around the true density, commonly known as the Kullback-Leibler property, plays a fundamental role in posterior consistency. A popular prior for Bayesian estimation is given by a Dirichlet mixture, where the kernels are chosen depending on the sample space and the class of densities to be estimated. The Kullback-Leibler propert…
▽ More
Positivity of the prior probability of Kullback-Leibler neighborhood around the true density, commonly known as the Kullback-Leibler property, plays a fundamental role in posterior consistency. A popular prior for Bayesian estimation is given by a Dirichlet mixture, where the kernels are chosen depending on the sample space and the class of densities to be estimated. The Kullback-Leibler property of the Dirichlet mixture prior has been shown for some special kernels like the normal density or Bernstein polynomial, under appropriate conditions. In this paper, we obtain easily verifiable sufficient conditions, under which a prior obtained by mixing a general kernel possesses the Kullback-Leibler property. We study a wide variety of kernel used in practice, including the normal, $t$, histogram, gamma, Weibull densities and so on, and show that the Kullback-Leibler property holds if some easily verifiable conditions are satisfied at the true density. This gives a catalog of conditions required for the Kullback-Leibler property, which can be readily used in applications.
△ Less
Submitted 8 May, 2008; v1 submitted 15 October, 2007;
originally announced October 2007.
-
Posterior convergence rates of Dirichlet mixtures at smooth densities
Authors:
Subhashis Ghosal,
Aad van der Vaart
Abstract:
We study the rates of convergence of the posterior distribution for Bayesian density estimation with Dirichlet mixtures of normal distributions as the prior. The true density is assumed to be twice continuously differentiable. The bandwidth is given a sequence of priors which is obtained by scaling a single prior by an appropriate order. In order to handle this problem, we derive a new general r…
▽ More
We study the rates of convergence of the posterior distribution for Bayesian density estimation with Dirichlet mixtures of normal distributions as the prior. The true density is assumed to be twice continuously differentiable. The bandwidth is given a sequence of priors which is obtained by scaling a single prior by an appropriate order. In order to handle this problem, we derive a new general rate theorem by considering a countable covering of the parameter space whose prior probabilities satisfy a summability condition together with certain individual bounds on the Hellinger metric entropy. We apply this new general theorem on posterior convergence rates by computing bounds for Hellinger (bracketing) entropy numbers for the involved class of densities, the error in the approximation of a smooth density by normal mixtures and the concentration rate of the prior. The best obtainable rate of convergence of the posterior turns out to be equivalent to the well-known frequentist rate for integrated mean squared error $n^{-2/5}$ up to a logarithmic factor.
△ Less
Submitted 14 August, 2007;
originally announced August 2007.
-
Convergence rates of posterior distributions for noniid observations
Authors:
Subhashis Ghosal,
Aad van der Vaart
Abstract:
We consider the asymptotic behavior of posterior distributions and Bayes estimators based on observations which are required to be neither independent nor identically distributed. We give general results on the rate of convergence of the posterior measure relative to distances derived from a testing criterion. We then specialize our results to independent, nonidentically distributed observations…
▽ More
We consider the asymptotic behavior of posterior distributions and Bayes estimators based on observations which are required to be neither independent nor identically distributed. We give general results on the rate of convergence of the posterior measure relative to distances derived from a testing criterion. We then specialize our results to independent, nonidentically distributed observations, Markov processes, stationary Gaussian time series and the white noise model. We apply our general results to several examples of infinite-dimensional statistical models including nonparametric regression with normal errors, binary regression, Poisson regression, an interval censoring model, Whittle estimation of the spectral density of a time series and a nonlinear autoregressive model.
△ Less
Submitted 3 August, 2007;
originally announced August 2007.
-
Posterior consistency of Gaussian process prior for nonparametric binary regression
Authors:
Subhashis Ghosal,
Anindya Roy
Abstract:
Consider binary observations whose response probability is an unknown smooth function of a set of covariates. Suppose that a prior on the response probability function is induced by a Gaussian process mapped to the unit interval through a link function. In this paper we study consistency of the resulting posterior distribution. If the covariance kernel has derivatives up to a desired order and t…
▽ More
Consider binary observations whose response probability is an unknown smooth function of a set of covariates. Suppose that a prior on the response probability function is induced by a Gaussian process mapped to the unit interval through a link function. In this paper we study consistency of the resulting posterior distribution. If the covariance kernel has derivatives up to a desired order and the bandwidth parameter of the kernel is allowed to take arbitrarily small values, we show that the posterior distribution is consistent in the $L_1$-distance. As an auxiliary result to our proofs, we show that, under certain conditions, a Gaussian process assigns positive probabilities to the uniform neighborhoods of a continuous function. This result may be of independent interest in the literature for small ball probabilities of Gaussian processes.
△ Less
Submitted 23 February, 2007;
originally announced February 2007.