-
Asymptotic optimality theory of confidence intervals of the mean
Authors:
Vikas Deep,
Achal Bassamboo,
Sandeep Juneja
Abstract:
We address the classical problem of constructing confidence intervals (CIs) for the mean of a distribution, given \(N\) i.i.d. samples, such that the CI contains the true mean with probability at least \(1 - δ\), where \(δ\in (0,1)\). We characterize three distinct learning regimes based on the minimum achievable limiting width of any CI as the sample size \(N_δ \to \infty\) and \(δ\to 0\). In the…
▽ More
We address the classical problem of constructing confidence intervals (CIs) for the mean of a distribution, given \(N\) i.i.d. samples, such that the CI contains the true mean with probability at least \(1 - δ\), where \(δ\in (0,1)\). We characterize three distinct learning regimes based on the minimum achievable limiting width of any CI as the sample size \(N_δ \to \infty\) and \(δ\to 0\). In the first regime, where \(N_δ\) grows slower than \(\log(1/δ)\), the limiting width of any CI equals the width of the distribution's support, precluding meaningful inference. In the second regime, where \(N_δ\) scales as \(\log(1/δ)\), we precisely characterize the minimum limiting width, which depends on the scaling constant. In the third regime, where \(N_δ\) grows faster than \(\log(1/δ)\), complete learning is achievable, and the limiting width of the CI collapses to zero, converging to the true mean. We demonstrate that CIs derived from concentration inequalities based on Kullback--Leibler (KL) divergences achieve asymptotically optimal performance, attaining the minimum limiting width in both sufficient and complete learning regimes for distributions in two families: single-parameter exponential and bounded support. Additionally, these results extend to one-sided CIs, with the width notion adjusted appropriately. Finally, we generalize our findings to settings with random per-sample costs, motivated by practical applications such as stochastic simulators and cloud service selection. Instead of a fixed sample size, we consider a cost budget \(C_δ\), identifying analogous learning regimes and characterizing the optimal CI construction policy.
△ Less
Submitted 31 January, 2025;
originally announced January 2025.
-
Discriminative Learning via Adaptive Questioning
Authors:
Achal Bassamboo,
Vikas Deep,
Sandeep Juneja,
Assaf Zeevi
Abstract:
We consider the problem of designing an adaptive sequence of questions that optimally classify a candidate's ability into one of several categories or discriminative grades. A candidate's ability is modeled as an unknown parameter, which, together with the difficulty of the question asked, determines the likelihood with which s/he is able to answer a question correctly. The learning algorithm is o…
▽ More
We consider the problem of designing an adaptive sequence of questions that optimally classify a candidate's ability into one of several categories or discriminative grades. A candidate's ability is modeled as an unknown parameter, which, together with the difficulty of the question asked, determines the likelihood with which s/he is able to answer a question correctly. The learning algorithm is only able to observe these noisy responses to its queries. We consider this problem from a fixed confidence-based $δ$-correct framework, that in our setting seeks to arrive at the correct ability discrimination at the fastest possible rate while guaranteeing that the probability of error is less than a pre-specified and small $δ$. In this setting we develop lower bounds on any sequential questioning strategy and develop geometrical insights into the problem structure both from primal and dual formulation. In addition, we arrive at algorithms that essentially match these lower bounds. Our key conclusions are that, asymptotically, any candidate needs to be asked questions at most at two (candidate ability-specific) levels, although, in a reasonably general framework, questions need to be asked only at a single level. Further, and interestingly, the problem structure facilitates endogenous exploration, so there is no need for a separately designed exploration stage in the algorithm.
△ Less
Submitted 11 April, 2020;
originally announced April 2020.
-
Optimal $δ$-Correct Best-Arm Selection for Heavy-Tailed Distributions
Authors:
Shubhada Agrawal,
Sandeep Juneja,
Peter Glynn
Abstract:
Given a finite set of unknown distributions or arms that can be sampled, we consider the problem of identifying the one with the maximum mean using a $δ$-correct algorithm (an adaptive, sequential algorithm that restricts the probability of error to a specified $δ$) that has minimum sample complexity. Lower bounds for $δ$-correct algorithms are well known. $δ$-correct algorithms that match the low…
▽ More
Given a finite set of unknown distributions or arms that can be sampled, we consider the problem of identifying the one with the maximum mean using a $δ$-correct algorithm (an adaptive, sequential algorithm that restricts the probability of error to a specified $δ$) that has minimum sample complexity. Lower bounds for $δ$-correct algorithms are well known. $δ$-correct algorithms that match the lower bound asymptotically as $δ$ reduces to zero have been previously developed when arm distributions are restricted to a single parameter exponential family. In this paper, we first observe a negative result that some restrictions are essential, as otherwise, under a $δ$-correct algorithm, distributions with unbounded support would require an infinite number of samples in expectation. We then propose a $δ$-correct algorithm that matches the lower bound as $δ$ reduces to zero under the mild restriction that a known bound on the expectation of $(1+ε)^{th}$ moment of the underlying random variables exists, for $ε> 0$. We also propose batch processing and identify near-optimal batch sizes to speed up the proposed algorithm substantially. The best-arm problem has many learning applications, including recommendation systems and product selection. It is also a well-studied classic problem in the simulation community.
△ Less
Submitted 24 November, 2023; v1 submitted 24 August, 2019;
originally announced August 2019.
-
Unbiased Estimation of the Reciprocal Mean for Non-negative Random Variables
Authors:
Sarat Moka,
Dirk P. Kroese,
Sandeep Juneja
Abstract:
Many simulation problems require the estimation of a ratio of two expectations. In recent years Monte Carlo estimators have been proposed that can estimate such ratios without bias. We investigate the theoretical properties of such estimators for the estimation of $β= 1/\mathbb{E}\, Z$, where $Z \geq 0$. The estimator, $\widehat β(w)$, is of the form $w/f_w(N) \prod_{i=1}^N (1 - w\, Z_i)$, where…
▽ More
Many simulation problems require the estimation of a ratio of two expectations. In recent years Monte Carlo estimators have been proposed that can estimate such ratios without bias. We investigate the theoretical properties of such estimators for the estimation of $β= 1/\mathbb{E}\, Z$, where $Z \geq 0$. The estimator, $\widehat β(w)$, is of the form $w/f_w(N) \prod_{i=1}^N (1 - w\, Z_i)$, where $w < 2β$ and $N$ is any random variable with probability mass function $f_w$ on the positive integers. For a fixed $w$, the optimal choice for $f_w$ is well understood, but less so the choice of $w$. We study the properties of $\widehat β(w)$ as a function of~$w$ and show that its expected time variance product decreases as $w$ decreases, even though the cost of constructing the estimator increases with $w$. We also show that the estimator is asymptotically equivalent to the maximum likelihood (biased) ratio estimator and establish practical confidence intervals.
△ Less
Submitted 3 July, 2019;
originally announced July 2019.
-
Random Fixed Points, Limits and Systemic risk
Authors:
Veeraruna Kavitha,
Indrajit Saha,
Sandeep Juneja
Abstract:
We consider vector fixed point (FP) equations in large dimensional spaces involving random variables, and study their realization-wise solutions. We have an underlying directed random graph, that defines the connections between various components of the FP equations. Existence of an edge between nodes i, j implies the i th FP equation depends on the j th component. We consider a special case where…
▽ More
We consider vector fixed point (FP) equations in large dimensional spaces involving random variables, and study their realization-wise solutions. We have an underlying directed random graph, that defines the connections between various components of the FP equations. Existence of an edge between nodes i, j implies the i th FP equation depends on the j th component. We consider a special case where any component of the FP equation depends upon an appropriate aggregate of that of the random neighbor components. We obtain finite dimensional limit FP equations (in a much smaller dimensional space), whose solutions approximate the solution of the random FP equations for almost all realizations, in the asymptotic limit (number of components increase). Our techniques are different from the traditional mean-field methods, which deal with stochastic FP equations in the space of distributions to describe the stationary distributions of the systems. In contrast our focus is on realization-wise FP solutions. We apply the results to study systemic risk in a large financial heterogeneous network with many small institutions and one big institution, and demonstrate some interesting phenomenon.
△ Less
Submitted 8 December, 2021; v1 submitted 13 September, 2018;
originally announced September 2018.
-
Path-ZVA: general, efficient and automated importance sampling for highly reliable Markovian systems
Authors:
Daniel Reijsbergen,
Pieter-Tjerk de Boer,
Werner Scheinhardt,
Sandeep Juneja
Abstract:
We introduce Path-ZVA: an efficient simulation technique for estimating the probability of reaching a rare goal state before a regeneration state in a (discrete-time) Markov chain. Standard Monte Carlo simulation techniques do not work well for rare events, so we use importance sampling; i.e., we change the probability measure governing the Markov chain such that transitions `towards' the goal sta…
▽ More
We introduce Path-ZVA: an efficient simulation technique for estimating the probability of reaching a rare goal state before a regeneration state in a (discrete-time) Markov chain. Standard Monte Carlo simulation techniques do not work well for rare events, so we use importance sampling; i.e., we change the probability measure governing the Markov chain such that transitions `towards' the goal state become more likely. To do this we need an idea of distance to the goal state, so some level of knowledge of the Markov chain is required. In this paper, we use graph analysis to obtain this knowledge. In particular, we focus on knowledge of the shortest paths (in terms of `rare' transitions) to the goal state. We show that only a subset of the (possibly huge) state space needs to be considered. This is effective when the high dependability of the system is primarily due to high component reliability, but less so when it is due to high redundancies. For several models we compare our results to well-known importance sampling methods from the literature and demonstrate the large potential gains of our method.
△ Less
Submitted 28 June, 2018;
originally announced June 2018.
-
Rejection and Importance Sampling based Perfect Simulation for Gibbs hard-sphere models
Authors:
S. B. Moka,
S. Juneja,
M. R. H. Mandjes
Abstract:
Coupling from the past (CFTP) methods have been used to generate perfect samples from finite Gibbs hard-sphere models, an important class of spatial point processes, which is a set of spheres with the centers on a bounded region that are distributed as a homogeneous Poisson point process (PPP) conditioned that spheres do not overlap with each other. We propose an alternative importance sampling ba…
▽ More
Coupling from the past (CFTP) methods have been used to generate perfect samples from finite Gibbs hard-sphere models, an important class of spatial point processes, which is a set of spheres with the centers on a bounded region that are distributed as a homogeneous Poisson point process (PPP) conditioned that spheres do not overlap with each other. We propose an alternative importance sampling based rejection methodology for the perfect sampling of these models. We analyze the asymptotic expected running time complexity of the proposed method when the intensity of the reference PPP increases to infinity while the (expected) sphere radius decreases to zero at varying rates. We further compare the performance of the proposed method analytically and numerically with a naive rejection algorithm and popular dominated CFTP algorithms. Our analysis relies upon identifying large deviations decay rates of the non-overlapping probability of spheres whose centers are distributed as a homogeneous PPP.
△ Less
Submitted 3 March, 2021; v1 submitted 29 April, 2017;
originally announced May 2017.
-
Exact and efficient simulation of tail probabilities of heavy-tailed infinite series
Authors:
Henrik Hult,
Sandeep Juneja,
Karthyek Murthy
Abstract:
We develop an efficient simulation algorithm for computing the tail probabilities of the infinite series $S = \sum_{n \geq 1} a_n X_n$ when random variables $X_n$ are heavy-tailed. As $S$ is the sum of infinitely many random variables, any simulation algorithm that stops after simulating only fixed, finitely many random variables is likely to introduce a bias. We overcome this challenge by rewriti…
▽ More
We develop an efficient simulation algorithm for computing the tail probabilities of the infinite series $S = \sum_{n \geq 1} a_n X_n$ when random variables $X_n$ are heavy-tailed. As $S$ is the sum of infinitely many random variables, any simulation algorithm that stops after simulating only fixed, finitely many random variables is likely to introduce a bias. We overcome this challenge by rewriting the tail probability of interest as a sum of a random number of telescoping terms, and subsequently developing conditional Monte Carlo based low variance simulation estimators for each telescoping term. The resulting algorithm is proved to result in estimators that a) have no bias, and b) require only a fixed, finite number of replications irrespective of how rare the tail probability of interest is. Thus, by combining a traditional variance reduction technique such as conditional Monte Carlo with more recent use of auxiliary randomization to remove bias in a multi-level type representation, we develop an efficient and unbiased simulation algorithm for tail probabilities of $S$. These have many applications including in analysis of financial time-series and stochastic recurrence equations arising in models in actuarial risk and population biology.
△ Less
Submitted 6 September, 2016;
originally announced September 2016.
-
Selecting the best system and multi-armed bandits
Authors:
Peter Glynn,
Sandeep Juneja
Abstract:
Consider the problem of finding a population or a probability distribution amongst many with the largest mean when these means are unknown but population samples can be simulated or otherwise generated. Typically, by selecting largest sample mean population, it can be shown that false selection probability decays at an exponential rate. Lately, researchers have sought algorithms that guarantee tha…
▽ More
Consider the problem of finding a population or a probability distribution amongst many with the largest mean when these means are unknown but population samples can be simulated or otherwise generated. Typically, by selecting largest sample mean population, it can be shown that false selection probability decays at an exponential rate. Lately, researchers have sought algorithms that guarantee that this probability is restricted to a small $δ$ in order $\log(1/δ)$ computational time by estimating the associated large deviations rate function via simulation. We show that such guarantees are misleading when populations have unbounded support even when these may be light-tailed. Specifically, we show that any policy that identifies the correct population with probability at least $1-δ$ for each problem instance requires infinite number of samples in expectation in making such a determination in any problem instance. This suggests that some restrictions are essential on populations to devise $O(\log(1/δ))$ algorithms with $1 - δ$ correctness guarantees. We note that under restriction on population moments, such methods are easily designed, and that sequential methods from stochastic multi-armed bandit literature can be adapted to devise such algorithms.
△ Less
Submitted 10 September, 2018; v1 submitted 16 July, 2015;
originally announced July 2015.
-
Regenerative Simulation for Queueing Networks with Exponential or Heavier Tail Arrival Distributions
Authors:
Sarat Babu Moka,
Sandeep Juneja
Abstract:
Multiclass open queueing networks find wide applications in communication, computer and fabrication networks. Often one is interested in steady-state performance measures associated with these networks. Conceptually, under mild conditions, a regenerative structure exists in multiclass networks, making them amenable to regenerative simulation for estimating the steady-state performance measures. Ho…
▽ More
Multiclass open queueing networks find wide applications in communication, computer and fabrication networks. Often one is interested in steady-state performance measures associated with these networks. Conceptually, under mild conditions, a regenerative structure exists in multiclass networks, making them amenable to regenerative simulation for estimating the steady-state performance measures. However, typically, identification of a regenerative structure in these networks is difficult. A well known exception is when all the interarrival times are exponentially distributed, where the instants corresponding to customer arrivals to an empty network constitute a regenerative structure. In this paper, we consider networks where the interarrival times are generally distributed but have exponential or heavier tails. We show that these distributions can be decomposed into a mixture of sums of independent random variables such that at least one of the components is exponentially distributed. This allows an easily implementable embedded regenerative structure in the Markov process. We show that under mild conditions on the network primitives, the regenerative mean and standard deviation estimators are consistent and satisfy a joint central limit theorem useful for constructing asymptotically valid confidence intervals. We also show that amongst all such interarrival time decompositions, the one with the largest mean exponential component minimizes the asymptotic variance of the standard deviation estimator.
△ Less
Submitted 23 July, 2013; v1 submitted 20 July, 2013;
originally announced July 2013.
-
State-independent Importance Sampling for Random Walks with Regularly Varying Increments
Authors:
Karthyek R. A. Murthy,
Sandeep Juneja,
Jose Blanchet
Abstract:
We develop importance sampling based efficient simulation techniques for three commonly encountered rare event probabilities associated with random walks having i.i.d. regularly varying increments; namely, 1) the large deviation probabilities, 2) the level crossing probabilities, and 3) the level crossing probabilities within a regenerative cycle. Exponential twisting based state-independent metho…
▽ More
We develop importance sampling based efficient simulation techniques for three commonly encountered rare event probabilities associated with random walks having i.i.d. regularly varying increments; namely, 1) the large deviation probabilities, 2) the level crossing probabilities, and 3) the level crossing probabilities within a regenerative cycle. Exponential twisting based state-independent methods, which are effective in efficiently estimating these probabilities for light-tailed increments are not applicable when the increments are heavy-tailed. To address the latter case, more complex and elegant state-dependent efficient simulation algorithms have been developed in the literature over the last few years. We propose that by suitably decomposing these rare event probabilities into a dominant and further residual components, simpler state-independent importance sampling algorithms can be devised for each component resulting in composite unbiased estimators with desirable efficiency properties. When the increments have infinite variance, there is an added complexity in estimating the level crossing probabilities as even the well known zero-variance measures have an infinite expected termination time. We adapt our algorithms so that this expectation is finite while the estimators remain strongly efficient. Numerically, the proposed estimators perform at least as well, and sometimes substantially better than the existing state-dependent estimators in the literature.
△ Less
Submitted 27 September, 2014; v1 submitted 15 June, 2012;
originally announced June 2012.
-
Efficient simulation of density and probability of large deviations of sum of random vectors using saddle point representations
Authors:
Santanu Dey,
Sandeep Juneja,
Ankush Agarwal
Abstract:
We consider the problem of efficient simulation estimation of the density function at the tails, and the probability of large deviations for a sum of independent, identically distributed, light-tailed and non-lattice random vectors. The latter problem besides being of independent interest, also forms a building block for more complex rare event problems that arise, for instance, in queueing and fi…
▽ More
We consider the problem of efficient simulation estimation of the density function at the tails, and the probability of large deviations for a sum of independent, identically distributed, light-tailed and non-lattice random vectors. The latter problem besides being of independent interest, also forms a building block for more complex rare event problems that arise, for instance, in queueing and financial credit risk modelling. It has been extensively studied in literature where state independent exponential twisting based importance sampling has been shown to be asymptotically efficient and a more nuanced state dependent exponential twisting has been shown to have a stronger bounded relative error property. We exploit the saddle-point based representations that exist for these rare quantities, which rely on inverting the characteristic functions of the underlying random vectors. These representations reduce the rare event estimation problem to evaluating certain integrals, which may via importance sampling be represented as expectations. Further, it is easy to identify and approximate the zero-variance importance sampling distribution to estimate these integrals. We identify such importance sampling measures and show that they possess the asymptotically vanishing relative error property that is stronger than the bounded relative error property. To illustrate the broader applicability of the proposed methodology, we extend it to similarly efficiently estimate the practically important expected overshoot of sums of iid random variables.
△ Less
Submitted 16 October, 2012; v1 submitted 5 March, 2012;
originally announced March 2012.