-
Model Predictive Control is Almost Optimal for Restless Bandit
Authors:
Nicolas Gast,
Dheeraj Narasimha
Abstract:
We consider the discrete time infinite horizon average reward restless markovian bandit (RMAB) problem. We propose a \emph{model predictive control} based non-stationary policy with a rolling computational horizon $τ$. At each time-slot, this policy solves a $τ$ horizon linear program whose first control value is kept as a control for the RMAB. Our solution requires minimal assumptions and quantif…
▽ More
We consider the discrete time infinite horizon average reward restless markovian bandit (RMAB) problem. We propose a \emph{model predictive control} based non-stationary policy with a rolling computational horizon $τ$. At each time-slot, this policy solves a $τ$ horizon linear program whose first control value is kept as a control for the RMAB. Our solution requires minimal assumptions and quantifies the loss in optimality in terms of $τ$ and the number of arms, $N$. We show that its sub-optimality gap is $O(1/\sqrt{N})$ in general, and $\exp(-Ω(N))$ under a local-stability condition. Our proof is based on a framework from dynamic control known as \emph{dissipativity}. Our solution easy to implement and performs very well in practice when compared to the state of the art. Further, both our solution and our proof methodology can easily be generalized to more general constrained MDP settings and should thus, be of great interest to the burgeoning RMAB community.
△ Less
Submitted 5 June, 2025; v1 submitted 8 October, 2024;
originally announced October 2024.
-
Prophet Inequalities: Competing with the Top $\ell$ Items is Easy
Authors:
Mathieu Molina,
Nicolas Gast,
Patrick Loiseau,
Vianney Perchet
Abstract:
We explore a prophet inequality problem, where the values of a sequence of items are drawn i.i.d. from some distribution, and an online decision maker must select one item irrevocably. We establish that $\mathrm{CR}_{\ell}$ the worst-case competitive ratio between the expected optimal performance of an online decision maker compared to that of a prophet who uses the average of the top $\ell$ items…
▽ More
We explore a prophet inequality problem, where the values of a sequence of items are drawn i.i.d. from some distribution, and an online decision maker must select one item irrevocably. We establish that $\mathrm{CR}_{\ell}$ the worst-case competitive ratio between the expected optimal performance of an online decision maker compared to that of a prophet who uses the average of the top $\ell$ items is exactly the solution to an integral equation. This quantity $\mathrm{CR}_{\ell}$ is larger than $1-e^{-\ell}$. This implies that the bound converges exponentially fast to $1$ as $\ell$ grows. In particular for $\ell=2$, $\mathrm{CR}_{2} \approx 0.966$ which is much closer to $1$ than the classical bound of $0.745$ for $\ell=1$. Additionally, we prove asymptotic lower bounds for the competitive ratio of a more general scenario, where the decision maker is permitted to select $k$ items. This subsumes the $k$ multi-unit i.i.d. prophet problem and provides the current best asymptotic guarantees, as well as enables broader understanding in the more general framework. Finally, we prove a tight asymptotic competitive ratio when only static threshold policies are allowed.
△ Less
Submitted 10 January, 2025; v1 submitted 14 August, 2024;
originally announced August 2024.
-
Computing the Bias of Constant-step Stochastic Approximation with Markovian Noise
Authors:
Sebastian Allmeier,
Nicolas Gast
Abstract:
We study stochastic approximation algorithms with Markovian noise and constant step-size $α$. We develop a method based on infinitesimal generator comparisons to study the bias of the algorithm, which is the expected difference between $θ_n$ -- the value at iteration $n$ -- and $θ^*$ -- the unique equilibrium of the corresponding ODE. We show that, under some smoothness conditions, this bias is of…
▽ More
We study stochastic approximation algorithms with Markovian noise and constant step-size $α$. We develop a method based on infinitesimal generator comparisons to study the bias of the algorithm, which is the expected difference between $θ_n$ -- the value at iteration $n$ -- and $θ^*$ -- the unique equilibrium of the corresponding ODE. We show that, under some smoothness conditions, this bias is of order $O(α)$. Furthermore, we show that the time-averaged bias is equal to $αV + O(α^2)$, where $V$ is a constant characterized by a Lyapunov equation, showing that $\mathbb{E}[\barθ_n] \approx θ^*+Vα+ O(α^2)$, where $\barθ_n=(1/n)\sum_{k=1}^nθ_k$ is the Polyak-Ruppert average. We also show that $\barθ_n$ converges with high probability around $θ^*+αV$. We illustrate how to combine this with Richardson-Romberg extrapolation to derive an iterative scheme with a bias of order $O(α^2)$.
△ Less
Submitted 25 October, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Approximations to Study the Impact of the Service Discipline in Systems with Redundancy
Authors:
Nicolas Gast,
Benny van Houdt
Abstract:
As job redundancy has been recognized as an effective means to improve performance of large-scale computer systems, queueing systems with redundancy have been studied by various authors. Existing results include methods to compute the queue length distribution and response time but only when the service discipline is First-Come-First-Served (FCFS). For other service disciplines, such as Processor…
▽ More
As job redundancy has been recognized as an effective means to improve performance of large-scale computer systems, queueing systems with redundancy have been studied by various authors. Existing results include methods to compute the queue length distribution and response time but only when the service discipline is First-Come-First-Served (FCFS). For other service disciplines, such as Processor Sharing (PS), or Last-Come-First-Served (LCFS), only the stability conditions are known. In this paper we develop the first methods to approximate the queue length distribution in a queueing system with redundancy under various service disciplines. We focus on a system with exponential job sizes, i.i.d. copies, and a large number of servers. We first derive a mean field approximation that is independent of the scheduling policy. In order to study the impact of service discipline, we then derive refinements of this approximation to specific scheduling policies. In the case of Processor Sharing, we provide a pair and a triplet approximation. The pair approximation can be regarded as a refinement of the classic mean field approximation and takes the service discipline into account, while the triplet approximation further refines the pair approximation. We also develop a pair approximation for three other service disciplines: First-Come-First-Served, Limited Processor Sharing and Last-Come-First-Served. We present numerical evidence that shows that all the approximations presented in the paper are highly accurate, but that none of them are asymptotically exact (as the number of servers goes to infinity). This makes these approximations suitable to study the impact of the service discipline on the queue length distribution. Our results show that FCFS yields the shortest queue length, and that the differences are more substantial at higher loads.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Trading-off price for data quality to achieve fair online allocation
Authors:
Mathieu Molina,
Nicolas Gast,
Patrick Loiseau,
Vianney Perchet
Abstract:
We consider the problem of online allocation subject to a long-term fairness penalty. Contrary to existing works, however, we do not assume that the decision-maker observes the protected attributes -- which is often unrealistic in practice. Instead they can purchase data that help estimate them from sources of different quality; and hence reduce the fairness penalty at some cost. We model this pro…
▽ More
We consider the problem of online allocation subject to a long-term fairness penalty. Contrary to existing works, however, we do not assume that the decision-maker observes the protected attributes -- which is often unrealistic in practice. Instead they can purchase data that help estimate them from sources of different quality; and hence reduce the fairness penalty at some cost. We model this problem as a multi-armed bandit problem where each arm corresponds to the choice of a data source, coupled with the online allocation problem. We propose an algorithm that jointly solves both problems and show that it has a regret bounded by $\mathcal{O}(\sqrt{T})$. A key difficulty is that the rewards received by selecting a source are correlated by the fairness penalty, which leads to a need for randomization (despite a stochastic setting). Our algorithm takes into account contextual information available before the source selection, and can adapt to many different fairness notions. We also show that in some instances, the estimates used can be learned on the fly.
△ Less
Submitted 4 December, 2023; v1 submitted 23 June, 2023;
originally announced June 2023.
-
Decentralized model-free reinforcement learning in stochastic games with average-reward objective
Authors:
Romain Cravic,
Nicolas Gast,
Bruno Gaujal
Abstract:
We propose the first model-free algorithm that achieves low regret performance for decentralized learning in two-player zero-sum tabular stochastic games with infinite-horizon average-reward objective. In decentralized learning, the learning agent controls only one player and tries to achieve low regret performances against an arbitrary opponent. This contrasts with centralized learning where the…
▽ More
We propose the first model-free algorithm that achieves low regret performance for decentralized learning in two-player zero-sum tabular stochastic games with infinite-horizon average-reward objective. In decentralized learning, the learning agent controls only one player and tries to achieve low regret performances against an arbitrary opponent. This contrasts with centralized learning where the agent tries to approximate the Nash equilibrium by controlling both players. In our infinite-horizon undiscounted setting, additional structure assumptions is needed to provide good behaviors of learning processes : here we assume for every strategy of the opponent, the agent has a way to go from any state to any other. This assumption is the analogous to the "communicating" assumption in the MDP setting. We show that our Decentralized Optimistic Nash Q-Learning (DONQ-learning) algorithm achieves both sublinear high probability regret of order $T^{3/4}$ and sublinear expected regret of order $T^{2/3}$. Moreover, our algorithm enjoys a low computational complexity and low memory space requirement compared to the previous works of (Wei et al. 2017) and (Jafarnia-Jahromi et al. 2021) in the same setting.
△ Less
Submitted 13 January, 2023;
originally announced January 2023.
-
Bias and Refinement of Multiscale Mean Field Models
Authors:
Sebastian Allmeier,
Nicolas Gast
Abstract:
Mean field approximation is a powerful technique which has been used in many settings to study large-scale stochastic systems. In the case of two-timescale systems, the approximation is obtained by a combination of scaling arguments and the use of the averaging principle. This paper analyzes the approximation error of this `average' mean field model for a two-timescale model…
▽ More
Mean field approximation is a powerful technique which has been used in many settings to study large-scale stochastic systems. In the case of two-timescale systems, the approximation is obtained by a combination of scaling arguments and the use of the averaging principle. This paper analyzes the approximation error of this `average' mean field model for a two-timescale model $(\boldsymbol{X}, \boldsymbol{Y})$, where the slow component $\boldsymbol{X}$ describes a population of interacting particles which is fully coupled with a rapidly changing environment $\boldsymbol{Y}$. The model is parametrized by a scaling factor $N$, e.g. the population size, which as $N$ gets large decreases the jump size of the slow component in contrast to the unchanged dynamics of the fast component. We show that under relatively mild conditions, the `average' mean field approximation has a bias of order $O(1/N)$ compared to $\mathbb{E}[\boldsymbol{X}]$. This holds true under any continuous performance metric in the transient regime, as well as for the steady-state if the model is exponentially stable. To go one step further, we derive a bias correction term for the steady-state, from which we define a new approximation called the refined `average' mean field approximation whose bias is of order $O(1/N^2)$. This refined `average' mean field approximation allows computing an accurate approximation even for small scaling factors, i.e., $N\approx 10 -50$. We illustrate the developed framework and accuracy results through an application to a random access CSMA model.
△ Less
Submitted 23 January, 2023; v1 submitted 21 November, 2022;
originally announced November 2022.
-
Fairness in Selection Problems with Strategic Candidates
Authors:
Vitalii Emelianov,
Nicolas Gast,
Patrick Loiseau
Abstract:
To better understand discriminations and the effect of affirmative actions in selection problems (e.g., college admission or hiring), a recent line of research proposed a model based on differential variance. This model assumes that the decision-maker has a noisy estimate of each candidate's quality and puts forward the difference in the noise variances between different demographic groups as a ke…
▽ More
To better understand discriminations and the effect of affirmative actions in selection problems (e.g., college admission or hiring), a recent line of research proposed a model based on differential variance. This model assumes that the decision-maker has a noisy estimate of each candidate's quality and puts forward the difference in the noise variances between different demographic groups as a key factor to explain discrimination. The literature on differential variance, however, does not consider the strategic behavior of candidates who can react to the selection procedure to improve their outcome, which is well-known to happen in many domains.
In this paper, we study how the strategic aspect affects fairness in selection problems. We propose to model selection problems with strategic candidates as a contest game: A population of rational candidates compete by choosing an effort level to increase their quality. They incur a cost-of-effort but get a (random) quality whose expectation equals the chosen effort. A Bayesian decision-maker observes a noisy estimate of the quality of each candidate (with differential variance) and selects the fraction $α$ of best candidates based on their posterior expected quality; each selected candidate receives a reward $S$. We characterize the (unique) equilibrium of this game in the different parameters' regimes, both when the decision-maker is unconstrained and when they are constrained to respect the fairness notion of demographic parity. Our results reveal important impacts of the strategic behavior on the discrimination observed at equilibrium and allow us to understand the effect of imposing demographic parity in this context. In particular, we find that, in many cases, the results contrast with the non-strategic setting.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
Testing Indexability and Computing Whittle and Gittins Index in Subcubic Time
Authors:
Nicolas Gast,
Bruno Gaujal,
Kimang Khun
Abstract:
Whittle index is a generalization of Gittins index that provides very efficient allocation rules for restless multi-armed bandits. In this work, we develop an algorithm to test the indexability and compute the Whittle indices of any finite-state restless bandit arm. This algorithm works in the discounted and non-discounted cases, and can compute Gittins index. Our algorithm builds on three tools:…
▽ More
Whittle index is a generalization of Gittins index that provides very efficient allocation rules for restless multi-armed bandits. In this work, we develop an algorithm to test the indexability and compute the Whittle indices of any finite-state restless bandit arm. This algorithm works in the discounted and non-discounted cases, and can compute Gittins index. Our algorithm builds on three tools: (1) a careful characterization of Whittle index that allows one to compute recursively the kth smallest index from the $(k - 1)$th smallest, and to test indexability, (2) the use of the Sherman-Morrison formula to make this recursive computation efficient, and (3) a sporadic use of the fastest matrix inversion and multiplication methods to obtain a subcubic complexity. We show that an efficient use of the Sherman-Morrison formula leads to an algorithm that computes Whittle index in $(2/3)n^3 + o(n^3)$ arithmetic operations, where $n$ is the number of states of the arm. The careful use of fast matrix multiplication leads to the first subcubic algorithm to compute Whittle or Gittins index: By using the current fastest matrix multiplication, the theoretical complexity of our algorithm is O(n^2.5286 ). We also develop an efficient implementation of our algorithm that can compute indices of Markov chains with several thousands of states in less than a few seconds.
△ Less
Submitted 22 June, 2023; v1 submitted 10 March, 2022;
originally announced March 2022.
-
On Fair Selection in the Presence of Implicit and Differential Variance
Authors:
Vitalii Emelianov,
Nicolas Gast,
Krishna P. Gummadi,
Patrick Loiseau
Abstract:
Discrimination in selection problems such as hiring or college admission is often explained by implicit bias from the decision maker against disadvantaged demographic groups. In this paper, we consider a model where the decision maker receives a noisy estimate of each candidate's quality, whose variance depends on the candidate's group -- we argue that such differential variance is a key feature o…
▽ More
Discrimination in selection problems such as hiring or college admission is often explained by implicit bias from the decision maker against disadvantaged demographic groups. In this paper, we consider a model where the decision maker receives a noisy estimate of each candidate's quality, whose variance depends on the candidate's group -- we argue that such differential variance is a key feature of many selection problems. We analyze two notable settings: in the first, the noise variances are unknown to the decision maker who simply picks the candidates with the highest estimated quality independently of their group; in the second, the variances are known and the decision maker picks candidates having the highest expected quality given the noisy estimate. We show that both baseline decision makers yield discrimination, although in opposite directions: the first leads to underrepresentation of the low-variance group while the second leads to underrepresentation of the high-variance group. We study the effect on the selection utility of imposing a fairness mechanism that we term the $γ$-rule (it is an extension of the classical four-fifths rule and it also includes demographic parity). In the first setting (with unknown variances), we prove that under mild conditions, imposing the $γ$-rule increases the selection utility -- here there is no trade-off between fairness and utility. In the second setting (with known variances), imposing the $γ$-rule decreases the utility but we prove a bound on the utility loss due to the fairness mechanism.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
Mean Field and Refined Mean Field Approximations for Heterogeneous Systems: It Works!
Authors:
Sebastian Allmeier,
Nicolas Gast
Abstract:
Mean field approximation is a powerful technique to study the performance of large stochastic systems represented as $n$ interacting objects. Applications include load balancing models, epidemic spreading, cache replacement policies, or large-scale data centers. Mean field approximation is asymptotically exact for systems composed of $n$ homogeneous objects under mild conditions. In this paper, we…
▽ More
Mean field approximation is a powerful technique to study the performance of large stochastic systems represented as $n$ interacting objects. Applications include load balancing models, epidemic spreading, cache replacement policies, or large-scale data centers. Mean field approximation is asymptotically exact for systems composed of $n$ homogeneous objects under mild conditions. In this paper, we study what happens when objects are heterogeneous. This can represent servers with different speeds or contents with different popularities. We define an interaction model that allows obtaining asymptotic convergence results for stochastic systems with heterogeneous object behavior, and show that the error of the mean field approximation is of order $O(1/n)$. More importantly, we show how to adapt the refined mean field approximation, developed by Gast et al. 2019, and show that the error of this approximation is reduced to $O(1/n^2)$. To illustrate the applicability of our result, we present two examples. The first addresses a list-based cache replacement model RANDOM($m$), which is an extension of the RANDOM policy. The second is a heterogeneous supermarket model. These examples show that the proposed approximations are computationally tractable and very accurate. They also show that for moderate system sizes ($n\approx30$) the refined mean field approximation tends to be more accurate than simulations for any reasonable simulation time.
△ Less
Submitted 2 November, 2021;
originally announced November 2021.
-
Asymptotic Degradation of Linear Regression Estimates With Strategic Data Sources
Authors:
Benjamin Roussillon,
Nicolas Gast,
Patrick Loiseau,
Panayotis Mertikopoulos
Abstract:
We consider the problem of linear regression from strategic data sources with a public good component, i.e., when data is provided by strategic agents who seek to minimize an individual provision cost for increasing their data's precision while benefiting from the model's overall precision. In contrast to previous works, our model tackles the case where there is uncertainty on the attributes chara…
▽ More
We consider the problem of linear regression from strategic data sources with a public good component, i.e., when data is provided by strategic agents who seek to minimize an individual provision cost for increasing their data's precision while benefiting from the model's overall precision. In contrast to previous works, our model tackles the case where there is uncertainty on the attributes characterizing the agents' data -- a critical aspect of the problem when the number of agents is large. We provide a characterization of the game's equilibrium, which reveals an interesting connection with optimal design. Subsequently, we focus on the asymptotic behavior of the covariance of the linear regression parameters estimated via generalized least squares as the number of data sources becomes large. We provide upper and lower bounds for this covariance matrix and we show that, when the agents' provision costs are superlinear, the model's covariance converges to zero but at a slower rate relative to virtually all learning problems with exogenous data. On the other hand, if the agents' provision costs are linear, this covariance fails to converge. This shows that even the basic property of consistency of generalized least squares estimators is compromised when the data sources are strategic.
△ Less
Submitted 11 March, 2022; v1 submitted 28 June, 2021;
originally announced June 2021.
-
Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more Scalable than Optimism?
Authors:
Nicolas Gast,
Bruno Gaujal,
Kimang Khun
Abstract:
We study learning algorithms for the classical Markovian bandit problem with discount. We explain how to adapt PSRL [24] and UCRL2 [2] to exploit the problem structure. These variants are called MB-PSRL and MB-UCRL2. While the regret bound and runtime of vanilla implementations of PSRL and UCRL2 are exponential in the number of bandits, we show that the episodic regret of MB-PSRL and MB-UCRL2 is…
▽ More
We study learning algorithms for the classical Markovian bandit problem with discount. We explain how to adapt PSRL [24] and UCRL2 [2] to exploit the problem structure. These variants are called MB-PSRL and MB-UCRL2. While the regret bound and runtime of vanilla implementations of PSRL and UCRL2 are exponential in the number of bandits, we show that the episodic regret of MB-PSRL and MB-UCRL2 is $\tilde{O}(S\sqrt{nK})$ where $K$ is the number of episodes, $n$ is the number of bandits and $S$ is the number of states of each bandit (the exact bound in S, n and K is given in the paper). Up to a factor $\sqrt S$, this matches the lower bound of $Ω(\sqrt{SnK})$ that we also derive in the paper. MB-PSRL is also computationally efficient: its runtime is linear in the number of bandits. We further show that this linear runtime cannot be achieved by adapting classical non-Bayesian algorithms such as UCRL2 or UCBVI to Markovian bandit problems. Finally, we perform numerical experiments that confirm that MB-PSRL outperforms other existing algorithms in practice, both in terms of regret and of computation time.
△ Less
Submitted 3 May, 2022; v1 submitted 16 June, 2021;
originally announced June 2021.
-
Exponential Convergence Rate for the Asymptotic Optimality of Whittle Index Policy
Authors:
Nicolas Gast,
Bruno Gaujal,
Chen Yan
Abstract:
We evaluate the performance of Whittle index policy for restless Markovian bandits, when the number of bandits grows. It is proven in [30] that this performance is asymptotically optimal if the bandits are indexable and the associated deterministic system has a global attractor fixed point. In this paper we show that, under the same conditions, the convergence rate is exponential in the number of…
▽ More
We evaluate the performance of Whittle index policy for restless Markovian bandits, when the number of bandits grows. It is proven in [30] that this performance is asymptotically optimal if the bandits are indexable and the associated deterministic system has a global attractor fixed point. In this paper we show that, under the same conditions, the convergence rate is exponential in the number of bandits, unless the fixed point is singular (to be defined later). Our proof is based on the nature of the deterministic equation governing the stochastic system: We show that it is a piecewise affine continuous dynamical system inside the simplex of the empirical measure of the bandits. Using simulations and numerical solvers, we also investigate the cases where the conditions for the exponential rate theorem are violated, notably when attracting limit cycles appear, or when the fixed point is singular. We illustrate our theorem on a Markovian fading channel model, which has been well studied in the literature. Finally, we extend our synchronous model results to the asynchronous model.
△ Less
Submitted 16 December, 2020;
originally announced December 2020.
-
On Fair Selection in the Presence of Implicit Variance
Authors:
Vitalii Emelianov,
Nicolas Gast,
Krishna P. Gummadi,
Patrick Loiseau
Abstract:
Quota-based fairness mechanisms like the so-called Rooney rule or four-fifths rule are used in selection problems such as hiring or college admission to reduce inequalities based on sensitive demographic attributes. These mechanisms are often viewed as introducing a trade-off between selection fairness and utility. In recent work, however, Kleinberg and Raghavan showed that, in the presence of imp…
▽ More
Quota-based fairness mechanisms like the so-called Rooney rule or four-fifths rule are used in selection problems such as hiring or college admission to reduce inequalities based on sensitive demographic attributes. These mechanisms are often viewed as introducing a trade-off between selection fairness and utility. In recent work, however, Kleinberg and Raghavan showed that, in the presence of implicit bias in estimating candidates' quality, the Rooney rule can increase the utility of the selection process.
We argue that even in the absence of implicit bias, the estimates of candidates' quality from different groups may differ in another fundamental way, namely, in their variance. We term this phenomenon implicit variance and we ask: can fairness mechanisms be beneficial to the utility of a selection process in the presence of implicit variance (even in the absence of implicit bias)? To answer this question, we propose a simple model in which candidates have a true latent quality that is drawn from a group-independent normal distribution. To make the selection, a decision maker receives an unbiased estimate of the quality of each candidate, with normal noise, but whose variance depends on the candidate's group. We then compare the utility obtained by imposing a fairness mechanism that we term $γ$-rule (it includes demographic parity and the four-fifths rule as special cases), to that of a group-oblivious selection algorithm that picks the candidates with the highest estimated quality independently of their group. Our main result shows that the demographic parity mechanism always increases the selection utility, while any $γ$-rule weakly increases it. We extend our model to a two-stage selection process where the true quality is observed at the second stage. We discuss multiple extensions of our results, in particular to different distributions of the true latent quality.
△ Less
Submitted 24 June, 2020;
originally announced June 2020.
-
Refined Mean Field Analysis of the Gossip Shuffle Protocol -- extended version --
Authors:
Nicolas Gast,
Diego Latella,
Mieke Massink
Abstract:
Gossip protocols form the basis of many smart collective adaptive systems. They are a class of fully decentralised, simple but robust protocols for the distribution of information throughout large scale networks with hundreds or thousands of nodes. Mean field analysis methods have made it possible to approximate and analyse performance aspects of such large scale protocols in an efficient way. Tak…
▽ More
Gossip protocols form the basis of many smart collective adaptive systems. They are a class of fully decentralised, simple but robust protocols for the distribution of information throughout large scale networks with hundreds or thousands of nodes. Mean field analysis methods have made it possible to approximate and analyse performance aspects of such large scale protocols in an efficient way. Taking the gossip shuffle protocol as a benchmark, we evaluate a recently developed refined mean field approach. We illustrate the gain in accuracy this can provide for the analysis of medium size models analysing two key performance measures. We also show that refined mean field analysis requires special attention to correctly capture the coordination aspects of the gossip shuffle protocol.
△ Less
Submitted 16 April, 2020;
originally announced April 2020.
-
The Price of Local Fairness in Multistage Selection
Authors:
Vitalii Emelianov,
George Arvanitakis,
Nicolas Gast,
Krishna Gummadi,
Patrick Loiseau
Abstract:
The rise of algorithmic decision making led to active researches on how to define and guarantee fairness, mostly focusing on one-shot decision making. In several important applications such as hiring, however, decisions are made in multiple stage with additional information at each stage. In such cases, fairness issues remain poorly understood.
In this paper we study fairness in $k$-stage select…
▽ More
The rise of algorithmic decision making led to active researches on how to define and guarantee fairness, mostly focusing on one-shot decision making. In several important applications such as hiring, however, decisions are made in multiple stage with additional information at each stage. In such cases, fairness issues remain poorly understood.
In this paper we study fairness in $k$-stage selection problems where additional features are observed at every stage. We first introduce two fairness notions, local (per stage) and global (final stage) fairness, that extend the classical fairness notions to the $k$-stage setting. We propose a simple model based on a probabilistic formulation and show that the locally and globally fair selections that maximize precision can be computed via a linear program. We then define the price of local fairness to measure the loss of precision induced by local constraints; and investigate theoretically and empirically this quantity. In particular, our experiments show that the price of local fairness is generally smaller when the sensitive attribute is observed at the first stage; but globally fair selections are more locally fair when the sensitive attribute is observed at the second stage---hence in both cases it is often possible to have a selection that has a small price of local fairness and is close to locally fair.
△ Less
Submitted 15 June, 2019;
originally announced June 2019.
-
A refined mean field approximation of synchronous discrete-time population models
Authors:
Nicolas Gast,
Diego Latella,
Mieke Massink
Abstract:
Mean field approximation is a popular method to study the behaviour of stochastic models composed of a large number of interacting objects. When the objects are asynchronous, the mean field approximation of a population model can be expressed as an ordinary differential equation. When the objects are (clock-) synchronous the mean field approximation is a discrete time dynamical system. We focus on…
▽ More
Mean field approximation is a popular method to study the behaviour of stochastic models composed of a large number of interacting objects. When the objects are asynchronous, the mean field approximation of a population model can be expressed as an ordinary differential equation. When the objects are (clock-) synchronous the mean field approximation is a discrete time dynamical system. We focus on the latter.We study the accuracy of mean field approximation when this approximation is a discrete-time dynamical system. We extend a result that was shown for the continuous time case and we prove that expected performance indicators estimated by mean field approximation are $O(1/N)$-accurate. We provide simple expressions to effectively compute the asymptotic error of mean field approximation, for finite time-horizon and steady-state, and we use this computed error to propose what we call a \emph{refined} mean field approximation. We show, by using a few numerical examples, that this technique improves the quality of approximation compared to the classical mean field approximation, especially for relatively small population sizes.
△ Less
Submitted 20 July, 2018;
originally announced July 2018.
-
A new analysis of Work Stealing with latency
Authors:
Nicolas Gast,
Mohammed Khatiri,
Denis Trystram,
Frederic Wagner
Abstract:
We study in this paper the impact of communication latency on the classical Work Stealing load balancing algorithm. Our paper extends the reference model in which we introduce a latency parameter. By using a theoretical analysis and simulation, we study the overall impact of this latency on the Makespan (maximum completion time). We derive a new expression of the expected running time of a bag of…
▽ More
We study in this paper the impact of communication latency on the classical Work Stealing load balancing algorithm. Our paper extends the reference model in which we introduce a latency parameter. By using a theoretical analysis and simulation, we study the overall impact of this latency on the Makespan (maximum completion time). We derive a new expression of the expected running time of a bag of tasks scheduled by Work Stealing. This expression enables us to predict under which conditions a given run will yield acceptable performance. For instance, we can easily calibrate the maximal number of processors to use for a given work/platform combination. All our results are validated through simulation on a wide range of parameters.
△ Less
Submitted 2 May, 2018;
originally announced May 2018.
-
Linear Regression from Strategic Data Sources
Authors:
Nicolas Gast,
Stratis Ioannidis,
Patrick Loiseau,
Benjamin Roussillon
Abstract:
Linear regression is a fundamental building block of statistical data analysis. It amounts to estimating the parameters of a linear model that maps input features to corresponding outputs. In the classical setting where the precision of each data point is fixed, the famous Aitken/Gauss-Markov theorem in statistics states that generalized least squares (GLS) is a so-called "Best Linear Unbiased Est…
▽ More
Linear regression is a fundamental building block of statistical data analysis. It amounts to estimating the parameters of a linear model that maps input features to corresponding outputs. In the classical setting where the precision of each data point is fixed, the famous Aitken/Gauss-Markov theorem in statistics states that generalized least squares (GLS) is a so-called "Best Linear Unbiased Estimator" (BLUE). In modern data science, however, one often faces strategic data sources, namely, individuals who incur a cost for providing high-precision data.
In this paper, we study a setting in which features are public but individuals choose the precision of the outputs they reveal to an analyst. We assume that the analyst performs linear regression on this dataset, and individuals benefit from the outcome of this estimation. We model this scenario as a game where individuals minimize a cost comprising two components: (a) an (agent-specific) disclosure cost for providing high-precision data; and (b) a (global) estimation cost representing the inaccuracy in the linear model estimate. In this game, the linear model estimate is a public good that benefits all individuals. We establish that this game has a unique non-trivial Nash equilibrium. We study the efficiency of this equilibrium and we prove tight bounds on the price of stability for a large class of disclosure and estimation costs. Finally, we study the estimator accuracy achieved at equilibrium. We show that, in general, Aitken's theorem does not hold under strategic data sources, though it does hold if individuals have identical disclosure costs (up to a multiplicative factor). When individuals have non-identical costs, we derive a bound on the improvement of the equilibrium estimation cost that can be achieved by deviating from GLS, under mild assumptions on the disclosure cost functions.
△ Less
Submitted 12 December, 2019; v1 submitted 30 September, 2013;
originally announced September 2013.
-
Decentralized List Scheduling
Authors:
Marc Tchiboukdjian,
Nicolas Gast,
Denis Trystram
Abstract:
Classical list scheduling is a very popular and efficient technique for scheduling jobs in parallel and distributed platforms. It is inherently centralized. However, with the increasing number of processors, the cost for managing a single centralized list becomes too prohibitive. A suitable approach to reduce the contention is to distribute the list among the computational units: each processor ha…
▽ More
Classical list scheduling is a very popular and efficient technique for scheduling jobs in parallel and distributed platforms. It is inherently centralized. However, with the increasing number of processors, the cost for managing a single centralized list becomes too prohibitive. A suitable approach to reduce the contention is to distribute the list among the computational units: each processor has only a local view of the work to execute. Thus, the scheduler is no longer greedy and standard performance guarantees are lost.
The objective of this work is to study the extra cost that must be paid when the list is distributed among the computational units. We first present a general methodology for computing the expected makespan based on the analysis of an adequate potential function which represents the load unbalance between the local lists. We obtain an equation on the evolution of the potential by computing its expected decrease in one step of the schedule. Our main theorem shows how to solve such equations to bound the makespan. Then, we apply this method to several scheduling problems, namely, for unit independent tasks, for weighted independent tasks and for tasks with precendence constraints. More precisely, we prove that the time for scheduling a global workload W composed of independent unit tasks on m processors is equal to W/m plus an additional term proportional to log_2 W. We provide a lower bound which shows that this is optimal up to a constant. This result is extended to the case of weighted independent tasks. In the last setting, precedence task graphs, our analysis leads to an improvement on the bound of Arora et al. We finally provide some experiments using a simulator. The distribution of the makespan is shown to fit existing probability laws. The additive term is shown by simulation to be around 3 \log_2 W confirming the tightness of our analysis.
△ Less
Submitted 19 July, 2011;
originally announced July 2011.
-
Computing hitting times via fluid approximation: application to the coupon collector problem
Authors:
Nicolas Gast
Abstract:
In this paper, we show how to use stochastic approximation to compute hitting time of a stochastic process, based on the study of the time for a fluid approximation of this process to be at distance 1/N of its fixed point.
This approach is developed to study a generalized version of the coupon collector problem. The system is composed by N independent identical Markov chains. At each time step,…
▽ More
In this paper, we show how to use stochastic approximation to compute hitting time of a stochastic process, based on the study of the time for a fluid approximation of this process to be at distance 1/N of its fixed point.
This approach is developed to study a generalized version of the coupon collector problem. The system is composed by N independent identical Markov chains. At each time step, one Markov chain is picked at random and performs one transition. We show that the time at which all chains have hit the same state is bounded by a N log N + b N log log N + O(N) where a and b are two constants depending on eigenvalues of the Markov chain.
△ Less
Submitted 18 July, 2011;
originally announced July 2011.
-
Mean field for Markov Decision Processes: from Discrete to Continuous Optimization
Authors:
Nicolas Gast,
Bruno Gaujal,
Jean-Yves Le Boudec
Abstract:
We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal reward of such a Markov Decision Process, satisfying a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov Decision Proce…
▽ More
We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal reward of such a Markov Decision Process, satisfying a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov Decision Process. We give bounds on the difference of the rewards, and a constructive algorithm for deriving an approximating solution to the Markov Decision Process from a solution of the HJB equations. We illustrate the method on three examples pertaining respectively to investment strategies, population dynamics control and scheduling in queues are developed. They are used to illustrate and justify the construction of the controlled ODE and to show the gain obtained by solving a continuous HJB equation rather than a large discrete Bellman equation.
△ Less
Submitted 19 May, 2011; v1 submitted 14 April, 2010;
originally announced April 2010.
-
A Mean Field Approach for Optimization in Particles Systems and Applications
Authors:
Nicolas Gast,
Bruno Gaujal
Abstract:
This paper investigates the limit behavior of Markov Decision Processes (MDPs) made of independent particles evolving in a common environment, when the number of particles goes to infinity. In the finite horizon case or with a discounted cost and an infinite horizon, we show that when the number of particles becomes large, the optimal cost of the system converges almost surely to the optimal cos…
▽ More
This paper investigates the limit behavior of Markov Decision Processes (MDPs) made of independent particles evolving in a common environment, when the number of particles goes to infinity. In the finite horizon case or with a discounted cost and an infinite horizon, we show that when the number of particles becomes large, the optimal cost of the system converges almost surely to the optimal cost of a discrete deterministic system (the ``optimal mean field''). Convergence also holds for optimal policies. We further provide insights on the speed of convergence by proving several central limits theorems for the cost and the state of the Markov decision process with explicit formulas for the variance of the limit Gaussian laws. Then, our framework is applied to a brokering problem in grid computing. The optimal policy for the limit deterministic system is computed explicitly. Several simulations with growing numbers of processors are reported. They compare the performance of the optimal policy of the limit system used in the finite case with classical policies (such as Join the Shortest Queue) by measuring its asymptotic gain as well as the threshold above which it starts outperforming classical policies.
△ Less
Submitted 10 June, 2009; v1 submitted 13 March, 2009;
originally announced March 2009.
-
Distributing Labels on Infinite Trees
Authors:
Nicolas Gast,
Bruno Gaujal
Abstract:
Sturmian words are infinite binary words with many equivalent definitions: They have a minimal factor complexity among all aperiodic sequences; they are balanced sequences (the labels 0 and 1 are as evenly distributed as possible) and they can be constructed using a mechanical definition. All this properties make them good candidates for being extremal points in scheduling problems over two proc…
▽ More
Sturmian words are infinite binary words with many equivalent definitions: They have a minimal factor complexity among all aperiodic sequences; they are balanced sequences (the labels 0 and 1 are as evenly distributed as possible) and they can be constructed using a mechanical definition. All this properties make them good candidates for being extremal points in scheduling problems over two processors. In this paper, we consider the problem of generalizing Sturmian words to trees. The problem is to evenly distribute labels 0 and 1 over infinite trees. We show that (strongly) balanced trees exist and can also be constructed using a mechanical process as long as the tree is irrational. Such trees also have a minimal factor complexity. Therefore they bring the hope that extremal scheduling properties of Sturmian words can be extended to such trees, as least partially. Such possible extensions are illustrated by one such example.
△ Less
Submitted 11 September, 2008;
originally announced September 2008.