Skip to main content

Showing 1–5 of 5 results for author: Pajic, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2404.10728  [pdf, other

    cs.LG stat.ML

    Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning

    Authors: Hao-Lun Hsu, Weixin Wang, Miroslav Pajic, Pan Xu

    Abstract: We present the first study on provably efficient randomized exploration in cooperative multi-agent reinforcement learning (MARL). We propose a unified algorithm framework for randomized exploration in parallel Markov Decision Processes (MDPs), and two Thompson Sampling (TS)-type algorithms, CoopTS-PHE and CoopTS-LMC, incorporating the perturbed-history exploration (PHE) strategy and the Langevin M… ▽ More

    Submitted 3 March, 2025; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: 66 pages, 14 figures, 6 table. Hao-Lun Hsu and Weixin Wang contributed equally to this work. Published in Proc. of the 38th Conference on Advances in Neural Information Processing Systems (NeurIPS 2024)

  2. arXiv:2312.05794  [pdf, ps, other

    math.ST cs.LG eess.SY math.PR stat.ML

    Spectral Statistics of the Sample Covariance Matrix for High Dimensional Linear Gaussians

    Authors: Muhammad Abdullah Naeem, Miroslav Pajic

    Abstract: Performance of ordinary least squares(OLS) method for the \emph{estimation of high dimensional stable state transition matrix} $A$(i.e., spectral radius $ρ(A)<1$) from a single noisy observed trajectory of the linear time invariant(LTI)\footnote{Linear Gaussian (LG) in Markov chain literature} system $X_{-}:(x_0,x_1, \ldots,x_{N-1})$ satisfying \begin{equation} x_{t+1}=Ax_{t}+w_{t}, \hspace{10pt… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: text overlap with arXiv:2310.10523

  3. arXiv:2306.11697  [pdf, other

    stat.ME cs.LG stat.ML

    Treatment Effects in Extreme Regimes

    Authors: Ahmed Aloui, Ali Hasan, Yuting Ng, Miroslav Pajic, Vahid Tarokh

    Abstract: Understanding treatment effects in extreme regimes is important for characterizing risks associated with different interventions. This is hindered by the unavailability of counterfactual outcomes and the rarity and difficulty of collecting extreme data in practice. To address this issue, we propose a new framework based on extreme value theory for estimating treatment effects in extreme regimes. W… ▽ More

    Submitted 22 May, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

  4. arXiv:2301.12056  [pdf, other

    cs.LG cs.AI stat.ML

    Variational Latent Branching Model for Off-Policy Evaluation

    Authors: Qitong Gao, Ge Gao, Min Chi, Miroslav Pajic

    Abstract: Model-based methods have recently shown great potential for off-policy evaluation (OPE); offline trajectories induced by behavioral policies are fitted to transitions of Markov decision processes (MDPs), which are used to rollout simulated trajectories and estimate the performance of policies. Model-based OPE methods face two key challenges. First, as offline trajectories are usually fixed, they t… ▽ More

    Submitted 3 February, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: Accepted to ICLR 2023

  5. arXiv:2205.12448  [pdf, ps, other

    stat.ML cs.IT cs.LG eess.SY

    Transportation-Inequalities, Lyapunov Stability and Sampling for Dynamical Systems on Continuous State Space

    Authors: Muhammad Abdullah Naeem, Miroslav Pajic

    Abstract: We study the concentration phenomenon for discrete-time random dynamical systems with an unbounded state space. We develop a heuristic approach towards obtaining exponential concentration inequalities for dynamical systems using an entirely functional analytic framework. We also show that existence of exponential-type Lyapunov function, compared to the purely deterministic setting, not only implie… ▽ More

    Submitted 7 December, 2022; v1 submitted 24 May, 2022; originally announced May 2022.