Quantizer Design for Finite Model Approximations, Model Learning, and Quantized Q-Learning for MDPs with Unbounded Spaces

Bicer, Osman; Kara, Ali D.; Yuksel, Serdar

Mathematics > Optimization and Control

arXiv:2510.04355 (math)

[Submitted on 5 Oct 2025 (v1), last revised 14 Oct 2025 (this version, v2)]

Title:Quantizer Design for Finite Model Approximations, Model Learning, and Quantized Q-Learning for MDPs with Unbounded Spaces

Authors:Osman Bicer, Ali D. Kara, Serdar Yuksel

View PDF HTML (experimental)

Abstract:In this paper, for Markov decision processes (MDPs) with unbounded state spaces we present refined upper bounds presented in [Kara et. al. JMLR'23] on finite model approximation errors via optimizing the quantizers used for finite model approximations. We also consider implications on quantizer design for quantized Q-learning and empirical model learning, and the performance of policies obtained via Q-learning where the quantized state is treated as the state itself. We highlight the distinctions between planning, where approximating MDPs can be independently designed, and learning (either via Q-learning or empirical model learning), where approximating MDPs are restricted to be defined by invariant measures of Markov chains under exploration policies, leading to significant subtleties on quantizer design performance, even though asymptotic near optimality can be established under both setups. In particular, under Lyapunov growth conditions, we obtain explicit upper bounds which decay to zero as the number of bins approaches infinity

Subjects:	Optimization and Control (math.OC); Machine Learning (cs.LG)
Cite as:	arXiv:2510.04355 [math.OC]
	(or arXiv:2510.04355v2 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2510.04355

Submission history

From: Ali Devran Kara [view email]
[v1] Sun, 5 Oct 2025 20:39:52 UTC (41 KB)
[v2] Tue, 14 Oct 2025 16:14:06 UTC (40 KB)

Mathematics > Optimization and Control

Title:Quantizer Design for Finite Model Approximations, Model Learning, and Quantized Q-Learning for MDPs with Unbounded Spaces

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:Quantizer Design for Finite Model Approximations, Model Learning, and Quantized Q-Learning for MDPs with Unbounded Spaces

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators