A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems

Gagrani, Mukul; Sudhakara, Sagar; Mahajan, Aditya; Nayyar, Ashutosh; Ouyang, Yi

Electrical Engineering and Systems Science > Systems and Control

arXiv:2108.08502 (eess)

[Submitted on 19 Aug 2021 (v1), last revised 20 Sep 2022 (this version, v2)]

Title:A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems

Authors:Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang

View PDF

Abstract:We revisit the Thompson sampling algorithm to control an unknown linear quadratic (LQ) system recently proposed by Ouyang et al (arXiv:1709.04047). The regret bound of the algorithm was derived under a technical assumption on the induced norm of the closed loop system. In this technical note, we show that by making a minor modification in the algorithm (in particular, ensuring that an episode does not end too soon), this technical assumption on the induced norm can be replaced by a milder assumption in terms of the spectral radius of the closed loop system. The modified algorithm has the same Bayesian regret of $\tilde{\mathcal{O}}(\sqrt{T})$, where $T$ is the time-horizon and the $\tilde{\mathcal{O}}(\cdot)$ notation hides logarithmic terms in~$T$.

Subjects:	Systems and Control (eess.SY); Artificial Intelligence (cs.AI); Optimization and Control (math.OC)
Cite as:	arXiv:2108.08502 [eess.SY]
	(or arXiv:2108.08502v2 [eess.SY] for this version)
	https://doi.org/10.48550/arXiv.2108.08502
Journal reference:	Proc 2022 IEEE Conference on Decision and Control

Submission history

From: Aditya Mahajan [view email]
[v1] Thu, 19 Aug 2021 05:25:28 UTC (161 KB)
[v2] Tue, 20 Sep 2022 02:07:54 UTC (74 KB)

Electrical Engineering and Systems Science > Systems and Control

Title:A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Systems and Control

Title:A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators