A Supplementary Condition for the Convergence of the Control Policy during Adaptive Dynamic Programming

Bao, Xuefeng; Mao, Zhi-Hong; Sharma, Nitin

Mathematics > Optimization and Control

arXiv:1803.07743 (math)

This paper has been withdrawn by X Bao

[Submitted on 21 Mar 2018 (v1), last revised 23 May 2018 (this version, v4)]

Title:A Supplementary Condition for the Convergence of the Control Policy during Adaptive Dynamic Programming

Authors:Xuefeng Bao, Zhi-Hong Mao, Nitin Sharma

No PDF available, click to view other formats

Abstract:Reinforcement learning based adaptive/approximate dynamic programming (ADP) is a powerful technique to determine an approximate optimal controller for a dynamical system. These methods bypass the need to analytically solve the nonlinear Hamilton-Jacobi-Bellman equation, whose solution is often to difficult to determine but is needed to determine the optimal control policy. ADP methods usually employ a policy iteration algorithm that evaluates and improves a value function at every step to find the optimal control policy. Previous works in ADP have been lacking a stronger condition that ensures the convergence of the policy iteration algorithm. This paper provides a sufficient but not necessary condition that guarantees the convergence of an ADP algorithm. This condition may provide a more solid theoretical framework for ADP-based control algorithm design for nonlinear dynamical systems.

Comments:	My derivation in this paper was wrong, which was misled by some other paper. I would like to withdraw it as it may mislead other readers. Sorry for that
Subjects:	Optimization and Control (math.OC)
Cite as:	arXiv:1803.07743 [math.OC]
	(or arXiv:1803.07743v4 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.1803.07743

Submission history

From: X Bao [view email]
[v1] Wed, 21 Mar 2018 04:16:14 UTC (9 KB)
[v2] Mon, 2 Apr 2018 01:06:05 UTC (9 KB)
[v3] Fri, 18 May 2018 20:00:05 UTC (1 KB) (withdrawn)
[v4] Wed, 23 May 2018 00:25:14 UTC (1 KB) (withdrawn)

Mathematics > Optimization and Control

Title:A Supplementary Condition for the Convergence of the Control Policy during Adaptive Dynamic Programming

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:A Supplementary Condition for the Convergence of the Control Policy during Adaptive Dynamic Programming

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators