Skip to main content

Showing 1–28 of 28 results for author: Osinenko, P

Searching in archive math. Search in all archives.
.
  1. arXiv:2506.06053  [pdf, ps, other

    math.DS eess.SY math.OC

    Some remarks on stochastic converse Lyapunov theorems

    Authors: Pavel Osinenko, Grigory Yaremenko

    Abstract: In this brief note, we investigate some constructions of Lyapunov functions for stochastic discrete-time stabilizable dynamical systems, in other words, controlled Markov chains. The main question here is whether a Lyapunov function in some statistical sense exists if the respective controlled Markov chain admits a stabilizing policy. We demonstrate some constructions extending on the classical re… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  2. arXiv:2505.12354  [pdf, ps, other

    cs.LG cs.AI cs.RO eess.SY math.OC

    A universal policy wrapper with guarantees

    Authors: Anton Bolychev, Georgiy Malaniya, Grigory Yaremenko, Anastasia Krasnaya, Pavel Osinenko

    Abstract: We introduce a universal policy wrapper for reinforcement learning agents that ensures formal goal-reaching guarantees. In contrast to standard reinforcement learning algorithms that excel in performance but lack rigorous safety assurances, our wrapper selectively switches between a high-performing base policy -- derived from any existing RL method -- and a fallback policy with known convergence p… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

  3. arXiv:2505.12350  [pdf, ps, other

    cs.LG cs.AI cs.RO eess.SY math.OC

    Multi-CALF: A Policy Combination Approach with Statistical Guarantees

    Authors: Georgiy Malaniya, Anton Bolychev, Grigory Yaremenko, Anastasia Krasnaya, Pavel Osinenko

    Abstract: We introduce Multi-CALF, an algorithm that intelligently combines reinforcement learning policies based on their relative value improvements. Our approach integrates a standard RL policy with a theoretically-backed alternative policy, inheriting formal stability guarantees while often achieving better performance than either policy individually. We prove that our combined policy converges to a spe… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

  4. arXiv:2505.06561  [pdf, ps, other

    cs.RO cs.AI math.OC

    Quadrupedal Robot Skateboard Mounting via Reverse Curriculum Learning

    Authors: Danil Belov, Artem Erkhov, Elizaveta Pestova, Ilya Osokin, Dzmitry Tsetserukou, Pavel Osinenko

    Abstract: The aim of this work is to enable quadrupedal robots to mount skateboards using Reverse Curriculum Reinforcement Learning. Although prior work has demonstrated skateboarding for quadrupeds that are already positioned on the board, the initial mounting phase still poses a significant challenge. A goal-oriented methodology was adopted, beginning with the terminal phases of the task and progressively… ▽ More

    Submitted 10 May, 2025; originally announced May 2025.

  5. arXiv:2501.08688  [pdf, other

    eess.SY math.OC

    Some remarks on practical stabilization via CLF-based control under measurement noise

    Authors: Patrick Schmidt, Pavel Osinenko, Stefan Streif

    Abstract: Practical stabilization of input-affine systems in the presence of measurement errors and input constraints is considered in this brief note. Assuming that a Lyapunov function and a stabilizing control exist for an input-affine system, the required measurement accuracy at each point of the state space is computed. This is done via the Lyapunov function-based decay condition, which describes along… ▽ More

    Submitted 15 January, 2025; originally announced January 2025.

    Comments: 14 pages, 8 figures, DOI 10.1109/ACCESS.2024.3521048

  6. arXiv:2501.02267  [pdf, ps, other

    math.OC cs.AI eess.SY

    Towards a constructive framework for control theory

    Authors: Pavel Osinenko

    Abstract: This work presents a framework for control theory based on constructive analysis to account for discrepancy between mathematical results and their implementation in a computer, also referred to as computational uncertainty. In control engineering, the latter is usually either neglected or considered submerged into some other type of uncertainty, such as system noise, and addressed within robust co… ▽ More

    Submitted 4 January, 2025; originally announced January 2025.

    Comments: Published under: https://ieeexplore.ieee.org/document/9419858

    Journal ref: in IEEE Control Systems Letters, vol. 6, pp. 379-384, 2022

  7. arXiv:2409.14867  [pdf, other

    cs.RO cs.AI math.DS math.OC

    A novel agent with formal goal-reaching guarantees: an experimental study with a mobile robot

    Authors: Grigory Yaremenko, Dmitrii Dobriborsci, Roman Zashchitin, Ruben Contreras Maestre, Ngoc Quoc Huy Hoang, Pavel Osinenko

    Abstract: Reinforcement Learning (RL) has been shown to be effective and convenient for a number of tasks in robotics. However, it requires the exploration of a sufficiently large number of state-action pairs, many of which may be unsafe or unimportant. For instance, online model-free learning can be hazardous and inefficient in the absence of guarantees that a certain set of desired states will be reached… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  8. arXiv:2409.09869  [pdf, other

    cs.RO cs.AI math.OC

    Critic as Lyapunov function (CALF): a model-free, stability-ensuring agent

    Authors: Pavel Osinenko, Grigory Yaremenko, Roman Zashchitin, Anton Bolychev, Sinan Ibrahim, Dmitrii Dobriborsci

    Abstract: This work presents and showcases a novel reinforcement learning agent called Critic As Lyapunov Function (CALF) which is model-free and ensures online environment, in other words, dynamical system stabilization. Online means that in each learning episode, the said environment is stabilized. This, as demonstrated in a case study with a mobile robot simulator, greatly improves the overall learning p… ▽ More

    Submitted 15 September, 2024; originally announced September 2024.

    Comments: IEEE Conference on Decision and Control. Accepted for publication in proceedings of the conference

  9. arXiv:2405.18118  [pdf, other

    cs.AI eess.SY math.DS

    An agent design with goal reaching guarantees for enhancement of learning

    Authors: Pavel Osinenko, Grigory Yaremenko, Georgiy Malaniya, Anton Bolychev, Alexander Gepperth

    Abstract: Reinforcement learning is commonly concerned with problems of maximizing accumulated rewards in Markov decision processes. Oftentimes, a certain goal state or a subset of the state space attain maximal reward. In such a case, the environment may be considered solved when the goal is reached. Whereas numerous techniques, learning or non-learning based, exist for solving environments, doing so optim… ▽ More

    Submitted 21 August, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  10. On constructive extractability of measurable selectors of set-valued maps

    Authors: Pavel Osinenko, Stefan Streif

    Abstract: This paper investigates the possibility of constructive extraction of measurable selector from set-valued maps which may commonly arise in viability theory, optimal control, discontinuous systems etc. For instance, existence of solutions to certain differential inclusions, often requires iterative extraction of measurable selectors. Next, optimal controls are in general non-unique which naturally… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: Published in IEEE Transactions on Automatic Control

    Journal ref: IEEE Transactions on Automatic Control, 66.8 (2021), pp. 3757-3764

  11. arXiv:2207.08730  [pdf, other

    eess.SY cs.AI cs.LG math.DS

    A framework for online, stabilizing reinforcement learning

    Authors: Grigory Yaremenko, Georgiy Malaniya, Pavel Osinenko

    Abstract: Online reinforcement learning is concerned with training an agent on-the-fly via dynamic interaction with the environment. Here, due to the specifics of the application, it is not generally possible to perform long pre-training, as it is commonly done in off-line, model-free approaches, which are akin to dynamic programming. Such applications may be found more frequently in industry, rather than i… ▽ More

    Submitted 16 November, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

  12. arXiv:2205.13409  [pdf, other

    math.OC cs.RO eess.SY math.DS

    On stochastic stabilization via non-smooth control Lyapunov functions

    Authors: Pavel Osinenko, Grigory Yaremenko, Georgiy Malaniya

    Abstract: Control Lyapunov function is a central tool in stabilization. It generalizes an abstract energy function -- a Lyapunov function -- to the case of controlled systems. It is a known fact that most control Lyapunov functions are non-smooth -- so is the case in non-holonomic systems, like wheeled robots and cars. Frameworks for stabilization using non-smooth control Lyapunov functions exist, like Dini… ▽ More

    Submitted 7 November, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Accepted for publication in IEEE Transactions on Automatic Control

  13. arXiv:2111.12316  [pdf, ps, other

    math.DS cs.LG eess.SY

    A note on stabilizing reinforcement learning

    Authors: Pavel Osinenko, Grigory Yaremenko, Ilya Osokin

    Abstract: Reinforcement learning is a general methodology of adaptive optimal control that has attracted much attention in various fields ranging from video game industry to robot manipulators. Despite its remarkable performance demonstrations, plain reinforcement learning controllers do not guarantee stability which compromises their applicability in industry. To provide such guarantees, measures have to b… ▽ More

    Submitted 11 June, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

  14. arXiv:2110.02129  [pdf, other

    math.OC cond-mat.stat-mech cs.AI cs.LG math.DS

    A study of first-passage time minimization via Q-learning in heated gridworlds

    Authors: M. A. Larchenko, P. Osinenko, G. Yaremenko, V. V. Palyulin

    Abstract: Optimization of first-passage times is required in applications ranging from nanobots navigation to market trading. In such settings, one often encounters unevenly distributed noise levels across the environment. We extensively study how a learning agent fares in 1- and 2- dimensional heated gridworlds with an uneven temperature distribution. The results show certain bias effects in agents trained… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

  15. arXiv:2108.10392  [pdf, other

    cs.RO eess.SY math.DS

    A generalized stacked reinforcement learning method for sampled systems

    Authors: Pavel Osinenko, Dmitrii Dobriborsci, Grigory Yaremenko, Georgiy Malaniya

    Abstract: A common setting of reinforcement learning (RL) is a Markov decision process (MDP) in which the environment is a stochastic discrete-time dynamical system. Whereas MDPs are suitable in such applications as video-games or puzzles, physical systems are time-continuous. A general variant of RL is of digital format, where updates of the value (or cost) and policy are performed at discrete moments in t… ▽ More

    Submitted 28 November, 2022; v1 submitted 23 August, 2021; originally announced August 2021.

  16. arXiv:2108.04857  [pdf, other

    cs.RO math.DS

    An experimental study of two predictive reinforcement learning methods and comparison with model-predictive control

    Authors: Dmitrii Dobriborsci, Pavel Osinenko

    Abstract: Reinforcement learning (RL) has been successfully used in various simulations and computer games. Industry-related applications, such as autonomous mobile robot motion control, are somewhat challenging for RL up to date though. This paper presents an experimental evaluation of predictive RL controllers for optimal mobile robot motion control. As a baseline for comparison, model-predictive control… ▽ More

    Submitted 23 August, 2021; v1 submitted 10 August, 2021; originally announced August 2021.

  17. arXiv:2108.04802  [pdf, other

    math.DS eess.SY

    Effects of sampling and horizon in predictive reinforcement learning

    Authors: Pavel Osinenko, Dmitrii Dobriborsci

    Abstract: Plain reinforcement learning (RL) may be prone to loss of convergence, constraint violation, unexpected performance, etc. Commonly, RL agents undergo extensive learning stages to achieve acceptable functionality. This is in contrast to classical control algorithms which are typically model-based. An direction of research is the fusion of RL with such algorithms, especially model-predictive control… ▽ More

    Submitted 23 August, 2021; v1 submitted 10 August, 2021; originally announced August 2021.

  18. On stochastic stabilization of sampled systems

    Authors: Pavel Osinenko, Grigory Yaremenko

    Abstract: This paper addresses stochastic stabilization in case where implementation of control policies is digital, i. e., when the dynamical system is treated continuous, whereas the control actions are held constant in predefined time steps. In such a setup, special attention should be paid to the sample-to-sample behavior of the involved Lyapunov function. This paper extends on the stochastic stability… ▽ More

    Submitted 7 November, 2022; v1 submitted 15 May, 2021; originally announced May 2021.

    Comments: 6 pages, no figures. Accepted for IEEE CDC 2021

    Journal ref: In 2021 60th IEEE Conference on Decision and Control (CDC) (pp. 5326-5331). IEEE 2021

  19. On inf-convolution-based robust practical stabilization under computational uncertainty

    Authors: Patrick Schmidt, Pavel Osinenko, Stefan Streif

    Abstract: This work is concerned with practical stabilization of nonlinear systems by means of inf-convolution-based sample-and-hold control. It is a fairly general stabilization technique based on a generic non-smooth control Lyapunov function (CLF) and robust to actuator uncertainty, measurement noise, etc. The stabilization technique itself involves computation of descent directions of the CLF. It turns… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: Accepted for publication in IEEE TRANSACTIONS ON AUTOMATIC CONTROL; 8 pages, 3 figures

  20. Stacked adaptive dynamic programming with unknown system model

    Authors: Pavel Osinenko, Thomas Göhrt, Grigory Devadze, Stefan Streif

    Abstract: Adaptive dynamic programming is a collective term for a variety of approaches to infinite-horizon optimal control. Common to all approaches is approximation of the infinite-horizon cost function based on dynamic programming philosophy. Typically, they also require knowledge of a dynamical model of the system. In the current work, application of adaptive dynamic programming to a system whose dynami… ▽ More

    Submitted 8 July, 2020; originally announced July 2020.

    Journal ref: IFAC-PapersOnLine, 50(1), 4150-4155 (2017)

  21. arXiv:2006.14034  [pdf, other

    math.OC math.DS

    A reinforcement learning method with closed-loop stability guarantee

    Authors: Pavel Osinenko, Lukas Beckenbach, Thomas Göhrt, Stefan Streif

    Abstract: Reinforcement learning (RL) in the context of control systems offers wide possibilities of controller adaptation. Given an infinite-horizon cost function, the so-called critic of RL approximates it with a neural net and sends this information to the controller (called "actor"). However, the issue of closed-loop stability under an RL-method is still not fully addressed. Since the critic delivers me… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: Submitted to IFAC 2020

  22. arXiv:2006.14013  [pdf, other

    math.OC math.DS

    Nonsmooth stabilization and its computational aspects

    Authors: Pavel Osinenko, Patrick Schmidt, Stefan Streif

    Abstract: This work has the goal of briefly surveying some key stabilization techniques for general nonlinear systems, for which, as it is well known, a smooth control Lyapunov function may fail to exist. A general overview of the situation with smooth and nonsmooth stabilization is provided, followed by a concise summary of basic tools and techniques, including general stabilization, sliding-mode control a… ▽ More

    Submitted 2 July, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: Submitted to IFAC 2020

    MSC Class: 34H15

  23. A method of online traction parameter identification and mapping

    Authors: Alexander Kobelski, Pavel Osinenko, Stefan Streif

    Abstract: Fuel consumption of heavy-duty vehicles such as tractors, bulldozers etc. is comparably high due to their scope of operation. The operation settings are usually fixed and not tuned to the environmental factors, such as ground conditions. Yet exactly the ground-to-propelling-unit properties are decisive in energy efficiency. Optimizing the latter would require a means of identifying those propertie… ▽ More

    Submitted 23 April, 2021; v1 submitted 12 June, 2020; originally announced June 2020.

    Comments: Accepted for publication at the IFAC WC 2020

    Journal ref: IFAC-PapersOnLine Volume 53, Issue 2, 2020, Pages 13933-13938

  24. arXiv:1906.02580  [pdf, other

    math.OC eess.SY math.DS

    Model predictive control with stage cost shaping inspired by reinforcement learning

    Authors: Lukas Beckenbach, Pavel Osinenko, Stefan Streif

    Abstract: This work presents a suboptimality study of a particular model predictive control with a stage cost shaping based on the ideas of reinforcement learning. The focus of the suboptimality study is to derive quantities relating the infinite-horizon cost function under the said variant of model predictive control to the respective infinite-horizon value function. The basis control scheme involves usual… ▽ More

    Submitted 27 April, 2020; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: 2 figures

    MSC Class: 93C10; 93C40; 93C55

  25. Practical sample-and-hold stabilization of nonlinear systems under approximate optimizers

    Authors: Pavel Osinenko, Lukas Beckenbach, Stefan Streif

    Abstract: It is a known fact that not all controllable systems can be asymptotically stabilized by a continuous static feedback. Several approaches have been developed throughout the last decades, including time-varying, dynamical and even discontinuous feedbacks. In the latter case, the sample-and-hold framework is widely used, in which the control input is held constant during sampling periods. Consequent… ▽ More

    Submitted 22 June, 2018; v1 submitted 6 March, 2018; originally announced March 2018.

    Comments: 6 pages, 2 figures

    MSC Class: 34H15

    Journal ref: IEEE Control Systems Letters, vol. 2, issue 4, pp. 569-574, October 2018

  26. Analysis of extremum value theorems for function spaces in optimal control under numerical uncertainty

    Authors: Pavel Osinenko, Stefan Streif

    Abstract: The extremum value theorem for function spaces plays the central role in optimal control. It is known that computation of optimal control actions and policies is often prone to numerical errors which may be related to computability issues. The current work addresses a version of the extremum value theorem for function spaces under explicit consideration of numerical uncertainties. It is shown that… ▽ More

    Submitted 22 June, 2018; v1 submitted 18 September, 2017; originally announced September 2017.

    Comments: 28 pages

    Journal ref: IMA Journal of Mathematical Control and Information (2018)

  27. arXiv:1609.00965  [pdf, ps, other

    math.MG

    A note on Brehm's extension theorem

    Authors: Pavel Osinenko

    Abstract: Brehm's extension theorem states that a non-expansive map on a finite subset of a Euclidean space can be extended to a piecewise-linear map on the entire space. In this note, it is verified that the proof of the theorem is constructive provided that the finite subset consists of points with rational coordinates. Additionally, the initial non-expansive map needs to send points with rational coordin… ▽ More

    Submitted 3 October, 2016; v1 submitted 4 September, 2016; originally announced September 2016.

    Comments: 4 pages, minor corrections made

    MSC Class: 51F99; 03F60

  28. arXiv:1607.04108  [pdf, ps, other

    math.OC

    A note on constructive treatment of eigenvectors

    Authors: Pavel Osinenko, Grigory Devadze, Stefan Streif

    Abstract: The eigenvalue problem plays a central role in linear algebra and its applications in control and optimization methods. In particular, many matrix decompositions rely upon computation of eigenvalue-eigenvector pairs, such as diagonal or Jordan normal forms. Unfortunately, numerical algorithms computing eigenvectors are prone to errors. Due to uncomputability of eigenpairs, perturbation theory and… ▽ More

    Submitted 14 July, 2016; originally announced July 2016.

    Comments: 12 pages

    MSC Class: 15A18; 93F60; 03F65 ACM Class: G.1.3; I.2.8