Skip to main content

Showing 1–50 of 146 results for author: Tsetserukou, D

.
  1. arXiv:2505.18876  [pdf, ps, other

    cs.RO

    DiffusionRL: Efficient Training of Diffusion Policies for Robotic Grasping Using RL-Adapted Large-Scale Datasets

    Authors: Maria Makarova, Qian Liu, Dzmitry Tsetserukou

    Abstract: Diffusion models have been successfully applied in areas such as image, video, and audio generation. Recent works show their promise for sequential decision-making and dexterous manipulation, leveraging their ability to model complex action distributions. However, challenges persist due to the data limitations and scenario-specific adaptation needs. In this paper, we address these challenges by pr… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

    Comments: Submitted to CoRL 2025

  2. arXiv:2505.07236  [pdf, other

    cs.RO cs.AI

    UAV-CodeAgents: Scalable UAV Mission Planning via Multi-Agent ReAct and Vision-Language Reasoning

    Authors: Oleg Sautenkov, Yasheerah Yaqoot, Muhammad Ahsan Mustafa, Faryal Batool, Jeffrin Sam, Artem Lykov, Chih-Yung Wen, Dzmitry Tsetserukou

    Abstract: We present UAV-CodeAgents, a scalable multi-agent framework for autonomous UAV mission generation, built on large language and vision-language models (LLMs/VLMs). The system leverages the ReAct (Reason + Act) paradigm to interpret satellite imagery, ground high-level natural language instructions, and collaboratively generate UAV trajectories with minimal human supervision. A core component is a v… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: Submitted

  3. arXiv:2505.06561  [pdf, ps, other

    cs.RO cs.AI math.OC

    Quadrupedal Robot Skateboard Mounting via Reverse Curriculum Learning

    Authors: Danil Belov, Artem Erkhov, Elizaveta Pestova, Ilya Osokin, Dzmitry Tsetserukou, Pavel Osinenko

    Abstract: The aim of this work is to enable quadrupedal robots to mount skateboards using Reverse Curriculum Reinforcement Learning. Although prior work has demonstrated skateboarding for quadrupeds that are already positioned on the board, the initial mounting phase still poses a significant challenge. A goal-oriented methodology was adopted, beginning with the terminal phases of the task and progressively… ▽ More

    Submitted 10 May, 2025; originally announced May 2025.

  4. arXiv:2505.03931  [pdf, other

    cs.RO

    NMPC-Lander: Nonlinear MPC with Barrier Function for UAV Landing on a Mobile Platform

    Authors: Amber Batool, Faryal Batool, Roohan Ahmed Khan, Muhammad Ahsan Mustafa, Aleksey Fedoseev, Dzmitry Tsetserukou

    Abstract: Quadcopters are versatile aerial robots gaining popularity in numerous critical applications. However, their operational effectiveness is constrained by limited battery life and restricted flight range. To address these challenges, autonomous drone landing on stationary or mobile charging and battery-swapping stations has become an essential capability. In this study, we present NMPC-Lander, a nov… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: This manuscript has been submitted to the IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2025

  5. arXiv:2505.02582  [pdf, ps, other

    cs.HC

    FlyHaptics: Flying Multi-contact Haptic Interface

    Authors: Luis Moreno, Miguel Altamirano Cabrera, Muhammad Haris Khan, Issatay Tokmurziyev, Yara Mahmoud, Valerii Serpiva, Dzmitry Tsetserukou

    Abstract: This work presents FlyHaptics, an aerial haptic interface tracked via a Vicon optical motion capture system and built around six five-bar linkage assemblies enclosed in a lightweight protective cage. We predefined five static tactile patterns - each characterized by distinct combinations of linkage contact points and vibration intensities - and evaluated them in a grounded pilot study, where parti… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

  6. arXiv:2505.02569  [pdf, other

    cs.RO cs.HC

    HapticVLM: VLM-Driven Texture Recognition Aimed at Intelligent Haptic Interaction

    Authors: Muhammad Haris Khan, Miguel Altamirano Cabrera, Dmitrii Iarchuk, Yara Mahmoud, Daria Trinitatova, Issatay Tokmurziyev, Dzmitry Tsetserukou

    Abstract: This paper introduces HapticVLM, a novel multimodal system that integrates vision-language reasoning with deep convolutional networks to enable real-time haptic feedback. HapticVLM leverages a ConvNeXt-based material recognition module to generate robust visual embeddings for accurate identification of object materials, while a state-of-the-art Vision-Language Model (Qwen2-VL-2B-Instruct) infers a… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

    Comments: Submitted to IEEE conf

  7. arXiv:2504.16914  [pdf, other

    cs.RO

    MorphoNavi: Aerial-Ground Robot Navigation with Object Oriented Mapping in Digital Twin

    Authors: Sausar Karaf, Mikhail Martynov, Oleg Sautenkov, Zhanibek Darush, Dzmitry Tsetserukou

    Abstract: This paper presents a novel mapping approach for a universal aerial-ground robotic system utilizing a single monocular camera. The proposed system is capable of detecting a diverse range of objects and estimating their positions without requiring fine-tuning for specific environments. The system's performance was evaluated through a simulated search-and-rescue scenario, where the MorphoGear robot… ▽ More

    Submitted 23 April, 2025; originally announced April 2025.

  8. arXiv:2504.09510  [pdf, ps, other

    cs.RO cs.ET

    Towards Intuitive Drone Operation Using a Handheld Motion Controller

    Authors: Daria Trinitatova, Sofia Shevelo, Dzmitry Tsetserukou

    Abstract: We present an intuitive human-drone interaction system that utilizes a gesture-based motion controller to enhance the drone operation experience in real and simulated environments. The handheld motion controller enables natural control of the drone through the movements of the operator's hand, thumb, and index finger: the trigger press manages the throttle, the tilt of the hand adjusts pitch and r… ▽ More

    Submitted 13 April, 2025; originally announced April 2025.

    Comments: HRI'25: Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction, 5 pages, 5 figures

  9. arXiv:2504.07939  [pdf, other

    cs.RO

    Echo: An Open-Source, Low-Cost Teleoperation System with Force Feedback for Dataset Collection in Robot Learning

    Authors: Artem Bazhenov, Sergei Satsevich, Sergei Egorov, Farit Khabibullin, Dzmitry Tsetserukou

    Abstract: In this article, we propose Echo, a novel joint-matching teleoperation system designed to enhance the collection of datasets for manual and bimanual tasks. Our system is specifically tailored for controlling the UR manipulator and features a custom controller with force feedback and adjustable sensitivity modes, enabling precise and intuitive operation. Additionally, Echo integrates a user-friendl… ▽ More

    Submitted 10 April, 2025; originally announced April 2025.

  10. arXiv:2503.16475  [pdf, other

    cs.HC cs.RO

    LLM-Glasses: GenAI-driven Glasses with Haptic Feedback for Navigation of Visually Impaired People

    Authors: Issatay Tokmurziyev, Miguel Altamirano Cabrera, Muhammad Haris Khan, Yara Mahmoud, Luis Moreno, Dzmitry Tsetserukou

    Abstract: We present LLM-Glasses, a wearable navigation system designed to assist visually impaired individuals by combining haptic feedback, YOLO-World object detection, and GPT-4o-driven reasoning. The system delivers real-time tactile guidance via temple-mounted actuators, enabling intuitive and independent navigation. Three user studies were conducted to evaluate its effectiveness: (1) a haptic pattern… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: Submitted to IEEE/RSJ IROS 2025

  11. arXiv:2503.15895  [pdf, other

    cs.RO

    CONTHER: Human-Like Contextual Robot Learning via Hindsight Experience Replay and Transformers without Expert Demonstrations

    Authors: Maria Makarova, Qian Liu, Dzmitry Tsetserukou

    Abstract: This paper presents CONTHER, a novel reinforcement learning algorithm designed to efficiently and rapidly train robotic agents for goal-oriented manipulation tasks and obstacle avoidance. The algorithm uses a modified replay buffer inspired by the Hindsight Experience Replay (HER) approach to artificially populate experience with successful trajectories, effectively addressing the problem of spars… ▽ More

    Submitted 20 March, 2025; originally announced March 2025.

    Comments: Submitted to IROS 2025

  12. arXiv:2503.07662  [pdf, other

    cs.MA cs.RO

    HIPPO-MAT: Decentralized Task Allocation Using GraphSAGE and Multi-Agent Deep Reinforcement Learning

    Authors: Lavanya Ratnabala, Robinroy Peter, Aleksey Fedoseev, Dzmitry Tsetserukou

    Abstract: This paper tackles decentralized continuous task allocation in heterogeneous multi-agent systems. We present a novel framework HIPPO-MAT that integrates graph neural networks (GNN) employing a GraphSAGE architecture to compute independent embeddings on each agent with an Independent Proximal Policy Optimization (IPPO) approach for multi-agent deep reinforcement learning. In our system, unmanned ae… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

    Comments: arXiv admin note: text overlap with arXiv:2502.02311

  13. arXiv:2503.07376  [pdf, other

    eess.SY cs.RO

    AttentionSwarm: Reinforcement Learning with Attention Control Barier Function for Crazyflie Drones in Dynamic Environments

    Authors: Grik Tadevosyan, Valerii Serpiva, Aleksey Fedoseev, Roohan Ahmed Khan, Demetros Aschu, Faryal Batool, Nickolay Efanov, Artem Mikhaylov, Dzmitry Tsetserukou

    Abstract: We introduce AttentionSwarm, a novel benchmark designed to evaluate safe and efficient swarm control across three challenging environments: a landing environment with obstacles, a competitive drone game setting, and a dynamic drone racing scenario. Central to our approach is the Attention Model Based Control Barrier Function (CBF) framework, which integrates attention mechanisms with safety-critic… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: 6 pages, 6 figures

  14. arXiv:2503.02723  [pdf, other

    cs.RO

    ImpedanceGPT: VLM-driven Impedance Control of Swarm of Mini-drones for Intelligent Navigation in Dynamic Environment

    Authors: Faryal Batool, Malaika Zafar, Yasheerah Yaqoot, Roohan Ahmed Khan, Muhammad Haris Khan, Aleksey Fedoseev, Dzmitry Tsetserukou

    Abstract: Swarm robotics plays a crucial role in enabling autonomous operations in dynamic and unpredictable environments. However, a major challenge remains ensuring safe and efficient navigation in environments filled with both dynamic alive (e.g., humans) and dynamic inanimate (e.g., non-living objects) obstacles. In this paper, we propose ImpedanceGPT, a novel system that combines a Vision-Language Mode… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: Submitted to IROS 2025

  15. arXiv:2503.02572  [pdf, other

    cs.RO cs.AI

    RaceVLA: VLA-based Racing Drone Navigation with Human-like Behaviour

    Authors: Valerii Serpiva, Artem Lykov, Artyom Myshlyaev, Muhammad Haris Khan, Ali Alridha Abdulkarim, Oleg Sautenkov, Dzmitry Tsetserukou

    Abstract: RaceVLA presents an innovative approach for autonomous racing drone navigation by leveraging Visual-Language-Action (VLA) to emulate human-like behavior. This research explores the integration of advanced algorithms that enable drones to adapt their navigation strategies based on real-time environmental feedback, mimicking the decision-making processes of human pilots. The model, fine-tuned on a c… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: 6 pages, 6 figures. Submitted to IROS 2025

  16. arXiv:2503.02465  [pdf, other

    cs.RO eess.SY

    UAV-VLRR: Vision-Language Informed NMPC for Rapid Response in UAV Search and Rescue

    Authors: Yasheerah Yaqoot, Muhammad Ahsan Mustafa, Oleg Sautenkov, Artem Lykov, Valerii Serpiva, Dzmitry Tsetserukou

    Abstract: Emergency search and rescue (SAR) operations often require rapid and precise target identification in complex environments where traditional manual drone control is inefficient. In order to address these scenarios, a rapid SAR system, UAV-VLRR (Vision-Language-Rapid-Response), is developed in this research. This system consists of two aspects: 1) A multimodal system which harnesses the power of Vi… ▽ More

    Submitted 13 May, 2025; v1 submitted 4 March, 2025; originally announced March 2025.

    Comments: UAV-VLRR

  17. arXiv:2503.02454  [pdf, other

    cs.RO

    UAV-VLPA*: A Vision-Language-Path-Action System for Optimal Route Generation on a Large Scales

    Authors: Oleg Sautenkov, Aibek Akhmetkazy, Yasheerah Yaqoot, Muhammad Ahsan Mustafa, Grik Tadevosyan, Artem Lykov, Dzmitry Tsetserukou

    Abstract: The UAV-VLPA* (Visual-Language-Planning-and-Action) system represents a cutting-edge advancement in aerial robotics, designed to enhance communication and operational efficiency for unmanned aerial vehicles (UAVs). By integrating advanced planning capabilities, the system addresses the Traveling Salesman Problem (TSP) to optimize flight paths, reducing the total trajectory length by 18.5\% compare… ▽ More

    Submitted 14 May, 2025; v1 submitted 4 March, 2025; originally announced March 2025.

    Comments: arXiv admin note: text overlap with arXiv:2501.05014

  18. arXiv:2503.01378  [pdf, other

    cs.RO

    CognitiveDrone: A VLA Model and Evaluation Benchmark for Real-Time Cognitive Task Solving and Reasoning in UAVs

    Authors: Artem Lykov, Valerii Serpiva, Muhammad Haris Khan, Oleg Sautenkov, Artyom Myshlyaev, Grik Tadevosyan, Yasheerah Yaqoot, Dzmitry Tsetserukou

    Abstract: This paper introduces CognitiveDrone, a novel Vision-Language-Action (VLA) model tailored for complex Unmanned Aerial Vehicles (UAVs) tasks that demand advanced cognitive abilities. Trained on a dataset comprising over 8,000 simulated flight trajectories across three key categories-Human Recognition, Symbol Understanding, and Reasoning-the model generates real-time 4D action commands based on firs… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

    Comments: Paper submitted to the IEEE conference

  19. arXiv:2502.20108  [pdf, other

    cs.CV cs.RO

    VDT-Auto: End-to-end Autonomous Driving with VLM-Guided Diffusion Transformers

    Authors: Ziang Guo, Konstantin Gubernatorov, Selamawit Asfaw, Zakhar Yagudin, Dzmitry Tsetserukou

    Abstract: In autonomous driving, dynamic environment and corner cases pose significant challenges to the robustness of ego vehicle's decision-making. To address these challenges, commencing with the representation of state-action mapping in the end-to-end autonomous driving paradigm, we introduce a novel pipeline, VDT-Auto. Leveraging the advancement of the state understanding of Visual Language Model (VLM)… ▽ More

    Submitted 1 March, 2025; v1 submitted 27 February, 2025; originally announced February 2025.

    Comments: Submitted paper

  20. arXiv:2502.17034  [pdf, other

    cs.RO cs.NE

    Evolution 6.0: Evolving Robotic Capabilities Through Generative Design

    Authors: Muhammad Haris Khan, Artyom Myshlyaev, Artem Lykov, Miguel Altamirano Cabrera, Dzmitry Tsetserukou

    Abstract: We propose a new concept, Evolution 6.0, which represents the evolution of robotics driven by Generative AI. When a robot lacks the necessary tools to accomplish a task requested by a human, it autonomously designs the required instruments and learns how to use them to achieve the goal. Evolution 6.0 is an autonomous robotic system powered by Vision-Language Models (VLMs), Vision-Language Action (… ▽ More

    Submitted 4 April, 2025; v1 submitted 24 February, 2025; originally announced February 2025.

    Comments: Submitted to IROS

  21. arXiv:2502.06725  [pdf, other

    cs.RO

    AgilePilot: DRL-Based Drone Agent for Real-Time Motion Planning in Dynamic Environments by Leveraging Object Detection

    Authors: Roohan Ahmed Khan, Valerii Serpiva, Demetros Aschalew, Aleksey Fedoseev, Dzmitry Tsetserukou

    Abstract: Autonomous drone navigation in dynamic environments remains a critical challenge, especially when dealing with unpredictable scenarios including fast-moving objects with rapidly changing goal positions. While traditional planners and classical optimisation methods have been extensively used to address this dynamic problem, they often face real-time, unpredictable changes that ultimately leads to s… ▽ More

    Submitted 21 April, 2025; v1 submitted 10 February, 2025; originally announced February 2025.

    Comments: Manuscript has been accepted at 2025 INTERNATIONAL CONFERENCE ON UNMANNED AIRCRAFT SYSTEMS (ICUAS)

  22. arXiv:2502.06722  [pdf, other

    cs.RO

    HetSwarm: Cooperative Navigation of Heterogeneous Swarm in Dynamic and Dense Environments through Impedance-based Guidance

    Authors: Malaika Zafar, Roohan Ahmed Khan, Aleksey Fedoseev, Kumar Katyayan Jaiswal, Dzmitry Tsetserukou

    Abstract: With the growing demand for efficient logistics and warehouse management, unmanned aerial vehicles (UAVs) are emerging as a valuable complement to automated guided vehicles (AGVs). UAVs enhance efficiency by navigating dense environments and operating at varying altitudes. However, their limited flight time, battery life, and payload capacity necessitate a supporting ground station. To address the… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

    Comments: Manuscript has been submitted to ICUAS-2025

  23. arXiv:2502.02311  [pdf, other

    cs.RO cs.LG cs.MA

    MAGNNET: Multi-Agent Graph Neural Network-based Efficient Task Allocation for Autonomous Vehicles with Deep Reinforcement Learning

    Authors: Lavanya Ratnabala, Aleksey Fedoseev, Robinroy Peter, Dzmitry Tsetserukou

    Abstract: This paper addresses the challenge of decentralized task allocation within heterogeneous multi-agent systems operating under communication constraints. We introduce a novel framework that integrates graph neural networks (GNNs) with a centralized training and decentralized execution (CTDE) paradigm, further enhanced by a tailored Proximal Policy Optimization (PPO) algorithm for multi-agent deep re… ▽ More

    Submitted 20 February, 2025; v1 submitted 4 February, 2025; originally announced February 2025.

    Comments: Submitted to IEEE Intelligent Vehicle Symposium (2025)

  24. SafeSwarm: Decentralized Safe RL for the Swarm of Drones Landing in Dense Crowds

    Authors: Grik Tadevosyan, Maksim Osipenko, Demetros Aschu, Aleksey Fedoseev, Valerii Serpiva, Oleg Sautenkov, Sausar Karaf, Dzmitry Tsetserukou

    Abstract: This paper introduces a safe swarm of drones capable of performing landings in crowded environments robustly by relying on Reinforcement Learning techniques combined with Safe Learning. The developed system allows us to teach the swarm of drones with different dynamics to land on moving landing pads in an environment while avoiding collisions with obstacles and between agents. The safe barrier n… ▽ More

    Submitted 13 January, 2025; originally announced January 2025.

    Report number: 1665--1669

    Journal ref: Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction

  25. arXiv:2501.07299  [pdf, other

    cs.RO

    ViewVR: Visual Feedback Modes to Achieve Quality of VR-based Telemanipulation

    Authors: A. Erkhov, A. Bazhenov, S. Satsevich, D. Belov, F. Khabibullin, S. Egorov, M. Gromakov, M. Altamirano Cabrera, D. Tsetserukou

    Abstract: The paper focuses on an immersive teleoperation system that enhances operator's ability to actively perceive the robot's surroundings. A consumer-grade HTC Vive VR system was used to synchronize the operator's hand and head movements with a UR3 robot and a custom-built robotic head with two degrees of freedom (2-DoF). The system's usability, manipulation efficiency, and intuitiveness of control we… ▽ More

    Submitted 13 January, 2025; originally announced January 2025.

  26. arXiv:2501.07295  [pdf, other

    cs.RO

    GestLLM: Advanced Hand Gesture Interpretation via Large Language Models for Human-Robot Interaction

    Authors: Oleg Kobzarev, Artem Lykov, Dzmitry Tsetserukou

    Abstract: This paper introduces GestLLM, an advanced system for human-robot interaction that enables intuitive robot control through hand gestures. Unlike conventional systems, which rely on a limited set of predefined gestures, GestLLM leverages large language models and feature extraction via MediaPipe to interpret a diverse range of gestures. This integration addresses key limitations in existing systems… ▽ More

    Submitted 14 January, 2025; v1 submitted 13 January, 2025; originally announced January 2025.

  27. arXiv:2501.07255  [pdf, other

    cs.RO

    GazeGrasp: DNN-Driven Robotic Grasping with Wearable Eye-Gaze Interface

    Authors: Issatay Tokmurziyev, Miguel Altamirano Cabrera, Luis Moreno, Muhammad Haris Khan, Dzmitry Tsetserukou

    Abstract: We present GazeGrasp, a gaze-based manipulation system enabling individuals with motor impairments to control collaborative robots using eye-gaze. The system employs an ESP32 CAM for eye tracking, MediaPipe for gaze detection, and YOLOv8 for object localization, integrated with a Universal Robot UR10 for manipulation tasks. After user-specific calibration, the system allows intuitive object select… ▽ More

    Submitted 14 January, 2025; v1 submitted 13 January, 2025; originally announced January 2025.

    Comments: Accepted to: IEEE/ACM International Conference on Human-Robot Interaction (HRI 2025)

  28. arXiv:2501.06919  [pdf, other

    cs.RO

    Shake-VLA: Vision-Language-Action Model-Based System for Bimanual Robotic Manipulations and Liquid Mixing

    Authors: Muhamamd Haris Khan, Selamawit Asfaw, Dmitrii Iarchuk, Miguel Altamirano Cabrera, Luis Moreno, Issatay Tokmurziyev, Dzmitry Tsetserukou

    Abstract: This paper introduces Shake-VLA, a Vision-Language-Action (VLA) model-based system designed to enable bimanual robotic manipulation for automated cocktail preparation. The system integrates a vision module for detecting ingredient bottles and reading labels, a speech-to-text module for interpreting user commands, and a language model to generate task-specific robotic instructions. Force Torque (FT… ▽ More

    Submitted 12 January, 2025; originally announced January 2025.

    Comments: Accepted to IEEE/ACM HRI 2025

  29. arXiv:2501.05014  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    UAV-VLA: Vision-Language-Action System for Large Scale Aerial Mission Generation

    Authors: Oleg Sautenkov, Yasheerah Yaqoot, Artem Lykov, Muhammad Ahsan Mustafa, Grik Tadevosyan, Aibek Akhmetkazy, Miguel Altamirano Cabrera, Mikhail Martynov, Sausar Karaf, Dzmitry Tsetserukou

    Abstract: The UAV-VLA (Visual-Language-Action) system is a tool designed to facilitate communication with aerial robots. By integrating satellite imagery processing with the Visual Language Model (VLM) and the powerful capabilities of GPT, UAV-VLA enables users to generate general flight paths-and-action plans through simple text requests. This system leverages the rich contextual information provided by sa… ▽ More

    Submitted 13 May, 2025; v1 submitted 9 January, 2025; originally announced January 2025.

    Comments: HRI 2025

  30. arXiv:2411.18295  [pdf, other

    cs.RO

    Optimizing energy consumption for legged robot by adapting equilibrium position and stiffness of a parallel torsion spring

    Authors: Danil Belov, Artem Erkhov, Farit Khabibullin, Elisaveta Pestova, Sergei Satsevich, Ilya Osokin, Pavel Osinenko, Dzmitry Tsetserukou

    Abstract: This paper is dedicated to the development of a novel adaptive torsion spring mechanism for optimizing energy consumption in legged robots. By adjusting the equilibrium position and stiffness of the spring, the system improves energy efficiency during cyclic movements, such as walking and jumping. The adaptive compliance mechanism, consisting of a torsion spring combined with a worm gear driven by… ▽ More

    Submitted 27 November, 2024; originally announced November 2024.

  31. arXiv:2411.05107  [pdf, other

    cs.RO

    MissionGPT: Mission Planner for Mobile Robot based on Robotics Transformer Model

    Authors: Vladimir Berman, Artem Bazhenov, Dzmitry Tsetserukou

    Abstract: This paper presents a novel approach to building mission planners based on neural networks with Transformer architecture and Large Language Models (LLMs). This approach demonstrates the possibility of setting a task for a mobile robot and its successful execution without the use of perception algorithms, based only on the data coming from the camera. In this work, a success rate of more than 50\%… ▽ More

    Submitted 7 November, 2024; originally announced November 2024.

  32. arXiv:2410.16943  [pdf, other

    cs.RO

    FlightAR: AR Flight Assistance Interface with Multiple Video Streams and Object Detection Aimed at Immersive Drone Control

    Authors: Oleg Sautenkov, Selamawit Asfaw, Yasheerah Yaqoot, Muhammad Ahsan Mustafa, Aleksey Fedoseev, Daria Trinitatova, Dzmitry Tsetserukou

    Abstract: The swift advancement of unmanned aerial vehicle (UAV) technologies necessitates new standards for developing human-drone interaction (HDI) interfaces. Most interfaces for HDI, especially first-person view (FPV) goggles, limit the operator's ability to obtain information from the environment. This paper presents a novel interface, FlightAR, that integrates augmented reality (AR) overlays of UAV fi… ▽ More

    Submitted 22 October, 2024; originally announced October 2024.

    Comments: Manuscript accepted in IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2024)

  33. arXiv:2410.16202  [pdf, other

    cs.HC

    Musinger: Communication of Music over a Distance with Wearable Haptic Display and Touch Sensitive Surface

    Authors: Miguel Altamirano Cabrera, Muhammad Haris Khan, Ali Alabbas, Luis Moreno, Issatay Tokmurziyev, Dzmitry Tsetserukou

    Abstract: This study explores the integration of auditory and tactile experiences in musical haptics, focusing on enhancing sensory dimensions of music through touch. Addressing the gap in translating auditory signals to meaningful tactile feedback, our research introduces a novel method involving a touch-sensitive recorder and a wearable haptic display that captures musical interactions via force sensors a… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

    Comments: This paper has been accepted for publication at ROBIO 2024 conference

  34. SwarmPath: Drone Swarm Navigation through Cluttered Environments Leveraging Artificial Potential Field and Impedance Control

    Authors: Roohan Ahmed Khan, Malaika Zafar, Amber Batool, Aleksey Fedoseev, Dzmitry Tsetserukou

    Abstract: In the area of multi-drone systems, navigating through dynamic environments from start to goal while providing collision-free trajectory and efficient path planning is a significant challenge. To solve this problem, we propose a novel SwarmPath technology that involves the integration of Artificial Potential Field (APF) with Impedance Controller. The proposed approach provides a solution based on… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: Manuscript accepted in IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2024)

  35. arXiv:2410.07801  [pdf, other

    cs.RO cs.CV cs.SE eess.SY

    LucidGrasp: Robotic Framework for Autonomous Manipulation of Laboratory Equipment with Different Degrees of Transparency via 6D Pose Estimation

    Authors: Maria Makarova, Daria Trinitatova, Qian Liu, Dzmitry Tsetserukou

    Abstract: Many modern robotic systems operate autonomously, however they often lack the ability to accurately analyze the environment and adapt to changing external conditions, while teleoperation systems often require special operator skills. In the field of laboratory automation, the number of automated processes is growing, however such systems are usually developed to perform specific tasks. In addition… ▽ More

    Submitted 31 October, 2024; v1 submitted 10 October, 2024; originally announced October 2024.

    Comments: Accepted to the 2024 IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2024), 6 pages, 8 figures

  36. arXiv:2410.05405  [pdf, other

    cs.RO

    SharpSLAM: 3D Object-Oriented Visual SLAM with Deblurring for Agile Drones

    Authors: Denis Davletshin, Iana Zhura, Vladislav Cheremnykh, Mikhail Rybiyanov, Aleksey Fedoseev, Dzmitry Tsetserukou

    Abstract: The paper focuses on the algorithm for improving the quality of 3D reconstruction and segmentation in DSP-SLAM by enhancing the RGB image quality. SharpSLAM algorithm developed by us aims to decrease the influence of high dynamic motion on visual object-oriented SLAM through image deblurring, improving all aspects of object-oriented SLAM, including localization, mapping, and object reconstruction.… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: Manuscript accepted to IEEE Telepresence 2024

  37. arXiv:2409.15838  [pdf, other

    cs.RO

    TiltXter: CNN-based Electro-tactile Rendering of Tilt Angle for Telemanipulation of Pasteur Pipettes

    Authors: Miguel Altamirano Cabrera, Jonathan Tirado, Aleksey Fedoseev, Oleg Sautenkov, Vladimir Poliakov, Pavel Kopanev, Dzmitry Tsetserukou

    Abstract: The shape of deformable objects can change drastically during grasping by robotic grippers, causing an ambiguous perception of their alignment and hence resulting in errors in robot positioning and telemanipulation. Rendering clear tactile patterns is fundamental to increasing users' precision and dexterity through tactile haptic feedback during telemanipulation. Therefore, different methods have… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

    Comments: Manuscript accepted to IEEE Telepresence 2024. arXiv admin note: text overlap with arXiv:2204.03521 by other authors

  38. arXiv:2409.12667  [pdf, other

    cs.RO cs.CV

    METDrive: Multi-modal End-to-end Autonomous Driving with Temporal Guidance

    Authors: Ziang Guo, Xinhao Lin, Zakhar Yagudin, Artem Lykov, Yong Wang, Yanqiang Li, Dzmitry Tsetserukou

    Abstract: Multi-modal end-to-end autonomous driving has shown promising advancements in recent work. By embedding more modalities into end-to-end networks, the system's understanding of both static and dynamic aspects of the driving environment is enhanced, thereby improving the safety of autonomous driving. In this paper, we introduce METDrive, an end-to-end system that leverages temporal guidance from the… ▽ More

    Submitted 14 May, 2025; v1 submitted 19 September, 2024; originally announced September 2024.

    Comments: Accepted by ICRA

  39. arXiv:2409.10106  [pdf, other

    cs.RO cs.AI

    Industry 6.0: New Generation of Industry driven by Generative AI and Swarm of Heterogeneous Robots

    Authors: Artem Lykov, Miguel Altamirano Cabrera, Mikhail Konenkov, Valerii Serpiva, Koffivi Fid`ele Gbagbe, Ali Alabbas, Aleksey Fedoseev, Luis Moreno, Muhammad Haris Khan, Ziang Guo, Dzmitry Tsetserukou

    Abstract: This paper presents the concept of Industry 6.0, introducing the world's first fully automated production system that autonomously handles the entire product design and manufacturing process based on user-provided natural language descriptions. By leveraging generative AI, the system automates critical aspects of production, including product blueprint design, component manufacturing, logistics, a… ▽ More

    Submitted 16 September, 2024; originally announced September 2024.

    Comments: submitted to IEEE conf

  40. arXiv:2409.00766  [pdf, other

    cs.RO cs.MA

    Dynamic Subgoal based Path Formation and Task Allocation: A NeuroFleets Approach to Scalable Swarm Robotics

    Authors: Robinroy Peter, Lavanya Ratnabala, Eugene Yugarajah Andrew Charles, Dzmitry Tsetserukou

    Abstract: This paper addresses the challenges of exploration and navigation in unknown environments from the perspective of evolutionary swarm robotics. A key focus is on path formation, which is essential for enabling cooperative swarm robots to navigate effectively. We designed the task allocation and path formation process based on a finite state machine, ensuring systematic decision-making and efficient… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

    Comments: arXiv admin note: text overlap with arXiv:2312.16606

  41. arXiv:2407.15622  [pdf, other

    cs.RO

    HyperSurf: Quadruped Robot Leg Capable of Surface Recognition with GRU and Real-to-Sim Transferring

    Authors: Sergei Satsevich, Yaroslav Savotin, Danil Belov, Elizaveta Pestova, Artem Erhov, Batyr Khabibullin, Artem Bazhenov, Vyacheslav Kovalev, Aleksey Fedoseev, Dzmitry Tsetserukou

    Abstract: This paper introduces a system of data collection acceleration and real-to-sim transferring for surface recognition on a quadruped robot. The system features a mechanical single-leg setup capable of stepping on various easily interchangeable surfaces. Additionally, it incorporates a GRU-based Surface Recognition System, inspired by the system detailed in the Dog-Surf paper. This setup facilitates… ▽ More

    Submitted 19 August, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

    Comments: IEEE SMC 2024

  42. arXiv:2407.10865  [pdf, other

    cs.RO

    AirNeRF: 3D Reconstruction of Human with Drone and NeRF for Future Communication Systems

    Authors: Alexey Kotcov, Maria Dronova, Vladislav Cheremnykh, Sausar Karaf, Dzmitry Tsetserukou

    Abstract: In the rapidly evolving landscape of digital content creation, the demand for fast, convenient, and autonomous methods of crafting detailed 3D reconstructions of humans has grown significantly. Addressing this pressing need, our AirNeRF system presents an innovative pathway to the creation of a realistic 3D human avatar. Our approach leverages Neural Radiance Fields (NeRF) with an automated drone-… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  43. arXiv:2407.09841  [pdf, other

    cs.RO

    OmniRace: 6D Hand Pose Estimation for Intuitive Guidance of Racing Drone

    Authors: Valerii Serpiva, Aleksey Fedoseev, Sausar Karaf, Ali Alridha Abdulkarim, Dzmitry Tsetserukou

    Abstract: This paper presents the OmniRace approach to controlling a racing drone with 6-degree of freedom (DoF) hand pose estimation and gesture recognition. To our knowledge, it is the first-ever technology that allows for low-level control of high-speed drones using gestures. OmniRace employs a gesture interface based on computer vision and a deep neural network to estimate a 6-DoF hand pose. The advance… ▽ More

    Submitted 21 October, 2024; v1 submitted 13 July, 2024; originally announced July 2024.

  44. arXiv:2407.09625  [pdf, other

    cs.RO

    MorphoMove: Bi-Modal Path Planner with MPC-based Path Follower for Multi-Limb Morphogenetic UAV

    Authors: Muhammad Ahsan Mustafa, Yasheerah Yaqoot, Mikhail Martynov, Sausar Karaf, Dzmitry Tsetserukou

    Abstract: This paper discusses developments for a multi-limb morphogenetic UAV, MorphoGear, that is capable of both aerial flight and ground locomotion. A hybrid path planning algorithm based on the A* strategy has been developed, enabling seamless transition between air-to-ground navigation modes, thereby enhancing robot's mobility in complex environments. Moreover, precise path following is achieved durin… ▽ More

    Submitted 21 August, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

    Comments: Accepted in IEEE International Conference on Systems, Man, and Cybernetics (SMC 2024)

  45. arXiv:2407.09459  [pdf, other

    cs.RO

    GazeRace: Revolutionizing Remote Piloting with Eye-Gaze Control

    Authors: Issatay Tokmurziyev, Valerii Serpiva, Alexey Fedoseev, Miguel Altamirano Cabrera, Dzmitry Tsetserukou

    Abstract: This paper presents GazeRace, a novel system that leverages eye-tracking technology for intuitive drone control. Using the MediaPipe library, the system translates eye movements into precise drone commands, enabling effective remote piloting. In testing, GazeRace demonstrated an 18% reduction in drone trajectory length while maintaining competitive speed with traditional controls. The results sugg… ▽ More

    Submitted 21 August, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

    Comments: Accepted in: IEEE International Conference on Systems, Man, and Cybernetics (SMC 2024)

  46. arXiv:2406.16164  [pdf, other

    cs.RO cs.MA

    TornadoDrone: Bio-inspired DRL-based Drone Landing on 6D Platform with Wind Force Disturbances

    Authors: Robinroy Peter, Lavanya Ratnabala, Demetros Aschu, Aleksey Fedoseev, Dzmitry Tsetserukou

    Abstract: Autonomous drone navigation faces a critical challenge in achieving accurate landings on dynamic platforms, especially under unpredictable conditions such as wind turbulence. Our research introduces TornadoDrone, a novel Deep Reinforcement Learning (DRL) model that adopts bio-inspired mechanisms to adapt to wind forces, mirroring the natural adaptability seen in birds. This model, unlike tradition… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE. arXiv admin note: substantial text overlap with arXiv:2403.06572

  47. arXiv:2406.04159  [pdf, other

    cs.RO cs.MA

    MARLander: A Local Path Planning for Drone Swarms using Multiagent Deep Reinforcement Learning

    Authors: Demetros Aschu, Robinroy Peter, Sausar Karaf, Aleksey Fedoseev, Dzmitry Tsetserukou

    Abstract: Achieving safe and precise landings for a swarm of drones poses a significant challenge, primarily attributed to conventional control and planning methods. This paper presents the implementation of multi-agent deep reinforcement learning (MADRL) techniques for the precise landing of a drone swarm at relocated target locations. The system is trained in a realistic simulated environment with a maxim… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  48. arXiv:2405.11682  [pdf, other

    cs.CV cs.RO

    FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention

    Authors: Ziang Guo, Zakhar Yagudin, Selamawit Asfaw, Artem Lykov, Dzmitry Tsetserukou

    Abstract: Camera, LiDAR and radar are common perception sensors for autonomous driving tasks. Robust prediction of 3D object detection is optimally based on the fusion of these sensors. To exploit their abilities wisely remains a challenge because each of these sensors has its own characteristics. In this paper, we propose FADet, a multi-sensor 3D detection network, which specifically studies the characteri… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: Submitted to IEEE

  49. arXiv:2405.11537  [pdf, other

    cs.RO cs.AI cs.ET

    VR-GPT: Visual Language Model for Intelligent Virtual Reality Applications

    Authors: Mikhail Konenkov, Artem Lykov, Daria Trinitatova, Dzmitry Tsetserukou

    Abstract: The advent of immersive Virtual Reality applications has transformed various domains, yet their integration with advanced artificial intelligence technologies like Visual Language Models remains underexplored. This study introduces a pioneering approach utilizing VLMs within VR environments to enhance user interaction and task efficiency. Leveraging the Unity engine and a custom-developed VLM, our… ▽ More

    Submitted 3 August, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

    Comments: Updated version

  50. arXiv:2405.09310  [pdf, other

    cs.RO

    GrainGrasp: Dexterous Grasp Generation with Fine-grained Contact Guidance

    Authors: Fuqiang Zhao, Dzmitry Tsetserukou, Qian Liu

    Abstract: One goal of dexterous robotic grasping is to allow robots to handle objects with the same level of flexibility and adaptability as humans. However, it remains a challenging task to generate an optimal grasping strategy for dexterous hands, especially when it comes to delicate manipulation and accurate adjustment the desired grasping poses for objects of varying shapes and sizes. In this paper, we… ▽ More

    Submitted 15 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    Comments: This paper is accepted by the ICRA2024