-
Learning Symbolic Persistent Macro-Actions for POMDP Solving Over Time
Authors:
Celeste Veronese,
Daniele Meli,
Alessandro Farinelli
Abstract:
This paper proposes an integration of temporal logical reasoning and Partially Observable Markov Decision Processes (POMDPs) to achieve interpretable decision-making under uncertainty with macro-actions. Our method leverages a fragment of Linear Temporal Logic (LTL) based on Event Calculus (EC) to generate \emph{persistent} (i.e., constant) macro-actions, which guide Monte Carlo Tree Search (MCTS)…
▽ More
This paper proposes an integration of temporal logical reasoning and Partially Observable Markov Decision Processes (POMDPs) to achieve interpretable decision-making under uncertainty with macro-actions. Our method leverages a fragment of Linear Temporal Logic (LTL) based on Event Calculus (EC) to generate \emph{persistent} (i.e., constant) macro-actions, which guide Monte Carlo Tree Search (MCTS)-based POMDP solvers over a time horizon, significantly reducing inference time while ensuring robust performance. Such macro-actions are learnt via Inductive Logic Programming (ILP) from a few traces of execution (belief-action pairs), thus eliminating the need for manually designed heuristics and requiring only the specification of the POMDP transition model. In the Pocman and Rocksample benchmark scenarios, our learned macro-actions demonstrate increased expressiveness and generality when compared to time-independent heuristics, indeed offering substantial computational efficiency improvements.
△ Less
Submitted 6 May, 2025;
originally announced May 2025.
-
Depth-Constrained ASV Navigation with Deep RL and Limited Sensing
Authors:
Amirhossein Zhalehmehrabi,
Daniele Meli,
Francesco Dal Santo,
Francesco Trotti,
Alessandro Farinelli
Abstract:
Autonomous Surface Vehicles (ASVs) play a crucial role in maritime operations, yet their navigation in shallow-water environments remains challenging due to dynamic disturbances and depth constraints. Traditional navigation strategies struggle with limited sensor information, making safe and efficient operation difficult. In this paper, we propose a reinforcement learning (RL) framework for ASV na…
▽ More
Autonomous Surface Vehicles (ASVs) play a crucial role in maritime operations, yet their navigation in shallow-water environments remains challenging due to dynamic disturbances and depth constraints. Traditional navigation strategies struggle with limited sensor information, making safe and efficient operation difficult. In this paper, we propose a reinforcement learning (RL) framework for ASV navigation under depth constraints, where the vehicle must reach a target while avoiding unsafe areas with only a single depth measurement per timestep from a downward-facing Single Beam Echosounder (SBES). To enhance environmental awareness, we integrate Gaussian Process (GP) regression into the RL framework, enabling the agent to progressively estimate a bathymetric depth map from sparse sonar readings. This approach improves decision-making by providing a richer representation of the environment. Furthermore, we demonstrate effective sim-to-real transfer, ensuring that trained policies generalize well to real-world aquatic conditions. Experimental results validate our method's capability to improve ASV navigation performance while maintaining safety in challenging shallow-water environments.
△ Less
Submitted 2 June, 2025; v1 submitted 25 April, 2025;
originally announced April 2025.
-
Monte Carlo Tree Search with Velocity Obstacles for safe and efficient motion planning in dynamic environments
Authors:
Lorenzo Bonanni,
Daniele Meli,
Alberto Castellini,
Alessandro Farinelli
Abstract:
Online motion planning is a challenging problem for intelligent robots moving in dense environments with dynamic obstacles, e.g., crowds. In this work, we propose a novel approach for optimal and safe online motion planning with minimal information about dynamic obstacles. Specifically, our approach requires only the current position of the obstacles and their maximum speed, but it does not need a…
▽ More
Online motion planning is a challenging problem for intelligent robots moving in dense environments with dynamic obstacles, e.g., crowds. In this work, we propose a novel approach for optimal and safe online motion planning with minimal information about dynamic obstacles. Specifically, our approach requires only the current position of the obstacles and their maximum speed, but it does not need any information about their exact trajectories or dynamic model. The proposed methodology combines Monte Carlo Tree Search (MCTS), for online optimal planning via model simulations, with Velocity Obstacles (VO), for obstacle avoidance. We perform experiments in a cluttered simulated environment with walls, and up to 40 dynamic obstacles moving with random velocities and directions. With an ablation study, we show the key contribution of VO in scaling up the efficiency of MCTS, selecting the safest and most rewarding actions in the tree of simulations. Moreover, we show the superiority of our methodology with respect to state-of-the-art planners, including Non-linear Model Predictive Control (NMPC), in terms of improved collision rate, computational and task performance.
△ Less
Submitted 16 January, 2025;
originally announced January 2025.
-
Inductive Learning of Robot Task Knowledge from Raw Data and Online Expert Feedback
Authors:
Daniele Meli,
Paolo Fiorini
Abstract:
The increasing level of autonomy of robots poses challenges of trust and social acceptance, especially in human-robot interaction scenarios. This requires an interpretable implementation of robotic cognitive capabilities, possibly based on formal methods as logics for the definition of task specifications. However, prior knowledge is often unavailable in complex realistic scenarios.
In this pape…
▽ More
The increasing level of autonomy of robots poses challenges of trust and social acceptance, especially in human-robot interaction scenarios. This requires an interpretable implementation of robotic cognitive capabilities, possibly based on formal methods as logics for the definition of task specifications. However, prior knowledge is often unavailable in complex realistic scenarios.
In this paper, we propose an offline algorithm based on inductive logic programming from noisy examples to extract task specifications (i.e., action preconditions, constraints and effects) directly from raw data of few heterogeneous (i.e., not repetitive) robotic executions. Our algorithm leverages on the output of any unsupervised action identification algorithm from video-kinematic recordings. Combining it with the definition of very basic, almost task-agnostic, commonsense concepts about the environment, which contribute to the interpretability of our methodology, we are able to learn logical axioms encoding preconditions of actions, as well as their effects in the event calculus paradigm. Since the quality of learned specifications depends mainly on the accuracy of the action identification algorithm, we also propose an online framework for incremental refinement of task knowledge from user feedback, guaranteeing safe execution. Results in a standard manipulation task and benchmark for user training in the safety-critical surgical robotic scenario, show the robustness, data- and time-efficiency of our methodology, with promising results towards the scalability in more complex domains.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
Online inductive learning from answer sets for efficient reinforcement learning exploration
Authors:
Celeste Veronese,
Daniele Meli,
Alessandro Farinelli
Abstract:
This paper presents a novel approach combining inductive logic programming with reinforcement learning to improve training performance and explainability. We exploit inductive learning of answer set programs from noisy examples to learn a set of logical rules representing an explainable approximation of the agent policy at each batch of experience. We then perform answer set reasoning on the learn…
▽ More
This paper presents a novel approach combining inductive logic programming with reinforcement learning to improve training performance and explainability. We exploit inductive learning of answer set programs from noisy examples to learn a set of logical rules representing an explainable approximation of the agent policy at each batch of experience. We then perform answer set reasoning on the learned rules to guide the exploration of the learning agent at the next batch, without requiring inefficient reward shaping and preserving optimality with soft bias. The entire procedure is conducted during the online execution of the reinforcement learning algorithm. We preliminarily validate the efficacy of our approach by integrating it into the Q-learning algorithm for the Pac-Man scenario in two maps of increasing complexity. Our methodology produces a significant boost in the discounted return achieved by the agent, even in the first batches of training. Moreover, inductive learning does not compromise the computational time required by Q-learning and learned rules quickly converge to an explanation of the agent policy.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
Explainable Online Unsupervised Anomaly Detection for Cyber-Physical Systems via Causal Discovery from Time Series
Authors:
Daniele Meli
Abstract:
Online unsupervised detection of anomalies is crucial to guarantee the correct operation of cyber-physical systems and the safety of humans interacting with them. State-of-the-art approaches based on deep learning via neural networks achieve outstanding performance at anomaly recognition, evaluating the discrepancy between a normal model of the system (with no anomalies) and the real-time stream o…
▽ More
Online unsupervised detection of anomalies is crucial to guarantee the correct operation of cyber-physical systems and the safety of humans interacting with them. State-of-the-art approaches based on deep learning via neural networks achieve outstanding performance at anomaly recognition, evaluating the discrepancy between a normal model of the system (with no anomalies) and the real-time stream of sensor time series. However, large training data and time are typically required, and explainability is still a challenge to identify the root of the anomaly and implement predictive maintainance. In this paper, we use causal discovery to learn a normal causal graph of the system, and we evaluate the persistency of causal links during real-time acquisition of sensor data to promptly detect anomalies. On two benchmark anomaly detection datasets, we show that our method has higher training efficiency, outperforms the accuracy of state-of-the-art neural architectures and correctly identifies the sources of >10 different anomalies. The code is at https://github.com/Isla-lab/causal_anomaly_detection.
△ Less
Submitted 28 July, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Planning and Inverse Kinematics of Hyper-Redundant Manipulators with VO-FABRIK
Authors:
Cristian Morasso,
Daniele Meli,
Yann Divet,
Salvatore Sessa,
Alessandro Farinelli
Abstract:
Hyper-redundant Robotic Manipulators (HRMs) offer great dexterity and flexibility of operation, but solving Inverse Kinematics (IK) is challenging. In this work, we introduce VO-FABRIK, an algorithm combining Forward and Backward Reaching Inverse Kinematics (FABRIK) for repeatable deterministic IK computation, and an approach inspired from velocity obstacles to perform path planning under collisio…
▽ More
Hyper-redundant Robotic Manipulators (HRMs) offer great dexterity and flexibility of operation, but solving Inverse Kinematics (IK) is challenging. In this work, we introduce VO-FABRIK, an algorithm combining Forward and Backward Reaching Inverse Kinematics (FABRIK) for repeatable deterministic IK computation, and an approach inspired from velocity obstacles to perform path planning under collision and joint limits constraints. We show preliminary results on an industrial HRM with 19 actuated joints. Our algorithm achieves good performance where a state-of-the-art IK solver fails.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach
Authors:
Daniele Meli,
Alberto Castellini,
Alessandro Farinelli
Abstract:
Partially Observable Markov Decision Processes (POMDPs) are a powerful framework for planning under uncertainty. They allow to model state uncertainty as a belief probability distribution. Approximate solvers based on Monte Carlo sampling show great success to relax the computational demand and perform online planning. However, scaling to complex realistic domains with many actions and long planni…
▽ More
Partially Observable Markov Decision Processes (POMDPs) are a powerful framework for planning under uncertainty. They allow to model state uncertainty as a belief probability distribution. Approximate solvers based on Monte Carlo sampling show great success to relax the computational demand and perform online planning. However, scaling to complex realistic domains with many actions and long planning horizons is still a major challenge, and a key point to achieve good performance is guiding the action-selection process with domain-dependent policy heuristics which are tailored for the specific application domain. We propose to learn high-quality heuristics from POMDP traces of executions generated by any solver. We convert the belief-action pairs to a logical semantics, and exploit data- and time-efficient Inductive Logic Programming (ILP) to generate interpretable belief-based policy specifications, which are then used as online heuristics. We evaluate thoroughly our methodology on two notoriously challenging POMDP problems, involving large action spaces and long planning horizons, namely, rocksample and pocman. Considering different state-of-the-art online POMDP solvers, including POMCP, DESPOT and AdaOPS, we show that learned heuristics expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specific heuristics within lower computational time. Moreover, they well generalize to more challenging scenarios not experienced in the training phase (e.g., increasing rocks and grid size in rocksample, incrementing the size of the map and the aggressivity of ghosts in pocman).
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
Learning Logic Specifications for Soft Policy Guidance in POMCP
Authors:
Giulio Mazzi,
Daniele Meli,
Alberto Castellini,
Alessandro Farinelli
Abstract:
Partially Observable Monte Carlo Planning (POMCP) is an efficient solver for Partially Observable Markov Decision Processes (POMDPs). It allows scaling to large state spaces by computing an approximation of the optimal policy locally and online, using a Monte Carlo Tree Search based strategy. However, POMCP suffers from sparse reward function, namely, rewards achieved only when the final goal is r…
▽ More
Partially Observable Monte Carlo Planning (POMCP) is an efficient solver for Partially Observable Markov Decision Processes (POMDPs). It allows scaling to large state spaces by computing an approximation of the optimal policy locally and online, using a Monte Carlo Tree Search based strategy. However, POMCP suffers from sparse reward function, namely, rewards achieved only when the final goal is reached, particularly in environments with large state spaces and long horizons. Recently, logic specifications have been integrated into POMCP to guide exploration and to satisfy safety requirements. However, such policy-related rules require manual definition by domain experts, especially in real-world scenarios. In this paper, we use inductive logic programming to learn logic specifications from traces of POMCP executions, i.e., sets of belief-action pairs generated by the planner. Specifically, we learn rules expressed in the paradigm of answer set programming. We then integrate them inside POMCP to provide soft policy bias toward promising actions. In the context of two benchmark scenarios, rocksample and battery, we show that the integration of learned rules from small task instances can improve performance with fewer Monte Carlo simulations and in larger task instances. We make our modified version of POMCP publicly available at https://github.com/GiuMaz/pomcp_clingo.git.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Logic programming for deliberative robotic task planning
Authors:
Daniele Meli,
Hirenkumar Nakawala,
Paolo Fiorini
Abstract:
Over the last decade, the use of robots in production and daily life has increased. With increasingly complex tasks and interaction in different environments including humans, robots are required a higher level of autonomy for efficient deliberation. Task planning is a key element of deliberation. It combines elementary operations into a structured plan to satisfy a prescribed goal, given specific…
▽ More
Over the last decade, the use of robots in production and daily life has increased. With increasingly complex tasks and interaction in different environments including humans, robots are required a higher level of autonomy for efficient deliberation. Task planning is a key element of deliberation. It combines elementary operations into a structured plan to satisfy a prescribed goal, given specifications on the robot and the environment. In this manuscript, we present a survey on recent advances in the application of logic programming to the problem of task planning. Logic programming offers several advantages compared to other approaches, including greater expressivity and interpretability which may aid in the development of safe and reliable robots. We analyze different planners and their suitability for specific robotic applications, based on expressivity in domain representation, computational efficiency and software implementation. In this way, we support the robotic designer in choosing the best tool for his application.
△ Less
Submitted 18 January, 2023;
originally announced January 2023.
-
Deliberation in autonomous robotic surgery: a framework for handling anatomical uncertainty
Authors:
Eleonora Tagliabue,
Daniele Meli,
Diego Dall'Alba,
Paolo Fiorini
Abstract:
Autonomous robotic surgery requires deliberation, i.e. the ability to plan and execute a task adapting to uncertain and dynamic environments. Uncertainty in the surgical domain is mainly related to the partial pre-operative knowledge about patient-specific anatomical properties. In this paper, we introduce a logic-based framework for surgical tasks with deliberative functions of monitoring and lea…
▽ More
Autonomous robotic surgery requires deliberation, i.e. the ability to plan and execute a task adapting to uncertain and dynamic environments. Uncertainty in the surgical domain is mainly related to the partial pre-operative knowledge about patient-specific anatomical properties. In this paper, we introduce a logic-based framework for surgical tasks with deliberative functions of monitoring and learning. The DEliberative Framework for Robot-Assisted Surgery (DEFRAS) estimates a pre-operative patient-specific plan, and executes it while continuously measuring the applied force obtained from a biomechanical pre-operative model. Monitoring module compares this model with the actual situation reconstructed from sensors. In case of significant mismatch, the learning module is invoked to update the model, thus improving the estimate of the exerted force. DEFRAS is validated both in simulated and real environment with da Vinci Research Kit executing soft tissue retraction. Compared with state-of-the art related works, the success rate of the task is improved while minimizing the interaction with the tissue to prevent unintentional damage.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
Autonomous tissue retraction with a biomechanically informed logic based framework
Authors:
D. Meli,
E. Tagliabue,
D. Dall'Alba,
P. Fiorini
Abstract:
Autonomy in robot-assisted surgery is essential to reduce surgeons' cognitive load and eventually improve the overall surgical outcome. A key requirement for autonomy in a safety-critical scenario as surgery lies in the generation of interpretable plans that rely on expert knowledge. Moreover, the Autonomous Robotic Surgical System (ARSS) must be able to reason on the dynamic and unpredictable ana…
▽ More
Autonomy in robot-assisted surgery is essential to reduce surgeons' cognitive load and eventually improve the overall surgical outcome. A key requirement for autonomy in a safety-critical scenario as surgery lies in the generation of interpretable plans that rely on expert knowledge. Moreover, the Autonomous Robotic Surgical System (ARSS) must be able to reason on the dynamic and unpredictable anatomical environment, and quickly adapt the surgical plan in case of unexpected situations. In this paper, we present a modular Framework for Robot-Assisted Surgery (FRAS) in deformable anatomical environments. Our framework integrates a logic module for task-level interpretable reasoning, a biomechanical simulation that complements data from real sensors, and a situation awareness module for context interpretation. The framework performance is evaluated on simulated soft tissue retraction, a common surgical task to remove the tissue hiding a region of interest. Results show that the framework has the adaptability required to successfully accomplish the task, handling dynamic environmental conditions and possible failures, while guaranteeing the computational efficiency required in a real surgical scenario. The framework is made publicly available.
△ Less
Submitted 28 September, 2021; v1 submitted 6 September, 2021;
originally announced September 2021.
-
Unsupervised identification of surgical robotic actions from small non homogeneous datasets
Authors:
Daniele Meli,
Paolo Fiorini
Abstract:
Robot-assisted surgery is an established clinical practice. The automatic identification of surgical actions is needed for a range of applications, including performance assessment of trainees and surgical process modeling for autonomous execution and monitoring. However, supervised action identification is not feasible, due to the burden of manually annotating recordings of potentially complex an…
▽ More
Robot-assisted surgery is an established clinical practice. The automatic identification of surgical actions is needed for a range of applications, including performance assessment of trainees and surgical process modeling for autonomous execution and monitoring. However, supervised action identification is not feasible, due to the burden of manually annotating recordings of potentially complex and long surgical executions. Moreover, often few example executions of a surgical procedure can be recorded. This paper proposes a novel fast algorithm for unsupervised identification of surgical actions in a standard surgical training task, the ring transfer, executed with da Vinci Research Kit. Exploiting kinematic and semantic visual features automatically extracted from a very limited dataset of executions, we are able to significantly outperform state-of-the-art results on a dataset of non-expert executions (58\% vs. 24\% F1-score), and improve performance in the presence of noise, short actions and non-homogeneous workflows, i.e. non repetitive action sequences.
△ Less
Submitted 11 October, 2021; v1 submitted 18 May, 2021;
originally announced May 2021.
-
Improving rigid 3D calibration for robotic surgery
Authors:
Andrea Roberti,
Nicola Piccinelli,
Daniele Meli,
Riccardo Muradore,
Paolo Fiorini
Abstract:
Autonomy is the frontier of research in robotic surgery and its aim is to improve the quality of surgical procedures in the next future. One fundamental requirement for autonomy is advanced perception capability through vision sensors. In this paper, we propose a novel calibration technique for a surgical scenario with da Vinci robot. Calibration of the camera and the robot is necessary for precis…
▽ More
Autonomy is the frontier of research in robotic surgery and its aim is to improve the quality of surgical procedures in the next future. One fundamental requirement for autonomy is advanced perception capability through vision sensors. In this paper, we propose a novel calibration technique for a surgical scenario with da Vinci robot. Calibration of the camera and the robot is necessary for precise positioning of the tools in order to emulate the high performance surgeons. Our calibration technique is tailored for RGB-D camera. Different tests performed on relevant use cases for surgery prove that we significantly improve precision and accuracy with respect to the state of the art solutions for similar devices on a surgical-size setup. Moreover, our calibration method can be easily extended to standard surgical endoscope to prompt its use in real surgical scenario.
△ Less
Submitted 16 July, 2020;
originally announced July 2020.
-
Dynamic Movement Primitives: Volumetric Obstacle Avoidance Using Dynamic Potential Functions
Authors:
Michele Ginesi,
Daniele Meli,
Andrea Roberti,
Nicola Sansonetto,
Paolo Fiorini
Abstract:
Obstacle avoidance for DMPs is still a challenging problem. In our previous work, we proposed a framework for obstacle avoidance based on superquadric potential functions to represent volumes. In this work, we extend our previous work to include the velocity of the trajectory in the definition of the potential. Our formulations guarantee smoother behavior with respect to state-of-the-art point-lik…
▽ More
Obstacle avoidance for DMPs is still a challenging problem. In our previous work, we proposed a framework for obstacle avoidance based on superquadric potential functions to represent volumes. In this work, we extend our previous work to include the velocity of the trajectory in the definition of the potential. Our formulations guarantee smoother behavior with respect to state-of-the-art point-like methods. Moreover, our new formulation allows to obtain a smoother behavior in proximity of the obstacle than when using a static (i.e. velocity independent) potential. We validate our framework for obstacle avoidance in a simulated multi-robot scenario and with different real robots: a pick-and-place task for an industrial manipulator and a surgical robot to show scalability; and navigation with a mobile robot in dynamic environment.
△ Less
Submitted 1 July, 2020;
originally announced July 2020.
-
Autonomous task planning and situation awareness in robotic surgery
Authors:
Michele Ginesi,
Daniele Meli,
Andrea Roberti,
Nicola Sansonetto,
Paolo Fiorini
Abstract:
The use of robots in minimally invasive surgery has improved the quality of standard surgical procedures. So far, only the automation of simple surgical actions has been investigated by researchers, while the execution of structured tasks requiring reasoning on the environment and the choice among multiple actions is still managed by human surgeons. In this paper, we propose a framework to impleme…
▽ More
The use of robots in minimally invasive surgery has improved the quality of standard surgical procedures. So far, only the automation of simple surgical actions has been investigated by researchers, while the execution of structured tasks requiring reasoning on the environment and the choice among multiple actions is still managed by human surgeons. In this paper, we propose a framework to implement surgical task automation. The framework consists of a task-level reasoning module based on answer set programming, a low-level motion planning module based on dynamic movement primitives, and a situation awareness module. The logic-based reasoning module generates explainable plans and is able to recover from failure conditions, which are identified and explained by the situation awareness module interfacing to a human supervisor, for enhanced safety. Dynamic Movement Primitives allow to replicate the dexterity of surgeons and to adapt to obstacles and changes in the environment. The framework is validated on different versions of the standard surgical training peg-and-ring task.
△ Less
Submitted 19 April, 2020;
originally announced April 2020.