Search | arXiv e-print repository

Optimal Motion Scaling for Delayed Telesurgery

Authors: Jason Lim, Florian Richter, Zih-Yun Chiu, Jaeyon Lee, Ethan Quist, Nathan Fisher, Jonathan Chambers, Steven Hong, Michael C. Yip

Abstract: Robotic teleoperation over long communication distances poses challenges due to delays in commands and feedback from network latency. One simple yet effective strategy to reduce errors and increase performance under delay is to downscale the relative motion between the operating surgeon and the robot. The question remains as to what is the optimal scaling factor, and how this value changes dependi… ▽ More Robotic teleoperation over long communication distances poses challenges due to delays in commands and feedback from network latency. One simple yet effective strategy to reduce errors and increase performance under delay is to downscale the relative motion between the operating surgeon and the robot. The question remains as to what is the optimal scaling factor, and how this value changes depending on the level of latency as well as operator tendencies. We present user studies investigating the relationship between latency, scaling factor, and performance. The results of our studies demonstrate a statistically significant difference in performance between users and across scaling factors for certain levels of delay. These findings indicate that the optimal scaling factor for a given level of delay is specific to each user, motivating the need for personalized models for optimal performance. We present techniques to model the user-specific mapping of latency level to scaling factor for optimal performance, leading to an efficient and effective solution to optimizing performance of robotic teleoperation and specifically telesurgery under large communication delay. △ Less

Submitted 26 June, 2025; originally announced June 2025.

Comments: Accepted to IROS 2025

arXiv:2503.05953 [pdf, other]

Differentiable Rendering-based Pose Estimation for Surgical Robotic Instruments

Authors: Zekai Liang, Zih-Yun Chiu, Florian Richter, Michael C. Yip

Abstract: Robot pose estimation is a challenging and crucial task for vision-based surgical robotic automation. Typical robotic calibration approaches, however, are not applicable to surgical robots, such as the da Vinci Research Kit (dVRK), due to joint angle measurement errors from cable-drives and the partially visible kinematic chain. Hence, previous works in surgical robotic automation used tracking al… ▽ More Robot pose estimation is a challenging and crucial task for vision-based surgical robotic automation. Typical robotic calibration approaches, however, are not applicable to surgical robots, such as the da Vinci Research Kit (dVRK), due to joint angle measurement errors from cable-drives and the partially visible kinematic chain. Hence, previous works in surgical robotic automation used tracking algorithms to estimate the pose of the surgical tool in real-time and compensate for the joint angle errors. However, a big limitation of these previous tracking works is the initialization step which relied on only keypoints and SolvePnP. In this work, we fully explore the potential of geometric primitives beyond just keypoints with differentiable rendering, cylinders, and construct a versatile pose matching pipeline in a novel pose hypothesis space. We demonstrate the state-of-the-art performance of our single-shot calibration method with both calibration consistency and real surgical tasks. As a result, this marker-less calibration approach proves to be a robust and generalizable initialization step for surgical tool tracking. △ Less

Submitted 7 March, 2025; originally announced March 2025.

arXiv:2409.15651 [pdf, other]

SurgIRL: Towards Life-Long Learning for Surgical Automation by Incremental Reinforcement Learning

Authors: Yun-Jie Ho, Zih-Yun Chiu, Yuheng Zhi, Michael C. Yip

Abstract: Surgical automation holds immense potential to improve the outcome and accessibility of surgery. Recent studies use reinforcement learning to learn policies that automate different surgical tasks. However, these policies are developed independently and are limited in their reusability when the task changes, making it more time-consuming when robots learn to solve multiple tasks. Inspired by how hu… ▽ More Surgical automation holds immense potential to improve the outcome and accessibility of surgery. Recent studies use reinforcement learning to learn policies that automate different surgical tasks. However, these policies are developed independently and are limited in their reusability when the task changes, making it more time-consuming when robots learn to solve multiple tasks. Inspired by how human surgeons build their expertise, we train surgical automation policies through Surgical Incremental Reinforcement Learning (SurgIRL). SurgIRL aims to (1) acquire new skills by referring to external policies (knowledge) and (2) accumulate and reuse these skills to solve multiple unseen tasks incrementally (incremental learning). Our SurgIRL framework includes three major components. We first define an expandable knowledge set containing heterogeneous policies that can be helpful for surgical tasks. Then, we propose Knowledge Inclusive Attention Network with mAximum Coverage Exploration (KIAN-ACE), which improves learning efficiency by maximizing the coverage of the knowledge set during the exploration process. Finally, we develop incremental learning pipelines based on KIAN-ACE to accumulate and reuse learned knowledge and solve multiple surgical tasks sequentially. Our simulation experiments show that KIAN-ACE efficiently learns to automate ten surgical tasks separately or incrementally. We also evaluate our learned policies on the da Vinci Research Kit (dVRK) and demonstrate successful sim-to-real transfers. △ Less

Submitted 23 September, 2024; originally announced September 2024.

arXiv:2404.00123 [pdf, other]

SURESTEP: An Uncertainty-Aware Trajectory Optimization Framework to Enhance Visual Tool Tracking for Robust Surgical Automation

Authors: Nikhil U. Shinde, Zih-Yun Chiu, Florian Richter, Jason Lim, Yuheng Zhi, Sylvia Herbert, Michael C. Yip

Abstract: Inaccurate tool localization is one of the main reasons for failures in automating surgical tasks. Imprecise robot kinematics and noisy observations caused by the poor visual acuity of an endoscopic camera make tool tracking challenging. Previous works in surgical automation adopt environment-specific setups or hard-coded strategies instead of explicitly considering motion and observation uncertai… ▽ More Inaccurate tool localization is one of the main reasons for failures in automating surgical tasks. Imprecise robot kinematics and noisy observations caused by the poor visual acuity of an endoscopic camera make tool tracking challenging. Previous works in surgical automation adopt environment-specific setups or hard-coded strategies instead of explicitly considering motion and observation uncertainty of tool tracking in their policies. In this work, we present SURESTEP, an uncertainty-aware trajectory optimization framework for robust surgical automation. We model the uncertainty of tool tracking with the components motivated by the sources of noise in typical surgical scenes. Using a Gaussian assumption to propagate our uncertainty models through a given tool trajectory, SURESTEP provides a general framework that minimizes the upper bound on the entropy of the final estimated tool distribution. We compare SURESTEP with a baseline method on a real-world suture needle regrasping task under challenging environmental conditions, such as poor lighting and a moving endoscopic camera. The results over 60 regrasps on the da Vinci Research Kit (dVRK) demonstrate that our optimized trajectories significantly outperform the un-optimized baseline. △ Less

Submitted 29 March, 2024; originally announced April 2024.

arXiv:2403.04971 [pdf, other]

Robust Surgical Tool Tracking with Pixel-based Probabilities for Projected Geometric Primitives

Authors: Christopher D'Ambrosia, Florian Richter, Zih-Yun Chiu, Nikhil Shinde, Fei Liu, Henrik I. Christensen, Michael C. Yip

Abstract: Controlling robotic manipulators via visual feedback requires a known coordinate frame transformation between the robot and the camera. Uncertainties in mechanical systems as well as camera calibration create errors in this coordinate frame transformation. These errors result in poor localization of robotic manipulators and create a significant challenge for applications that rely on precise inter… ▽ More Controlling robotic manipulators via visual feedback requires a known coordinate frame transformation between the robot and the camera. Uncertainties in mechanical systems as well as camera calibration create errors in this coordinate frame transformation. These errors result in poor localization of robotic manipulators and create a significant challenge for applications that rely on precise interactions between manipulators and the environment. In this work, we estimate the camera-to-base transform and joint angle measurement errors for surgical robotic tools using an image based insertion-shaft detection algorithm and probabilistic models. We apply our proposed approach in both a structured environment as well as an unstructured environment and measure to demonstrate the efficacy of our methods. △ Less

Submitted 7 March, 2024; originally announced March 2024.

arXiv:2309.15265 [pdf, other]

Finding Biomechanically Safe Trajectories for Robot Manipulation of the Human Body in a Search and Rescue Scenario

Authors: Elizabeth Peiros, Zih-Yun Chiu, Yuheng Zhi, Nikhil Shinde, Michael C. Yip

Abstract: There has been increasing awareness of the difficulties in reaching and extracting people from mass casualty scenarios, such as those arising from natural disasters. While platforms have been designed to consider reaching casualties and even carrying them out of harm's way, the challenge of repositioning a casualty from its found configuration to one suitable for extraction has not been explicitly… ▽ More There has been increasing awareness of the difficulties in reaching and extracting people from mass casualty scenarios, such as those arising from natural disasters. While platforms have been designed to consider reaching casualties and even carrying them out of harm's way, the challenge of repositioning a casualty from its found configuration to one suitable for extraction has not been explicitly explored. Furthermore, this planning problem needs to incorporate biomechanical safety considerations for the casualty. Thus, we present a first solution to biomechanically safe trajectory generation for repositioning limbs of unconscious human casualties. We describe biomechanical safety as mathematical constraints, mechanical descriptions of the dynamics for the robot-human coupled system, and the planning and trajectory optimization process that considers this coupled and constrained system. We finally evaluate our approach over several variations of the problem and demonstrate it on a real robot and human subject. This work provides a crucial part of search and rescue that can be used in conjunction with past and present works involving robots and vision systems designed for search and rescue. △ Less

Submitted 26 September, 2023; originally announced September 2023.

arXiv:2210.16674 [pdf, other]

Semantic-SuPer: A Semantic-aware Surgical Perception Framework for Endoscopic Tissue Identification, Reconstruction, and Tracking

Authors: Shan Lin, Albert J. Miao, Jingpei Lu, Shunkai Yu, Zih-Yun Chiu, Florian Richter, Michael C. Yip

Abstract: Accurate and robust tracking and reconstruction of the surgical scene is a critical enabling technology toward autonomous robotic surgery. Existing algorithms for 3D perception in surgery mainly rely on geometric information, while we propose to also leverage semantic information inferred from the endoscopic video using image segmentation algorithms. In this paper, we present a novel, comprehensiv… ▽ More Accurate and robust tracking and reconstruction of the surgical scene is a critical enabling technology toward autonomous robotic surgery. Existing algorithms for 3D perception in surgery mainly rely on geometric information, while we propose to also leverage semantic information inferred from the endoscopic video using image segmentation algorithms. In this paper, we present a novel, comprehensive surgical perception framework, Semantic-SuPer, that integrates geometric and semantic information to facilitate data association, 3D reconstruction, and tracking of endoscopic scenes, benefiting downstream tasks like surgical navigation. The proposed framework is demonstrated on challenging endoscopic data with deforming tissue, showing its advantages over our baseline and several other state-of the-art approaches. Our code and dataset are available at https://github.com/ucsdarclab/Python-SuPer. △ Less

Submitted 20 February, 2023; v1 submitted 29 October, 2022; originally announced October 2022.

Comments: IEEE International Conference on Robotics and Automation (ICRA) 2023

arXiv:2210.11973 [pdf, other]

Real-Time Constrained 6D Object-Pose Tracking of An In-Hand Suture Needle for Minimally Invasive Robotic Surgery

Authors: Zih-Yun Chiu, Florian Richter, Michael C. Yip

Abstract: Autonomous suturing has been a long-sought-after goal for surgical robotics. Outside of staged environments, accurate localization of suture needles is a critical foundation for automating various suture needle manipulation tasks in the real world. When localizing a needle held by a gripper, previous work usually tracks them separately without considering their relationship. Because of the signifi… ▽ More Autonomous suturing has been a long-sought-after goal for surgical robotics. Outside of staged environments, accurate localization of suture needles is a critical foundation for automating various suture needle manipulation tasks in the real world. When localizing a needle held by a gripper, previous work usually tracks them separately without considering their relationship. Because of the significant errors that can arise in the stereo-triangulation of objects and instruments, their reconstructions may often not be consistent. This can lead to unrealistic tool-needle grasp reconstructions that are infeasible. Instead, an obvious strategy to improve localization would be to leverage constraints that arise from contact, thereby constraining reconstructions of objects and instruments into a jointly feasible space. In this work, we consider feasible grasping constraints when tracking the 6D pose of an in-hand suture needle. We propose a reparameterization trick to define a new state space for describing a needle pose, where grasp constraints can be easily defined and satisfied. Our proposed state space and feasible grasping constraints are then incorporated into Bayesian filters for real-time needle localization. In the experiments, we show that our constrained methods outperform previous unconstrained/constrained tracking approaches and demonstrate the importance of incorporating feasible grasping constraints into automating suture needle manipulation tasks. △ Less

Submitted 21 October, 2022; originally announced October 2022.

arXiv:2210.03729 [pdf, other]

Flexible Attention-Based Multi-Policy Fusion for Efficient Deep Reinforcement Learning

Authors: Zih-Yun Chiu, Yi-Lin Tuan, William Yang Wang, Michael C. Yip

Abstract: Reinforcement learning (RL) agents have long sought to approach the efficiency of human learning. Humans are great observers who can learn by aggregating external knowledge from various sources, including observations from others' policies of attempting a task. Prior studies in RL have incorporated external knowledge policies to help agents improve sample efficiency. However, it remains non-trivia… ▽ More Reinforcement learning (RL) agents have long sought to approach the efficiency of human learning. Humans are great observers who can learn by aggregating external knowledge from various sources, including observations from others' policies of attempting a task. Prior studies in RL have incorporated external knowledge policies to help agents improve sample efficiency. However, it remains non-trivial to perform arbitrary combinations and replacements of those policies, an essential feature for generalization and transferability. In this work, we present Knowledge-Grounded RL (KGRL), an RL paradigm fusing multiple knowledge policies and aiming for human-like efficiency and flexibility. We propose a new actor architecture for KGRL, Knowledge-Inclusive Attention Network (KIAN), which allows free knowledge rearrangement due to embedding-based attentive action prediction. KIAN also addresses entropy imbalance, a problem arising in maximum entropy KGRL that hinders an agent from efficiently exploring the environment, through a new design of policy distributions. The experimental results demonstrate that KIAN outperforms alternative methods incorporating external knowledge policies and achieves efficient and flexible learning. Our implementation is available at https://github.com/Pascalson/KGRL.git △ Less

Submitted 9 October, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

Comments: NeurIPS 2023

arXiv:2210.03728 [pdf, other]

Dynamic Latent Separation for Deep Learning

Authors: Yi-Lin Tuan, Zih-Yun Chiu, William Yang Wang

Abstract: A core problem in machine learning is to learn expressive latent variables for model prediction on complex data that involves multiple sub-components in a flexible and interpretable fashion. Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications. The key idea is to dynamically distance data samples in the latent sp… ▽ More A core problem in machine learning is to learn expressive latent variables for model prediction on complex data that involves multiple sub-components in a flexible and interpretable fashion. Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications. The key idea is to dynamically distance data samples in the latent space and thus enhance the output diversity. Our dynamic latent separation method, inspired by atomic physics, relies on the jointly learned structures of each data sample, which also reveal the importance of each sub-component for distinguishing data samples. This approach, atom modeling, requires no supervision of the latent space and allows us to learn extra partially interpretable representations besides the original goal of a model. We empirically demonstrate that the algorithm also enhances the performance of small to larger-scale models in various classification and generation problems. △ Less

Submitted 11 February, 2024; v1 submitted 7 October, 2022; originally announced October 2022.

arXiv:2109.12722 [pdf, other]

Markerless Suture Needle 6D Pose Tracking with Robust Uncertainty Estimation for Autonomous Minimally Invasive Robotic Surgery

Authors: Zih-Yun Chiu, Albert Z Liao, Florian Richter, Bjorn Johnson, Michael C. Yip

Abstract: Suture needle localization is necessary for autonomous suturing. Previous approaches in autonomous suturing often relied on fiducial markers rather than markerless detection schemes for localizing a suture needle due to the inconsistency of markerless detections. However, fiducial markers are not practical for real-world applications and can often be occluded from environmental factors in surgery… ▽ More Suture needle localization is necessary for autonomous suturing. Previous approaches in autonomous suturing often relied on fiducial markers rather than markerless detection schemes for localizing a suture needle due to the inconsistency of markerless detections. However, fiducial markers are not practical for real-world applications and can often be occluded from environmental factors in surgery (e.g., blood). Therefore in this work, we present a robust tracking approach for estimating the 6D pose of a suture needle when using inconsistent detections. We define observation models based on suture needles' geometry that captures the uncertainty of the detections and fuse them temporally in a probabilistic fashion. In our experiments, we compare different permutations of the observation models in the suture needle localization task to show their effectiveness. Our proposed method outperforms previous approaches in localizing a suture needle. We also demonstrate the proposed tracking method in an autonomous suture needle regrasping task and ex vivo environments. △ Less

Submitted 4 April, 2022; v1 submitted 26 September, 2021; originally announced September 2021.

arXiv:2108.02128 [pdf, other]

Parallelized Reverse Curriculum Generation

Authors: Zih-Yun Chiu, Yi-Lin Tuan, Hung-yi Lee, Li-Chen Fu

Abstract: For reinforcement learning (RL), it is challenging for an agent to master a task that requires a specific series of actions due to sparse rewards. To solve this problem, reverse curriculum generation (RCG) provides a reverse expansion approach that automatically generates a curriculum for the agent to learn. More specifically, RCG adapts the initial state distribution from the neighborhood of a go… ▽ More For reinforcement learning (RL), it is challenging for an agent to master a task that requires a specific series of actions due to sparse rewards. To solve this problem, reverse curriculum generation (RCG) provides a reverse expansion approach that automatically generates a curriculum for the agent to learn. More specifically, RCG adapts the initial state distribution from the neighborhood of a goal to a distance as training proceeds. However, the initial state distribution generated for each iteration might be biased, thus making the policy overfit or slowing down the reverse expansion rate. While training RCG for actor-critic (AC) based RL algorithms, this poor generalization and slow convergence might be induced by the tight coupling between an AC pair. Therefore, we propose a parallelized approach that simultaneously trains multiple AC pairs and periodically exchanges their critics. We empirically demonstrate that this proposed approach can improve RCG in performance and convergence, and it can also be applied to other AC based RL algorithms with adapted initial state distribution. △ Less

Submitted 4 August, 2021; originally announced August 2021.

arXiv:2011.04813 [pdf, other]

Bimanual Regrasping for Suture Needles using Reinforcement Learning for Rapid Motion Planning

Authors: Zih-Yun Chiu, Florian Richter, Emily K. Funk, Ryan K. Orosco, Michael C. Yip

Abstract: Regrasping a suture needle is an important yet time-consuming process in suturing. To bring efficiency into regrasping, prior work either designs a task-specific mechanism or guides the gripper toward some specific pick-up point for proper grasping of a needle. Yet, these methods are usually not deployable when the working space is changed. Therefore, in this work, we present rapid trajectory gene… ▽ More Regrasping a suture needle is an important yet time-consuming process in suturing. To bring efficiency into regrasping, prior work either designs a task-specific mechanism or guides the gripper toward some specific pick-up point for proper grasping of a needle. Yet, these methods are usually not deployable when the working space is changed. Therefore, in this work, we present rapid trajectory generation for bimanual needle regrasping via reinforcement learning (RL). Demonstrations from a sampling-based motion planning algorithm is incorporated to speed up the learning. In addition, we propose the ego-centric state and action spaces for this bimanual planning problem, where the reference frames are on the end-effectors instead of some fixed frame. Thus, the learned policy can be directly applied to any feasible robot configuration. Our experiments in simulation show that the success rate of a single pass is 97%, and the planning time is 0.0212s on average, which outperforms other widely used motion planning algorithms. For the real-world experiments, the success rate is 73.3% if the needle pose is reconstructed from an RGB image, with a planning time of 0.0846s and a run time of 5.1454s. If the needle pose is known beforehand, the success rate becomes 90.5%, with a planning time of 0.0807s and a run time of 2.8801s. △ Less

Submitted 23 May, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

Showing 1–13 of 13 results for author: Chiu, Z