-
Towards Generalist Robot Learning from Internet Video: A Survey
Authors:
Robert McCarthy,
Daniel C. H. Tan,
Dominik Schmidt,
Fernando Acero,
Nathan Herr,
Yilun Du,
Thomas G. Thuruthel,
Zhibin Li
Abstract:
Scaling deep learning to massive, diverse internet data has yielded remarkably general capabilities in visual and natural language understanding and generation. However, data has remained scarce and challenging to collect in robotics, seeing robot learning struggle to obtain similarly general capabilities. Promising Learning from Videos (LfV) methods aim to address the robotics data bottleneck by…
▽ More
Scaling deep learning to massive, diverse internet data has yielded remarkably general capabilities in visual and natural language understanding and generation. However, data has remained scarce and challenging to collect in robotics, seeing robot learning struggle to obtain similarly general capabilities. Promising Learning from Videos (LfV) methods aim to address the robotics data bottleneck by augmenting traditional robot data with large-scale internet video data. This video data offers broad foundational information regarding physical behaviour and the underlying physics of the world, and thus can be highly informative for a generalist robot.
In this survey, we present a thorough overview of the emerging field of LfV. We outline fundamental concepts, including the benefits and challenges of LfV. We provide a comprehensive review of current methods for extracting knowledge from large-scale internet video, addressing key challenges in LfV, and boosting downstream robot and reinforcement learning via the use of video data. The survey concludes with a critical discussion of challenges and opportunities in LfV. Here, we advocate for scalable foundation model approaches that can leverage the full range of available internet video to improve the learning of robot policies and dynamics models. We hope this survey can inform and catalyse further LfV research, driving progress towards the development of general-purpose robots.
△ Less
Submitted 12 November, 2024; v1 submitted 30 April, 2024;
originally announced April 2024.
-
Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory
Authors:
Daniel C. H. Tan,
Fernando Acero,
Robert McCarthy,
Dimitrios Kanoulas,
Zhibin Li
Abstract:
Guaranteeing safe behaviour of reinforcement learning (RL) policies poses significant challenges for safety-critical applications, despite RL's generality and scalability. To address this, we propose a new approach to apply verification methods from control theory to learned value functions. By analyzing task structures for safety preservation, we formalize original theorems that establish links b…
▽ More
Guaranteeing safe behaviour of reinforcement learning (RL) policies poses significant challenges for safety-critical applications, despite RL's generality and scalability. To address this, we propose a new approach to apply verification methods from control theory to learned value functions. By analyzing task structures for safety preservation, we formalize original theorems that establish links between value functions and control barrier functions. Further, we propose novel metrics for verifying value functions in safe control tasks and practical implementation details to improve learning. Our work presents a novel method for certificate learning, which unlocks a diversity of verification techniques from control theory for RL policies, and marks a significant step towards a formal framework for the general, scalable, and verifiable design of RL-based control systems. Code and videos are available at this https url: https://rl-cbf.github.io/
△ Less
Submitted 5 December, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Perceptive Locomotion with Controllable Pace and Natural Gait Transitions Over Uneven Terrains
Authors:
Daniel Chee Hian Tan,
Jenny Zhang,
Michael,
Chuah,
Zhibin Li
Abstract:
This work developed a learning framework for perceptive legged locomotion that combines visual feedback, proprioceptive information, and active gait regulation of foot-ground contacts. The perception requires only one forward-facing camera to obtain the heightmap, and the active regulation of gait paces and traveling velocity are realized through our formulation of CPG-based high-level imitation o…
▽ More
This work developed a learning framework for perceptive legged locomotion that combines visual feedback, proprioceptive information, and active gait regulation of foot-ground contacts. The perception requires only one forward-facing camera to obtain the heightmap, and the active regulation of gait paces and traveling velocity are realized through our formulation of CPG-based high-level imitation of foot-ground contacts. Through this framework, an end-user has the ability to command task-level inputs to control different walking speeds and gait frequencies according to the traversal of different terrains, which enables more reliable negotiation with encountered obstacles. The results demonstrated that the learned perceptive locomotion policy followed task-level control inputs with intended behaviors, and was robust in presence of unseen terrains and external force perturbations. A video demonstration can be found at https://youtu.be/OTzlWzDfAe8, and the codebase at https://github.com/jennyzzt/perceptual-locomotion.
△ Less
Submitted 30 January, 2023; v1 submitted 25 January, 2023;
originally announced January 2023.
-
Using Deep Learning with Large Aggregated Datasets for COVID-19 Classification from Cough
Authors:
Esin Darici Haritaoglu,
Nicholas Rasmussen,
Daniel C. H. Tan,
Jennifer Ranjani J.,
Jaclyn Xiao,
Gunvant Chaudhari,
Akanksha Rajput,
Praveen Govindan,
Christian Canham,
Wei Chen,
Minami Yamaura,
Laura Gomezjurado,
Aaron Broukhim,
Amil Khanzada,
Mert Pilanci
Abstract:
The Covid-19 pandemic has been one of the most devastating events in recent history, claiming the lives of more than 5 million people worldwide. Even with the worldwide distribution of vaccines, there is an apparent need for affordable, reliable, and accessible screening techniques to serve parts of the World that do not have access to Western medicine. Artificial Intelligence can provide a soluti…
▽ More
The Covid-19 pandemic has been one of the most devastating events in recent history, claiming the lives of more than 5 million people worldwide. Even with the worldwide distribution of vaccines, there is an apparent need for affordable, reliable, and accessible screening techniques to serve parts of the World that do not have access to Western medicine. Artificial Intelligence can provide a solution utilizing cough sounds as a primary screening mode for COVID-19 diagnosis. This paper presents multiple models that have achieved relatively respectable performance on the largest evaluation dataset currently presented in academic literature. Through investigation of a self-supervised learning model (Area under the ROC curve, AUC = 0.807) and a convolutional nerual network (CNN) model (AUC = 0.802), we observe the possibility of model bias with limited datasets. Moreover, we observe that performance increases with training data size, showing the need for the worldwide collection of data to help combat the Covid-19 pandemic with non-traditional means.
△ Less
Submitted 29 March, 2022; v1 submitted 5 January, 2022;
originally announced January 2022.