Skip to main content

Showing 1–22 of 22 results for author: Dasari, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.20020  [pdf, other

    cs.RO

    Gemini Robotics: Bringing AI into the Physical World

    Authors: Gemini Robotics Team, Saminda Abeyruwan, Joshua Ainslie, Jean-Baptiste Alayrac, Montserrat Gonzalez Arenas, Travis Armstrong, Ashwin Balakrishna, Robert Baruch, Maria Bauza, Michiel Blokzijl, Steven Bohez, Konstantinos Bousmalis, Anthony Brohan, Thomas Buschmann, Arunkumar Byravan, Serkan Cabi, Ken Caluwaerts, Federico Casarini, Oscar Chang, Jose Enrique Chen, Xi Chen, Hao-Tien Lewis Chiang, Krzysztof Choromanski, David D'Ambrosio, Sudeep Dasari , et al. (93 additional authors not shown)

    Abstract: Recent advancements in large multimodal models have led to the emergence of remarkable generalist capabilities in digital domains, yet their translation to physical agents such as robots remains a significant challenge. This report introduces a new family of AI models purposefully designed for robotics and built upon the foundation of Gemini 2.0. We present Gemini Robotics, an advanced Vision-Lang… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  2. arXiv:2503.01238  [pdf, other

    cs.RO cs.AI cs.LG

    A Taxonomy for Evaluating Generalist Robot Policies

    Authors: Jensen Gao, Suneel Belkhale, Sudeep Dasari, Ashwin Balakrishna, Dhruv Shah, Dorsa Sadigh

    Abstract: Machine learning for robotics promises to unlock generalization to novel tasks and environments. Guided by this promise, many recent works have focused on scaling up robot data collection and developing larger, more expressive policies to achieve this. But how do we measure progress towards this goal of policy generalization in practice? Evaluating and quantifying generalization is the Wild West o… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

    Comments: 25 pages

  3. arXiv:2502.04786  [pdf, other

    cs.CR cs.AI

    Enhancing SQL Injection Detection and Prevention Using Generative Models

    Authors: Naga Sai Dasari, Atta Badii, Armin Moin, Ahmed Ashlam

    Abstract: SQL Injection (SQLi) continues to pose a significant threat to the security of web applications, enabling attackers to manipulate databases and access sensitive information without authorisation. Although advancements have been made in detection techniques, traditional signature-based methods still struggle to identify sophisticated SQL injection attacks that evade predefined patterns. As SQLi att… ▽ More

    Submitted 7 February, 2025; originally announced February 2025.

    Comments: 13 pages, 22 Figures, 1 Table

  4. arXiv:2410.10088  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    The Ingredients for Robotic Diffusion Transformers

    Authors: Sudeep Dasari, Oier Mees, Sebastian Zhao, Mohan Kumar Srirama, Sergey Levine

    Abstract: In recent years roboticists have achieved remarkable progress in solving increasingly general tasks on dexterous robotic hardware by leveraging high capacity Transformer network architectures and generative diffusion models. Unfortunately, combining these two orthogonal improvements has proven surprisingly difficult, since there is no clear and well-understood process for making important design c… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

  5. arXiv:2408.11812  [pdf, other

    cs.RO cs.LG

    Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation

    Authors: Ria Doshi, Homer Walke, Oier Mees, Sudeep Dasari, Sergey Levine

    Abstract: Modern machine learning systems rely on large datasets to attain broad generalization, and this often poses a challenge in robot learning, where each robotic platform and task might have only a small dataset. By training a single policy across many different kinds of robots, a robot learning method can leverage much broader and more diverse datasets, which in turn can lead to better generalization… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: Project website at https://crossformer-model.github.io/

  6. arXiv:2407.18911  [pdf, other

    cs.RO cs.CV

    HRP: Human Affordances for Robotic Pre-Training

    Authors: Mohan Kumar Srirama, Sudeep Dasari, Shikhar Bahl, Abhinav Gupta

    Abstract: In order to *generalize* to various tasks in the wild, robotic agents will need a suitable representation (i.e., vision network) that enables the robot to predict optimal actions given high dimensional vision inputs. However, learning such a representation requires an extreme amount of diverse training data, which is prohibitively expensive to collect on a real robot. How can we overcome this prob… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: Accepted to Robotics Science and Systems 2024

  7. arXiv:2405.12213  [pdf, other

    cs.RO cs.LG

    Octo: An Open-Source Generalist Robot Policy

    Authors: Octo Model Team, Dibya Ghosh, Homer Walke, Karl Pertsch, Kevin Black, Oier Mees, Sudeep Dasari, Joey Hejna, Tobias Kreiman, Charles Xu, Jianlan Luo, You Liang Tan, Lawrence Yunliang Chen, Pannag Sanketi, Quan Vuong, Ted Xiao, Dorsa Sadigh, Chelsea Finn, Sergey Levine

    Abstract: Large policies pretrained on diverse robot datasets have the potential to transform robotic learning: instead of training new policies from scratch, such generalist robot policies may be finetuned with only a little in-domain data, yet generalize broadly. However, to be widely applicable across a range of robotic learning scenarios, environments, and tasks, such policies need to handle diverse sen… ▽ More

    Submitted 26 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: Project website: https://octo-models.github.io

  8. arXiv:2403.12945  [pdf, other

    cs.RO

    DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

    Authors: Alexander Khazatsky, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany, Mohan Kumar Srirama, Lawrence Yunliang Chen, Kirsty Ellis, Peter David Fagan, Joey Hejna, Masha Itkina, Marion Lepert, Yecheng Jason Ma, Patrick Tree Miller, Jimmy Wu, Suneel Belkhale, Shivin Dass, Huy Ha, Arhan Jain, Abraham Lee, Youngwoon Lee, Marius Memmel, Sungjae Park , et al. (76 additional authors not shown)

    Abstract: The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a resu… ▽ More

    Submitted 22 April, 2025; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: Project website: https://droid-dataset.github.io/

  9. arXiv:2310.09289  [pdf, other

    cs.RO cs.CV

    An Unbiased Look at Datasets for Visuo-Motor Pre-Training

    Authors: Sudeep Dasari, Mohan Kumar Srirama, Unnat Jain, Abhinav Gupta

    Abstract: Visual representation learning hold great promise for robotics, but is severely hampered by the scarcity and homogeneity of robotics datasets. Recent works address this problem by pre-training visual representations on large-scale but out-of-domain data (e.g., videos of egocentric interactions) and then transferring them to target robotics tasks. While the field is heavily focused on developing be… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

    Comments: Accepted to CoRL 2023

  10. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (269 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 14 May, 2025; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  11. arXiv:2309.03130  [pdf, other

    cs.RO cs.AI

    MyoDex: A Generalizable Prior for Dexterous Manipulation

    Authors: Vittorio Caggiano, Sudeep Dasari, Vikash Kumar

    Abstract: Human dexterity is a hallmark of motor control. Our hands can rapidly synthesize new behaviors despite the complexity (multi-articular and multi-joints, with 23 joints controlled by more than 40 muscles) of musculoskeletal sensory-motor circuits. In this work, we take inspiration from how human dexterity builds on a diversity of prior experiences, instead of being acquired through a single task. M… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

    Comments: Accepted to the 40th International Conference on Machine Learning (2023)

  12. arXiv:2303.08135  [pdf, other

    cs.RO

    Manipulate by Seeing: Creating Manipulation Controllers from Pre-Trained Representations

    Authors: Jianren Wang, Sudeep Dasari, Mohan Kumar Srirama, Shubham Tulsiani, Abhinav Gupta

    Abstract: The field of visual representation learning has seen explosive growth in the past years, but its benefits in robotics have been surprisingly limited so far. Prior work uses generic visual representations as a basis to learn (task-specific) robot action policies (e.g., via behavior cloning). While the visual representations do accelerate learning, they are primarily used to encode visual observatio… ▽ More

    Submitted 15 August, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: Oral Presentation at the International Conference on Computer Vision (ICCV), 2023

  13. arXiv:2209.11221  [pdf, other

    cs.RO cs.AI

    Learning Dexterous Manipulation from Exemplar Object Trajectories and Pre-Grasps

    Authors: Sudeep Dasari, Abhinav Gupta, Vikash Kumar

    Abstract: Learning diverse dexterous manipulation behaviors with assorted objects remains an open grand challenge. While policy learning methods offer a powerful avenue to attack this problem, they require extensive per-task engineering and algorithmic tuning. This paper seeks to escape these constraints, by developing a Pre-Grasp informed Dexterous Manipulation (PGDM) framework that generates diverse dexte… ▽ More

    Submitted 12 February, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

    Comments: An abridged version of this paper was presented in IEEE International Conference on Robotics and Automation (ICRA) 2023

  14. arXiv:2203.08098  [pdf, other

    cs.RO

    RB2: Robotic Manipulation Benchmarking with a Twist

    Authors: Sudeep Dasari, Jianren Wang, Joyce Hong, Shikhar Bahl, Yixin Lin, Austin Wang, Abitha Thankaraj, Karanbir Chahal, Berk Calli, Saurabh Gupta, David Held, Lerrel Pinto, Deepak Pathak, Vikash Kumar, Abhinav Gupta

    Abstract: Benchmarks offer a scientific way to compare algorithms using objective performance metrics. Good benchmarks have two features: (a) they should be widely useful for many research groups; (b) and they should produce reproducible findings. In robotic manipulation research, there is a trade-off between reproducibility and broad accessibility. If the benchmark is kept restrictive (fixed hardware, obje… ▽ More

    Submitted 30 October, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: accepted at the NeurIPS 2021 Datasets and Benchmarks Track

  15. arXiv:2101.10085  [pdf

    cs.CY

    Unified Citizen Identity System Using Blockchain

    Authors: Sri Sai Abhishake Gopal Dasari

    Abstract: The citizenship identities of a nation's occupants enable the state to identify and authenticate them unquestionably. These documents help individuals in recognizing themselves and to profit from the rights and advantages given to them by the legislature or the constitution of the land. There are problems in the traditional way of issuance f these identities and many hurdles that impede people fro… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

    Comments: We have worked on this method for Indian Governance, yet it can be customized for all countries

  16. arXiv:2012.15373  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    Model-Based Visual Planning with Self-Supervised Functional Distances

    Authors: Stephen Tian, Suraj Nair, Frederik Ebert, Sudeep Dasari, Benjamin Eysenbach, Chelsea Finn, Sergey Levine

    Abstract: A generalist robot must be able to complete a variety of tasks in its environment. One appealing way to specify each task is in terms of a goal observation. However, learning goal-reaching policies with reinforcement learning remains a challenging problem, particularly when hand-engineered reward functions are not available. Learned dynamics models are a promising approach for learning about the e… ▽ More

    Submitted 30 December, 2020; originally announced December 2020.

  17. arXiv:2011.05970  [pdf, other

    cs.LG cs.CV cs.RO

    Transformers for One-Shot Visual Imitation

    Authors: Sudeep Dasari, Abhinav Gupta

    Abstract: Humans are able to seamlessly visually imitate others, by inferring their intentions and using past experience to achieve the same end goal. In other words, we can parse complex semantic knowledge from raw video and efficiently translate that into concrete motor control. Is it possible to give a robot this same capability? Prior research in robot imitation learning has created agents which can acq… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

    Comments: For code and project video please check our website: https://oneshotfeatures.github.io/

  18. arXiv:1910.11215  [pdf, other

    cs.RO cs.CV cs.LG

    RoboNet: Large-Scale Multi-Robot Learning

    Authors: Sudeep Dasari, Frederik Ebert, Stephen Tian, Suraj Nair, Bernadette Bucher, Karl Schmeckpeper, Siddharth Singh, Sergey Levine, Chelsea Finn

    Abstract: Robot learning has emerged as a promising tool for taming the complexity and diversity of the real world. Methods based on high-capacity models, such as deep networks, hold the promise of providing effective generalization to a wide range of open-world environments. However, these same methods typically require large amounts of diverse training data to generalize effectively. In contrast, most rob… ▽ More

    Submitted 2 January, 2020; v1 submitted 24 October, 2019; originally announced October 2019.

    Comments: accepted at the Conference on Robot Learning (CoRL) 2019

  19. arXiv:1910.06302  [pdf, other

    eess.IV cs.CV cs.LG

    Finding New Diagnostic Information for Detecting Glaucoma using Neural Networks

    Authors: Erfan Noury, Suria S. Mannil, Robert T. Chang, An Ran Ran, Carol Y. Cheung, Suman S. Thapa, Harsha L. Rao, Srilakshmi Dasari, Mohammed Riyazuddin, Dolly Chang, Sriharsha Nagaraj, Clement C. Tham, Reza Zadeh

    Abstract: We describe a new approach to automated Glaucoma detection in 3D Spectral Domain Optical Coherence Tomography (OCT) optic nerve scans. First, we gathered a unique and diverse multi-ethnic dataset of OCT scans consisting of glaucoma and non-glaucomatous cases obtained from four tertiary care eye hospitals located in four different countries. Using this longitudinal data, we achieved state-of-the-ar… ▽ More

    Submitted 2 September, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

    Comments: 28 pages, 12 figures, 15 tables, title changed, new authors added

  20. arXiv:1812.00568  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control

    Authors: Frederik Ebert, Chelsea Finn, Sudeep Dasari, Annie Xie, Alex Lee, Sergey Levine

    Abstract: Deep reinforcement learning (RL) algorithms can learn complex robotic skills from raw sensory inputs, but have yet to achieve the kind of broad generalization and applicability demonstrated by deep learning methods in supervised domains. We present a deep RL method that is practical for real-world robotics tasks, such as robotic manipulation, and generalizes effectively to never-before-seen tasks… ▽ More

    Submitted 3 December, 2018; originally announced December 2018.

  21. arXiv:1810.03043  [pdf, other

    cs.RO cs.AI cs.CV

    Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning

    Authors: Frederik Ebert, Sudeep Dasari, Alex X. Lee, Sergey Levine, Chelsea Finn

    Abstract: Prediction is an appealing objective for self-supervised learning of behavioral skills, particularly for autonomous robots. However, effectively utilizing predictive models for control, especially with raw image inputs, poses a number of major challenges. How should the predictions be used? What happens when they are inaccurate? In this paper, we tackle these questions by proposing a method for le… ▽ More

    Submitted 6 October, 2018; originally announced October 2018.

    Comments: accepted at the Conference on Robot Learning (CoRL) 2018

  22. arXiv:1802.01557  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning

    Authors: Tianhe Yu, Chelsea Finn, Annie Xie, Sudeep Dasari, Tianhao Zhang, Pieter Abbeel, Sergey Levine

    Abstract: Humans and animals are capable of learning a new behavior by observing others perform the skill just once. We consider the problem of allowing a robot to do the same -- learning from a raw video pixels of a human, even when there is substantial domain shift in the perspective, environment, and embodiment between the robot and the observed human. Prior approaches to this problem have hand-specified… ▽ More

    Submitted 5 February, 2018; originally announced February 2018.

    Comments: First two authors contributed equally. Video available at https://sites.google.com/view/daml