Search | arXiv e-print repository

Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop

Authors: Justin Kerr, Kush Hari, Ethan Weber, Chung Min Kim, Brent Yi, Tyler Bonnen, Ken Goldberg, Angjoo Kanazawa

Abstract: Humans do not passively observe the visual world -- we actively look in order to act. Motivated by this principle, we introduce EyeRobot, a robotic system with gaze behavior that emerges from the need to complete real-world tasks. We develop a mechanical eyeball that can freely rotate to observe its surroundings and train a gaze policy to control it using reinforcement learning. We accomplish this… ▽ More Humans do not passively observe the visual world -- we actively look in order to act. Motivated by this principle, we introduce EyeRobot, a robotic system with gaze behavior that emerges from the need to complete real-world tasks. We develop a mechanical eyeball that can freely rotate to observe its surroundings and train a gaze policy to control it using reinforcement learning. We accomplish this by first collecting teleoperated demonstrations paired with a 360 camera. This data is imported into a simulation environment that supports rendering arbitrary eyeball viewpoints, allowing episode rollouts of eye gaze on top of robot demonstrations. We then introduce a BC-RL loop to train the hand and eye jointly: the hand (BC) agent is trained from rendered eye observations, and the eye (RL) agent is rewarded when the hand produces correct action predictions. In this way, hand-eye coordination emerges as the eye looks towards regions which allow the hand to complete the task. EyeRobot implements a foveal-inspired policy architecture allowing high resolution with a small compute budget, which we find also leads to the emergence of more stable fixation as well as improved ability to track objects and ignore distractors. We evaluate EyeRobot on five panoramic workspace manipulation tasks requiring manipulation in an arc surrounding the robot arm. Our experiments suggest EyeRobot exhibits hand-eye coordination behaviors which effectively facilitate manipulation over large workspaces with a single camera. See project site for videos: https://www.eyerobot.net/ △ Less

Submitted 12 June, 2025; originally announced June 2025.

Comments: Project page: https://www.eyerobot.net/

arXiv:2505.15558 [pdf, ps, other]

Robo-DM: Data Management For Large Robot Datasets

Authors: Kaiyuan Chen, Letian Fu, David Huang, Yanxiang Zhang, Lawrence Yunliang Chen, Huang Huang, Kush Hari, Ashwin Balakrishna, Ted Xiao, Pannag R Sanketi, John Kubiatowicz, Ken Goldberg

Abstract: Recent results suggest that very large datasets of teleoperated robot demonstrations can be used to train transformer-based models that have the potential to generalize to new scenes, robots, and tasks. However, curating, distributing, and loading large datasets of robot trajectories, which typically consist of video, textual, and numerical modalities - including streams from multiple cameras - re… ▽ More Recent results suggest that very large datasets of teleoperated robot demonstrations can be used to train transformer-based models that have the potential to generalize to new scenes, robots, and tasks. However, curating, distributing, and loading large datasets of robot trajectories, which typically consist of video, textual, and numerical modalities - including streams from multiple cameras - remains challenging. We propose Robo-DM, an efficient open-source cloud-based data management toolkit for collecting, sharing, and learning with robot data. With Robo-DM, robot datasets are stored in a self-contained format with Extensible Binary Meta Language (EBML). Robo-DM can significantly reduce the size of robot trajectory data, transfer costs, and data load time during training. Compared to the RLDS format used in OXE datasets, Robo-DM's compression saves space by up to 70x (lossy) and 3.5x (lossless). Robo-DM also accelerates data retrieval by load-balancing video decoding with memory-mapped decoding caches. Compared to LeRobot, a framework that also uses lossy video compression, Robo-DM is up to 50x faster when decoding sequentially. We physically evaluate a model trained by Robo-DM with lossy compression, a pick-and-place task, and In-Context Robot Transformer. Robo-DM uses 75x compression of the original dataset and does not suffer reduction in downstream task accuracy. △ Less

Submitted 21 May, 2025; originally announced May 2025.

Comments: Best paper finalist of IEEE ICRA 2025

arXiv:2504.14857 [pdf, other]

SuFIA-BC: Generating High Quality Demonstration Data for Visuomotor Policy Learning in Surgical Subtasks

Authors: Masoud Moghani, Nigel Nelson, Mohamed Ghanem, Andres Diaz-Pinto, Kush Hari, Mahdi Azizian, Ken Goldberg, Sean Huver, Animesh Garg

Abstract: Behavior cloning facilitates the learning of dexterous manipulation skills, yet the complexity of surgical environments, the difficulty and expense of obtaining patient data, and robot calibration errors present unique challenges for surgical robot learning. We provide an enhanced surgical digital twin with photorealistic human anatomical organs, integrated into a comprehensive simulator designed… ▽ More Behavior cloning facilitates the learning of dexterous manipulation skills, yet the complexity of surgical environments, the difficulty and expense of obtaining patient data, and robot calibration errors present unique challenges for surgical robot learning. We provide an enhanced surgical digital twin with photorealistic human anatomical organs, integrated into a comprehensive simulator designed to generate high-quality synthetic data to solve fundamental tasks in surgical autonomy. We present SuFIA-BC: visual Behavior Cloning policies for Surgical First Interactive Autonomy Assistants. We investigate visual observation spaces including multi-view cameras and 3D visual representations extracted from a single endoscopic camera view. Through systematic evaluation, we find that the diverse set of photorealistic surgical tasks introduced in this work enables a comprehensive evaluation of prospective behavior cloning models for the unique challenges posed by surgical environments. We observe that current state-of-the-art behavior cloning techniques struggle to solve the contact-rich and complex tasks evaluated in this work, regardless of their underlying perception or control architectures. These findings highlight the importance of customizing perception pipelines and control architectures, as well as curating larger-scale synthetic datasets that meet the specific demands of surgical tasks. Project website: https://orbit-surgical.github.io/sufia-bc/ △ Less

Submitted 21 April, 2025; originally announced April 2025.

arXiv:2503.05189 [pdf, other]

Persistent Object Gaussian Splat (POGS) for Tracking Human and Robot Manipulation of Irregularly Shaped Objects

Authors: Justin Yu, Kush Hari, Karim El-Refai, Arnav Dalal, Justin Kerr, Chung Min Kim, Richard Cheng, Muhammad Zubair Irshad, Ken Goldberg

Abstract: Tracking and manipulating irregularly-shaped, previously unseen objects in dynamic environments is important for robotic applications in manufacturing, assembly, and logistics. Recently introduced Gaussian Splats efficiently model object geometry, but lack persistent state estimation for task-oriented manipulation. We present Persistent Object Gaussian Splat (POGS), a system that embeds semantics,… ▽ More Tracking and manipulating irregularly-shaped, previously unseen objects in dynamic environments is important for robotic applications in manufacturing, assembly, and logistics. Recently introduced Gaussian Splats efficiently model object geometry, but lack persistent state estimation for task-oriented manipulation. We present Persistent Object Gaussian Splat (POGS), a system that embeds semantics, self-supervised visual features, and object grouping features into a compact representation that can be continuously updated to estimate the pose of scanned objects. POGS updates object states without requiring expensive rescanning or prior CAD models of objects. After an initial multi-view scene capture and training phase, POGS uses a single stereo camera to integrate depth estimates along with self-supervised vision encoder features for object pose estimation. POGS supports grasping, reorientation, and natural language-driven manipulation by refining object pose estimates, facilitating sequential object reset operations with human-induced object perturbations and tool servoing, where robots recover tool pose despite tool perturbations of up to 30°. POGS achieves up to 12 consecutive successful object resets and recovers from 80% of in-grasp tool perturbations. △ Less

Submitted 7 March, 2025; originally announced March 2025.

Comments: Accepted to ICRA 2025

arXiv:2412.13484 [pdf, other]

Curriculum Learning for Cross-Lingual Data-to-Text Generation With Noisy Data

Authors: Kancharla Aditya Hari, Manish Gupta, Vasudeva Varma

Abstract: Curriculum learning has been used to improve the quality of text generation systems by ordering the training samples according to a particular schedule in various tasks. In the context of data-to-text generation (DTG), previous studies used various difficulty criteria to order the training samples for monolingual DTG. These criteria, however, do not generalize to the crosslingual variant of the pr… ▽ More Curriculum learning has been used to improve the quality of text generation systems by ordering the training samples according to a particular schedule in various tasks. In the context of data-to-text generation (DTG), previous studies used various difficulty criteria to order the training samples for monolingual DTG. These criteria, however, do not generalize to the crosslingual variant of the problem and do not account for noisy data. We explore multiple criteria that can be used for improving the performance of cross-lingual DTG systems with noisy data using two curriculum schedules. Using the alignment score criterion for ordering samples and an annealing schedule to train the model, we show increase in BLEU score by up to 4 points, and improvements in faithfulness and coverage of generations by 5-15% on average across 11 Indian languages and English in 2 separate datasets. We make code and data publicly available △ Less

Submitted 17 December, 2024; originally announced December 2024.

arXiv:2412.05408 [pdf, other]

FogROS2-FT: Fault Tolerant Cloud Robotics

Authors: Kaiyuan Chen, Kush Hari, Trinity Chung, Michael Wang, Nan Tian, Christian Juette, Jeffrey Ichnowski, Liu Ren, John Kubiatowicz, Ion Stoica, Ken Goldberg

Abstract: Cloud robotics enables robots to offload complex computational tasks to cloud servers for performance and ease of management. However, cloud compute can be costly, cloud services can suffer occasional downtime, and connectivity between the robot and cloud can be prone to variations in network Quality-of-Service (QoS). We present FogROS2-FT (Fault Tolerant) to mitigate these issues by introducing a… ▽ More Cloud robotics enables robots to offload complex computational tasks to cloud servers for performance and ease of management. However, cloud compute can be costly, cloud services can suffer occasional downtime, and connectivity between the robot and cloud can be prone to variations in network Quality-of-Service (QoS). We present FogROS2-FT (Fault Tolerant) to mitigate these issues by introducing a multi-cloud extension that automatically replicates independent stateless robotic services, routes requests to these replicas, and directs the first response back. With replication, robots can still benefit from cloud computations even when a cloud service provider is down or there is low QoS. Additionally, many cloud computing providers offer low-cost spot computing instances that may shutdown unpredictably. Normally, these low-cost instances would be inappropriate for cloud robotics, but the fault tolerance nature of FogROS2-FT allows them to be used reliably. We demonstrate FogROS2-FT fault tolerance capabilities in 3 cloud-robotics scenarios in simulation (visual object detection, semantic segmentation, motion planning) and 1 physical robot experiment (scan-pick-and-place). Running on the same hardware specification, FogROS2-FT achieves motion planning with up to 2.2x cost reduction and up to a 5.53x reduction on 99 Percentile (P99) long-tail latency. FogROS2-FT reduces the P99 long-tail latency of object detection and semantic segmentation by 2.0x and 2.1x, respectively, under network slowdown and resource contention. △ Less

Submitted 6 December, 2024; originally announced December 2024.

Comments: IEEE/RSJ International Conference on Intelligent Robots and Systems 2024 Best Paper Finalist

arXiv:2409.18108 [pdf, other]

Language-Embedded Gaussian Splats (LEGS): Incrementally Building Room-Scale Representations with a Mobile Robot

Authors: Justin Yu, Kush Hari, Kishore Srinivas, Karim El-Refai, Adam Rashid, Chung Min Kim, Justin Kerr, Richard Cheng, Muhammad Zubair Irshad, Ashwin Balakrishna, Thomas Kollar, Ken Goldberg

Abstract: Building semantic 3D maps is valuable for searching for objects of interest in offices, warehouses, stores, and homes. We present a mapping system that incrementally builds a Language-Embedded Gaussian Splat (LEGS): a detailed 3D scene representation that encodes both appearance and semantics in a unified representation. LEGS is trained online as a robot traverses its environment to enable localiz… ▽ More Building semantic 3D maps is valuable for searching for objects of interest in offices, warehouses, stores, and homes. We present a mapping system that incrementally builds a Language-Embedded Gaussian Splat (LEGS): a detailed 3D scene representation that encodes both appearance and semantics in a unified representation. LEGS is trained online as a robot traverses its environment to enable localization of open-vocabulary object queries. We evaluate LEGS on 4 room-scale scenes where we query for objects in the scene to assess how LEGS can capture semantic meaning. We compare LEGS to LERF and find that while both systems have comparable object query success rates, LEGS trains over 3.5x faster than LERF. Results suggest that a multi-camera setup and incremental bundle adjustment can boost visual reconstruction quality in constrained robot trajectories, and suggest LEGS can localize open-vocabulary and long-tail object queries with up to 66% accuracy. △ Less

Submitted 26 September, 2024; originally announced September 2024.

arXiv:2409.07457 [pdf, other]

LSST: Learned Single-Shot Trajectory and Reconstruction Network for MR Imaging

Authors: Hemant Kumar Aggarwal, Sudhanya Chatterjee, Dattesh Shanbhag, Uday Patil, K. V. S. Hari

Abstract: Single-shot magnetic resonance (MR) imaging acquires the entire k-space data in a single shot and it has various applications in whole-body imaging. However, the long acquisition time for the entire k-space in single-shot fast spin echo (SSFSE) MR imaging poses a challenge, as it introduces T2-blur in the acquired images. This study aims to enhance the reconstruction quality of SSFSE MR images by… ▽ More Single-shot magnetic resonance (MR) imaging acquires the entire k-space data in a single shot and it has various applications in whole-body imaging. However, the long acquisition time for the entire k-space in single-shot fast spin echo (SSFSE) MR imaging poses a challenge, as it introduces T2-blur in the acquired images. This study aims to enhance the reconstruction quality of SSFSE MR images by (a) optimizing the trajectory for measuring the k-space, (b) acquiring fewer samples to speed up the acquisition process, and (c) reducing the impact of T2-blur. The proposed method adheres to physics constraints due to maximum gradient strength and slew-rate available while optimizing the trajectory within an end-to-end learning framework. Experiments were conducted on publicly available fastMRI multichannel dataset with 8-fold and 16-fold acceleration factors. An experienced radiologist's evaluation on a five-point Likert scale indicates improvements in the reconstruction quality as the ACL fibers are sharper than comparative methods. △ Less

Submitted 8 August, 2024; originally announced September 2024.

arXiv:2404.16027 [pdf, other]

ORBIT-Surgical: An Open-Simulation Framework for Learning Surgical Augmented Dexterity

Authors: Qinxi Yu, Masoud Moghani, Karthik Dharmarajan, Vincent Schorp, William Chung-Ho Panitch, Jingzhou Liu, Kush Hari, Huang Huang, Mayank Mittal, Ken Goldberg, Animesh Garg

Abstract: Physics-based simulations have accelerated progress in robot learning for driving, manipulation, and locomotion. Yet, a fast, accurate, and robust surgical simulation environment remains a challenge. In this paper, we present ORBIT-Surgical, a physics-based surgical robot simulation framework with photorealistic rendering in NVIDIA Omniverse. We provide 14 benchmark surgical tasks for the da Vinci… ▽ More Physics-based simulations have accelerated progress in robot learning for driving, manipulation, and locomotion. Yet, a fast, accurate, and robust surgical simulation environment remains a challenge. In this paper, we present ORBIT-Surgical, a physics-based surgical robot simulation framework with photorealistic rendering in NVIDIA Omniverse. We provide 14 benchmark surgical tasks for the da Vinci Research Kit (dVRK) and Smart Tissue Autonomous Robot (STAR) which represent common subtasks in surgical training. ORBIT-Surgical leverages GPU parallelization to train reinforcement learning and imitation learning algorithms to facilitate study of robot learning to augment human surgical skills. ORBIT-Surgical also facilitates realistic synthetic data generation for active perception tasks. We demonstrate ORBIT-Surgical sim-to-real transfer of learned policies onto a physical dVRK robot. Project website: orbit-surgical.github.io △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.05151 [pdf, other]

STITCH: Augmented Dexterity for Suture Throws Including Thread Coordination and Handoffs

Authors: Kush Hari, Hansoul Kim, Will Panitch, Kishore Srinivas, Vincent Schorp, Karthik Dharmarajan, Shreya Ganti, Tara Sadjadpour, Ken Goldberg

Abstract: We present STITCH: an augmented dexterity pipeline that performs Suture Throws Including Thread Coordination and Handoffs. STITCH iteratively performs needle insertion, thread sweeping, needle extraction, suture cinching, needle handover, and needle pose correction with failure recovery policies. We introduce a novel visual 6D needle pose estimation framework using a stereo camera pair and new sut… ▽ More We present STITCH: an augmented dexterity pipeline that performs Suture Throws Including Thread Coordination and Handoffs. STITCH iteratively performs needle insertion, thread sweeping, needle extraction, suture cinching, needle handover, and needle pose correction with failure recovery policies. We introduce a novel visual 6D needle pose estimation framework using a stereo camera pair and new suturing motion primitives. We compare STITCH to baselines, including a proprioception-only and a policy without visual servoing. In physical experiments across 15 trials, STITCH achieves an average of 2.93 sutures without human intervention and 4.47 sutures with human intervention. See https://sites.google.com/berkeley.edu/stitch for code and supplemental materials. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2403.10494 [pdf, other]

Lifelong LERF: Local 3D Semantic Inventory Monitoring Using FogROS2

Authors: Adam Rashid, Chung Min Kim, Justin Kerr, Letian Fu, Kush Hari, Ayah Ahmad, Kaiyuan Chen, Huang Huang, Marcus Gualtieri, Michael Wang, Christian Juette, Nan Tian, Liu Ren, Ken Goldberg

Abstract: Inventory monitoring in homes, factories, and retail stores relies on maintaining data despite objects being swapped, added, removed, or moved. We introduce Lifelong LERF, a method that allows a mobile robot with minimal compute to jointly optimize a dense language and geometric representation of its surroundings. Lifelong LERF maintains this representation over time by detecting semantic changes… ▽ More Inventory monitoring in homes, factories, and retail stores relies on maintaining data despite objects being swapped, added, removed, or moved. We introduce Lifelong LERF, a method that allows a mobile robot with minimal compute to jointly optimize a dense language and geometric representation of its surroundings. Lifelong LERF maintains this representation over time by detecting semantic changes and selectively updating these regions of the environment, avoiding the need to exhaustively remap. Human users can query inventory by providing natural language queries and receiving a 3D heatmap of potential object locations. To manage the computational load, we use Fog-ROS2, a cloud robotics platform, to offload resource-intensive tasks. Lifelong LERF obtains poses from a monocular RGBD SLAM backend, and uses these poses to progressively optimize a Language Embedded Radiance Field (LERF) for semantic monitoring. Experiments with 3-5 objects arranged on a tabletop and a Turtlebot with a RealSense camera suggest that Lifelong LERF can persistently adapt to changes in objects with up to 91% accuracy. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: See project webpage at: https://sites.google.com/berkeley.edu/lifelonglerf/home

arXiv:2402.19249 [pdf, other]

Mirage: Cross-Embodiment Zero-Shot Policy Transfer with Cross-Painting

Authors: Lawrence Yunliang Chen, Kush Hari, Karthik Dharmarajan, Chenfeng Xu, Quan Vuong, Ken Goldberg

Abstract: The ability to reuse collected data and transfer trained policies between robots could alleviate the burden of additional data collection and training. While existing approaches such as pretraining plus finetuning and co-training show promise, they do not generalize to robots unseen in training. Focusing on common robot arms with similar workspaces and 2-jaw grippers, we investigate the feasibilit… ▽ More The ability to reuse collected data and transfer trained policies between robots could alleviate the burden of additional data collection and training. While existing approaches such as pretraining plus finetuning and co-training show promise, they do not generalize to robots unseen in training. Focusing on common robot arms with similar workspaces and 2-jaw grippers, we investigate the feasibility of zero-shot transfer. Through simulation studies on 8 manipulation tasks, we find that state-based Cartesian control policies can successfully zero-shot transfer to a target robot after accounting for forward dynamics. To address robot visual disparities for vision-based policies, we introduce Mirage, which uses "cross-painting"--masking out the unseen target robot and inpainting the seen source robot--during execution in real time so that it appears to the policy as if the trained source robot were performing the task. Mirage applies to both first-person and third-person camera views and policies that take in both states and images as inputs or only images as inputs. Despite its simplicity, our extensive simulation and physical experiments provide strong evidence that Mirage can successfully zero-shot transfer between different robot arms and grippers with only minimal performance degradation on a variety of manipulation tasks such as picking, stacking, and assembly, significantly outperforming a generalist policy. Project website: https://robot-mirage.github.io/ △ Less

Submitted 8 September, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Comments: RSS 2024. Project page: https://robot-mirage.github.io/

arXiv:2311.05782 [pdf, other]

MPGemmFI: A Fault Injection Technique for Mixed Precision GEMM in ML Applications

Authors: Bo Fang, Xinyi Li, Harvey Dam, Cheng Tan, Siva Kumar Sastry Hari, Timothy Tsai, Ignacio Laguna, Dingwen Tao, Ganesh Gopalakrishnan, Prashant Nair, Kevin Barker, Ang Li

Abstract: Emerging deep learning workloads urgently need fast general matrix multiplication (GEMM). To meet such demand, one of the critical features of machine-learning-specific accelerators such as NVIDIA Tensor Cores, AMD Matrix Cores, and Google TPUs is the support of mixed-precision enabled GEMM. For DNN models, lower-precision FP data formats and computation offer acceptable correctness but significan… ▽ More Emerging deep learning workloads urgently need fast general matrix multiplication (GEMM). To meet such demand, one of the critical features of machine-learning-specific accelerators such as NVIDIA Tensor Cores, AMD Matrix Cores, and Google TPUs is the support of mixed-precision enabled GEMM. For DNN models, lower-precision FP data formats and computation offer acceptable correctness but significant performance, area, and memory footprint improvement. While promising, the mixed-precision computation on error resilience remains unexplored. To this end, we develop a fault injection framework that systematically injects fault into the mixed-precision computation results. We investigate how the faults affect the accuracy of machine learning applications. Based on the error resilience characteristics, we offer lightweight error detection and correction solutions that significantly improve the overall model accuracy if the models experience hardware faults. The solutions can be efficiently integrated into the accelerator's pipelines. △ Less

Submitted 9 November, 2023; originally announced November 2023.

arXiv:2311.05600 [pdf, other]

FogROS2-Config: Optimizing Latency and Cost for Multi-Cloud Robot Applications

Authors: Kaiyuan Chen, Kush Hari, Rohil Khare, Charlotte Le, Trinity Chung, Jaimyn Drake, Jeffrey Ichnowski, John Kubiatowicz, Ken Goldberg

Abstract: Cloud service providers provide over 50,000 distinct and dynamically changing set of cloud server options. To help roboticists make cost-effective decisions, we present FogROS2-Config, an open toolkit that takes ROS2 nodes as input and automatically runs relevant benchmarks to quickly return a menu of cloud compute services that tradeoff latency and cost. Because it is infeasible to try every hard… ▽ More Cloud service providers provide over 50,000 distinct and dynamically changing set of cloud server options. To help roboticists make cost-effective decisions, we present FogROS2-Config, an open toolkit that takes ROS2 nodes as input and automatically runs relevant benchmarks to quickly return a menu of cloud compute services that tradeoff latency and cost. Because it is infeasible to try every hardware configuration, FogROS2-Config quickly samples tests a small set of edge case servers. We evaluate FogROS2-Config on three robotics application tasks: visual SLAM, grasp planning. and motion planning. FogROS2-Config can reduce the cost by up to 20x. By comparing with a Pareto frontier for cost and latency by running the application task on feasible server configurations, we evaluate cost and latency models and confirm that FogROS2-Config selects efficient hardware configurations to balance cost and latency. △ Less

Submitted 13 May, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

Comments: Published 2024 IEEE International Conference on Robotics and Automation (ICRA), Former name: FogROS2-Sky

arXiv:2310.17274 [pdf, other]

cuRobo: Parallelized Collision-Free Minimum-Jerk Robot Motion Generation

Authors: Balakumar Sundaralingam, Siva Kumar Sastry Hari, Adam Fishman, Caelan Garrett, Karl Van Wyk, Valts Blukis, Alexander Millane, Helen Oleynikova, Ankur Handa, Fabio Ramos, Nathan Ratliff, Dieter Fox

Abstract: This paper explores the problem of collision-free motion generation for manipulators by formulating it as a global motion optimization problem. We develop a parallel optimization technique to solve this problem and demonstrate its effectiveness on massively parallel GPUs. We show that combining simple optimization techniques with many parallel seeds leads to solving difficult motion generation pro… ▽ More This paper explores the problem of collision-free motion generation for manipulators by formulating it as a global motion optimization problem. We develop a parallel optimization technique to solve this problem and demonstrate its effectiveness on massively parallel GPUs. We show that combining simple optimization techniques with many parallel seeds leads to solving difficult motion generation problems within 50ms on average, 60x faster than state-of-the-art (SOTA) trajectory optimization methods. We achieve SOTA performance by combining L-BFGS step direction estimation with a novel parallel noisy line search scheme and a particle-based optimization solver. To further aid trajectory optimization, we develop a parallel geometric planner that plans within 20ms and also introduce a collision-free IK solver that can solve over 7000 queries/s. We package our contributions into a state of the art GPU accelerated motion generation library, cuRobo and release it to enrich the robotics community. Additional details are available at https://curobo.org △ Less

Submitted 3 November, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

Comments: revised technical report, 62 pages, Website: https://curobo.org

arXiv:2310.07854 [pdf, other]

VaPr: Variable-Precision Tensors to Accelerate Robot Motion Planning

Authors: Yu-Shun Hsiao, Siva Kumar Sastry Hari, Balakumar Sundaralingam, Jason Yik, Thierry Tambe, Charbel Sakr, Stephen W. Keckler, Vijay Janapa Reddi

Abstract: High-dimensional motion generation requires numerical precision for smooth, collision-free solutions. Typically, double-precision or single-precision floating-point (FP) formats are utilized. Using these for big tensors imposes a strain on the memory bandwidth provided by the devices and alters the memory footprint, hence limiting their applicability to low-power edge devices needed for mobile rob… ▽ More High-dimensional motion generation requires numerical precision for smooth, collision-free solutions. Typically, double-precision or single-precision floating-point (FP) formats are utilized. Using these for big tensors imposes a strain on the memory bandwidth provided by the devices and alters the memory footprint, hence limiting their applicability to low-power edge devices needed for mobile robots. The uniform application of reduced precision can be advantageous but severely degrades solutions. Using decreased precision data types for important tensors, we propose to accelerate motion generation by removing memory bottlenecks. We propose variable-precision (VaPr) search optimization to determine the appropriate precision for large tensors from a vast search space of approximately 4 million unique combinations for FP data types across the tensors. To obtain the efficiency gains, we exploit existing platform support for an out-of-the-box GPU speedup and evaluate prospective precision converter units for GPU types that are not currently supported. Our experimental results on 800 planning problems for the Franka Panda robot on the MotionBenchmaker dataset across 8 environments show that a 4-bit FP format is sufficient for the largest set of tensors in the motion generation stack. With the software-only solution, VaPr achieves 6.3% and 6.3% speedups on average for a significant portion of motion generation over the SOTA solution (CuRobo) on Jetson Orin and RTX2080 Ti GPU, respectively, and 9.9%, 17.7% speedups with the FP converter. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 7 pages, 5 figures, 8 tables, to be published in 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

arXiv:2310.03841 [pdf, other]

ALBERTA: ALgorithm-Based Error Resilience in Transformer Architectures

Authors: Haoxuan Liu, Vasu Singh, Michał Filipiuk, Siva Kumar Sastry Hari

Abstract: Vision Transformers are being increasingly deployed in safety-critical applications that demand high reliability. It is crucial to ensure the correctness of their execution in spite of potential errors such as transient hardware errors. We propose a novel algorithm-based resilience framework called ALBERTA that allows us to perform end-to-end resilience analysis and protection of transformer-based… ▽ More Vision Transformers are being increasingly deployed in safety-critical applications that demand high reliability. It is crucial to ensure the correctness of their execution in spite of potential errors such as transient hardware errors. We propose a novel algorithm-based resilience framework called ALBERTA that allows us to perform end-to-end resilience analysis and protection of transformer-based architectures. First, our work develops an efficient process of computing and ranking the resilience of transformers layers. We find that due to the large size of transformer models, applying traditional network redundancy to a subset of the most vulnerable layers provides high error coverage albeit with impractically high overhead. We address this shortcoming by providing a software-directed, checksum-based error detection technique aimed at protecting the most vulnerable general matrix multiply (GEMM) layers in the transformer models that use either floating-point or integer arithmetic. Results show that our approach achieves over 99% coverage for errors that result in a mismatch with less than 0.2% and 0.01% computation and memory overheads, respectively. Lastly, we present the applicability of our framework in various modern GPU architectures under different numerical precisions. We introduce an efficient self-correction mechanism for resolving erroneous detection with an average of less than 2% overhead per error. △ Less

Submitted 5 February, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

arXiv:2308.08917 [pdf, other]

Unfolding for Joint Channel Estimation and Symbol Detection in MIMO Communication Systems

Authors: Swati Bhattacharya, K. V. S. Hari, Yonina C. Eldar

Abstract: This paper proposes a Joint Channel Estimation and Symbol Detection (JED) scheme for Multiple-Input Multiple-Output (MIMO) wireless communication systems. Our proposed method for JED using Alternating Direction Method of Multipliers (JED-ADMM) and its model-based neural network version JED using Unfolded ADMM (JED-U-ADMM) markedly improve the symbol detection performance over JED using Alternating… ▽ More This paper proposes a Joint Channel Estimation and Symbol Detection (JED) scheme for Multiple-Input Multiple-Output (MIMO) wireless communication systems. Our proposed method for JED using Alternating Direction Method of Multipliers (JED-ADMM) and its model-based neural network version JED using Unfolded ADMM (JED-U-ADMM) markedly improve the symbol detection performance over JED using Alternating Minimization (JED-AM) for a range of MIMO antenna configurations. Both proposed algorithms exploit the non-smooth constraint, that occurs as a result of the Quadrature Amplitude Modulation (QAM) data symbols, to effectively improve the performance using the ADMM iterations. The proposed unfolded network JED-U-ADMM consists of a few trainable parameters and requires a small training set. We show the efficacy of the proposed methods for both uncorrelated and correlated MIMO channels. For certain configurations, the gain in SNR for a desired BER of $10^{-2}$ for the proposed JED-ADMM and JED-U-ADMM is upto $4$ dB and is also accompanied by a significant reduction in computational complexity of upto $75\%$, depending on the MIMO configuration, as compared to the complexity of JED-AM. △ Less

Submitted 21 August, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

Comments: 14 pages, 19 figures, submitted to IEEE Transactions on Signal Processing

arXiv:2306.14131 [pdf, other]

Safety-Critical Scenario Generation Via Reinforcement Learning Based Editing

Authors: Haolan Liu, Liangjun Zhang, Siva Kumar Sastry Hari, Jishen Zhao

Abstract: Generating safety-critical scenarios is essential for testing and verifying the safety of autonomous vehicles. Traditional optimization techniques suffer from the curse of dimensionality and limit the search space to fixed parameter spaces. To address these challenges, we propose a deep reinforcement learning approach that generates scenarios by sequential editing, such as adding new agents or mod… ▽ More Generating safety-critical scenarios is essential for testing and verifying the safety of autonomous vehicles. Traditional optimization techniques suffer from the curse of dimensionality and limit the search space to fixed parameter spaces. To address these challenges, we propose a deep reinforcement learning approach that generates scenarios by sequential editing, such as adding new agents or modifying the trajectories of the existing agents. Our framework employs a reward function consisting of both risk and plausibility objectives. The plausibility objective leverages generative models, such as a variational autoencoder, to learn the likelihood of the generated parameters from the training datasets; It penalizes the generation of unlikely scenarios. Our approach overcomes the dimensionality challenge and explores a wide range of safety-critical scenarios. Our evaluation demonstrates that the proposed method generates safety-critical scenarios of higher quality compared with previous approaches. △ Less

Submitted 6 March, 2024; v1 submitted 25 June, 2023; originally announced June 2023.

arXiv:2301.09219 [pdf, other]

Applied Deep Learning to Identify and Localize Polyps from Endoscopic Images

Authors: Chandana Raju, Sumedh Vilas Datar, Kushala Hari, Kavin Vijay, Suma Ningappa

Abstract: Deep learning based neural networks have gained popularity for a variety of biomedical imaging applications. In the last few years several works have shown the use of these methods for colon cancer detection and the early results have been promising. These methods can potentially be utilized to assist doctor's and may help in identifying the number of lesions or abnormalities in a diagnosis sessio… ▽ More Deep learning based neural networks have gained popularity for a variety of biomedical imaging applications. In the last few years several works have shown the use of these methods for colon cancer detection and the early results have been promising. These methods can potentially be utilized to assist doctor's and may help in identifying the number of lesions or abnormalities in a diagnosis session. From our literature survey we found out that there is a lack of publicly available labeled data. Thus, as part of this work, we have aimed at open sourcing a dataset which contains annotations of polyps and ulcers. This is the first dataset that's coming from India containing polyp and ulcer images. The dataset can be used for detection and classification tasks. We also evaluated our dataset with several popular deep learning object detection models that's trained on large publicly available datasets and found out empirically that the model trained on one dataset works well on our dataset that has data being captured in a different acquisition device. △ Less

Submitted 22 January, 2023; originally announced January 2023.

arXiv:2301.04595 [pdf, other]

Circuit simulation using explicit methods

Authors: Mahesh B. Patil, V. V. S. Pavan Kumar Hari

Abstract: Use of explicit methods for simulating electrical circuits, especially for power electronics applications, is described. Application of the forward Euler method to a half-wave rectifier is discussed, and the limitations of a fixed-step method are pointed out. Implementation of the Runge-Kutta-Fehlberg (RKF) method, which allows variable time steps, for the half-wave rectifier circuit is discussed,… ▽ More Use of explicit methods for simulating electrical circuits, especially for power electronics applications, is described. Application of the forward Euler method to a half-wave rectifier is discussed, and the limitations of a fixed-step method are pointed out. Implementation of the Runge-Kutta-Fehlberg (RKF) method, which allows variable time steps, for the half-wave rectifier circuit is discussed, and its advantages pointed out. Formulation of circuit equations for the purpose of simulation using the RKF method is described for some more examples. Stability and accuracy issues related to power electronic circuits are brought out, and mechanisms to address them are presented. Future plans related to this work are described. △ Less

Submitted 11 January, 2023; originally announced January 2023.

Comments: 13 pages, 22 figures

arXiv:2210.02628 [pdf, other]

Cooperative Coverage with a Leader and a Wingmate in Communication-Constrained Environments

Authors: Sai Krishna Kanth Hari, Sivakumar Rathinam, Swaroop Darbha, David W. Casbeer

Abstract: We consider a mission framework in which two unmanned vehicles (UVs), a leader and a wingmate, are required to provide cooperative coverage of an environment while being within a short communication range. This framework finds applications in underwater and/or military domains, where certain constraints are imposed on communication by either the application or the environment. An important objecti… ▽ More We consider a mission framework in which two unmanned vehicles (UVs), a leader and a wingmate, are required to provide cooperative coverage of an environment while being within a short communication range. This framework finds applications in underwater and/or military domains, where certain constraints are imposed on communication by either the application or the environment. An important objective of missions within this framework is to minimize the total travel and communication costs of the leader-wingmate duo. In this paper, we propose and formulate the problem of finding routes for the UVs that minimize the sum of their travel and communication costs as a network optimization problem of the form of a binary program (BP). The BP is computationally expensive, with the time required to compute optimal solutions increasing rapidly with the problem size. To address this challenge, here, we propose two algorithms, an approximation algorithm and a heuristic algorithm, to solve large-scale instances of the problem swiftly. We demonstrate the effectiveness and the scalability of these algorithms through an analysis of extensive numerical simulations performed over 500 instances, with the number of targets in the instances ranging from 6 to 100. △ Less

Submitted 5 October, 2022; originally announced October 2022.

arXiv:2205.03347 [pdf, other]

Zhuyi: Perception Processing Rate Estimation for Safety in Autonomous Vehicles

Authors: Yu-Shun Hsiao, Siva Kumar Sastry Hari, Michał Filipiuk, Timothy Tsai, Michael B. Sullivan, Vijay Janapa Reddi, Vasu Singh, Stephen W. Keckler

Abstract: The processing requirement of autonomous vehicles (AVs) for high-accuracy perception in complex scenarios can exceed the resources offered by the in-vehicle computer, degrading safety and comfort. This paper proposes a sensor frame processing rate (FPR) estimation model, Zhuyi, that quantifies the minimum safe FPR continuously in a driving scenario. Zhuyi can be employed post-deployment as an onli… ▽ More The processing requirement of autonomous vehicles (AVs) for high-accuracy perception in complex scenarios can exceed the resources offered by the in-vehicle computer, degrading safety and comfort. This paper proposes a sensor frame processing rate (FPR) estimation model, Zhuyi, that quantifies the minimum safe FPR continuously in a driving scenario. Zhuyi can be employed post-deployment as an online safety check and to prioritize work. Experiments conducted using a multi-camera state-of-the-art industry AV system show that Zhuyi's estimated FPRs are conservative, yet the system can maintain safety by processing only 36% or fewer frames compared to a default 30-FPR system in the tested scenarios. △ Less

Submitted 6 May, 2022; originally announced May 2022.

Comments: 2022 Design Automation Conference (DAC), July 10-14, 2022, San Francisco

arXiv:2204.12924 [pdf, other]

An open-source simulation package for power electronics education

Authors: Mahesh B. Patil, V. V. S. Pavan Kumar Hari, Ruchita D. Korgaonkar, Kumar Appaiah

Abstract: Extension of the open-source simulation package GSEIM for power electronics applications is presented. Recent developments in GSEIM, including those oriented specifically towards power electronic circuits, are described. Some examples of electrical element templates, which form a part of the GSEIM library, are discussed. Representative simulation examples in power electronics are presented to brin… ▽ More Extension of the open-source simulation package GSEIM for power electronics applications is presented. Recent developments in GSEIM, including those oriented specifically towards power electronic circuits, are described. Some examples of electrical element templates, which form a part of the GSEIM library, are discussed. Representative simulation examples in power electronics are presented to bring out important features of the simulator. Advantages of GSEIM for educational purposes are discussed. Finally, plans regarding future developments in GSEIM are presented. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: 8 pages, 11 figures

arXiv:2105.10586 [pdf, other]

Bounds on Optimal Revisit Times in Persistent Monitoring Missions with a Distinct \& Remote Service Station

Authors: Sai Krishna Kanth Hari, Sivakumar Rathinam, Swaroop Darbha, Krishna Kalyanam, Satyanarayana Gupta Manyam, David Casbeer

Abstract: Persistent monitoring missions require an up-to-date knowledge of the changing state of the underlying environment. UAVs can be gainfully employed to continually visit a set of targets representing tasks (and locations) in the environment and collect data therein for long time periods. The enduring nature of these missions requires the UAV to be regularly recharged at a service station. In this pa… ▽ More Persistent monitoring missions require an up-to-date knowledge of the changing state of the underlying environment. UAVs can be gainfully employed to continually visit a set of targets representing tasks (and locations) in the environment and collect data therein for long time periods. The enduring nature of these missions requires the UAV to be regularly recharged at a service station. In this paper, we consider the case in which the service station is not co-located with any of the targets. An efficient monitoring requires the revisit time, defined as the maximum of the time elapsed between successive revisits to targets, to be minimized. Here, we consider the problem of determining UAV routes that lead to the minimum revisit time. The problem is NP-hard, and its computational difficulty increases with the fuel capacity of the UAV. We develop an algorithm to construct near-optimal solutions to the problem quickly, when the fuel capacity exceeds a threshold. We also develop lower bounds to the optimal revisit time and use these bounds to demonstrate (through numerical simulations) that the constructed solutions are, on an average, at most 0.01% away from the optimum. △ Less

Submitted 21 May, 2021; originally announced May 2021.

Comments: Submitted to IEEE TRO

arXiv:2103.07403 [pdf, other]

Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles

Authors: Zahra Ghodsi, Siva Kumar Sastry Hari, Iuri Frosio, Timothy Tsai, Alejandro Troccoli, Stephen W. Keckler, Siddharth Garg, Anima Anandkumar

Abstract: Extracting interesting scenarios from real-world data as well as generating failure cases is important for the development and testing of autonomous systems. We propose efficient mechanisms to both characterize and generate testing scenarios using a state-of-the-art driving simulator. For any scenario, our method generates a set of possible driving paths and identifies all the possible safe drivin… ▽ More Extracting interesting scenarios from real-world data as well as generating failure cases is important for the development and testing of autonomous systems. We propose efficient mechanisms to both characterize and generate testing scenarios using a state-of-the-art driving simulator. For any scenario, our method generates a set of possible driving paths and identifies all the possible safe driving trajectories that can be taken starting at different times, to compute metrics that quantify the complexity of the scenario. We use our method to characterize real driving data from the Next Generation Simulation (NGSIM) project, as well as adversarial scenarios generated in simulation. We rank the scenarios by defining metrics based on the complexity of avoiding accidents and provide insights into how the AV could have minimized the probability of incurring an accident. We demonstrate a strong correlation between the proposed metrics and human intuition. △ Less

Submitted 12 March, 2021; originally announced March 2021.

arXiv:2006.04984 [pdf, other]

Making Convolutions Resilient via Algorithm-Based Error Detection Techniques

Authors: Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, Stephen W. Keckler

Abstract: The ability of Convolutional Neural Networks (CNNs) to accurately process real-time telemetry has boosted their use in safety-critical and high-performance computing systems. As such systems require high levels of resilience to errors, CNNs must execute correctly in the presence of hardware faults. Full duplication provides the needed assurance but incurs a prohibitive 100% overhead. Algorithmic t… ▽ More The ability of Convolutional Neural Networks (CNNs) to accurately process real-time telemetry has boosted their use in safety-critical and high-performance computing systems. As such systems require high levels of resilience to errors, CNNs must execute correctly in the presence of hardware faults. Full duplication provides the needed assurance but incurs a prohibitive 100% overhead. Algorithmic techniques are known to offer low-cost solutions, but the practical feasibility and performance of such techniques have never been studied for CNN deployment platforms (e.g., TensorFlow or TensorRT on GPUs). In this paper, we focus on algorithmically verifying Convolutions, which are the most resource-demanding operations in CNNs. We use checksums to verify convolutions, adding a small amount of redundancy, far less than full-duplication. We first identify the challenges that arise in employing Algorithm-Based Error Detection (ABED) for Convolutions in optimized inference platforms that fuse multiple network layers and use reduced-precision operations, and demonstrate how to overcome them. We propose and evaluate variations of ABED techniques that offer implementation complexity, runtime overhead, and coverage trade-offs. Results show that ABED can detect all transient hardware errors that might otherwise corrupt output and does so while incurring low runtime overheads (6-23%), offering at least 1.6X throughput to workloads compared to full duplication. △ Less

Submitted 8 June, 2020; originally announced June 2020.

arXiv:2005.01445 [pdf, other]

Estimating Silent Data Corruption Rates Using a Two-Level Model

Authors: Siva Kumar Sastry Hari, Paolo Rech, Timothy Tsai, Mark Stephenson, Arslan Zulfiqar, Michael Sullivan, Philip Shirvani, Paul Racunas, Joel Emer, Stephen W. Keckler

Abstract: High-performance and safety-critical system architects must accurately evaluate the application-level silent data corruption (SDC) rates of processors to soft errors. Such an evaluation requires error propagation all the way from particle strikes on low-level state up to the program output. Existing approaches that rely on low-level simulations with fault injection cannot evaluate full application… ▽ More High-performance and safety-critical system architects must accurately evaluate the application-level silent data corruption (SDC) rates of processors to soft errors. Such an evaluation requires error propagation all the way from particle strikes on low-level state up to the program output. Existing approaches that rely on low-level simulations with fault injection cannot evaluate full applications because of their slow speeds, while application-level accelerated fault testing in accelerated particle beams is often impractical. We present a new two-level methodology for application resilience evaluation that overcomes these challenges. The proposed approach decomposes application failure rate estimation into (1) identifying how particle strikes in low-level unprotected state manifest at the architecture-level, and (2) measuring how such architecture-level manifestations propagate to the program output. We demonstrate the effectiveness of this approach on GPU architectures. We also show that using just one of the two steps can overestimate SDC rates and produce different trends---the composition of the two is needed for accurate reliability modeling. △ Less

Submitted 27 April, 2020; originally announced May 2020.

arXiv:2002.09786 [pdf, other]

HarDNN: Feature Map Vulnerability Evaluation in CNNs

Authors: Abdulrahman Mahmoud, Siva Kumar Sastry Hari, Christopher W. Fletcher, Sarita V. Adve, Charbel Sakr, Naresh Shanbhag, Pavlo Molchanov, Michael B. Sullivan, Timothy Tsai, Stephen W. Keckler

Abstract: As Convolutional Neural Networks (CNNs) are increasingly being employed in safety-critical applications, it is important that they behave reliably in the face of hardware errors. Transient hardware errors may percolate undesirable state during execution, resulting in software-manifested errors which can adversely affect high-level decision making. This paper presents HarDNN, a software-directed ap… ▽ More As Convolutional Neural Networks (CNNs) are increasingly being employed in safety-critical applications, it is important that they behave reliably in the face of hardware errors. Transient hardware errors may percolate undesirable state during execution, resulting in software-manifested errors which can adversely affect high-level decision making. This paper presents HarDNN, a software-directed approach to identify vulnerable computations during a CNN inference and selectively protect them based on their propensity towards corrupting the inference output in the presence of a hardware error. We show that HarDNN can accurately estimate relative vulnerability of a feature map (fmap) in CNNs using a statistical error injection campaign, and explore heuristics for fast vulnerability assessment. Based on these results, we analyze the tradeoff between error coverage and computational overhead that the system designers can use to employ selective protection. Results show that the improvement in resilience for the added computation is superlinear with HarDNN. For example, HarDNN improves SqueezeNet's resilience by 10x with just 30% additional computations. △ Less

Submitted 25 February, 2020; v1 submitted 22 February, 2020; originally announced February 2020.

Comments: 14 pages, 5 figures, a short version accepted for publication in First Workshop on Secure and Resilient Autonomy (SARA) co-located with MLSys2020

arXiv:1912.00146 [pdf]

doi 10.1007/978-981-15-2612-1_70

Secure Wireless Internet of Things Communication using Virtual Private Networks

Authors: Ishaan Lodha, Lakshana Kolur, K. Sree Hari, Honnavalli Prasad

Abstract: The Internet of Things (IoT) is an exploding market as well as a important focus area for research. Security is a major issue for IoT products and solutions, with several massive problems that are still commonplace in the field. In this paper, we have successfully minimized the risk of data eavesdropping and tampering over the network by securing these communications using the concept of tunneling… ▽ More The Internet of Things (IoT) is an exploding market as well as a important focus area for research. Security is a major issue for IoT products and solutions, with several massive problems that are still commonplace in the field. In this paper, we have successfully minimized the risk of data eavesdropping and tampering over the network by securing these communications using the concept of tunneling. We have implemented this by connecting a router to the internet via a Virtual Private network while using PPTP and L2TP as the underlying protocols for the VPN and exploring their cost benefits, compatibility and most importantly, their feasibility. The main purpose of our paper is to try to secure IoT networks without adversely affecting the selling point of IoT. △ Less

Submitted 30 November, 2019; originally announced December 2019.

Comments: 8 pages

arXiv:1907.01692 [pdf, other]

An Approximation Algorithm for a Task Allocation, Sequencing and Scheduling Problem involving a Human-Robot Team

Authors: Sai Krishna Hari, Abhishek Nayak, Sivakumar Rathinam

Abstract: This article presents an approximation algorithm for a task allocation, sequencing and scheduling problem involving a team of human operators and robots. Specifically, we present an algorithm with an approximation ratio as a function of the number of human operators ($m$) and the number of robots ($k$) in the team. The approximation ratios are $\frac{7}{2} -\frac{5}{2k}$,… ▽ More This article presents an approximation algorithm for a task allocation, sequencing and scheduling problem involving a team of human operators and robots. Specifically, we present an algorithm with an approximation ratio as a function of the number of human operators ($m$) and the number of robots ($k$) in the team. The approximation ratios are $\frac{7}{2} -\frac{5}{2k}$, $\frac{5}{2} -\frac{1}{k}$ and $\frac{7}{2} -\frac{1}{k}$ when $m=1$, $m\geq k\geq 2$ and $k>m\geq 2$ respectively. We also present computational results to corroborate the performance of the proposed approximation algorithm. △ Less

Submitted 11 September, 2019; v1 submitted 2 July, 2019; originally announced July 2019.

arXiv:1907.01051 [pdf, other]

ML-based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection

Authors: Saurabh Jha, Subho S. Banerjee, Timothy Tsai, Siva K. S. Hari, Michael B. Sullivan, Zbigniew T. Kalbarczyk, Stephen W. Keckler, Ravishankar K. Iyer

Abstract: The safety and resilience of fully autonomous vehicles (AVs) are of significant concern, as exemplified by several headline-making accidents. While AV development today involves verification, validation, and testing, end-to-end assessment of AV systems under accidental faults in realistic driving scenarios has been largely unexplored. This paper presents DriveFI, a machine learning-based fault inj… ▽ More The safety and resilience of fully autonomous vehicles (AVs) are of significant concern, as exemplified by several headline-making accidents. While AV development today involves verification, validation, and testing, end-to-end assessment of AV systems under accidental faults in realistic driving scenarios has been largely unexplored. This paper presents DriveFI, a machine learning-based fault injection engine, which can mine situations and faults that maximally impact AV safety, as demonstrated on two industry-grade AV technology stacks (from NVIDIA and Baidu). For example, DriveFI found 561 safety-critical faults in less than 4 hours. In comparison, random injection experiments executed over several weeks could not find any safety-critical faults △ Less

Submitted 1 July, 2019; originally announced July 2019.

Comments: Accepted at 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks

arXiv:1808.02545 [pdf, other]

Persistent Monitoring of Dynamically Changing Environments Using an Unmanned Vehicle

Authors: Sai Krishna Kanth Hari, Sivakumar Rathinam, Swaroop Darbha, Krishnamoorthy Kalyanam, Satyanarayana Gupta Manyam, David Casbeer

Abstract: We consider the problem of planning a closed walk $\mathcal W$ for a UAV to persistently monitor a finite number of stationary targets with equal priorities and dynamically changing properties. A UAV must physically visit the targets in order to monitor them and collect information therein. The frequency of monitoring any given target is specified by a target revisit time, $i.e.$, the maximum allo… ▽ More We consider the problem of planning a closed walk $\mathcal W$ for a UAV to persistently monitor a finite number of stationary targets with equal priorities and dynamically changing properties. A UAV must physically visit the targets in order to monitor them and collect information therein. The frequency of monitoring any given target is specified by a target revisit time, $i.e.$, the maximum allowable time between any two successive visits to the target. The problem considered in this paper is the following: Given $n$ targets and $k \geq n$ allowed visits to them, find an optimal closed walk $\mathcal W^*(k)$ so that every target is visited at least once and the maximum revisit time over all the targets, $\mathcal R(\mathcal W(k))$, is minimized. We prove the following: If $k \geq n^2-n$, $\mathcal R(\mathcal W^*(k))$ (or simply, $\mathcal R^*(k)$) takes only two values: $\mathcal R^*(n)$ when $k$ is an integral multiple of $n$, and $\mathcal R^*(n+1)$ otherwise. This result suggests significant computational savings - one only needs to determine $\mathcal W^*(n)$ and $\mathcal W^*(n+1)$ to construct an optimal solution $\mathcal W^*(k)$. We provide MILP formulations for computing $\mathcal W^*(n)$ and $\mathcal W^*(n+1)$. Furthermore, for {\it any} given $k$, we prove that $\mathcal R^*(k) \geq \mathcal R^*(k+n)$. △ Less

Submitted 3 June, 2019; v1 submitted 7 August, 2018; originally announced August 2018.

arXiv:1504.01705 [pdf, ps, other]

Fusion of Sparse Reconstruction Algorithms for Multiple Measurement Vectors

Authors: Deepa K. G., Sooraj K. Ambat, K. V. S. Hari

Abstract: We consider the recovery of sparse signals that share a common support from multiple measurement vectors. The performance of several algorithms developed for this task depends on parameters like dimension of the sparse signal, dimension of measurement vector, sparsity level, measurement noise. We propose a fusion framework, where several multiple measurement vector reconstruction algorithms partic… ▽ More We consider the recovery of sparse signals that share a common support from multiple measurement vectors. The performance of several algorithms developed for this task depends on parameters like dimension of the sparse signal, dimension of measurement vector, sparsity level, measurement noise. We propose a fusion framework, where several multiple measurement vector reconstruction algorithms participate and the final signal estimate is obtained by combining the signal estimates of the participating algorithms. We present the conditions for achieving a better reconstruction performance than the participating algorithms. Numerical simulations demonstrate that the proposed fusion algorithm often performs better than the participating algorithms. △ Less

Submitted 6 April, 2015; originally announced April 2015.

arXiv:1410.6028 [pdf, ps, other]

A Risk Minimization Framework for Channel Estimation in OFDM Systems

Authors: Karthik Upadhya, Chandra Sekhar Seelamantula, K. V. S. Hari

Abstract: We address the problem of channel estimation for cyclic-prefix (CP) Orthogonal Frequency Division Multiplexing (OFDM) systems. We model the channel as a vector of unknown deterministic constants and hence, do not require prior knowledge of the channel statistics. Since the mean-square error (MSE) is not computable in practice, in such a scenario, we propose a novel technique using Stein's lemma to… ▽ More We address the problem of channel estimation for cyclic-prefix (CP) Orthogonal Frequency Division Multiplexing (OFDM) systems. We model the channel as a vector of unknown deterministic constants and hence, do not require prior knowledge of the channel statistics. Since the mean-square error (MSE) is not computable in practice, in such a scenario, we propose a novel technique using Stein's lemma to obtain an unbiased estimate of the mean-square error, namely the Stein's unbiased risk estimate (SURE). We obtain an estimate of the channel from noisy observations using linear and nonlinear denoising functions, whose parameters are chosen to minimize SURE. Based on computer simulations, we show that using SURE-based channel estimate in equalization offers an improvement in signal-to-noise ratio of around 2.25 dB over the maximum-likelihood channel estimate, in practical channel scenarios, without assuming prior knowledge of channel statistics. △ Less

Submitted 22 October, 2014; originally announced October 2014.

arXiv:1308.0104 [pdf, ps, other]

doi 10.1109/LSP.2013.2276791

A Fast Eigen Solution for Homogeneous Quadratic Minimization with at most Three Constraints

Authors: Dinesh Dileep Gaurav, K. V. S. Hari

Abstract: We propose an eigenvalue based technique to solve the Homogeneous Quadratic Constrained Quadratic Programming problem (HQCQP) with at most 3 constraints which arise in many signal processing problems. Semi-Definite Relaxation (SDR) is the only known approach and is computationally intensive. We study the performance of the proposed fast eigen approach through simulations in the context of MIMO rel… ▽ More We propose an eigenvalue based technique to solve the Homogeneous Quadratic Constrained Quadratic Programming problem (HQCQP) with at most 3 constraints which arise in many signal processing problems. Semi-Definite Relaxation (SDR) is the only known approach and is computationally intensive. We study the performance of the proposed fast eigen approach through simulations in the context of MIMO relays and show that the solution converges to the solution obtained using the SDR approach with significant reduction in complexity. △ Less

Submitted 1 August, 2013; originally announced August 2013.

Comments: 15 pages, The same content without appendices is accepted and is to be published in IEEE Signal Processing Letters

arXiv:1304.7434 [pdf, ps, other]

Low Complexity Joint Estimation of Synchronization Impairments in Sparse Channel for MIMO-OFDM System

Authors: Renu Jose, Sooraj K. Ambat, K. V. S. Hari

Abstract: Low complexity joint estimation of synchronization impairments and channel in a single-user MIMO-OFDM system is presented in this letter. Based on a system model that takes into account the effects of synchronization impairments such as carrier frequency offset, sampling frequency offset, and symbol timing error, and channel, a Maximum Likelihood (ML) algorithm for the joint estimation is proposed… ▽ More Low complexity joint estimation of synchronization impairments and channel in a single-user MIMO-OFDM system is presented in this letter. Based on a system model that takes into account the effects of synchronization impairments such as carrier frequency offset, sampling frequency offset, and symbol timing error, and channel, a Maximum Likelihood (ML) algorithm for the joint estimation is proposed. To reduce the complexity of ML grid search, the number of received signal samples used for estimation need to be reduced. The conventional channel estimation methods using Least-Squares (LS) fail for the reduced sample under-determined system, which results in poor performance of the joint estimator. The proposed ML algorithm uses Compressed Sensing (CS) based channel estimation method in a sparse fading scenario, where the received samples used for estimation are less than that required for an LS based estimation. The performance of the estimation method is studied through numerical simulations, and it is observed that CS based joint estimator performs better than LS based joint estimator △ Less

Submitted 28 April, 2013; originally announced April 2013.

Comments: 7 pages, 4 figures, under review in AEU - International Journal of Electronics and Communications (Elsevier) (paper id-AEUE-D-12-00625)

arXiv:1212.1340

Spatial Modulation in Zero-Padded Single Carrier Communication

Authors: Rakshith Rajashekar, K. V. S. Hari

Abstract: In this paper, we consider the Spatial Modulation (SM) system in a frequency selective channel under single carrier (SC) communication scenario and propose zero-padding instead of cyclic prefix considered in the existing literature. We show that the zero-padded single carrier (ZP-SC) SM system offers full multipath diversity under maximum-likelihood (ML) detection, unlike the cyclic prefixed SM sy… ▽ More In this paper, we consider the Spatial Modulation (SM) system in a frequency selective channel under single carrier (SC) communication scenario and propose zero-padding instead of cyclic prefix considered in the existing literature. We show that the zero-padded single carrier (ZP-SC) SM system offers full multipath diversity under maximum-likelihood (ML) detection, unlike the cyclic prefixed SM system. Further, we show that the order of ML decoding complexity in the proposed ZP-SC SM system is independent of the frame length and depends only on the number of multipath links between the transmitter and the receiver. Thus, we show that the zero-padding in the SC SM system has two fold advantage over cyclic prefixing: 1) gives full multipath diversity, and 2) offers relatively low ML decoding complexity. Furthermore, we extend the partial interference cancellation receiver (PIC-R) proposed by Guo and Xia for the decoding of STBCs in order to convert the ZP-SC system into a set of flat-fading subsystems. We show that the transmission of any full rank STBC over these subsystems achieves full transmit, receive as well as multipath diversity under PIC-R. With the aid of this extended PIC-R, we show that the ZP-SC SM system achieves receive and multipath diversity with a decoding complexity same as that of the SM system in flat-fading scenario. △ Less

Submitted 16 January, 2013; v1 submitted 6 December, 2012; originally announced December 2012.

Comments: This paper has been withdrawn by the authors

arXiv:1210.5314 [pdf, ps, other]

Maximum Likelihood Algorithms for Joint Estimation of Synchronization Impairments and Channel in MIMO-OFDM System

Authors: Renu Jose, K. V. S. Hari

Abstract: Maximum Likelihood (ML) algorithms, for the joint estimation of synchronization impairments and channel in Multiple Input Multiple Output-Orthogonal Frequency Division Multiplexing (MIMO-OFDM) system, are investigated in this work. A system model that takes into account the effects of carrier frequency offset, sampling frequency offset, symbol timing error, and channel impulse response is formulat… ▽ More Maximum Likelihood (ML) algorithms, for the joint estimation of synchronization impairments and channel in Multiple Input Multiple Output-Orthogonal Frequency Division Multiplexing (MIMO-OFDM) system, are investigated in this work. A system model that takes into account the effects of carrier frequency offset, sampling frequency offset, symbol timing error, and channel impulse response is formulated. Cramér-Rao Lower Bounds for the estimation of continuous parameters are derived, which show the coupling effect among different impairments and the significance of the joint estimation. We propose an ML algorithm for the estimation of synchronization impairments and channel together, using grid search method. To reduce the complexity of the joint grid search in ML algorithm, a Modified ML (MML) algorithm with multiple one-dimensional searches is also proposed. Further, a Stage-wise ML (SML) algorithm using existing algorithms, which estimate fewer number of parameters, is also proposed. Performance of the estimation algorithms is studied through numerical simulations and it is found that the proposed ML and MML algorithms exhibit better performance than SML algorithm. △ Less

Submitted 27 October, 2012; v1 submitted 19 October, 2012; originally announced October 2012.

Comments: 18 pages, 5 figures, Submitted to IET Communications

arXiv:1210.2502 [pdf, ps, other]

Structured Dispersion Matrices from Space-Time Block Codes for Space-Time Shift Keying

Authors: Rakshith Rajashekar, K. V. S. Hari, L. Hanzo

Abstract: Coherent Space-Time Shift Keying (CSTSK) is a recently developed generalized shift-keying framework for Multiple-Input Multiple-Output systems, which uses a set of Space-Time matrices termed as Dispersion Matrices (DM). CSTSK may be combined with a classic signaling set (eg. QAM, PSK) in order to strike a flexible tradeoff between the achievable diversity and multiplexing gain. One of the key bene… ▽ More Coherent Space-Time Shift Keying (CSTSK) is a recently developed generalized shift-keying framework for Multiple-Input Multiple-Output systems, which uses a set of Space-Time matrices termed as Dispersion Matrices (DM). CSTSK may be combined with a classic signaling set (eg. QAM, PSK) in order to strike a flexible tradeoff between the achievable diversity and multiplexing gain. One of the key benefits of the CSTSK scheme is its Inter-Channel Interference (ICI) free system that makes single-stream Maximum Likelihood detection possible at low-complexity. In the existing CSTSK scheme, DMs are chosen by maximizing the mutual information over a large set of complex valued, Gaussian random matrices through numerical simulations. We refer to them as Capacity-Optimized (CO) DMs. In this contribution we establish a connection between the STSK scheme as well as the Space-Time Block Codes (STBC) and show that a class of STBCs termed as Decomposable Dispersion Codes (DDC) enjoy all the benefits that are specific to the STSK scheme. Two STBCs belonging to this class are proposed, a rate-one code from Field Extensions and a full-rate code from Cyclic Division Algebras, that offer structured DMs with desirable properties such as full-diversity, and a high coding gain. We show that the DMs derived from these codes are capable of achieving a performance than CO-DMs, and emphasize the importance of DMs having a higher coding gain than CO-DMs in scenarios having realistic, imperfect channel state information at the receiver. △ Less

Submitted 9 October, 2012; originally announced October 2012.

Comments: 30 pages 9 figures. Ignore the 31st page which has a copy of Fig. 5

arXiv:1209.6017

Power Allocation in Amplify and Forward Relays with a Power Constrained Relay

Authors: Dinesh Dileep Gaurav, K. V. S. Hari

Abstract: We consider a two-hop Multiple-Input Multiple-Output channel with a source, a single Amplify and Forward relay, and the destination. We consider the problem of designing precoders at the source and the relay, and the receiver matrix at the destination. In particular, we address the problem of optimal power allocation scheme at the source which minimizes the source transmit power while satisfying a… ▽ More We consider a two-hop Multiple-Input Multiple-Output channel with a source, a single Amplify and Forward relay, and the destination. We consider the problem of designing precoders at the source and the relay, and the receiver matrix at the destination. In particular, we address the problem of optimal power allocation scheme at the source which minimizes the source transmit power while satisfying a given Quality of Service requirement at the destination, and a power constraint at the relay. We consider two types of receiver at the destination, a Zero Forcing receiver and an Minimum Mean Square Error receiver. Simulation Results are provided in the end which compare the performance of both the receivers. △ Less

Submitted 8 November, 2012; v1 submitted 26 September, 2012; originally announced September 2012.

Comments: 9 pages, 2 figures, This is to present the new version with updated content

arXiv:1206.6190

Low Complexity Maximum Likelihood Detection in Spatial Modulation Systems

Authors: Rakshith Rajashekar, K. V. S. Hari

Abstract: Spatial Modulation (SM) is a recently developed low-complexity Multiple-Input Multiple-Output scheme that uses antenna indices and a conventional signal set to convey information. It has been shown that the Maximum-Likelihood (ML) detection in an SM system involves joint detection of the transmit antenna index and the transmitted symbol, and hence, the ML search complexity grows linearly with the… ▽ More Spatial Modulation (SM) is a recently developed low-complexity Multiple-Input Multiple-Output scheme that uses antenna indices and a conventional signal set to convey information. It has been shown that the Maximum-Likelihood (ML) detection in an SM system involves joint detection of the transmit antenna index and the transmitted symbol, and hence, the ML search complexity grows linearly with the number of transmit antennas and the size of the signal set. In this paper, we show that the ML search complexity in an SM system becomes independent of the constellation size when the signal set employed is a square- or a rectangular-QAM. Further, we show that Sphere Decoding (SD) algorithms become essential in SM systems only when the number of transmit antennas is large and not necessarily when the employed signal set is large. We propose a novel {\em hard-limiting} enabled sphere decoding detector whose complexity is lesser than that of the existing detector and a generalized detection scheme for SM systems with {\em arbitrary} number of transmit antennas. We support our claims with simulation results that the proposed detectors are ML-optimal and offer a significantly reduced complexity. △ Less

Submitted 16 January, 2013; v1 submitted 27 June, 2012; originally announced June 2012.

Comments: This paper has been withdrawn by the authors

arXiv:1204.5652 [pdf, ps, other]

ML Decoding Complexity Reduction in STBCs Using Time-Orthogonal Pulse Shaping

Authors: Rakshith Rajashekar, K. V. S. Hari

Abstract: Motivated by the recent developments in the Space Shift Keying (SSK) and Spatial Modulation (SM) systems which employ Time-Orthogonal Pulse Shaping (TOPS) filters to achieve transmit diversity gains, we propose TOPS for Space-Time Block Codes (STBC). We show that any STBC whose set of weight matrices partitions into P subsets under the equivalence relation termed as Common Support Relation can be… ▽ More Motivated by the recent developments in the Space Shift Keying (SSK) and Spatial Modulation (SM) systems which employ Time-Orthogonal Pulse Shaping (TOPS) filters to achieve transmit diversity gains, we propose TOPS for Space-Time Block Codes (STBC). We show that any STBC whose set of weight matrices partitions into P subsets under the equivalence relation termed as Common Support Relation can be made P -group decodable by properly employing TOPS waveforms across space and time. Furthermore, by considering some of the well known STBCs in the literature we show that the order of their Maximum Likelihood decoding complexity can be greatly reduced by the application of TOPS. △ Less

Submitted 25 April, 2012; originally announced April 2012.

Comments: 10 pages

arXiv:1204.4656 [pdf, ps, other]

Fusion of Greedy Pursuits for Compressed Sensing Signal Reconstruction

Authors: Sooraj K. Ambat, Saikat Chatterjee, K. V. S. Hari

Abstract: Greedy Pursuits are very popular in Compressed Sensing for sparse signal recovery. Though many of the Greedy Pursuits possess elegant theoretical guarantees for performance, it is well known that their performance depends on the statistical distribution of the non-zero elements in the sparse signal. In practice, the distribution of the sparse signal may not be known a priori. It is also observed t… ▽ More Greedy Pursuits are very popular in Compressed Sensing for sparse signal recovery. Though many of the Greedy Pursuits possess elegant theoretical guarantees for performance, it is well known that their performance depends on the statistical distribution of the non-zero elements in the sparse signal. In practice, the distribution of the sparse signal may not be known a priori. It is also observed that performance of Greedy Pursuits degrades as the number of available measurements decreases from a threshold value which is method dependent. To improve the performance in these situations, we introduce a novel fusion framework for Greedy Pursuits and also propose two algorithms for sparse recovery. Through Monte Carlo simulations we show that the proposed schemes improve sparse signal recovery in clean as well as noisy measurement cases. △ Less

Submitted 19 June, 2012; v1 submitted 20 April, 2012; originally announced April 2012.

Comments: Accepted, "20th European Signal Processing Conference 2012 (EUSIPCO 2012)", Bucharest, Romania,27 Aug,2012

arXiv:1204.4073 [pdf, ps, other]

Modulation Diversity for Spatial Modulation Using Complex Interleaved Orthogonal Design

Authors: Rakshith Rajashekar, K. V. S. Hari

Abstract: In this paper, we propose modulation diversity techniques for Spatial Modulation (SM) system using Complex Interleaved Orthogonal Design (CIOD) meant for two transmit antennas. Specifically, we show that by using the CIOD for two transmit antenna system, the standard SM scheme, where only one transmit antenna is activated in any symbol duration, can achieve a transmit diversity order of two. We sh… ▽ More In this paper, we propose modulation diversity techniques for Spatial Modulation (SM) system using Complex Interleaved Orthogonal Design (CIOD) meant for two transmit antennas. Specifically, we show that by using the CIOD for two transmit antenna system, the standard SM scheme, where only one transmit antenna is activated in any symbol duration, can achieve a transmit diversity order of two. We show with our simulation results that the proposed schemes offer transmit diversity order of two, and hence, give a better Symbol Error Rate performance than the SM scheme with transmit diversity order of one. △ Less

Submitted 18 April, 2012; originally announced April 2012.

Comments: 7 pages

arXiv:1202.5187

Sphere Decoding for Spatial Modulation Systems with Arbitrary Nt

Authors: Rakshith Rajashekar, K. V. S. Hari

Abstract: Recently, three Sphere Decoding (SD) algorithms were proposed for Spatial Modulation (SM) scheme which focus on reducing the transmit-, receive-, and both transmit and receive-search spaces at the receiver and were termed as Receiver-centric SD (Rx-SD), Transmitter-centric SD (Tx-SD), and Combined SD (C-SD) detectors, respectively. The Tx-SD detector was proposed for systems with Nt \leq Nr, where… ▽ More Recently, three Sphere Decoding (SD) algorithms were proposed for Spatial Modulation (SM) scheme which focus on reducing the transmit-, receive-, and both transmit and receive-search spaces at the receiver and were termed as Receiver-centric SD (Rx-SD), Transmitter-centric SD (Tx-SD), and Combined SD (C-SD) detectors, respectively. The Tx-SD detector was proposed for systems with Nt \leq Nr, where Nt and Nr are the number of transmit and receive antennas of the system. In this paper, we show that the existing Tx-SD detector is not limited to systems with Nt \leq Nr but can be used with systems Nr < Nt \leq 2Nr - 1 as well. We refer to this detector as the Extended Tx-SD (E-Tx-SD) detector. Further, we propose an E- Tx-SD based detection scheme for SM systems with arbitrary Nt by exploiting the Inter-Channel Interference (ICI) free property of the SM systems. We show with our simulation results that the proposed detectors are ML-optimal and offer significantly reduced complexity. △ Less

Submitted 16 January, 2013; v1 submitted 23 February, 2012; originally announced February 2012.

Comments: This paper has been withdrawn by the authors

arXiv:0912.2320 [pdf]

Identifying the Importance of Software Reuse in COCOMO81, COCOMOII

Authors: CH. V. M. K. Hari, Prof. Prasad Reddy P. V. G. D, J. N. V. R Swarup Kumar, G. SriRamGanesh

Abstract: Software project management is an interpolation of project planning, project monitoring and project termination. The substratal goals of planning are to scout for the future, to diagnose the attributes that are essentially done for the consummation of the project successfully, animate the scheduling and allocate resources for the attributes. Software cost estimation is a vital role in preeminent… ▽ More Software project management is an interpolation of project planning, project monitoring and project termination. The substratal goals of planning are to scout for the future, to diagnose the attributes that are essentially done for the consummation of the project successfully, animate the scheduling and allocate resources for the attributes. Software cost estimation is a vital role in preeminent software project decisions such as resource allocation and bidding. This paper articulates the conventional overview of software cost estimation modus operandi available. The cost, effort estimates of software projects done by the various companies are congregated, the results are segregated with the present cost models and the MRE (Mean Relative Error) is enumerated. We have administered the historical data to COCOMO 81, COCOMOII model and identified that the stellar predicament is that no cost model gives the exact estimate of a software project. △ Less

Submitted 11 December, 2009; originally announced December 2009.

Journal ref: IJCSE Volume 1 Issue 3 2009 142-147

Showing 1–47 of 47 results for author: Hari, K