Search | arXiv e-print repository

SonarSplat: Novel View Synthesis of Imaging Sonar via Gaussian Splatting

Authors: Advaith V. Sethuraman, Max Rucker, Onur Bagoren, Pou-Chun Kung, Nibarkavi N. B. Amutha, Katherine A. Skinner

Abstract: In this paper, we present SonarSplat, a novel Gaussian splatting framework for imaging sonar that demonstrates realistic novel view synthesis and models acoustic streaking phenomena. Our method represents the scene as a set of 3D Gaussians with acoustic reflectance and saturation properties. We develop a novel method to efficiently rasterize Gaussians to produce a range/azimuth image that is faith… ▽ More In this paper, we present SonarSplat, a novel Gaussian splatting framework for imaging sonar that demonstrates realistic novel view synthesis and models acoustic streaking phenomena. Our method represents the scene as a set of 3D Gaussians with acoustic reflectance and saturation properties. We develop a novel method to efficiently rasterize Gaussians to produce a range/azimuth image that is faithful to the acoustic image formation model of imaging sonar. In particular, we develop a novel approach to model azimuth streaking in a Gaussian splatting framework. We evaluate SonarSplat using real-world datasets of sonar images collected from an underwater robotic platform in a controlled test tank and in a real-world river environment. Compared to the state-of-the-art, SonarSplat offers improved image synthesis capabilities (+3.2 dB PSNR) and more accurate 3D reconstruction (52% lower Chamfer Distance). We also demonstrate that SonarSplat can be leveraged for azimuth streak removal. △ Less

Submitted 4 May, 2025; v1 submitted 31 March, 2025; originally announced April 2025.

arXiv:2503.01074 [pdf, other]

OceanSim: A GPU-Accelerated Underwater Robot Perception Simulation Framework

Authors: Jingyu Song, Haoyu Ma, Onur Bagoren, Advaith V. Sethuraman, Yiting Zhang, Katherine A. Skinner

Abstract: Underwater simulators offer support for building robust underwater perception solutions. Significant work has recently been done to develop new simulators and to advance the performance of existing underwater simulators. Still, there remains room for improvement on physics-based underwater sensor modeling and rendering efficiency. In this paper, we propose OceanSim, a high-fidelity GPU-accelerated… ▽ More Underwater simulators offer support for building robust underwater perception solutions. Significant work has recently been done to develop new simulators and to advance the performance of existing underwater simulators. Still, there remains room for improvement on physics-based underwater sensor modeling and rendering efficiency. In this paper, we propose OceanSim, a high-fidelity GPU-accelerated underwater simulator to address this research gap. We propose advanced physics-based rendering techniques to reduce the sim-to-real gap for underwater image simulation. We develop OceanSim to fully leverage the computing advantages of GPUs and achieve real-time imaging sonar rendering and fast synthetic data generation. We evaluate the capabilities and realism of OceanSim using real-world data to provide qualitative and quantitative results. The project page for OceanSim is https://umfieldrobotics.github.io/OceanSim. △ Less

Submitted 2 March, 2025; originally announced March 2025.

Comments: 8 pages, 6 figures

arXiv:2411.04963 [pdf, other]

VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes

Authors: Advaith V. Sethuraman, Onur Bagoren, Harikrishnan Seetharaman, Dalton Richardson, Joseph Taylor, Katherine A. Skinner

Abstract: Mobile robots operating indoors must be prepared to navigate challenging scenes that contain transparent surfaces. This paper proposes a novel method for the fusion of acoustic and visual sensing modalities through implicit neural representations to enable dense reconstruction of transparent surfaces in indoor scenes. We propose a novel model that leverages generative latent optimization to learn… ▽ More Mobile robots operating indoors must be prepared to navigate challenging scenes that contain transparent surfaces. This paper proposes a novel method for the fusion of acoustic and visual sensing modalities through implicit neural representations to enable dense reconstruction of transparent surfaces in indoor scenes. We propose a novel model that leverages generative latent optimization to learn an implicit representation of indoor scenes consisting of transparent surfaces. We demonstrate that we can query the implicit representation to enable volumetric rendering in image space or 3D geometry reconstruction (point clouds or mesh) with transparent surface prediction. We evaluate our method's effectiveness qualitatively and quantitatively on a new dataset collected using a custom, low-cost sensing platform featuring RGB-D cameras and ultrasonic sensors. Our method exhibits significant improvement over state-of-the-art for transparent surface reconstruction. △ Less

Submitted 7 November, 2024; originally announced November 2024.

Comments: https://umfieldrobotics.github.io/VAIR_site/

arXiv:2408.01569 [pdf, other]

TURTLMap: Real-time Localization and Dense Mapping of Low-texture Underwater Environments with a Low-cost Unmanned Underwater Vehicle

Authors: Jingyu Song, Onur Bagoren, Razan Andigani, Advaith Venkatramanan Sethuraman, Katherine A. Skinner

Abstract: Significant work has been done on advancing localization and mapping in underwater environments. Still, state-of-the-art methods are challenged by low-texture environments, which is common for underwater settings. This makes it difficult to use existing methods in diverse, real-world scenes. In this paper, we present TURTLMap, a novel solution that focuses on textureless underwater environments th… ▽ More Significant work has been done on advancing localization and mapping in underwater environments. Still, state-of-the-art methods are challenged by low-texture environments, which is common for underwater settings. This makes it difficult to use existing methods in diverse, real-world scenes. In this paper, we present TURTLMap, a novel solution that focuses on textureless underwater environments through a real-time localization and mapping method. We show that this method is low-cost, and capable of tracking the robot accurately, while constructing a dense map of a low-textured environment in real-time. We evaluate the proposed method using real-world data collected in an indoor water tank with a motion capture system and ground truth reference map. Qualitative and quantitative results validate the proposed system achieves accurate and robust localization and precise dense mapping, even when subject to wave conditions. The project page for TURTLMap is https://umfieldrobotics.github.io/TURTLMap. △ Less

Submitted 9 October, 2024; v1 submitted 2 August, 2024; originally announced August 2024.

Comments: Accepted to IROS 2024

arXiv:2402.01106 [pdf, other]

Learning Which Side to Scan: Multi-View Informed Active Perception with Side Scan Sonar for Autonomous Underwater Vehicles

Authors: Advaith V. Sethuraman, Philip Baldoni, Katherine A. Skinner, James McMahon

Abstract: Autonomous underwater vehicles often perform surveys that capture multiple views of targets in order to provide more information for human operators or automatic target recognition algorithms. In this work, we address the problem of choosing the most informative views that minimize survey time while maximizing classifier accuracy. We introduce a novel active perception framework for multi-view ada… ▽ More Autonomous underwater vehicles often perform surveys that capture multiple views of targets in order to provide more information for human operators or automatic target recognition algorithms. In this work, we address the problem of choosing the most informative views that minimize survey time while maximizing classifier accuracy. We introduce a novel active perception framework for multi-view adaptive surveying and reacquisition using side scan sonar imagery. Our framework addresses this challenge by using a graph formulation for the adaptive survey task. We then use Graph Neural Networks (GNNs) to both classify acquired sonar views and to choose the next best view based on the collected data. We evaluate our method using simulated surveys in a high-fidelity side scan sonar simulator. Our results demonstrate that our approach is able to surpass the state-of-the-art in classification accuracy and survey efficiency. This framework is a promising approach for more efficient autonomous missions involving side scan sonar, such as underwater exploration, marine archaeology, and environmental monitoring. △ Less

Submitted 13 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

arXiv:2401.14546 [pdf, other]

doi 10.1177/0278364924126685

Machine Learning for Shipwreck Segmentation from Side Scan Sonar Imagery: Dataset and Benchmark

Authors: Advaith V. Sethuraman, Anja Sheppard, Onur Bagoren, Christopher Pinnow, Jamey Anderson, Timothy C. Havens, Katherine A. Skinner

Abstract: Open-source benchmark datasets have been a critical component for advancing machine learning for robot perception in terrestrial applications. Benchmark datasets enable the widespread development of state-of-the-art machine learning methods, which require large datasets for training, validation, and thorough comparison to competing approaches. Underwater environments impose several operational cha… ▽ More Open-source benchmark datasets have been a critical component for advancing machine learning for robot perception in terrestrial applications. Benchmark datasets enable the widespread development of state-of-the-art machine learning methods, which require large datasets for training, validation, and thorough comparison to competing approaches. Underwater environments impose several operational challenges that hinder efforts to collect large benchmark datasets for marine robot perception. Furthermore, a low abundance of targets of interest relative to the size of the search space leads to increased time and cost required to collect useful datasets for a specific task. As a result, there is limited availability of labeled benchmark datasets for underwater applications. We present the AI4Shipwrecks dataset, which consists of 28 distinct shipwrecks totaling 286 high-resolution labeled side scan sonar images to advance the state-of-the-art in autonomous sonar image understanding. We leverage the unique abundance of targets in Thunder Bay National Marine Sanctuary in Lake Huron, MI, to collect and compile a sonar imagery benchmark dataset through surveys with an autonomous underwater vehicle (AUV). We consulted with expert marine archaeologists for the labeling of robotically gathered data. We then leverage this dataset to perform benchmark experiments for comparison of state-of-the-art supervised segmentation methods, and we present insights on opportunities and open challenges for the field. The dataset and benchmarking tools will be released as an open-source benchmark dataset to spur innovation in machine learning for Great Lakes and ocean exploration. The dataset and accompanying software are available at https://umfieldrobotics.github.io/ai4shipwrecks/. △ Less

Submitted 27 August, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

Comments: Project website link: https://umfieldrobotics.github.io/ai4shipwrecks/

Journal ref: The International Journal of Robotics Research. 2024;0(0)

arXiv:2310.01667 [pdf, other]

STARS: Zero-shot Sim-to-Real Transfer for Segmentation of Shipwrecks in Sonar Imagery

Authors: Advaith Venkatramanan Sethuraman, Katherine A. Skinner

Abstract: In this paper, we address the problem of sim-to-real transfer for object segmentation when there is no access to real examples of an object of interest during training, i.e. zero-shot sim-to-real transfer for segmentation. We focus on the application of shipwreck segmentation in side scan sonar imagery. Our novel segmentation network, STARS, addresses this challenge by fusing a predicted deformati… ▽ More In this paper, we address the problem of sim-to-real transfer for object segmentation when there is no access to real examples of an object of interest during training, i.e. zero-shot sim-to-real transfer for segmentation. We focus on the application of shipwreck segmentation in side scan sonar imagery. Our novel segmentation network, STARS, addresses this challenge by fusing a predicted deformation field and anomaly volume, allowing it to generalize better to real sonar images and achieve more effective zero-shot sim-to-real transfer for image segmentation. We evaluate the sim-to-real transfer capabilities of our method on a real, expert-labeled side scan sonar dataset of shipwrecks collected from field work surveys with an autonomous underwater vehicle (AUV). STARS is trained entirely in simulation and performs zero-shot shipwreck segmentation with no additional fine-tuning on real data. Our method provides a significant 20% increase in segmentation performance for the targeted shipwreck class compared to the best baseline. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2209.13091 [pdf, other]

WaterNeRF: Neural Radiance Fields for Underwater Scenes

Authors: Advaith Venkatramanan Sethuraman, Manikandasriram Srinivasan Ramanagopal, Katherine A. Skinner

Abstract: Underwater imaging is a critical task performed by marine robots for a wide range of applications including aquaculture, marine infrastructure inspection, and environmental monitoring. However, water column effects, such as attenuation and backscattering, drastically change the color and quality of imagery captured underwater. Due to varying water conditions and range-dependency of these effects,… ▽ More Underwater imaging is a critical task performed by marine robots for a wide range of applications including aquaculture, marine infrastructure inspection, and environmental monitoring. However, water column effects, such as attenuation and backscattering, drastically change the color and quality of imagery captured underwater. Due to varying water conditions and range-dependency of these effects, restoring underwater imagery is a challenging problem. This impacts downstream perception tasks including depth estimation and 3D reconstruction. In this paper, we advance state-of-the-art in neural radiance fields (NeRFs) to enable physics-informed dense depth estimation and color correction. Our proposed method, WaterNeRF, estimates parameters of a physics-based model for underwater image formation, leading to a hybrid data-driven and model-based solution. After determining the scene structure and radiance field, we can produce novel views of degraded as well as corrected underwater images, along with dense depth of the scene. We evaluate the proposed method qualitatively and quantitatively on a real underwater dataset. △ Less

Submitted 29 September, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

arXiv:2204.11716 [pdf, other]

Masked Image Modeling Advances 3D Medical Image Analysis

Authors: Zekai Chen, Devansh Agarwal, Kshitij Aggarwal, Wiem Safta, Samit Hirawat, Venkat Sethuraman, Mariann Micsinai Balan, Kevin Brown

Abstract: Recently, masked image modeling (MIM) has gained considerable attention due to its capacity to learn from vast amounts of unlabeled data and has been demonstrated to be effective on a wide variety of vision tasks involving natural images. Meanwhile, the potential of self-supervised learning in modeling 3D medical images is anticipated to be immense due to the high quantities of unlabeled images, a… ▽ More Recently, masked image modeling (MIM) has gained considerable attention due to its capacity to learn from vast amounts of unlabeled data and has been demonstrated to be effective on a wide variety of vision tasks involving natural images. Meanwhile, the potential of self-supervised learning in modeling 3D medical images is anticipated to be immense due to the high quantities of unlabeled images, and the expense and difficulty of quality labels. However, MIM's applicability to medical images remains uncertain. In this paper, we demonstrate that masked image modeling approaches can also advance 3D medical images analysis in addition to natural images. We study how masked image modeling strategies leverage performance from the viewpoints of 3D medical image segmentation as a representative downstream task: i) when compared to naive contrastive learning, masked image modeling approaches accelerate the convergence of supervised training even faster (1.40$\times$) and ultimately produce a higher dice score; ii) predicting raw voxel values with a high masking ratio and a relatively smaller patch size is non-trivial self-supervised pretext-task for medical images modeling; iii) a lightweight decoder or projection head design for reconstruction is powerful for masked image modeling on 3D medical images which speeds up training and reduce cost; iv) finally, we also investigate the effectiveness of MIM methods under different practical scenarios where different image resolutions and labeled data ratios are applied. △ Less

Submitted 23 August, 2022; v1 submitted 25 April, 2022; originally announced April 2022.

Comments: 8 pages, 6 figures, 9 tables; Accepted by WACV2023

arXiv:2101.10266 [pdf, other]

COVID-19 Outbreak Prediction and Analysis using Self Reported Symptoms

Authors: Rohan Sukumaran, Parth Patwa, T V Sethuraman, Sheshank Shankar, Rishank Kanaparti, Joseph Bae, Yash Mathur, Abhishek Singh, Ayush Chopra, Myungsun Kang, Priya Ramaswamy, Ramesh Raskar

Abstract: It is crucial for policymakers to understand the community prevalence of COVID-19 so combative resources can be effectively allocated and prioritized during the COVID-19 pandemic. Traditionally, community prevalence has been assessed through diagnostic and antibody testing data. However, despite the increasing availability of COVID-19 testing, the required level has not been met in most parts of t… ▽ More It is crucial for policymakers to understand the community prevalence of COVID-19 so combative resources can be effectively allocated and prioritized during the COVID-19 pandemic. Traditionally, community prevalence has been assessed through diagnostic and antibody testing data. However, despite the increasing availability of COVID-19 testing, the required level has not been met in most parts of the globe, introducing a need for an alternative method for communities to determine disease prevalence. This is further complicated by the observation that COVID-19 prevalence and spread varies across different spatial, temporal, and demographics. In this study, we understand trends in the spread of COVID-19 by utilizing the results of self-reported COVID-19 symptoms surveys as an alternative to COVID-19 testing reports. This allows us to assess community disease prevalence, even in areas with low COVID-19 testing ability. Using individually reported symptom data from various populations, our method predicts the likely percentage of the population that tested positive for COVID-19. We do so with a Mean Absolute Error (MAE) of 1.14 and Mean Relative Error (MRE) of 60.40\% with 95\% confidence interval as (60.12, 60.67). This implies that our model predicts +/- 1140 cases than the original in a population of 1 million. In addition, we forecast the location-wise percentage of the population testing positive for the next 30 days using self-reported symptoms data from previous days. The MAE for this method is as low as 0.15 (MRE of 23.61\% with 95\% confidence interval as (23.6, 13.7)) for New York. We present an analysis of these results, exposing various clinical attributes of interest across different demographics. Lastly, we qualitatively analyze how various policy enactments (testing, curfew) affect the prevalence of COVID-19 in a community. △ Less

Submitted 19 June, 2021; v1 submitted 20 December, 2020; originally announced January 2021.

Comments: 15 pages, 16 Figures - Latest version on the Journal of Behavioural Data Science - https://isdsa.org/_media/jbds/v1n1/v1n1p8.pdf

arXiv:0712.2872 [pdf, ps, other]

doi 10.1109/TIT.2009.2012995

Low SNR Capacity of Noncoherent Fading Channels

Authors: Vignesh Sethuraman, Ligong Wang, Bruce Hajek, Amos Lapidoth

Abstract: Discrete-time Rayleigh fading single-input single-output (SISO) and multiple-input multiple-output (MIMO) channels are considered, with no channel state information at the transmitter or the receiver. The fading is assumed to be stationary and correlated in time, but independent from antenna to antenna. Peak-power and average-power constraints are imposed on the transmit antennas. For MIMO chann… ▽ More Discrete-time Rayleigh fading single-input single-output (SISO) and multiple-input multiple-output (MIMO) channels are considered, with no channel state information at the transmitter or the receiver. The fading is assumed to be stationary and correlated in time, but independent from antenna to antenna. Peak-power and average-power constraints are imposed on the transmit antennas. For MIMO channels, these constraints are either imposed on the sum over antennas, or on each individual antenna. For SISO channels and MIMO channels with sum power constraints, the asymptotic capacity as the peak signal-to-noise ratio tends to zero is identified; for MIMO channels with individual power constraints, this asymptotic capacity is obtained for a class of channels called transmit separable channels. The results for MIMO channels with individual power constraints are carried over to SISO channels with delay spread (i.e. frequency selective fading). △ Less

Submitted 15 December, 2008; v1 submitted 17 December, 2007; originally announced December 2007.

Comments: submitted to IEEE IT

arXiv:cs/0701078 [pdf, other]

Low SNR Capacity of Fading Channels -- MIMO and Delay Spread

Authors: Vignesh Sethuraman, Ligong Wang, Bruce Hajek, Amos Lapidoth

Abstract: Discrete-time Rayleigh fading multiple-input multiple-output (MIMO) channels are considered, with no channel state information at the transmitter and receiver. The fading is assumed to be correlated in time and independent from antenna to antenna. Peak and average transmit power constraints are imposed, either on the sum over antennas, or on each individual antenna. In both cases, an upper bound… ▽ More Discrete-time Rayleigh fading multiple-input multiple-output (MIMO) channels are considered, with no channel state information at the transmitter and receiver. The fading is assumed to be correlated in time and independent from antenna to antenna. Peak and average transmit power constraints are imposed, either on the sum over antennas, or on each individual antenna. In both cases, an upper bound and an asymptotic lower bound, as the signal-to-noise ratio approaches zero, on the channel capacity are presented. The limit of normalized capacity is identified under the sum power constraints, and, for a subclass of channels, for individual power constraints. These results carry over to a SISO channel with delay spread (i.e. frequency selective fading). △ Less

Submitted 26 April, 2007; v1 submitted 11 January, 2007; originally announced January 2007.

arXiv:cs/0604049 [pdf, ps, other]

Low SNR Capacity of Fading Channels with Peak and Average Power Constraints

Authors: Vignesh Sethuraman, Bruce Hajek

Abstract: Flat-fading channels that are correlated in time are considered under peak and average power constraints. For discrete-time channels, a new upper bound on the capacity per unit time is derived. A low SNR analysis of a full-scattering vector channel is used to derive a complimentary lower bound. Together, these bounds allow us to identify the exact scaling of channel capacity for a fixed peak to… ▽ More Flat-fading channels that are correlated in time are considered under peak and average power constraints. For discrete-time channels, a new upper bound on the capacity per unit time is derived. A low SNR analysis of a full-scattering vector channel is used to derive a complimentary lower bound. Together, these bounds allow us to identify the exact scaling of channel capacity for a fixed peak to average ratio, as the average power converges to zero. The upper bound is also asymptotically tight as the average power converges to zero for a fixed peak power. For a continuous time infinite bandwidth channel, Viterbi identified the capacity for M-FSK modulation. Recently, Zhang and Laneman showed that the capacity can be achieved with non-bursty signaling (QPSK). An additional contribution of this paper is to obtain similar results under peak and average power constraints. △ Less

Submitted 12 April, 2006; v1 submitted 11 April, 2006; originally announced April 2006.

Comments: 13 pages, version without proofs submitted to ISIT 2006

arXiv:cs/0506052 [pdf, ps, other]

Comments on `Bit Interleaved Coded Modulation'

Authors: Vignesh Sethuraman, Bruce Hajek

Abstract: Caire, Taricco and Biglieri presented a detailed analysis of bit interleaved coded modulation, a simple and popular technique used to improve system performance, especially in the context of fading channels. They derived an upper bound to the probability of error, called the expurgated bound. In this correspondence, the proof of the expurgated bound is shown to be flawed. A new upper bound is al… ▽ More Caire, Taricco and Biglieri presented a detailed analysis of bit interleaved coded modulation, a simple and popular technique used to improve system performance, especially in the context of fading channels. They derived an upper bound to the probability of error, called the expurgated bound. In this correspondence, the proof of the expurgated bound is shown to be flawed. A new upper bound is also derived. It is not known whether the original expurgated bound is valid for the important special case of square QAM with Gray labeling, but the new bound is very close to, and slightly tighter than, the original bound for a numerical example. △ Less

Submitted 22 November, 2005; v1 submitted 13 June, 2005; originally announced June 2005.

Comments: This is the version after review (shorter, better etc)

arXiv:cs/0504085 [pdf, ps, other]

doi 10.1109/TIT.2005.853329

Capacity per Unit Energy of Fading Channels with a Peak Constraint

Authors: Vignesh Sethuraman, Bruce Hajek

Abstract: A discrete-time single-user scalar channel with temporally correlated Rayleigh fading is analyzed. There is no side information at the transmitter or the receiver. A simple expression is given for the capacity per unit energy, in the presence of a peak constraint. The simple formula of Verdu for capacity per unit cost is adapted to a channel with memory, and is used in the proof. In addition to… ▽ More A discrete-time single-user scalar channel with temporally correlated Rayleigh fading is analyzed. There is no side information at the transmitter or the receiver. A simple expression is given for the capacity per unit energy, in the presence of a peak constraint. The simple formula of Verdu for capacity per unit cost is adapted to a channel with memory, and is used in the proof. In addition to bounding the capacity of a channel with correlated fading, the result gives some insight into the relationship between the correlation in the fading process and the channel capacity. The results are extended to a channel with side information, showing that the capacity per unit energy is one nat per Joule, independently of the peak power constraint. A continuous-time version of the model is also considered. The capacity per unit energy subject to a peak constraint (but no bandwidth constraint) is given by an expression similar to that for discrete time, and is evaluated for Gauss-Markov and Clarke fading channels. △ Less

Submitted 10 June, 2005; v1 submitted 18 April, 2005; originally announced April 2005.

Comments: Journal version of paper presented in ISIT 2003 - now accepted for publication in IEEE Transactions on Information Theory

Showing 1–15 of 15 results for author: Sethuraman, V