Search | arXiv e-print repository

doi 10.1109/TGRS.2022.3162420

Iterative, Deep Synthetic Aperture Sonar Image Segmentation

Authors: Yung-Chen Sun, Isaac D. Gerg, Vishal Monga

Abstract: Synthetic aperture sonar (SAS) systems produce high-resolution images of the seabed environment. Moreover, deep learning has demonstrated superior ability in finding robust features for automating imagery analysis. However, the success of deep learning is conditioned on having lots of labeled training data, but obtaining generous pixel-level annotations of SAS imagery is often practically infeasib… ▽ More Synthetic aperture sonar (SAS) systems produce high-resolution images of the seabed environment. Moreover, deep learning has demonstrated superior ability in finding robust features for automating imagery analysis. However, the success of deep learning is conditioned on having lots of labeled training data, but obtaining generous pixel-level annotations of SAS imagery is often practically infeasible. This challenge has thus far limited the adoption of deep learning methods for SAS segmentation. Algorithms exist to segment SAS imagery in an unsupervised manner, but they lack the benefit of state-of-the-art learning methods and the results present significant room for improvement. In view of the above, we propose a new iterative algorithm for unsupervised SAS image segmentation combining superpixel formation, deep learning, and traditional clustering methods. We call our method Iterative Deep Unsupervised Segmentation (IDUS). IDUS is an unsupervised learning framework that can be divided into four main steps: 1) A deep network estimates class assignments. 2) Low-level image features from the deep network are clustered into superpixels. 3) Superpixels are clustered into class assignments (which we call pseudo-labels) using $k$-means. 4) Resulting pseudo-labels are used for loss backpropagation of the deep network prediction. These four steps are performed iteratively until convergence. A comparison of IDUS to current state-of-the-art methods on a realistic benchmark dataset for SAS image segmentation demonstrates the benefits of our proposal even as the IDUS incurs a much lower computational burden during inference (actual labeling of a test image). Finally, we also develop a semi-supervised (SS) extension of IDUS called IDSS and demonstrate experimentally that it can further enhance performance while outperforming supervised alternatives that exploit the same labeled training imagery. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: arXiv admin note: text overlap with arXiv:2107.14563

arXiv:2107.14563 [pdf, other]

Iterative, Deep, and Unsupervised Synthetic Aperture Sonar Image Segmentation

Authors: Yung-Chen Sun, Isaac D. Gerg, Vishal Monga

Abstract: Deep learning has not been routinely employed for semantic segmentation of seabed environment for synthetic aperture sonar (SAS) imagery due to the implicit need of abundant training data such methods necessitate. Abundant training data, specifically pixel-level labels for all images, is usually not available for SAS imagery due to the complex logistics (e.g., diver survey, chase boat, precision p… ▽ More Deep learning has not been routinely employed for semantic segmentation of seabed environment for synthetic aperture sonar (SAS) imagery due to the implicit need of abundant training data such methods necessitate. Abundant training data, specifically pixel-level labels for all images, is usually not available for SAS imagery due to the complex logistics (e.g., diver survey, chase boat, precision position information) needed for obtaining accurate ground-truth. Many hand-crafted feature based algorithms have been proposed to segment SAS in an unsupervised fashion. However, there is still room for improvement as the feature extraction step of these methods is fixed. In this work, we present a new iterative unsupervised algorithm for learning deep features for SAS image segmentation. Our proposed algorithm alternates between clustering superpixels and updating the parameters of a convolutional neural network (CNN) so that the feature extraction for image segmentation can be optimized. We demonstrate the efficacy of our method on a realistic benchmark dataset. Our results show that the performance of our proposed method is considerably better than current state-of-the-art methods in SAS image segmentation. △ Less

Submitted 30 July, 2021; originally announced July 2021.

Comments: IEEE OCEANS 2021

arXiv:2103.10312 [pdf, other]

Real-Time, Deep Synthetic Aperture Sonar (SAS) Autofocus

Authors: Isaac D. Gerg, Vishal Monga

Abstract: Synthetic aperture sonar (SAS) requires precise time-of-flight measurements of the transmitted/received waveform to produce well-focused imagery. It is not uncommon for errors in these measurements to be present resulting in image defocusing. To overcome this, an \emph{autofocus} algorithm is employed as a post-processing step after image reconstruction to improve image focus. A particular class o… ▽ More Synthetic aperture sonar (SAS) requires precise time-of-flight measurements of the transmitted/received waveform to produce well-focused imagery. It is not uncommon for errors in these measurements to be present resulting in image defocusing. To overcome this, an \emph{autofocus} algorithm is employed as a post-processing step after image reconstruction to improve image focus. A particular class of these algorithms can be framed as a sharpness/contrast metric-based optimization. To improve convergence, a hand-crafted weighting function to remove "bad" areas of the image is sometimes applied to the image-under-test before the optimization procedure. Additionally, dozens of iterations are necessary for convergence which is a large compute burden for low size, weight, and power (SWaP) systems. We propose a deep learning technique to overcome these limitations and implicitly learn the weighting function in a data-driven manner. Our proposed method, which we call Deep Autofocus, uses features from the single-look-complex (SLC) to estimate the phase correction which is applied in $k$-space. Furthermore, we train our algorithm on batches of training imagery so that during deployment, only a single iteration of our method is sufficient to autofocus. We show results demonstrating the robustness of our technique by comparing our results to four commonly used image sharpness metrics. Our results demonstrate Deep Autofocus can produce imagery perceptually better than common iterative techniques but at a lower computational cost. We conclude that Deep Autofocus can provide a more favorable cost-quality trade-off than alternatives with significant potential of future research. △ Less

Submitted 1 June, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

Comments: Four pages. Accepted to IGARSS 2021. Fixed Eq 9

arXiv:2101.05888 [pdf, other]

GPU Acceleration for Synthetic Aperture Sonar Image Reconstruction

Authors: Isaac D. Gerg, Daniel C. Brown, Stephen G. Wagner, Daniel Cook, Brian N. O'Donnell, Thomas Benson, Thomas C. Montgomery

Abstract: Synthetic aperture sonar (SAS) image reconstruction, or beamforming as it is often referred to within the SAS community, comprises a class of computationally intensive algorithms for creating coherent high-resolution imagery from successive spatially varying sonar pings. Image reconstruction is usually performed topside because of the large compute burden necessitated by the procedure. Historicall… ▽ More Synthetic aperture sonar (SAS) image reconstruction, or beamforming as it is often referred to within the SAS community, comprises a class of computationally intensive algorithms for creating coherent high-resolution imagery from successive spatially varying sonar pings. Image reconstruction is usually performed topside because of the large compute burden necessitated by the procedure. Historically, image reconstruction required significant assumptions in order to produce real-time imagery within an unmanned underwater vehicle's (UUV's) size, weight, and power (SWaP) constraints. However, these assumptions result in reduced image quality. In this work, we describe ASASIN, the Advanced Synthetic Aperture Sonar Imagining eNgine. ASASIN is a time domain backprojection image reconstruction suite utilizing graphics processing units (GPUs) allowing real-time operation on UUVs without sacrificing image quality. We describe several speedups employed in ASASIN allowing us to achieve this objective. Furthermore, ASASIN's signal processing chain is capable of producing 2D and 3D SAS imagery as we will demonstrate. Finally, we measure ASASIN's performance on a variety of GPUs and create a model capable of predicting performance. We demonstrate our model's usefulness in predicting run-time performance on desktop and embedded GPU hardware. △ Less

Submitted 25 May, 2021; v1 submitted 14 January, 2021; originally announced January 2021.

Comments: 9 pages, 9 figures, submitted to MTS/IEEE OCEANS. Eq 11 fixed

arXiv:2010.15687

Deep Autofocus for Synthetic Aperture Sonar

Authors: Isaac Gerg, Vishal Monga

Abstract: Synthetic aperture sonar (SAS) requires precise positional and environmental information to produce well-focused output during the image reconstruction step. However, errors in these measurements are commonly present resulting in defocused imagery. To overcome these issues, an \emph{autofocus} algorithm is employed as a post-processing step after image reconstruction for the purpose of improving i… ▽ More Synthetic aperture sonar (SAS) requires precise positional and environmental information to produce well-focused output during the image reconstruction step. However, errors in these measurements are commonly present resulting in defocused imagery. To overcome these issues, an \emph{autofocus} algorithm is employed as a post-processing step after image reconstruction for the purpose of improving image quality using the image content itself. These algorithms are usually iterative and metric-based in that they seek to optimize an image sharpness metric. In this letter, we demonstrate the potential of machine learning, specifically deep learning, to address the autofocus problem. We formulate the problem as a self-supervised, phase error estimation task using a deep network we call Deep Autofocus. Our formulation has the advantages of being non-iterative (and thus fast) and not requiring ground truth focused-defocused images pairs as often required by other deblurring deep learning methods. We compare our technique against a set of common sharpness metrics optimized using gradient descent over a real-world dataset. Our results demonstrate Deep Autofocus can produce imagery that is perceptually as good as benchmark iterative techniques but at a substantially lower computational cost. We conclude that our proposed Deep Autofocus can provide a more favorable cost-quality trade-off than state-of-the-art alternatives with significant potential of future research. △ Less

Submitted 30 July, 2021; v1 submitted 29 October, 2020; originally announced October 2020.

Comments: superseded by another work

arXiv:2010.13317 [pdf, other]

Structural Prior Driven Regularized Deep Learning for Sonar Image Classification

Authors: Isaac D. Gerg, Vishal Monga

Abstract: Deep learning has been recently shown to improve performance in the domain of synthetic aperture sonar (SAS) image classification. Given the constant resolution with range of a SAS, it is no surprise that deep learning techniques perform so well. Despite deep learning's recent success, there are still compelling open challenges in reducing the high false alarm rate and enabling success when traini… ▽ More Deep learning has been recently shown to improve performance in the domain of synthetic aperture sonar (SAS) image classification. Given the constant resolution with range of a SAS, it is no surprise that deep learning techniques perform so well. Despite deep learning's recent success, there are still compelling open challenges in reducing the high false alarm rate and enabling success when training imagery is limited, which is a practical challenge that distinguishes the SAS classification problem from standard image classification set-ups where training imagery may be abundant. We address these challenges by exploiting prior knowledge that humans use to grasp the scene. These include unconscious elimination of the image speckle and localization of objects in the scene. We introduce a new deep learning architecture which incorporates these priors with the goal of improving automatic target recognition (ATR) from SAS imagery. Our proposal -- called SPDRDL, Structural Prior Driven Regularized Deep Learning -- incorporates the previously mentioned priors in a multi-task convolutional neural network (CNN) and requires no additional training data when compared to traditional SAS ATR methods. Two structural priors are enforced via regularization terms in the learning of the network: (1) structural similarity prior -- enhanced imagery (often through despeckling) aids human interpretation and is semantically similar to the original imagery and (2) structural scene context priors -- learned features ideally encapsulate target centering information; hence learning may be enhanced via a regularization that encourages fidelity against known ground truth target shifts (relative target position from scene center). Experiments on a challenging real-world dataset reveal that SPDRDL outperforms state-of-the-art deep learning and other competing methods for SAS image classification. △ Less

Submitted 26 October, 2020; originally announced October 2020.

Comments: To appear in TGRS, 2021

arXiv:1909.06436 [pdf, other]

Coupling Rendering and Generative Adversarial Networks for Artificial SAS Image Generation

Authors: Albert Reed, Isaac Gerg, John McKay, Daniel Brown, David Williams, Suren Jayasuriya

Abstract: Acquisition of Synthetic Aperture Sonar (SAS) datasets is bottlenecked by the costly deployment of SAS imaging systems, and even when data acquisition is possible,the data is often skewed towards containing barren seafloor rather than objects of interest. We present a novel pipeline, called SAS GAN, which couples an optical renderer with a generative adversarial network (GAN) to synthesize realist… ▽ More Acquisition of Synthetic Aperture Sonar (SAS) datasets is bottlenecked by the costly deployment of SAS imaging systems, and even when data acquisition is possible,the data is often skewed towards containing barren seafloor rather than objects of interest. We present a novel pipeline, called SAS GAN, which couples an optical renderer with a generative adversarial network (GAN) to synthesize realistic SAS images of targets on the seafloor. This coupling enables high levels of SAS image realism while enabling control over image geometry and parameters. We demonstrate qualitative results by presenting examples of images created with our pipeline. We also present quantitative results through the use of t-SNE and the Fréchet Inception Distance to argue that our generated SAS imagery potentially augments SAS datasets more effectively than an off-the-shelf GAN. △ Less

Submitted 2 October, 2019; v1 submitted 13 September, 2019; originally announced September 2019.

Comments: 10 pages, 9 figures. Submitted to IEEE OCEANS 2019 (Seattle). Updated acknowledgements

arXiv:1808.05279 [pdf, other]

Measuring Human Assessed Complexity in Synthetic Aperture Sonar Imagery Using the Elo Rating System

Authors: Brian Reinhardt, Isaac Gerg, Daniel Brown, Joonho Park

Abstract: Performance of automatic target recognition from synthetic aperture sonar data is heavily dependent on the complexity of the beamformed imagery. Several mechanisms can contribute to this, including unwanted vehicle dynamics, the bathymetry of the scene, and the presence of natural and manmade clutter. To understand the impact of the environmental complexity on image perception, researchers have ta… ▽ More Performance of automatic target recognition from synthetic aperture sonar data is heavily dependent on the complexity of the beamformed imagery. Several mechanisms can contribute to this, including unwanted vehicle dynamics, the bathymetry of the scene, and the presence of natural and manmade clutter. To understand the impact of the environmental complexity on image perception, researchers have taken approaches rooted in information theory, or heuristics. Despite these efforts, a quantitative measure for complexity has not been related to the phenomenology from which it is derived. By using subject matter experts (SMEs) we derive a complexity metric for a set of imagery which accounts for the underlying phenomenology. The goal of this work is to develop an understanding of how several common information theoretic and heuristic measures are related to the SME perceived complexity in synthetic aperture sonar imagery. To achieve this, an ensemble of 10-meter x 10-meter images were cropped from a high-frequency SAS data set that spans multiple environments. The SME's were presented pairs of images from which they could rate the relative image complexity. These comparisons were then converted into the desired sequential ranking using a method first developed by A. Elo for establishing rankings of chess players. The Elo method produced a plausible rank ordering across the broad dataset. The heuristic and information theoretical metrics were then compared to the image rank from which they were derived. The metrics with the highest degree of correlation were those relating to spatial information, e.g. variations in pixel intensity, with an R-squared value of approximately 0.9. However, this agreement was dependent on the scale from which the spatial variation was measured. Results will also be presented for many other measures including lacunarity, image compression, and entropy. △ Less

Submitted 15 August, 2018; originally announced August 2018.

Comments: Accepted for the Institute of Acoustics 4th International Conference on Synthetic Apertures Sonar and Radar Sept 2018

arXiv:1808.02868 [pdf, other]

Additional Representations for Improving Synthetic Aperture Sonar Classification Using Convolutional Neural Networks

Authors: Isaac Gerg, David Williams

Abstract: Object classification in synthetic aperture sonar (SAS) imagery is usually a data starved and class imbalanced problem. There are few objects of interest present among much benign seafloor. Despite these problems, current classification techniques discard a large portion of the collected SAS information. In particular, a beamformed SAS image, which we call a single-look complex (SLC) image, contai… ▽ More Object classification in synthetic aperture sonar (SAS) imagery is usually a data starved and class imbalanced problem. There are few objects of interest present among much benign seafloor. Despite these problems, current classification techniques discard a large portion of the collected SAS information. In particular, a beamformed SAS image, which we call a single-look complex (SLC) image, contains complex pixels composed of real and imaginary parts. For human consumption, the SLC is converted to a magnitude-phase representation and the phase information is discarded. Even more problematic, the magnitude information usually exhibits a large dynamic range (>80dB) and must be dynamic range compressed for human display. Often it is this dynamic range compressed representation, originally designed for human consumption, which is fed into a classifier. Consequently, the classification process is completely void of the phase information. In this work, we show improvements in classification performance using the phase information from the SLC as well as information from an alternate source: photographs. We perform statistical testing to demonstrate the validity of our results. △ Less

Submitted 15 October, 2018; v1 submitted 8 August, 2018; originally announced August 2018.

Comments: Accepted for the Institute of Acoustics 4th International Conference on Synthetic Aperture Sonar and Radar Sept 2018

arXiv:1808.02792 [pdf]

Multiband SAS Imagery

Authors: Isaac D Gerg

Abstract: Advances in unmanned synthetic aperture sonar (SAS) imaging platforms allow for the simultaneous collection of multiband SAS imagery. The imagery is collected over several octaves and the phenomenology's interactions with the sea floor vary greatly over this range -- higher frequencies resolve proud and fine structure of the seafloor while lower frequencies resolve subsurface features and often in… ▽ More Advances in unmanned synthetic aperture sonar (SAS) imaging platforms allow for the simultaneous collection of multiband SAS imagery. The imagery is collected over several octaves and the phenomenology's interactions with the sea floor vary greatly over this range -- higher frequencies resolve proud and fine structure of the seafloor while lower frequencies resolve subsurface features and often induce internal resonance in man-made objects. Currently, analysts examine multiband imagery by viewing a single band at a time. This method makes it difficult to ascertain correlations between any pair of bands collected over the same location. To mitigate this issue, we propose methods which ingest high frequency (HF) and low frequency (LF) SAS imagery and generates a color composite creating what we call a multiband SAS (MSAS) image. The MSAS image contains the relevant portions of the HF and LF images required by an analyst to interpret the scene and are defined using a spatial saliency metric computed for each image. We then combine the saliency and acoustic backscatter measures to form the final MSAS image. △ Less

Submitted 8 August, 2018; originally announced August 2018.

arXiv:1801.02548 [pdf, other]

Bridging the Gap: Simultaneous Fine Tuning for Data Re-Balancing

Authors: John McKay, Isaac Gerg, Vishal Monga

Abstract: There are many real-world classification problems wherein the issue of data imbalance (the case when a data set contains substantially more samples for one/many classes than the rest) is unavoidable. While under-sampling the problematic classes is a common solution, this is not a compelling option when the large data class is itself diverse and/or the limited data class is especially small. We sug… ▽ More There are many real-world classification problems wherein the issue of data imbalance (the case when a data set contains substantially more samples for one/many classes than the rest) is unavoidable. While under-sampling the problematic classes is a common solution, this is not a compelling option when the large data class is itself diverse and/or the limited data class is especially small. We suggest a strategy based on recent work concerning limited data problems which utilizes a supplemental set of images with similar properties to the limited data class to aid in the training of a neural network. We show results for our model against other typical methods on a real-world synthetic aperture sonar data set. Code can be found at github.com/JohnMcKay/dataImbalance. △ Less

Submitted 8 January, 2018; originally announced January 2018.

Comments: Submitted to IGARSS 2018, 4 Pages, 8 Figures

arXiv:1706.09858 [pdf, other]

What's Mine is Yours: Pretrained CNNs for Limited Training Sonar ATR

Authors: John McKay, Isaac Gerg, Vishal Monga, Raghu Raj

Abstract: Finding mines in Sonar imagery is a significant problem with a great deal of relevance for seafaring military and commercial endeavors. Unfortunately, the lack of enormous Sonar image data sets has prevented automatic target recognition (ATR) algorithms from some of the same advances seen in other computer vision fields. Namely, the boom in convolutional neural nets (CNNs) which have been able to… ▽ More Finding mines in Sonar imagery is a significant problem with a great deal of relevance for seafaring military and commercial endeavors. Unfortunately, the lack of enormous Sonar image data sets has prevented automatic target recognition (ATR) algorithms from some of the same advances seen in other computer vision fields. Namely, the boom in convolutional neural nets (CNNs) which have been able to achieve incredible results - even surpassing human actors - has not been an easily feasible route for many practitioners of Sonar ATR. We demonstrate the power of one avenue to incorporating CNNs into Sonar ATR: transfer learning. We first show how well a straightforward, flexible CNN feature-extraction strategy can be used to obtain impressive if not state-of-the-art results. Secondly, we propose a way to utilize the powerful transfer learning approach towards multiple instance target detection and identification within a provided synthetic aperture Sonar data set. △ Less

Submitted 29 June, 2017; originally announced June 2017.

Comments: Accepted to OCEANS 2017 - Anchorage (Conference)

Showing 1–12 of 12 results for author: Gerg, I