-
Diver Identification Using Anthropometric Data Ratios for Underwater Multi-Human-Robot Collaboration
Authors:
Jungseok Hong,
Sadman Sakib Enan,
Junaed Sattar
Abstract:
Recent advances in efficient design, perception algorithms, and computing hardware have made it possible to create improved human-robot interaction (HRI) capabilities for autonomous underwater vehicles (AUVs). To conduct secure missions as underwater human-robot teams, AUVs require the ability to accurately identify divers. However, this remains an open problem due to divers' challenging visual fe…
▽ More
Recent advances in efficient design, perception algorithms, and computing hardware have made it possible to create improved human-robot interaction (HRI) capabilities for autonomous underwater vehicles (AUVs). To conduct secure missions as underwater human-robot teams, AUVs require the ability to accurately identify divers. However, this remains an open problem due to divers' challenging visual features, mainly caused by similar-looking scuba gear. In this paper, we present a novel algorithm that can perform diver identification using either pre-trained models or models trained during deployment. We exploit anthropometric data obtained from diver pose estimates to generate robust features that are invariant to changes in distance and photometric conditions. We also propose an embedding network that maximizes inter-class distances in the feature space and minimizes those for the intra-class features, which significantly improves classification performance. Furthermore, we present an end-to-end diver identification framework that operates on an AUV and evaluate the accuracy of the proposed algorithm. Quantitative results in controlled-water experiments show that our algorithm achieves a high level of accuracy in diver identification.
△ Less
Submitted 29 September, 2023;
originally announced October 2023.
-
A Diver Attention Estimation Framework for Effective Underwater Human-Robot Interaction
Authors:
Sadman Sakib Enan,
Junaed Sattar
Abstract:
Many underwater tasks, such as cable-and-wreckage inspection and search-and-rescue, can benefit from robust Human-Robot Interaction (HRI) capabilities. With the recent advancements in vision-based underwater HRI methods, Autonomous Underwater Vehicles (AUVs) have the capability to interact with their human partners without requiring assistance from a topside operator. However, in these methods, th…
▽ More
Many underwater tasks, such as cable-and-wreckage inspection and search-and-rescue, can benefit from robust Human-Robot Interaction (HRI) capabilities. With the recent advancements in vision-based underwater HRI methods, Autonomous Underwater Vehicles (AUVs) have the capability to interact with their human partners without requiring assistance from a topside operator. However, in these methods, the AUV assumes that the diver is ready for interaction, while in reality, the diver may be distracted. In this paper, we attempt to address this problem by presenting a diver attention estimation framework for AUVs to autonomously determine the attentiveness of a diver, and developing a robot controller to allow the AUV to navigate and reorient itself with respect to the diver before initiating interaction. The core element of the framework is a deep convolutional neural network called DATT-Net. It is based on a pyramid structure that can exploit the geometric relations among 10 facial keypoints of a diver to estimate their head orientation, which we use as an indicator of attentiveness. Our on-the-bench experimental evaluations and real-world experiments during both closed- and open-water robot trials confirm the efficacy of the proposed framework.
△ Less
Submitted 12 March, 2025; v1 submitted 28 September, 2022;
originally announced September 2022.
-
Robotic Detection of a Human-Comprehensible Gestural Language for Underwater Multi-Human-Robot Collaboration
Authors:
Sadman Sakib Enan,
Michael Fulton,
Junaed Sattar
Abstract:
In this paper, we present a motion-based robotic communication framework that enables non-verbal communication among autonomous underwater vehicles (AUVs) and human divers. We design a gestural language for AUV-to-AUV communication which can be easily understood by divers observing the conversation unlike typical radio frequency, light, or audio based AUV communication. To allow AUVs to visually u…
▽ More
In this paper, we present a motion-based robotic communication framework that enables non-verbal communication among autonomous underwater vehicles (AUVs) and human divers. We design a gestural language for AUV-to-AUV communication which can be easily understood by divers observing the conversation unlike typical radio frequency, light, or audio based AUV communication. To allow AUVs to visually understand a gesture from another AUV, we propose a deep network (RRCommNet) which exploits a self-attention mechanism to learn to recognize each message by extracting maximally discriminative spatio-temporal features. We train this network on diverse simulated and real-world data. Our experimental evaluations, both in simulation and in closed-water robot trials, demonstrate that the proposed RRCommNet architecture is able to decipher gesture-based messages with an average accuracy of 88-94% on simulated data, 73-83% on real data (depending on the version of the model used). Further, by performing a message transcription study with human participants, we also show that the proposed language can be understood by humans, with an overall transcription accuracy of 88%. Finally, we discuss the inference runtime of RRCommNet on embedded GPU hardware, for real-time use on board AUVs in the field.
△ Less
Submitted 12 July, 2022;
originally announced July 2022.
-
Visual Diver Face Recognition for Underwater Human-Robot Interaction
Authors:
Jungseok Hong,
Sadman Sakib Enan,
Christopher Morse,
Junaed Sattar
Abstract:
This paper presents a deep-learned facial recognition method for underwater robots to identify scuba divers. Specifically, the proposed method is able to recognize divers underwater with faces heavily obscured by scuba masks and breathing apparatus. Our contribution in this research is towards robust facial identification of individuals under significant occlusion of facial features and image degr…
▽ More
This paper presents a deep-learned facial recognition method for underwater robots to identify scuba divers. Specifically, the proposed method is able to recognize divers underwater with faces heavily obscured by scuba masks and breathing apparatus. Our contribution in this research is towards robust facial identification of individuals under significant occlusion of facial features and image degradation from underwater optical distortions. With the ability to correctly recognize divers, autonomous underwater vehicles (AUV) will be able to engage in collaborative tasks with the correct person in human-robot teams and ensure that instructions are accepted from only those authorized to command the robots. We demonstrate that our proposed framework is able to learn discriminative features from real-world diver faces through different data augmentation and generation techniques. Experimental evaluations show that this framework achieves a 3-fold increase in prediction accuracy compared to the state-of-the-art (SOTA) algorithms and is well-suited for embedded inference on robotic platforms.
△ Less
Submitted 18 November, 2020;
originally announced November 2020.
-
Semantic Segmentation of Underwater Imagery: Dataset and Benchmark
Authors:
Md Jahidul Islam,
Chelsey Edge,
Yuyang Xiao,
Peigen Luo,
Muntaqim Mehtaz,
Christopher Morse,
Sadman Sakib Enan,
Junaed Sattar
Abstract:
In this paper, we present the first large-scale dataset for semantic Segmentation of Underwater IMagery (SUIM). It contains over 1500 images with pixel annotations for eight object categories: fish (vertebrates), reefs (invertebrates), aquatic plants, wrecks/ruins, human divers, robots, and sea-floor. The images have been rigorously collected during oceanic explorations and human-robot collaborati…
▽ More
In this paper, we present the first large-scale dataset for semantic Segmentation of Underwater IMagery (SUIM). It contains over 1500 images with pixel annotations for eight object categories: fish (vertebrates), reefs (invertebrates), aquatic plants, wrecks/ruins, human divers, robots, and sea-floor. The images have been rigorously collected during oceanic explorations and human-robot collaborative experiments, and annotated by human participants. We also present a benchmark evaluation of state-of-the-art semantic segmentation approaches based on standard performance metrics. In addition, we present SUIM-Net, a fully-convolutional encoder-decoder model that balances the trade-off between performance and computational efficiency. It offers competitive performance while ensuring fast end-to-end inference, which is essential for its use in the autonomy pipeline of visually-guided underwater robots. In particular, we demonstrate its usability benefits for visual servoing, saliency prediction, and detailed scene understanding. With a variety of use cases, the proposed model and benchmark dataset open up promising opportunities for future research in underwater robot vision.
△ Less
Submitted 13 September, 2020; v1 submitted 2 April, 2020;
originally announced April 2020.
-
Design and Experiments with LoCO AUV: A Low Cost Open-Source Autonomous Underwater Vehicle
Authors:
Chelsey Edge,
Sadman Sakib Enan,
Michael Fulton,
Jungseok Hong,
Jiawei Mo,
Kimberly Barthelemy,
Hunter Bashaw,
Berik Kallevig,
Corey Knutson,
Kevin Orpen,
Junaed Sattar
Abstract:
In this paper we present LoCO AUV, a Low-Cost, Open Autonomous Underwater Vehicle. LoCO is a general-purpose, single-person-deployable, vision-guided AUV, rated to a depth of 100 meters. We discuss the open and expandable design of this underwater robot, as well as the design of a simulator in Gazebo. Additionally, we explore the platform's preliminary local motion control and state estimation abi…
▽ More
In this paper we present LoCO AUV, a Low-Cost, Open Autonomous Underwater Vehicle. LoCO is a general-purpose, single-person-deployable, vision-guided AUV, rated to a depth of 100 meters. We discuss the open and expandable design of this underwater robot, as well as the design of a simulator in Gazebo. Additionally, we explore the platform's preliminary local motion control and state estimation abilities, which enable it to perform maneuvers autonomously. In order to demonstrate its usefulness for a variety of tasks, we implement a variety of our previously presented human-robot interaction capabilities on LoCO, including gestural control, diver following, and robot communication via motion. Finally, we discuss the practical concerns of deployment and our experiences in using this robot in pools, lakes, and the ocean. All design details, instructions on assembly, and code will be released under a permissive, open-source license.
△ Less
Submitted 19 March, 2020;
originally announced March 2020.
-
Underwater Image Super-Resolution using Deep Residual Multipliers
Authors:
Md Jahidul Islam,
Sadman Sakib Enan,
Peigen Luo,
Junaed Sattar
Abstract:
We present a deep residual network-based generative model for single image super-resolution (SISR) of underwater imagery for use by autonomous underwater robots. We also provide an adversarial training pipeline for learning SISR from paired data. In order to supervise the training, we formulate an objective function that evaluates the \textit{perceptual quality} of an image based on its global con…
▽ More
We present a deep residual network-based generative model for single image super-resolution (SISR) of underwater imagery for use by autonomous underwater robots. We also provide an adversarial training pipeline for learning SISR from paired data. In order to supervise the training, we formulate an objective function that evaluates the \textit{perceptual quality} of an image based on its global content, color, and local style information. Additionally, we present USR-248, a large-scale dataset of three sets of underwater images of 'high' (640x480) and 'low' (80x60, 160x120, and 320x240) spatial resolution. USR-248 contains paired instances for supervised training of 2x, 4x, or 8x SISR models. Furthermore, we validate the effectiveness of our proposed model through qualitative and quantitative experiments and compare the results with several state-of-the-art models' performances. We also analyze its practical feasibility for applications such as scene understanding and attention modeling in noisy visual conditions.
△ Less
Submitted 24 February, 2020; v1 submitted 20 September, 2019;
originally announced September 2019.