Search | arXiv e-print repository

doi 10.1109/TAES.2021.3139848

Effect of Kinematics and Fluency in Adversarial Synthetic Data Generation for ASL Recognition with RF Sensors

Authors: M. M. Rahman, E. Malaia, A. C. Gurbuz, D. J. Griffin, C. Crawfordand S. Z. Gurbuz

Abstract: RF sensors have been recently proposed as a new modality for sign language processing technology. They are non-contact, effective in the dark, and acquire a direct measurement of signing kinematic via exploitation of the micro-Doppler effect. First, this work provides an in depth, comparative examination of the kinematic properties of signing as measured by RF sensors for both fluent ASL users and… ▽ More RF sensors have been recently proposed as a new modality for sign language processing technology. They are non-contact, effective in the dark, and acquire a direct measurement of signing kinematic via exploitation of the micro-Doppler effect. First, this work provides an in depth, comparative examination of the kinematic properties of signing as measured by RF sensors for both fluent ASL users and hearing imitation signers. Second, as ASL recognition techniques utilizing deep learning requires a large amount of training data, this work examines the effect of signing kinematics and subject fluency on adversarial learning techniques for data synthesis. Two different approaches for the synthetic training data generation are proposed: 1) adversarial domain adaptation to minimize the differences between imitation signing and fluent signing data, and 2) kinematically-constrained generative adversarial networks for accurate synthesis of RF signing signatures. The results show that the kinematic discrepancies between imitation signing and fluent signing are so significant that training on data directly synthesized from fluent RF signers offers greater performance (93% top-5 accuracy) than that produced by adaptation of imitation signing (88% top-5 accuracy) when classifying 100 ASL signs. △ Less

Submitted 31 December, 2021; originally announced January 2022.

arXiv:2111.05480 [pdf, other]

ASL Trigger Recognition in Mixed Activity/Signing Sequences for RF Sensor-Based User Interfaces

Authors: Emre Kurtoglu, Ali C. Gurbuz, Evie A. Malaia, Darrin Griffin, Chris Crawford, Sevgi Z. Gurbuz

Abstract: The past decade has seen great advancements in speech recognition for control of interactive devices, personal assistants, and computer interfaces. However, Deaf and hard-ofhearing (HoH) individuals, whose primary mode of communication is sign language, cannot use voice-controlled interfaces. Although there has been significant work in video-based sign language recognition, video is not effective… ▽ More The past decade has seen great advancements in speech recognition for control of interactive devices, personal assistants, and computer interfaces. However, Deaf and hard-ofhearing (HoH) individuals, whose primary mode of communication is sign language, cannot use voice-controlled interfaces. Although there has been significant work in video-based sign language recognition, video is not effective in the dark and has raised privacy concerns in the Deaf community when used in the context of human ambient intelligence. RF sensors have been recently proposed as a new modality that can be effective under the circumstances where video is not. This paper considers the problem of recognizing a trigger sign (wake word) in the context of daily living, where gross motor activities are interwoven with signing sequences. The proposed approach exploits multiple RF data domain representations (time-frequency, range-Doppler, and range-angle) for sequential classification of mixed motion data streams. The recognition accuracy of signs with varying kinematic properties is compared and used to make recommendations on appropriate trigger sign selection for RFsensor based user interfaces. The proposed approach achieves a trigger sign detection rate of 98.9% and a classification accuracy of 92% for 15 ASL words and 3 gross motor activities. △ Less

Submitted 18 November, 2021; v1 submitted 9 November, 2021; originally announced November 2021.

Comments: 13 pages, 10 figures

arXiv:2009.01224 [pdf, other]

American Sign Language Recognition Using RF Sensing

Authors: Sevgi Z. Gurbuz, Ali C. Gurbuz, Evie A. Malaia, Darrin J. Griffin, Chris Crawford, M. Mahbubur Rahman, Emre Kurtoglu, Ridvan Aksu, Trevor Macks, Robiulhossain Mdrafi

Abstract: Many technologies for human-computer interaction have been designed for hearing individuals and depend upon vocalized speech, precluding users of American Sign Language (ASL) in the Deaf community from benefiting from these advancements. While great strides have been made in ASL recognition with video or wearable gloves, the use of video in homes has raised privacy concerns, while wearable gloves… ▽ More Many technologies for human-computer interaction have been designed for hearing individuals and depend upon vocalized speech, precluding users of American Sign Language (ASL) in the Deaf community from benefiting from these advancements. While great strides have been made in ASL recognition with video or wearable gloves, the use of video in homes has raised privacy concerns, while wearable gloves severely restrict movement and infringe on daily life. Methods: This paper proposes the use of RF sensors for HCI applications serving the Deaf community. A multi-frequency RF sensor network is used to acquire non-invasive, non-contact measurements of ASL signing irrespective of lighting conditions. The unique patterns of motion present in the RF data due to the micro-Doppler effect are revealed using time-frequency analysis with the Short-Time Fourier Transform. Linguistic properties of RF ASL data are investigated using machine learning (ML). Results: The information content, measured by fractal complexity, of ASL signing is shown to be greater than that of other upper body activities encountered in daily living. This can be used to differentiate daily activities from signing, while features from RF data show that imitation signing by non-signers is 99\% differentiable from native ASL signing. Feature-level fusion of RF sensor network data is used to achieve 72.5\% accuracy in classification of 20 native ASL signs. Implications: RF sensing can be used to study dynamic linguistic properties of ASL and design Deaf-centric smart environments for non-invasive, remote recognition of ASL. ML algorithms should be benchmarked on native, not imitation, ASL data. △ Less

Submitted 2 September, 2020; originally announced September 2020.

Comments: submitted

Journal ref: IEEE Sensors Journal, August 2020

Showing 1–3 of 3 results for author: Malaia, E