Skip to main content

Showing 1–9 of 9 results for author: Nawaz, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2410.13526  [pdf

    cs.CV cs.LG eess.IV

    Generative Adversarial Synthesis of Radar Point Cloud Scenes

    Authors: Muhammad Saad Nawaz, Thomas Dallmann, Torsten Schoen, Dirk Heberling

    Abstract: For the validation and verification of automotive radars, datasets of realistic traffic scenarios are required, which, how ever, are laborious to acquire. In this paper, we introduce radar scene synthesis using GANs as an alternative to the real dataset acquisition and simulation-based approaches. We train a PointNet++ based GAN model to generate realistic radar point cloud scenes and use a binary… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: ICMIM 2024; 7th IEEE MTT Conference

  2. arXiv:2404.09342  [pdf, other

    cs.CV cs.SD eess.AS

    Face-voice Association in Multilingual Environments (FAME) Challenge 2024 Evaluation Plan

    Authors: Muhammad Saad Saeed, Shah Nawaz, Muhammad Salman Tahir, Rohan Kumar Das, Muhammad Zaigham Zaheer, Marta Moscati, Markus Schedl, Muhammad Haris Khan, Karthik Nandakumar, Muhammad Haroon Yousaf

    Abstract: The advancements of technology have led to the use of multimodal systems in various real-world applications. Among them, the audio-visual systems are one of the widely used multimodal systems. In the recent years, associating face and voice of a person has gained attention due to presence of unique correlation between them. The Face-voice Association in Multilingual Environments (FAME) Challenge 2… ▽ More

    Submitted 22 July, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: ACM Multimedia Conference - Grand Challenge

  3. arXiv:2309.09837  [pdf, other

    cs.SD cs.CY eess.AS

    Frame-to-Utterance Convergence: A Spectra-Temporal Approach for Unified Spoofing Detection

    Authors: Awais Khan, Khalid Mahmood Malik, Shah Nawaz

    Abstract: Voice spoofing attacks pose a significant threat to automated speaker verification systems. Existing anti-spoofing methods often simulate specific attack types, such as synthetic or replay attacks. However, in real-world scenarios, the countermeasures are unaware of the generation schema of the attack, necessitating a unified solution. Current unified solutions struggle to detect spoofing artifact… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  4. arXiv:2308.01966  [pdf, other

    cs.MM cs.CL cs.LG cs.SD eess.AS

    DCTM: Dilated Convolutional Transformer Model for Multimodal Engagement Estimation in Conversation

    Authors: Vu Ngoc Tu, Van Thong Huynh, Hyung-Jeong Yang, M. Zaigham Zaheer, Shah Nawaz, Karthik Nandakumar, Soo-Hyung Kim

    Abstract: Conversational engagement estimation is posed as a regression problem, entailing the identification of the favorable attention and involvement of the participants in the conversation. This task arises as a crucial pursuit to gain insights into human's interaction dynamics and behavior patterns within a conversation. In this research, we introduce a dilated convolutional Transformer for modeling an… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

    Comments: Accepted in ACMM Grand Challenge

  5. arXiv:2302.13033  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Speaker Recognition in Realistic Scenario Using Multimodal Data

    Authors: Saqlain Hussain Shah, Muhammad Saad Saeed, Shah Nawaz, Muhammad Haroon Yousaf

    Abstract: In recent years, an association is established between faces and voices of celebrities leveraging large scale audio-visual information from YouTube. The availability of large scale audio-visual datasets is instrumental in developing speaker recognition methods based on standard Convolutional Neural Networks. Thus, the aim of this paper is to leverage large scale audio-visual information to improve… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

    Comments: Accepted at the International Conference on Artificial Intelligence (ICAI'2023)

  6. Non-Coherent and Backscatter Communications: Enabling Ultra-Massive Connectivity in 6G Wireless Networks

    Authors: Syed Junaid Nawaz, Shree Krishna Sharma, Babar Mansoor, Mohmammad N. Patwary, Noor M. Khan

    Abstract: With the commencement of the 5G of wireless networks, researchers around the globe have started paying their attention to the imminent challenges that may emerge in the beyond 5G (B5G) era. Various revolutionary technologies and innovative services are offered in 5G networks, which, along with many principal advantages, are anticipated to bring a boom in the number of connected wireless devices an… ▽ More

    Submitted 20 February, 2021; v1 submitted 21 May, 2020; originally announced May 2020.

    Comments: 6G Wireless Networks, Preprint, 34 pages, 11 Figures

  7. arXiv:2004.13780  [pdf, other

    cs.CV cs.CL cs.SD eess.AS

    Cross-modal Speaker Verification and Recognition: A Multilingual Perspective

    Authors: Muhammad Saad Saeed, Shah Nawaz, Pietro Morerio, Arif Mahmood, Ignazio Gallo, Muhammad Haroon Yousaf, Alessio Del Bue

    Abstract: Recent years have seen a surge in finding association between faces and voices within a cross-modal biometric application along with speaker recognition. Inspired from this, we introduce a challenging task in establishing association between faces and voices across multiple languages spoken by the same set of persons. The aim of this paper is to answer two closely related questions: "Is face-voice… ▽ More

    Submitted 22 April, 2021; v1 submitted 28 April, 2020; originally announced April 2020.

    Comments: Accepted: CVPRW

  8. arXiv:1909.08685  [pdf, ps, other

    cs.CV cs.SD eess.AS

    Deep Latent Space Learning for Cross-modal Mapping of Audio and Visual Signals

    Authors: Shah Nawaz, Muhammad Kamran Janjua, Ignazio Gallo, Arif Mahmood, Alessandro Calefati

    Abstract: We propose a novel deep training algorithm for joint representation of audio and visual information which consists of a single stream network (SSNet) coupled with a novel loss function to learn a shared deep latent space representation of multimodal information. The proposed framework characterizes the shared latent space by leveraging the class centers which helps to eliminate the need for pairwi… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

    Comments: Accepted to DICTA 2019

  9. arXiv:1812.02483  [pdf, other

    eess.SP cs.IT

    Propagation Channels for mmWave Vehicular Communications: State-of-the-art and Future Research Directions

    Authors: Furqan Jameel, Shurjeel Wyne, Syed Junaid Nawaz, Zheng Chang

    Abstract: Vehicular communications essentially support automotive applications for safety and infotainment. For this reason, industry leaders envision an enhanced role of vehicular communications in the fifth generation of mobile communications technology. Over the years, the number of vehicle-mounted sensors has increased steadily, which potentially leads to more volume of critical data communications in a… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.