Skip to main content

Showing 1–15 of 15 results for author: Naqvi, S M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2501.02516   

    eess.AS cs.SD

    A Frequency-aware Augmentation Network for Mental Disorders Assessment from Audio

    Authors: Shuanglin Li, Siyang Song, Rajesh Nair, Syed Mohsen Naqvi

    Abstract: Depression and Attention Deficit Hyperactivity Disorder (ADHD) stand out as the common mental health challenges today. In affective computing, speech signals serve as effective biomarkers for mental disorder assessment. Current research, relying on labor-intensive hand-crafted features or simplistic time-frequency representations, often overlooks critical details by not accounting for the differen… ▽ More

    Submitted 4 March, 2025; v1 submitted 5 January, 2025; originally announced January 2025.

    Comments: Have find some technical problems which need be addressed within a plenty of time, and some part of them should be completed

  2. arXiv:2501.02512  [pdf, other

    eess.AS cs.SD

    Efficient Long Speech Sequence Modelling for Time-Domain Depression Level Estimation

    Authors: Shuanglin Li, Zhijie Xie, Syed Mohsen Naqvi

    Abstract: Depression significantly affects emotions, thoughts, and daily activities. Recent research indicates that speech signals contain vital cues about depression, sparking interest in audio-based deep-learning methods for estimating its severity. However, most methods rely on time-frequency representations of speech which have recently been criticized for their limitations due to the loss of informatio… ▽ More

    Submitted 5 January, 2025; originally announced January 2025.

  3. ADHD diagnosis based on action characteristics recorded in videos using machine learning

    Authors: Yichun Li, Syes Mohsen Naqvi, Rajesh Nair

    Abstract: Demand for ADHD diagnosis and treatment is increasing significantly and the existing services are unable to meet the demand in a timely manner. In this work, we introduce a novel action recognition method for ADHD diagnosis by identifying and analysing raw video recordings. Our main contributions include 1) designing and implementing a test focusing on the attention and hyperactivity/impulsivity o… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

    Comments: Neuroscience Applied

  4. arXiv:2409.02243  [pdf, other

    cs.CV

    A Novel Audio-Visual Information Fusion System for Mental Disorders Detection

    Authors: Yichun Li, Shuanglin Li, Syed Mohsen Naqvi

    Abstract: Mental disorders are among the foremost contributors to the global healthcare challenge. Research indicates that timely diagnosis and intervention are vital in treating various mental disorders. However, the early somatization symptoms of certain mental disorders may not be immediately evident, often resulting in their oversight and misdiagnosis. Additionally, the traditional diagnosis methods inc… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

    Comments: 27th International Conference on Information (FUSION)

  5. arXiv:2310.06854  [pdf, other

    cs.CV cs.AI eess.IV

    Learning with Noisy Labels for Human Fall Events Classification: Joint Cooperative Training with Trinity Networks

    Authors: Leiyu Xie, Yang Sun, Syed Mohsen Naqvi

    Abstract: With the increasing ageing population, fall events classification has drawn much research attention. In the development of deep learning, the quality of data labels is crucial. Most of the datasets are labelled automatically or semi-automatically, and the samples may be mislabeled, which constrains the performance of Deep Neural Networks (DNNs). Recent research on noisy label learning confirms tha… ▽ More

    Submitted 27 September, 2023; originally announced October 2023.

  6. arXiv:2309.15635  [pdf, other

    cs.CV

    Position and Orientation-Aware One-Shot Learning for Medical Action Recognition from Signal Data

    Authors: Leiyu Xie, Yuxing Yang, Zeyu Fu, Syed Mohsen Naqvi

    Abstract: In this work, we propose a position and orientation-aware one-shot learning framework for medical action recognition from signal data. The proposed framework comprises two stages and each stage includes signal-level image generation (SIG), cross-attention (CsA), dynamic time warping (DTW) modules and the information fusion between the proposed privacy-preserved position and orientation features. T… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  7. arXiv:2304.09751  [pdf, other

    cs.CV cs.AI cs.LG

    Skeleton-based action analysis for ADHD diagnosis

    Authors: Yichun Li, Yi Li, Rajesh Nair, Syed Mohsen Naqvi

    Abstract: Attention Deficit Hyperactivity Disorder (ADHD) is a common neurobehavioral disorder worldwide. While extensive research has focused on machine learning methods for ADHD diagnosis, most research relies on high-cost equipment, e.g., MRI machine and EEG patch. Therefore, low-cost diagnostic methods based on the action characteristics of ADHD are desired. Skeleton-based action recognition has gained… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  8. arXiv:2208.12027  [pdf, other

    cs.CV cs.AI

    Two-stage Fall Events Classification with Human Skeleton Data

    Authors: Leiyu Xie, Yang Sun, Jonathon A. Chambers, Syed Mohsen Naqvi

    Abstract: Fall detection and classification become an imper- ative problem for healthcare applications particularity with the increasingly ageing population. Currently, most of the fall clas- sification algorithms provide binary fall or no-fall classification. For better healthcare, it is thus not enough to do binary fall classification but to extend it to multiple fall events classification. In this work,… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

  9. arXiv:2206.04962  [pdf, other

    cs.SD eess.AS

    Feature Learning and Ensemble Pre-Tasks Based Self-Supervised Speech Denoising and Dereverberation

    Authors: Yi Li, ShuangLin Li, Yang Sun, Syed Mohsen Naqvi

    Abstract: Self-supervised learning (SSL) achieves great success in monaural speech enhancement, while the accuracy of the target speech estimation, particularly for unseen speakers, remains inadequate with existing pre-tasks. As speech signal contains multi-faceted information including speaker identity, paralinguistics, and spoken content, the latent representation for speech enhancement becomes a tough ta… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: text overlap with arXiv:2112.11142

  10. arXiv:2112.11459  [pdf, other

    cs.SD eess.AS

    Self-Supervised Learning based Monaural Speech Enhancement with Multi-Task Pre-Training

    Authors: Yi Li, Yang Sun, Syed Mohsen Naqvi

    Abstract: In self-supervised learning, it is challenging to reduce the gap between the enhancement performance on the estimated and target speech signals with existed pre-tasks. In this paper, we propose a multi-task pre-training method to improve the speech enhancement performance with self-supervised learning. Within the pre-training autoencoder (PAE), only a limited set of clean speech signals are requir… ▽ More

    Submitted 21 December, 2021; originally announced December 2021.

    Comments: Submitted to ICASSP 2022. arXiv admin note: text overlap with arXiv:2112.11142

  11. arXiv:2112.11142  [pdf, other

    cs.SD eess.AS

    Self-Supervised Learning based Monaural Speech Enhancement with Complex-Cycle-Consistent

    Authors: Yi Li, Yang Sun, Syed Mohsen Naqvi

    Abstract: Recently, self-supervised learning (SSL) techniques have been introduced to solve the monaural speech enhancement problem. Due to the lack of using clean phase information, the enhancement performance is limited in most SSL methods. Therefore, in this paper, we propose a phase-aware self-supervised learning based monaural speech enhancement method. The latent representations of both amplitude and… ▽ More

    Submitted 21 December, 2021; originally announced December 2021.

  12. U-shaped Transformer with Frequency-Band Aware Attention for Speech Enhancement

    Authors: Yi Li, Yang Sun, Syed Mohsen Naqvi

    Abstract: The state-of-the-art speech enhancement has limited performance in speech estimation accuracy. Recently, in deep learning, the Transformer shows the potential to exploit the long-range dependency in speech by self-attention. Therefore, it is introduced in speech enhancement to improve the speech estimation accuracy from a noise mixture. However, to address the computational cost issue in Transform… ▽ More

    Submitted 11 December, 2021; originally announced December 2021.

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing ( Volume: 31), 2023

  13. Domain Adaptation and Autoencoder Based Unsupervised Speech Enhancement

    Authors: Yi Li, Yang Sun, Kirill Horoshenkov, Syed Mohsen Naqvi

    Abstract: As a category of transfer learning, domain adaptation plays an important role in generalizing the model trained in one task and applying it to other similar tasks or settings. In speech enhancement, a well-trained acoustic model can be exploited to obtain the speech signal in the context of other languages, speakers, and environments. Recent domain adaptation research was developed more effectivel… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Journal ref: IEEE Transactions on Artificial Intelligence. (2021)

  14. arXiv:1810.12126  [pdf, other

    eess.IV cs.CV

    ActionXPose: A Novel 2D Multi-view Pose-based Algorithm for Real-time Human Action Recognition

    Authors: Federico Angelini, Zeyu Fu, Yang Long, Ling Shao, Syed Mohsen Naqvi

    Abstract: We present ActionXPose, a novel 2D pose-based algorithm for posture-level Human Action Recognition (HAR). The proposed approach exploits 2D human poses provided by OpenPose detector from RGB videos. ActionXPose aims to process poses data to be provided to a Long Short-Term Memory Neural Network and to a 1D Convolutional Neural Network, which solve the classification problem. ActionXPose is one of… ▽ More

    Submitted 29 October, 2018; originally announced October 2018.

  15. arXiv:1511.01726  [pdf, other

    cs.CV

    Multi-Target Tracking and Occlusion Handling with Learned Variational Bayesian Clusters and a Social Force Model

    Authors: Ata-ur-Rehman, Syed Mohsen Naqvi, Lyudmila Mihaylova, Jonathon Chambers

    Abstract: This paper considers the problem of multiple human target tracking in a sequence of video data. A solution is proposed which is able to deal with the challenges of a varying number of targets, interactions and when every target gives rise to multiple measurements. The developed novel algorithm comprises variational Bayesian clustering combined with a social force model, integrated within a particl… ▽ More

    Submitted 5 November, 2015; originally announced November 2015.

    Comments: 19 pages, 14 figures