Search | arXiv e-print repository

Preserving the beamforming effect for spatial cue-based pseudo-binaural dereverberation of a single source

Authors: Sania Gul, Muhammad Salman Khan, Syed Waqar Shah

Abstract: Reverberations are unavoidable in enclosures, resulting in reduced intelligibility for hearing impaired and non native listeners and even for the normal hearing listeners in noisy circumstances. It also degrades the performance of machine listening applications. In this paper, we propose a novel approach of binaural dereverberation of a single speech source, using the differences in the interaural… ▽ More Reverberations are unavoidable in enclosures, resulting in reduced intelligibility for hearing impaired and non native listeners and even for the normal hearing listeners in noisy circumstances. It also degrades the performance of machine listening applications. In this paper, we propose a novel approach of binaural dereverberation of a single speech source, using the differences in the interaural cues of the direct path signal and the reverberations. Two beamformers, spaced at an interaural distance, are used to extract the reverberations from the reverberant speech. The interaural cues generated by these reverberations and those generated by the direct path signal act as a two class dataset, used for the training of U-Net (a deep convolutional neural network). After its training, the beamformers are removed and the trained U-Net along with the maximum likelihood estimation (MLE) algorithm is used to discriminate between the direct path cues from the reverberation cues, when the system is exposed to the interaural spectrogram of the reverberant speech signal. Our proposed model has outperformed the classical signal processing dereverberation model weighted prediction error in terms of cepstral distance (CEP), frequency weighted segmental signal to noise ratio (FWSEGSNR) and signal to reverberation modulation energy ratio (SRMR) by 1.4 points, 8 dB and 0.6dB. It has achieved better performance than the deep learning based dereverberation model by gaining 1.3 points improvement in CEP with comparable FWSEGSNR, using training dataset which is almost 8 times smaller than required for that model. The proposed model also sustained its performance under relatively similar unseen acoustic conditions and at positions in the vicinity of its training position. △ Less

Submitted 10 August, 2022; originally announced August 2022.

Comments: 25 pages, 7 figures

arXiv:2208.04626 [pdf]

Recycling an anechoic pre-trained speech separation deep neural network for binaural dereverberation of a single source

Authors: Sania Gul, Muhammad Salman Khan, Syed Waqar Shah, Ata Ur-Rehman

Abstract: Reverberation results in reduced intelligibility for both normal and hearing-impaired listeners. This paper presents a novel psychoacoustic approach of dereverberation of a single speech source by recycling a pre-trained binaural anechoic speech separation neural network. As training the deep neural network (DNN) is a lengthy and computationally expensive process, the advantage of using a pre-trai… ▽ More Reverberation results in reduced intelligibility for both normal and hearing-impaired listeners. This paper presents a novel psychoacoustic approach of dereverberation of a single speech source by recycling a pre-trained binaural anechoic speech separation neural network. As training the deep neural network (DNN) is a lengthy and computationally expensive process, the advantage of using a pre-trained separation network for dereverberation is that the network does not need to be retrained, saving both time and computational resources. The interaural cues of a reverberant source are given to this pretrained neural network to discriminate between the direct path signal and the reverberant speech. The results show an average improvement of 1.3% in signal intelligibility, 0.83 dB in SRMR (signal to reverberation energy ratio) and 0.16 points in perceptual evaluation of speech quality (PESQ) over other state-of-the-art signal processing dereverberation algorithms and 14% in intelligibility and 0.35 points in quality over orthogonal matching pursuit with spectral subtraction (OSS), a machine learning based dereverberation algorithm. △ Less

Submitted 9 August, 2022; originally announced August 2022.

Comments: 15 pages, 4 figures

arXiv:2204.03687 [pdf, other]

doi 10.1109/TVT.2022.3165467

Statistical QoS Analysis of Reconfigurable Intelligent Surface-assisted D2D Communication

Authors: Syed Waqas Haider Shah, Adnan Noor Mian, Shahid Mumtaz, Anwer Al-Dulaimi, Chih-Lin I, Jon Crowcroft

Abstract: This work performs the statistical QoS analysis of a Rician block-fading reconfigurable intelligent surface (RIS)-assisted D2D link in which the transmit node operates under delay QoS constraints. First, we perform mode selection for the D2D link, in which the D2D pair can either communicate directly by relaying data from RISs or through a base station (BS). Next, we provide closed-form expression… ▽ More This work performs the statistical QoS analysis of a Rician block-fading reconfigurable intelligent surface (RIS)-assisted D2D link in which the transmit node operates under delay QoS constraints. First, we perform mode selection for the D2D link, in which the D2D pair can either communicate directly by relaying data from RISs or through a base station (BS). Next, we provide closed-form expressions for the effective capacity (EC) of the RIS-assisted D2D link. When channel state information at the transmitter (CSIT) is available, the transmit D2D node communicates with the variable rate $r_t(n)$ (adjustable according to the channel conditions); otherwise, it uses a fixed rate $r_t$. It allows us to model the RIS-assisted D2D link as a Markov system in both cases. We also extend our analysis to overlay and underlay D2D settings. To improve the throughput of the RIS-assisted D2D link when CSIT is unknown, we use the HARQ retransmission scheme and provide the EC analysis of the HARQ-enabled RIS-assisted D2D link. Finally, simulation results demonstrate that: i) the EC increases with an increase in RIS elements, ii) the EC decreases when strict QoS constraints are imposed at the transmit node, iii) the EC decreases with an increase in the variance of the path loss estimation error, iv) the EC increases with an increase in the probability of ON states, v) EC increases by using HARQ when CSIT is unknown, and it can reach up to $5\times$ the usual EC (with no HARQ and without CSIT) by using the optimal number of retransmissions. △ Less

Submitted 7 April, 2022; originally announced April 2022.

Comments: Accepted for publication in IEEE Transactions on Vehicular Technology

arXiv:2107.12217 [pdf, other]

Effective Capacity Analysis of HARQ-enabled D2D Communication in Multi-Tier Cellular Networks

Authors: Syed Waqas Haider Shah, Muhammad Mahboob-ur-Rahman, Adnan Noor Mian, Octavia A. Dobre, Jon Crowcroft

Abstract: This work does the statistical quality-of-service (QoS) analysis of a block-fading device-to-device (D2D) link in a multi-tier cellular network that consists of a macro-BS (BSMC) and a micro-BS (BSmC) which both operate in full-duplex (FD) mode. For the D2D link under consideration, we first formulate the mode selection problem-whereby D2D pair could either communicate directly, or, through the BS… ▽ More This work does the statistical quality-of-service (QoS) analysis of a block-fading device-to-device (D2D) link in a multi-tier cellular network that consists of a macro-BS (BSMC) and a micro-BS (BSmC) which both operate in full-duplex (FD) mode. For the D2D link under consideration, we first formulate the mode selection problem-whereby D2D pair could either communicate directly, or, through the BSmC, or, through the BSMC-as a ternary hypothesis testing problem. Next, to compute the effective capacity (EC) for the given D2D link, we assume that the channel state information (CSI) is not available at the transmit D2D node, and hence, it transmits at a fixed rate r with a fixed power. This allows us to model the D2D link as a Markov system with six-states. We consider both overlay and underlay modes for the D2D link. Moreover, to improve the throughput of the D2D link, we assume that the D2D pair utilizes two special automatic repeat request (ARQ) schemes, i.e., Hybrid-ARQ (HARQ) and truncated HARQ. Furthermore, we consider two distinct queue models at the transmit D2D node, based upon how it responds to the decoding failure at the receive D2D node. Eventually, we provide closed-form expressions for the EC for both HARQ-enabled D2D link and truncated HARQ-enabled D2D link, under both queue models. Noting that the EC looks like a quasi-concave function of r, we further maximize the EC by searching for an optimal rate via the gradient-descent method. Simulation results provide us the following insights: i) EC decreases with an increase in the QoS exponent, ii) EC of the D2D link improves when HARQ is employed, iii) EC increases with an increase in the quality of self-interference cancellation techniques used at BSmC and BSMC in FD mode. △ Less

Submitted 26 July, 2021; originally announced July 2021.

Comments: Accepted for publication in IEEE Transactions on Vehicular Technology

arXiv:2102.13334 [pdf]

Integration of deep learning with expectation maximization for spatial cue based speech separation in reverberant conditions

Authors: Sania Gul, Muhammad Salman Khan, Syed Waqar Shah

Abstract: In this paper, we formulate a blind source separation (BSS) framework, which allows integrating U-Net based deep learning source separation network with probabilistic spatial machine learning expectation maximization (EM) algorithm for separating speech in reverberant conditions. Our proposed model uses a pre-trained deep learning convolutional neural network, U-Net, for clustering the interaural… ▽ More In this paper, we formulate a blind source separation (BSS) framework, which allows integrating U-Net based deep learning source separation network with probabilistic spatial machine learning expectation maximization (EM) algorithm for separating speech in reverberant conditions. Our proposed model uses a pre-trained deep learning convolutional neural network, U-Net, for clustering the interaural level difference (ILD) cues and machine learning expectation maximization (EM) algorithm for clustering the interaural phase difference (IPD) cues. The integrated model exploits the complementary strengths of the two approaches to BSS: the strong modeling power of supervised neural networks and the ease of unsupervised machine learning algorithms, whose few parameters can be estimated on as little as a single segment of an audio mixture. The results show an average improvement of 4.3 dB in signal to distortion ratio (SDR) and 4.3% in short time speech intelligibility (STOI) over the EM based source separation algorithm MESSL-GS (model-based expectation-maximization source separation and localization with garbage source) and 4.5 dB in SDR and 8% in STOI over deep learning convolutional neural network (U-Net) based speech separation algorithm SONET under the reverberant conditions ranging from anechoic to those mostly encountered in the real world. △ Less

Submitted 26 February, 2021; originally announced February 2021.

arXiv:2002.05093 [pdf, other]

On the Effective Capacity of an Underwater Acoustic Channel under Impersonation Attack

Authors: Waqas Aman, Zeeshan Haider, S. Waqas H. Shah, M. Mahboob Ur Rahman, Octavia A. Dobre

Abstract: This paper investigates the impact of authentication on effective capacity (EC) of an underwater acoustic (UWA) channel. Specifically, the UWA channel is under impersonation attack by a malicious node (Eve) present in the close vicinity of the legitimate node pair (Alice and Bob); Eve tries to inject its malicious data into the system by making Bob believe that she is indeed Alice. To thwart the i… ▽ More This paper investigates the impact of authentication on effective capacity (EC) of an underwater acoustic (UWA) channel. Specifically, the UWA channel is under impersonation attack by a malicious node (Eve) present in the close vicinity of the legitimate node pair (Alice and Bob); Eve tries to inject its malicious data into the system by making Bob believe that she is indeed Alice. To thwart the impersonation attack by Eve, Bob utilizes the distance of the transmit node as the feature/fingerprint to carry out feature-based authentication at the physical layer. Due to authentication at Bob, due to lack of channel knowledge at the transmit node (Alice or Eve), and due to the threshold-based decoding error model, the relevant dynamics of the considered system could be modelled by a Markov chain (MC). Thus, we compute the state-transition probabilities of the MC, and the moment generating function for the service process corresponding to each state. This enables us to derive a closed-form expression of the EC in terms of authentication parameters. Furthermore, we compute the optimal transmission rate (at Alice) through gradient-descent (GD) technique and artificial neural network (ANN) method. Simulation results show that the EC decreases under severe authentication constraints (i.e., more false alarms and more transmissions by Eve). Simulation results also reveal that the (optimal transmission rate) performance of the ANN technique is quite close to that of the GD method. △ Less

Submitted 12 February, 2020; originally announced February 2020.

Comments: This paper is accepted for presentation at IEEE International Conference on Communications (ICC) 2020

arXiv:1901.02334 [pdf, other]

On the Impact of Mode Selection on Effective Capacity of Device-to-Device Communication

Authors: Syed Waqas Haider Shah, M. Mahboob ur Rahman, Adnan N. Mian, Ali Imran, Shahid Mumtaz, Octavia A. Dobre

Abstract: Consider a device to device (D2D) link which utilizes the mode selection to decide between the direct mode and cellular mode. This paper investigates the impact of mode selection on effective capacity (EC) (the maximum sustainable constant arrival rate at a transmitters queue under statistical quality of service constraints) of a D2D link for both overlay and underlay scenarios. Due to lack of cha… ▽ More Consider a device to device (D2D) link which utilizes the mode selection to decide between the direct mode and cellular mode. This paper investigates the impact of mode selection on effective capacity (EC) (the maximum sustainable constant arrival rate at a transmitters queue under statistical quality of service constraints) of a D2D link for both overlay and underlay scenarios. Due to lack of channel state information, the transmit device sends data at a fixed rate and fixed power; this fact combined with mode selection makes the D2D channel a Markov service process. Thus, the EC is obtained by calculating the entries of the transition probability matrix corresponding to the Markov D2D channel. Numerical results show that the EC decays exponentially (and the gain of overlay D2D over underlay D2D diminishes) with the increase in estimation error of the pathloss measurements utilized by the mode selection. △ Less

Submitted 8 January, 2019; originally announced January 2019.

Comments: 11 pages, 2 figures

arXiv:1812.01385 [pdf, other]

Design of an Efficient Single-Stage and 2-Stages Class-E Power Amplifier (2.4GHz) for Internet-of-Things

Authors: Ayyaz Ali, Syed Waqas Haider Shah, Khalid Iqbal

Abstract: In this work, the designs of a single-stage and 2-stage 2.4 GHz power amplifier (PA) are presented. The proposed PAs have been designed to provide high gain and improved efficiency using harmonic suppression and optimized impedance matching techniques. There are two harmonic suppression circuits, each stage of the PA consists of 2 capacitors and 2 inductors, which will help to suppress the harmoni… ▽ More In this work, the designs of a single-stage and 2-stage 2.4 GHz power amplifier (PA) are presented. The proposed PAs have been designed to provide high gain and improved efficiency using harmonic suppression and optimized impedance matching techniques. There are two harmonic suppression circuits, each stage of the PA consists of 2 capacitors and 2 inductors, which will help to suppress the harmonic frequency for 2.4 GHz. These suppression circuits will help to enhance the overall efficiency of the PAs. Both the PAs are provided with a VCC supply of 4.2V. Input and output impedances are matched to 50 ohms. Simulation and experimental results are presented, where the simulated gain and power added efficiency (PAE) for single stage PA are 17:58dB and 53%, respectively. While the experimental gain and PAE are 16:7dB and 49.5%, respectively. On the other hand, for 2-stages PA, simulated gain comes out to be 34.6dB and PAE is 55%, while the experimental gain and PAE are 30.5dB and 53.1%, respectively. The final design is being fabricated on the Taconic printed circuit board (PCB) with a thickness of 0.79mm and the dielectric constant value of 3.2 and its dimensions are 4.6cm x 3.4cm for single stage and 5.9cm x 3.6cm for 2-stages PA. △ Less

Submitted 4 December, 2018; originally announced December 2018.

Comments: 16th International Conference on Frontiers of Information Technology

arXiv:1807.03644 [pdf]

A Technique for Multi-User MIMO using Spatial Channel Model for out-door environments

Authors: Syed Waqas Haider Shah

Abstract: Any wireless communication system needs to specify a propagation channel model which acts as basis for performance evaluation and comparison. Spatial channel models can be divided into deterministic i.e ray tracing, measurement based which is based on channel information and geometry based stochastic channel models which are based on assumption that directional structure of channel can be modelled… ▽ More Any wireless communication system needs to specify a propagation channel model which acts as basis for performance evaluation and comparison. Spatial channel models can be divided into deterministic i.e ray tracing, measurement based which is based on channel information and geometry based stochastic channel models which are based on assumption that directional structure of channel can be modelled by last interaction between physical objects and electromagnetic waves, before waves reach the base or mobile station. Multi user double directional channel model (MDDCM) is a geometry based channel model which is used to calculate the double directional channel information in cellular system with MIMO mobile station and MIMO base station. In this research phase firstly, Multi User Multiple Input Multiple Output (MU-MIMO) spatial channel model has been implemented for different outdoor environments Urban Micro and Urban Macro using MATLAB for finding various parameters like angle of arrival of the user, user direction and the distance between user and access point (AP). Secondly coded (Multiple Input Multiple Output-Orthogonal Frequency Division Multiplexing) MIMO-OFDM system has been implemented using multipath Rayleigh faded channel and realistic Spatial Channel Model. Different BER improvement techniques are used such as, Viterbi-decoder, Time and Frequency inter-leaving. Multi-channel diversity is also observed by using multiple antennas at transmitting and receiving end [(2x2) and (2x4)] on both the channels. Effect of different modulation techniques on BER performance is also observed. △ Less

Submitted 10 July, 2018; originally announced July 2018.

Comments: masters thesis

Showing 1–9 of 9 results for author: Shah, S W