Skip to main content

Showing 1–9 of 9 results for author: Lan, Z

Searching in archive eess. Search in all archives.
.
  1. arXiv:2411.06307  [pdf, other

    cs.SD eess.AS

    Acoustic Volume Rendering for Neural Impulse Response Fields

    Authors: Zitong Lan, Chenhao Zheng, Zhiwei Zheng, Mingmin Zhao

    Abstract: Realistic audio synthesis that captures accurate acoustic phenomena is essential for creating immersive experiences in virtual and augmented reality. Synthesizing the sound received at any position relies on the estimation of impulse response (IR), which characterizes how sound propagates in one scene along different paths before arriving at the listener's position. In this paper, we present Acous… ▽ More

    Submitted 9 November, 2024; originally announced November 2024.

    Comments: NeurIPS 2024 Spotlight

  2. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  3. arXiv:2210.16849  [pdf, other

    cs.SD eess.AS

    TT-Net: Dual-path transformer based sound field translation in the spherical harmonic domain

    Authors: Yiwen Wang, Zijian Lan, Xihong Wu, Tianshu Qu

    Abstract: In the current method for the sound field translation tasks based on spherical harmonic (SH) analysis, the solution based on the additive theorem usually faces the problem of singular values caused by large matrix condition numbers. The influence of different distances and frequencies of the spherical radial function on the stability of the translation matrix will affect the accuracy of the SH coe… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: Submitted to ICASSP 2023

  4. arXiv:2210.02166  [pdf, other

    eess.SY

    Robust Bayesian Inference for Moving Horizon Estimation

    Authors: Wenhan Cao, Chang Liu, Zhiqian Lan, Shengbo Eben Li, Wei Pan, Angelo Alessandri

    Abstract: The accuracy of moving horizon estimation (MHE) suffers significantly in the presence of measurement outliers. Existing methods address this issue by treating measurements leading to large MHE cost function values as outliers, which are subsequently discarded. This strategy, achieved through solving combinatorial optimization problems, is confined to linear systems to guarantee computational tract… ▽ More

    Submitted 2 October, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: 17 pages

  5. arXiv:2112.10894  [pdf

    cs.NE cs.LG eess.SP

    Subject-Independent Drowsiness Recognition from Single-Channel EEG with an Interpretable CNN-LSTM model

    Authors: Jian Cui, Zirui Lan, Tianhu Zheng, Yisi Liu, Olga Sourina, Lipo Wang, Wolfgang Müller-Wittig

    Abstract: For EEG-based drowsiness recognition, it is desirable to use subject-independent recognition since conducting calibration on each subject is time-consuming. In this paper, we propose a novel Convolutional Neural Network (CNN)-Long Short-Term Memory (LSTM) model for subject-independent drowsiness recognition from single-channel EEG signals. Different from existing deep learning models that are most… ▽ More

    Submitted 21 November, 2021; originally announced December 2021.

    Journal ref: 2021 International Conference on Cyberworlds (CW), 2021, pp. 201-208

  6. arXiv:2107.09507  [pdf

    eess.SP cs.LG cs.NE q-bio.NC

    EEG-based Cross-Subject Driver Drowsiness Recognition with an Interpretable Convolutional Neural Network

    Authors: Jian Cui, Zirui Lan, Olga Sourina, Wolfgang Müller-Wittig

    Abstract: In the context of electroencephalogram (EEG)-based driver drowsiness recognition, it is still challenging to design a calibration-free system, since EEG signals vary significantly among different subjects and recording sessions. Many efforts have been made to use deep learning methods for mental state recognition from EEG signals. However, existing work mostly treats deep learning models as black-… ▽ More

    Submitted 17 February, 2022; v1 submitted 30 May, 2021; originally announced July 2021.

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems, 2022

  7. arXiv:2106.00613  [pdf

    eess.SP cs.HC cs.LG cs.NE

    A Compact and Interpretable Convolutional Neural Network for Cross-Subject Driver Drowsiness Detection from Single-Channel EEG

    Authors: Jian Cui, Zirui Lan, Yisi Liu, Ruilin Li, Fan Li, Olga Sourina, Wolfgang Mueller-Wittig

    Abstract: Driver drowsiness is one of main factors leading to road fatalities and hazards in the transportation industry. Electroencephalography (EEG) has been considered as one of the best physiological signals to detect drivers drowsy states, since it directly measures neurophysiological activities in the brain. However, designing a calibration-free system for driver drowsiness detection with EEG is still… ▽ More

    Submitted 30 May, 2021; originally announced June 2021.

  8. arXiv:2101.08074  [pdf, other

    eess.SY cs.AI cs.LG cs.MA

    Flocking and Collision Avoidance for a Dynamic Squad of Fixed-Wing UAVs Using Deep Reinforcement Learning

    Authors: Chao Yan, Xiaojia Xiang, Chang Wang, Zhen Lan

    Abstract: Developing the flocking behavior for a dynamic squad of fixed-wing UAVs is still a challenge due to kinematic complexity and environmental uncertainty. In this paper, we deal with the decentralized flocking and collision avoidance problem through deep reinforcement learning (DRL). Specifically, we formulate a decentralized DRL-based decision making framework from the perspective of every follower,… ▽ More

    Submitted 22 July, 2021; v1 submitted 20 January, 2021; originally announced January 2021.

    Comments: Accepted for publication in the proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021)

  9. arXiv:2003.02436  [pdf, other

    cs.LG cs.NE cs.SD eess.AS stat.ML

    Talking-Heads Attention

    Authors: Noam Shazeer, Zhenzhong Lan, Youlong Cheng, Nan Ding, Le Hou

    Abstract: We introduce "talking-heads attention" - a variation on multi-head attention which includes linearprojections across the attention-heads dimension, immediately before and after the softmax operation.While inserting only a small number of additional parameters and a moderate amount of additionalcomputation, talking-heads attention leads to better perplexities on masked language modeling tasks, aswe… ▽ More

    Submitted 5 March, 2020; originally announced March 2020.