Skip to main content

Showing 1–10 of 10 results for author: Fei, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.08967  [pdf, ps, other

    cs.SD cs.CL eess.AS

    Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

    Authors: Ailin Huang, Bingxin Li, Bruce Wang, Boyong Wu, Chao Yan, Chengli Feng, Heng Wang, Hongyu Zhou, Hongyuan Wang, Jingbei Li, Jianjian Sun, Joanna Wang, Mingrui Chen, Peng Liu, Ruihang Miao, Shilei Jiang, Tian Fei, Wang You, Xi Chen, Xuerui Yang, Yechang Huang, Yuxiang Zhang, Zheng Ge, Zheng Gong, Zhewei Huang , et al. (51 additional authors not shown)

    Abstract: Large Audio-Language Models (LALMs) have significantly advanced intelligent human-computer interaction, yet their reliance on text-based outputs limits their ability to generate natural speech responses directly, hindering seamless audio interactions. To address this, we introduce Step-Audio-AQAA, a fully end-to-end LALM designed for Audio Query-Audio Answer (AQAA) tasks. The model integrates a du… ▽ More

    Submitted 13 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

    Comments: 12 pages, 3 figures

  2. arXiv:2506.03388  [pdf, other

    cs.CV

    Cross-Modal Urban Sensing: Evaluating Sound-Vision Alignment Across Street-Level and Aerial Imagery

    Authors: Pengyu Chen, Xiao Huang, Teng Fei, Sicheng Wang

    Abstract: Environmental soundscapes convey substantial ecological and social information regarding urban environments; however, their potential remains largely untapped in large-scale geographic analysis. In this study, we investigate the extent to which urban sounds correspond with visual scenes by comparing various visual representation strategies in capturing acoustic semantics. We employ a multimodal ap… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  3. arXiv:2505.12734  [pdf, ps, other

    cs.SD cs.AI cs.GR cs.HC eess.AS

    SounDiT: Geo-Contextual Soundscape-to-Landscape Generation

    Authors: Junbo Wang, Haofeng Tan, Bowen Liao, Albert Jiang, Teng Fei, Qixing Huang, Zhengzhong Tu, Shan Ye, Yuhao Kang

    Abstract: We present a novel and practically significant problem-Geo-Contextual Soundscape-to-Landscape (GeoS2L) generation-which aims to synthesize geographically realistic landscape images from environmental soundscapes. Prior audio-to-image generation methods typically rely on general-purpose datasets and overlook geographic and environmental contexts, resulting in unrealistic images that are misaligned… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    Comments: 14 pages, 5 figures

  4. arXiv:2503.23178  [pdf, other

    cs.CV

    Intelligent Bear Prevention System Based on Computer Vision: An Approach to Reduce Human-Bear Conflicts in the Tibetan Plateau Area, China

    Authors: Pengyu Chen, Teng Fei, Yunyan Du, Jiawei Yi, Yi Li, John A. Kupfer

    Abstract: Conflicts between humans and bears on the Tibetan Plateau present substantial threats to local communities and hinder wildlife preservation initiatives. This research introduces a novel strategy that incorporates computer vision alongside Internet of Things (IoT) technologies to alleviate these issues. Tailored specifically for the harsh environment of the Tibetan Plateau, the approach utilizes th… ▽ More

    Submitted 29 March, 2025; originally announced March 2025.

  5. arXiv:2411.18997  [pdf, other

    q-fin.CP cs.AI

    GRU-PFG: Extract Inter-Stock Correlation from Stock Factors with Graph Neural Network

    Authors: Yonggai Zhuang, Haoran Chen, Kequan Wang, Teng Fei

    Abstract: The complexity of stocks and industries presents challenges for stock prediction. Currently, stock prediction models can be divided into two categories. One category, represented by GRU and ALSTM, relies solely on stock factors for prediction, with limited effectiveness. The other category, represented by HIST and TRA, incorporates not only stock factors but also industry information, industry fin… ▽ More

    Submitted 28 November, 2024; originally announced November 2024.

    Comments: 17pages

  6. arXiv:2404.01250  [pdf, other

    q-bio.NC cs.HC

    Perceptogram: Reconstructing Visual Percepts from EEG

    Authors: Teng Fei, Abhinav Uppal, Ian Jackson, Srinivas Ravishankar, David Wang, Virginia R. de Sa

    Abstract: Visual neural decoding from EEG has improved significantly due to diffusion models that can reconstruct high-quality images from decoded latents. While recent works have focused on relatively complex architectures to achieve good reconstruction performance from EEG, less attention has been paid to the source of this information. In this work, we attempt to discover EEG features that represent perc… ▽ More

    Submitted 24 February, 2025; v1 submitted 1 April, 2024; originally announced April 2024.

  7. arXiv:2203.09611  [pdf, other

    cs.LG cs.AI cs.DB cs.SI stat.ML

    STICC: A multivariate spatial clustering method for repeated geographic pattern discovery with consideration of spatial contiguity

    Authors: Yuhao Kang, Kunlin Wu, Song Gao, Ignavier Ng, Jinmeng Rao, Shan Ye, Fan Zhang, Teng Fei

    Abstract: Spatial clustering has been widely used for spatial data mining and knowledge discovery. An ideal multivariate spatial clustering should consider both spatial contiguity and aspatial attributes. Existing spatial clustering approaches may face challenges for discovering repeated geographic patterns with spatial contiguity maintained. In this paper, we propose a Spatial Toeplitz Inverse Covariance-B… ▽ More

    Submitted 30 March, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Journal ref: International Journal of Geographical Information Science, Year 2022

  8. arXiv:2102.00407  [pdf

    cs.CY

    Emotion and color in paintings: a novel temporal and spatial quantitative perspective

    Authors: Wenyuan Kong, Teng Fei, Thom Jencks

    Abstract: As subjective artistic creations, artistic paintings carry emotion of their creators. Emotions expressed in paintings and emotion aroused in spectators by paintings are two kinds of emotions that scholars have paid attention to. Traditional studies on emotions expressed by paintings are mainly conducted from qualitative perspectives, with neither quantitative output on the emotional values of a pa… ▽ More

    Submitted 31 January, 2021; originally announced February 2021.

  9. arXiv:2012.12809  [pdf, other

    cs.CV cs.AI eess.SP

    Warping of Radar Data into Camera Image for Cross-Modal Supervision in Automotive Applications

    Authors: Christopher Grimm, Tai Fei, Ernst Warsitz, Ridha Farhoud, Tobias Breddermann, Reinhold Haeb-Umbach

    Abstract: We present an approach to automatically generate semantic labels for real recordings of automotive range-Doppler (RD) radar spectra. Such labels are required when training a neural network for object recognition from radar data. The automatic labeling approach rests on the simultaneous recording of camera and lidar data in addition to the radar spectrum. By warping radar spectra into the camera im… ▽ More

    Submitted 20 June, 2022; v1 submitted 23 December, 2020; originally announced December 2020.

  10. Extracting human emotions at different places based on facial expressions and spatial clustering analysis

    Authors: Yuhao Kang, Qingyuan Jia, Song Gao, Xiaohuan Zeng, Yueyao Wang, Stephan Angsuesser, Yu Liu, Xinyue Ye, Teng Fei

    Abstract: The emergence of big data enables us to evaluate the various human emotions at places from a statistic perspective by applying affective computing. In this study, a novel framework for extracting human emotions from large-scale georeferenced photos at different places is proposed. After the construction of places based on spatial clustering of user generated footprints collected in social media we… ▽ More

    Submitted 6 May, 2019; originally announced May 2019.

    Comments: 40 pages; 9 figures

    ACM Class: I.2; I.4.9; I.5.3

    Journal ref: Transactions in GIS, Year 2019, Volume 23, Issue 3