Skip to main content

Showing 1–4 of 4 results for author: Ishikawa, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2410.00511  [pdf, other

    eess.AS cs.AI cs.CV

    Pre-training with Synthetic Patterns for Audio

    Authors: Yuchi Ishikawa, Tatsuya Komatsu, Yoshimitsu Aoki

    Abstract: In this paper, we propose to pre-train audio encoders using synthetic patterns instead of real audio data. Our proposed framework consists of two key elements. The first one is Masked Autoencoder (MAE), a self-supervised learning framework that learns from reconstructing data from randomly masked counterparts. MAEs tend to focus on low-level information such as visual patterns and regularities wit… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

    Comments: Submitted to ICASSP'25

  2. arXiv:2403.12477  [pdf, other

    cs.SD eess.AS

    Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis and Rank-constrained Spatial Covariance Matrix Estimation

    Authors: Yuto Ishikawa, Kohei Konaka, Tomohiko Nakamura, Norihiro Takamune, Hiroshi Saruwatari

    Abstract: Real-time speech extraction is an important challenge with various applications such as speech recognition in a human-like avatar/robot. In this paper, we propose the real-time extension of a speech extraction method based on independent low-rank matrix analysis (ILRMA) and rank-constrained spatial covariance matrix estimation (RCSCME). The RCSCME-based method is a multichannel blind speech extrac… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 5 pages, 3 figures, accepted at HSCMA 2024

  3. arXiv:2311.09646  [pdf, other

    cs.CV cs.GR eess.IV

    Reconstructing Continuous Light Field From Single Coded Image

    Authors: Yuya Ishikawa, Keita Takahashi, Chihiro Tsutake, Toshiaki Fujii

    Abstract: We propose a method for reconstructing a continuous light field of a target scene from a single observed image. Our method takes the best of two worlds: joint aperture-exposure coding for compressive light-field acquisition, and a neural radiance field (NeRF) for view synthesis. Joint aperture-exposure coding implemented in a camera enables effective embedding of 3-D scene information into an obse… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Journal ref: IEEE Access, Volume 11, Pages 99387-99396, 2023

  4. arXiv:1904.06851  [pdf

    cs.SD cs.MM eess.AS

    Proximal binaural sound can induce subjective frisson

    Authors: Shiori Honda, Yuri Ishikawa, Rei Konno, Eiko Imai, Natsumi Nomiyama, Kazuki Sakurada, Takuya Koumura, Hirohito M. Kondo, Shigeto Furukawa, Shinya Fujii, Masashi Nakatani

    Abstract: Auditory frisson is the experience of feeling of cold or shivering related to sound in the absence of a physical cold stimulus. Multiple examples of frisson-inducing sounds have been reported, but the mechanism of auditory frisson remains elusive. Typical frisson-inducing sounds may contain a looming effect, in which a sound appears to approach the listener's peripersonal space. Previous studies o… ▽ More

    Submitted 8 April, 2020; v1 submitted 15 April, 2019; originally announced April 2019.

    Comments: 21 pages, 3 figures, 3 tables, 3 supplemental figures, 3 supplemental tables

    Journal ref: Front Psychol. 2020 Mar 3;11:316