Skip to main content

Showing 1–23 of 23 results for author: Chung, W

Searching in archive eess. Search in all archives.
.
  1. arXiv:2504.14915  [pdf, other

    eess.AS cs.AI

    StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models

    Authors: Yeona Hong, Hyewon Han, Woo-jin Chung, Hong-Goo Kang

    Abstract: In this paper, we propose StableQuant, a novel adaptive post-training quantization (PTQ) algorithm for widely used speech foundation models (SFMs). While PTQ has been successfully employed for compressing large language models (LLMs) due to its ability to bypass additional fine-tuning, directly applying these techniques to SFMs may not yield optimal results, as SFMs utilize distinct network archit… ▽ More

    Submitted 21 April, 2025; originally announced April 2025.

    Comments: Accepted at ICASSP 2025

  2. arXiv:2502.10822  [pdf, other

    eess.AS cs.AI cs.SD

    NeuroAMP: A Novel End-to-end General Purpose Deep Neural Amplifier for Personalized Hearing Aids

    Authors: Shafique Ahmed, Ryandhimas E. Zezario, Hui-Guan Yuan, Amir Hussain, Hsin-Min Wang, Wei-Ho Chung, Yu Tsao

    Abstract: The prevalence of hearing aids is increasing. However, optimizing the amplification processes of hearing aids remains challenging due to the complexity of integrating multiple modular components in traditional methods. To address this challenge, we present NeuroAMP, a novel deep neural network designed for end-to-end, personalized amplification in hearing aids. NeuroAMP leverages both spectral fea… ▽ More

    Submitted 15 February, 2025; originally announced February 2025.

  3. arXiv:2412.15299  [pdf, other

    cs.CL cs.SD eess.AS

    LAMA-UT: Language Agnostic Multilingual ASR through Orthography Unification and Language-Specific Transliteration

    Authors: Sangmin Lee, Woo-Jin Chung, Hong-Goo Kang

    Abstract: Building a universal multilingual automatic speech recognition (ASR) model that performs equitably across languages has long been a challenge due to its inherent difficulties. To address this task we introduce a Language-Agnostic Multilingual ASR pipeline through orthography Unification and language-specific Transliteration (LAMA-UT). LAMA-UT operates without any language-specific modules while ma… ▽ More

    Submitted 22 December, 2024; v1 submitted 19 December, 2024; originally announced December 2024.

  4. arXiv:2411.09838  [pdf, other

    eess.IV cs.CV

    OneNet: A Channel-Wise 1D Convolutional U-Net

    Authors: Sanghyun Byun, Kayvan Shah, Ayushi Gang, Christopher Apton, Jacob Song, Woo Seong Chung

    Abstract: Many state-of-the-art computer vision architectures leverage U-Net for its adaptability and efficient feature extraction. However, the multi-resolution convolutional design often leads to significant computational demands, limiting deployment on edge devices. We present a streamlined alternative: a 1D convolutional encoder that retains accuracy while enhancing its suitability for edge applications… ▽ More

    Submitted 14 November, 2024; originally announced November 2024.

  5. arXiv:2410.21276  [pdf, other

    cs.CL cs.AI cs.CV cs.CY cs.LG cs.SD eess.AS

    GPT-4o System Card

    Authors: OpenAI, :, Aaron Hurst, Adam Lerer, Adam P. Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, Aleksander MÄ…dry, Alex Baker-Whitcomb, Alex Beutel, Alex Borzunov, Alex Carney, Alex Chow, Alex Kirillov, Alex Nichol, Alex Paino, Alex Renzin, Alex Tachard Passos, Alexander Kirillov, Alexi Christakis , et al. (395 additional authors not shown)

    Abstract: GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 mil… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

  6. arXiv:2407.08991  [pdf

    eess.AS cs.AI cs.CC

    Optimization of DNN-based speaker verification model through efficient quantization technique

    Authors: Yeona Hong, Woo-Jin Chung, Hong-Goo Kang

    Abstract: As Deep Neural Networks (DNNs) rapidly advance in various fields, including speech verification, they typically involve high computational costs and substantial memory consumption, which can be challenging to manage on mobile systems. Quantization of deep models offers a means to reduce both computational and memory expenses. Our research proposes an optimization framework for the quantization of… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: in Korean language, Accepted at Society of Electronic Engineers of Korea Conference 2024

  7. arXiv:2406.17329  [pdf, other

    eess.SP cs.SD eess.AS physics.bio-ph

    Speaker-Independent Acoustic-to-Articulatory Inversion through Multi-Channel Attention Discriminator

    Authors: Woo-Jin Chung, Hong-Goo Kang

    Abstract: We present a novel speaker-independent acoustic-to-articulatory inversion (AAI) model, overcoming the limitations observed in conventional AAI models that rely on acoustic features derived from restricted datasets. To address these challenges, we leverage representations from a pre-trained self-supervised learning (SSL) model to more effectively estimate the global, local, and kinematic pattern in… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 2024

  8. arXiv:2404.08212   

    eess.SP

    Mental Stress Detection: Development and Evaluation of a Wearable In-Ear Plethysmography

    Authors: Hika Barki, Wan-Young Chung

    Abstract: Mental stress is a prevalent condition that can have negative impacts on one's health. Early detection and treatment are crucial for preventing related illnesses and maintaining overall wellness. This study presents a new method for identifying mental stress using a wearable biosensor worn in the ear. Data was gathered from 14 participants in a controlled environment using stress-inducing tasks su… ▽ More

    Submitted 13 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: The paper is being withdrawn because we have identified substantial issues with the data analysis process. To ensure the integrity and accuracy of our findings, we are re-evaluating the data and will resubmit the paper after thorough revisions

  9. arXiv:2403.19180  [pdf

    eess.SP cs.ET

    A Robust UWOC-assisted Multi-hop Topology for Underwater Sensor Network Nodes

    Authors: Maaz Salman, Javad Bolboli, Wan-Young Chung

    Abstract: Underwater environment is substantially less explored territory as compared to earth surface due to lack of robust underwater communication infrastructure. For Internet of Underwater things connectivity, underwater wireless optical communication can play a vital role, compared to conventional radio frequency communication, due to longer range, high data rate, low latency, and unregulated bandwidth… ▽ More

    Submitted 31 March, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  10. arXiv:2311.10327  [pdf, other

    cs.RO eess.SY

    Dimensionality Reduction of Dynamics on Lie Manifolds via Structure-Aware Canonical Correlation Analysis

    Authors: Wooyoung Chung, Daniel Polani, Stas Tiomkin

    Abstract: Incorporating prior knowledge into a data-driven modeling problem can drastically improve performance, reliability, and generalization outside of the training sample. The stronger the structural properties, the more effective these improvements become. Manifolds are a powerful nonlinear generalization of Euclidean space for modeling finite dimensions. Structural impositions in constrained systems… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  11. arXiv:2311.00364  [pdf, other

    eess.AS cs.SD physics.bio-ph

    C2C: Cough to COVID-19 Detection in BHI 2023 Data Challenge

    Authors: Woo-Jin Chung, Miseul Kim, Hong-Goo Kang

    Abstract: This report describes our submission to BHI 2023 Data Competition: Sensor challenge. Our Audio Alchemists team designed an acoustic-based COVID-19 diagnosis system, Cough to COVID-19 (C2C), and won the 1st place in the challenge. C2C involves three key contributions: pre-processing of input signals, cough-related representation extraction leveraging Wav2vec2.0, and data augmentation. Through exper… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: 1st place winning paper from the BHI 2023 Data Challenge Competition: Sensor Informatics

  12. arXiv:2306.14517  [pdf, other

    cs.CL cs.SD eess.AS

    Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition

    Authors: Samuel Cahyawijaya, Holy Lovenia, Willy Chung, Rita Frieske, Zihan Liu, Pascale Fung

    Abstract: Speech emotion recognition plays a crucial role in human-computer interactions. However, most speech emotion recognition research is biased toward English-speaking adults, which hinders its applicability to other demographic groups in different languages and age groups. In this work, we analyze the transferability of emotion recognition across three different languages--English, Mandarin Chinese,… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: Accepted in INTERSPEECH 2023

  13. arXiv:2306.09640  [pdf, other

    eess.AS

    MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion

    Authors: Woo-Jin Chung, Doyeon Kim, Soo-Whan Chung, Hong-Goo Kang

    Abstract: We introduce Multi-level feature Fusion-based Periodicity Analysis Model (MF-PAM), a novel deep learning-based pitch estimation model that accurately estimates pitch trajectory in noisy and reverberant acoustic environments. Our model leverages the periodic characteristics of audio signals and involves two key steps: extracting pitch periodicity using periodic non-periodic convolution (PNP-Conv) b… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: accepted at INTERSPEECH 2023

  14. An empirical study on speech restoration guided by self supervised speech representation

    Authors: Jaeuk Byun, Youna Ji, Soo Whan Chung, Soyeon Choe, Min Seok Choi

    Abstract: Enhancing speech quality is an indispensable yet difficult task as it is often complicated by a range of degradation factors. In addition to additive noise, reverberation, clipping, and speech attenuation can all adversely affect speech quality. Speech restoration aims to recover speech components from these distortions. This paper focuses on exploring the impact of self-supervised speech represen… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: To be presented at ICASSP 2023

  15. arXiv:2304.06475  [pdf, ps, other

    eess.SP

    WiRiS: Transformer for RIS-Assisted Device-Free Sensing for Joint People Counting and Localization using Wi-Fi CSI

    Authors: Wei-Yu Chung, Li-Hsiang Shen, Kai-Ten Feng, Yuan-Chun Lin, Shih-Cheng Lin, Sheng-Fuh Chang

    Abstract: Channel State Information (CSI) is widely adopted as a feature for indoor localization. Taking advantage of the abundant information from the CSI, people can be accurately sensed even without equipped devices. However, the positioning error increases severely in non-line-of-sight (NLoS) regions. Reconfigurable intelligent surface (RIS) has been introduced to improve signal coverage in NLoS areas,… ▽ More

    Submitted 9 November, 2023; v1 submitted 25 March, 2023; originally announced April 2023.

  16. arXiv:2202.01946  [pdf, ps, other

    eess.SP cs.IT cs.LG

    Unsupervised Learning Based Hybrid Beamforming with Low-Resolution Phase Shifters for MU-MIMO Systems

    Authors: Chia-Ho Kuo, Hsin-Yuan Chang, Ronald Y. Chang, Wei-Ho Chung

    Abstract: Millimeter wave (mmWave) is a key technology for fifth-generation (5G) and beyond communications. Hybrid beamforming has been proposed for large-scale antenna systems in mmWave communications. Existing hybrid beamforming designs based on infinite-resolution phase shifters (PSs) are impractical due to hardware cost and power consumption. In this paper, we propose an unsupervised-learning-based sche… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

    Comments: IEEE International Conference on Communications (ICC) 2022

  17. arXiv:2107.00418  [pdf, other

    eess.IV cs.CV

    Supervised Segmentation with Domain Adaptation for Small Sampled Orbital CT Images

    Authors: Sungho Suh, Sojeong Cheon, Wonseo Choi, Yeon Woong Chung, Won-Kyung Cho, Ji-Sun Paik, Sung Eun Kim, Dong-Jin Chang, Yong Oh Lee

    Abstract: Deep neural networks (DNNs) have been widely used for medical image analysis. However, the lack of access a to large-scale annotated dataset poses a great challenge, especially in the case of rare diseases, or new domains for the research society. Transfer of pre-trained features, from the relatively large dataset is a considerable solution. In this paper, we have explored supervised segmentation… ▽ More

    Submitted 1 July, 2021; originally announced July 2021.

  18. Deep Metric Learning-based Image Retrieval System for Chest Radiograph and its Clinical Applications in COVID-19

    Authors: Aoxiao Zhong, Xiang Li, Dufan Wu, Hui Ren, Kyungsang Kim, Younggon Kim, Varun Buch, Nir Neumark, Bernardo Bizzo, Won Young Tak, Soo Young Park, Yu Rim Lee, Min Kyu Kang, Jung Gil Park, Byung Seok Kim, Woo Jin Chung, Ning Guo, Ittai Dayan, Mannudeep K. Kalra, Quanzheng Li

    Abstract: In recent years, deep learning-based image analysis methods have been widely applied in computer-aided detection, diagnosis and prognosis, and has shown its value during the public health crisis of the novel coronavirus disease 2019 (COVID-19) pandemic. Chest radiograph (CXR) has been playing a crucial role in COVID-19 patient triaging, diagnosing and monitoring, particularly in the United States.… ▽ More

    Submitted 25 November, 2020; originally announced December 2020.

    Comments: Aoxiao Zhong and Xiang Li contribute equally to this work

    Journal ref: Medical Image Analysis. 70 (2021) 101993

  19. arXiv:2009.12610  [pdf

    eess.IV cs.CV cs.LG

    Deep Learning-based Four-region Lung Segmentation in Chest Radiography for COVID-19 Diagnosis

    Authors: Young-Gon Kim, Kyungsang Kim, Dufan Wu, Hui Ren, Won Young Tak, Soo Young Park, Yu Rim Lee, Min Kyu Kang, Jung Gil Park, Byung Seok Kim, Woo Jin Chung, Mannudeep K. Kalra, Quanzheng Li

    Abstract: Purpose. Imaging plays an important role in assessing severity of COVID 19 pneumonia. However, semantic interpretation of chest radiography (CXR) findings does not include quantitative description of radiographic opacities. Most current AI assisted CXR image analysis framework do not quantify for regional variations of disease. To address these, we proposed a four region lung segmentation method t… ▽ More

    Submitted 26 September, 2020; originally announced September 2020.

  20. arXiv:1904.08833  [pdf, other

    cs.RO eess.SY

    A Passivity-based Nonlinear Admittance Control with Application to Powered Upper-limb Control under Unknown Environmental Interactions

    Authors: Min Jun Kim, Woongyong Lee, Jae Yeon Choi, Goobong Chung, Kyung-Lyong Han, Il Seop Choi, Christian Ott, Wan Kyun Chung

    Abstract: This paper presents an admittance controller based on the passivity theory for a powered upper-limb exoskeleton robot which is governed by the nonlinear equation of motion. Passivity allows us to include a human operator and environmental interaction in the control loop. The robot interacts with the human operator via F/T sensor and interacts with the environment mainly via end-effectors. Although… ▽ More

    Submitted 18 April, 2019; originally announced April 2019.

    Comments: Accepted in IEEE/ASME Transactions on Mechatronics (T-MECH)

  21. arXiv:1807.09609  [pdf, other

    math.OC eess.SP

    Local Cyber-Physical Attack for Masking Line Outage and Topology Attack in Smart Grid

    Authors: Hwei-Ming Chung, Wen-Tai Li, Chau Yuen, Wei-Ho Chung, Yan Zhang, Chao-Kai Wen

    Abstract: Malicious attacks in the power system can eventually result in a large-scale cascade failure if not attended on time. These attacks, which are traditionally classified into \emph{physical} and \emph{cyber attacks}, can be avoided by using the latest and advanced detection mechanisms. However, a new threat called \emph{cyber-physical attacks} which jointly target both the physical and cyber layers… ▽ More

    Submitted 24 July, 2018; originally announced July 2018.

    Comments: accepted by IEEE Transactions on Smart Grid. arXiv admin note: text overlap with arXiv:1708.03201

  22. arXiv:1712.00157  [pdf, other

    cs.IT eess.SP

    Fundamental Limits on Data Acquisition: Trade-offs between Sample Complexity and Query Difficulty

    Authors: Hye Won Chung, Ji Oon Lee, Alfred O. Hero

    Abstract: We consider query-based data acquisition and the corresponding information recovery problem, where the goal is to recover $k$ binary variables (information bits) from parity measurements of those variables. The queries and the corresponding parity measurements are designed using the encoding rule of Fountain codes. By using Fountain codes, we can design potentially limitless number of queries, and… ▽ More

    Submitted 2 January, 2018; v1 submitted 30 November, 2017; originally announced December 2017.

  23. Local Cyber-physical Attack with Leveraging Detection in Smart Grid

    Authors: Hwei-Ming Chung, Wen-Tai Li, Chau Yuen, Wei-Ho Chung, Chao-Kai Wen

    Abstract: A well-designed attack in the power system can cause an initial failure and then results in large-scale cascade failure. Several works have discussed power system attack through false data injection, line-maintaining attack, and line-removing attack. However, the existing methods need to continuously attack the system for a long time, and, unfortunately, the performance cannot be guaranteed if the… ▽ More

    Submitted 10 August, 2017; originally announced August 2017.

    Comments: Accepted by IEEE SmartGridComm 2017

    Journal ref: 2017 IEEE International Conference on Smart Grid Communications (SmartGridComm)