Skip to main content

Showing 1–32 of 32 results for author: Jung, C

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.03020  [pdf, ps, other

    eess.AS

    InfiniteAudio: Infinite-Length Audio Generation with Consistency

    Authors: Chaeyoung Jung, Hojoon Ki, Ji-Hoon Kim, Junmo Kim, Joon Son Chung

    Abstract: This paper presents InfiniteAudio, a simple yet effective strategy for generating infinite-length audio using diffusion-based text-to-audio methods. Current approaches face memory constraints because the output size increases with input length, making long duration generation challenging. A common workaround is to concatenate short audio segments, but this often leads to inconsistencies due to the… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  2. arXiv:2506.00452  [pdf, other

    eess.SP cs.AI stat.ML

    Attention-Aided MMSE for OFDM Channel Estimation: Learning Linear Filters with Attention

    Authors: TaeJun Ha, Chaehyun Jung, Hyeonuk Kim, Jeongwoo Park, Jeonghun Park

    Abstract: In orthogonal frequency division multiplexing (OFDM), accurate channel estimation is crucial. Classical signal processing based approaches, such as minimum mean-squared error (MMSE) estimation, often require second-order statistics that are difficult to obtain in practice. Recent deep neural networks based methods have been introduced to address this; yet they often suffer from high complexity. Th… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

    Comments: 13 pages, 12 figures

  3. arXiv:2505.16798  [pdf, ps, other

    eess.AS cs.AI

    SEED: Speaker Embedding Enhancement Diffusion Model

    Authors: KiHyun Nam, Jungwoo Heo, Jee-weon Jung, Gangin Park, Chaeyoung Jung, Ha-Jin Yu, Joon Son Chung

    Abstract: A primary challenge when deploying speaker recognition systems in real-world applications is performance degradation caused by environmental mismatch. We propose a diffusion-based method that takes speaker embeddings extracted from a pre-trained speaker recognition model and generates refined embeddings. For training, our approach progressively adds Gaussian noise to both clean and noisy speaker e… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: Accepted to Interspeech 2025. The official code can be found at https://github.com/kaistmm/seed-pytorch

  4. An Addendum to NeBula: Towards Extending TEAM CoSTAR's Solution to Larger Scale Environments

    Authors: Ali Agha, Kyohei Otsu, Benjamin Morrell, David D. Fan, Sung-Kyun Kim, Muhammad Fadhil Ginting, Xianmei Lei, Jeffrey Edlund, Seyed Fakoorian, Amanda Bouman, Fernando Chavez, Taeyeon Kim, Gustavo J. Correa, Maira Saboia, Angel Santamaria-Navarro, Brett Lopez, Boseong Kim, Chanyoung Jung, Mamoru Sobue, Oriana Claudia Peltzer, Joshua Ott, Robert Trybula, Thomas Touma, Marcel Kaufmann, Tiago Stegun Vaquero , et al. (64 additional authors not shown)

    Abstract: This paper presents an appendix to the original NeBula autonomy solution developed by the TEAM CoSTAR (Collaborative SubTerranean Autonomous Robots), participating in the DARPA Subterranean Challenge. Specifically, this paper presents extensions to NeBula's hardware, software, and algorithmic components that focus on increasing the range and scale of the exploration environment. From the algorithm… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    Journal ref: IEEE Transactions on Field Robotics, vol. 1, pp. 476-526, 2024

  5. arXiv:2503.16956  [pdf, other

    eess.AS cs.AI cs.CV cs.SD

    From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech

    Authors: Ji-Hoon Kim, Jeongsoo Choi, Jaehun Kim, Chaeyoung Jung, Joon Son Chung

    Abstract: The objective of this study is to generate high-quality speech from silent talking face videos, a task also known as video-to-speech synthesis. A significant challenge in video-to-speech synthesis lies in the substantial modality gap between silent video and multi-faceted speech. In this paper, we propose a novel video-to-speech system that effectively bridges this modality gap, significantly enha… ▽ More

    Submitted 21 March, 2025; originally announced March 2025.

    Comments: CVPR 2025, demo page: https://mm.kaist.ac.kr/projects/faces2voices/

  6. arXiv:2502.02689  [pdf, other

    eess.SY

    Multidimensional Swarm Flight Approach For Chasing Unauthorized UAVs Leveraging Asynchronous Deep Learning

    Authors: Tae-Won Ban, Kyu-Min Kang, Bang Chul Jung

    Abstract: This paper introduces a novel unmanned aerial vehicles (UAV) chasing system designed to track and chase unauthorized UAVs, significantly enhancing their neutralization effectiveness.

    Submitted 4 February, 2025; originally announced February 2025.

  7. arXiv:2501.18376  [pdf, other

    cs.CV eess.IV stat.AP

    Cracks in concrete

    Authors: Tin Barisin, Christian Jung, Anna Nowacka, Claudia Redenbach, Katja Schladitz

    Abstract: Finding and properly segmenting cracks in images of concrete is a challenging task. Cracks are thin and rough and being air filled do yield a very weak contrast in 3D images obtained by computed tomography. Enhancing and segmenting dark lower-dimensional structures is already demanding. The heterogeneous concrete matrix and the size of the images further increase the complexity. ML methods have pr… ▽ More

    Submitted 30 January, 2025; originally announced January 2025.

    Comments: This is a preprint of the chapter: T. Barisin, C. Jung, A. Nowacka, C. Redenbach, K. Schladitz: Cracks in concrete, published in Statistical Machine Learning for Engineering with Applications (LNCS), edited by J. Franke, A. Schöbel, reproduced with permission of Springer Nature Switzerland AG 2024. The final authenticated version is available online at: https://doi.org/10.1007/978-3-031-66253-9

    MSC Class: 60D05

    Journal ref: Statistical Machine Learning for Engineering with Applications (Lecture Notes in Statistics), edited by Jürgen Franke, Anita Schöbel, 2024, Springer Cham

  8. arXiv:2412.19259  [pdf, other

    eess.AS cs.SD

    VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis

    Authors: Jaemin Jung, Junseok Ahn, Chaeyoung Jung, Tan Dat Nguyen, Youngjoon Jang, Joon Son Chung

    Abstract: We present VoiceDiT, a multi-modal generative model for producing environment-aware speech and audio from text and visual prompts. While aligning speech with text is crucial for intelligible speech, achieving this alignment in noisy conditions remains a significant and underexplored challenge in the field. To address this, we present a novel audio generation pipeline named VoiceDiT. This pipeline… ▽ More

    Submitted 26 December, 2024; originally announced December 2024.

    Comments: Accepted to ICASSP 2025

  9. arXiv:2411.15490  [pdf, other

    cs.CV cs.LG eess.IV

    Improving Factuality of 3D Brain MRI Report Generation with Paired Image-domain Retrieval and Text-domain Augmentation

    Authors: Junhyeok Lee, Yujin Oh, Dahyoun Lee, Hyon Keun Joh, Chul-Ho Sohn, Sung Hyun Baik, Cheol Kyu Jung, Jung Hyun Park, Kyu Sung Choi, Byung-Hoon Kim, Jong Chul Ye

    Abstract: Acute ischemic stroke (AIS) requires time-critical management, with hours of delayed intervention leading to an irreversible disability of the patient. Since diffusion weighted imaging (DWI) using the magnetic resonance image (MRI) plays a crucial role in the detection of AIS, automated prediction of AIS from DWI has been a research topic of clinical importance. While text radiology reports contai… ▽ More

    Submitted 23 November, 2024; originally announced November 2024.

  10. arXiv:2410.01270  [pdf, other

    cs.CV eess.SY

    Panopticus: Omnidirectional 3D Object Detection on Resource-constrained Edge Devices

    Authors: Jeho Lee, Chanyoung Jung, Jiwon Kim, Hojung Cha

    Abstract: 3D object detection with omnidirectional views enables safety-critical applications such as mobile robot navigation. Such applications increasingly operate on resource-constrained edge devices, facilitating reliable processing without privacy concerns or network delays. To enable cost-effective deployment, cameras have been widely adopted as a low-cost alternative to LiDAR sensors. However, the co… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

    Comments: Published at MobiCom 2024

  11. arXiv:2407.00289  [pdf, other

    eess.SY cs.IT

    Personalised Outfit Recommendation via History-aware Transformers

    Authors: Myong Chol Jung, Julien Monteil, Philip Schulz, Volodymyr Vaskovych

    Abstract: We present the history-aware transformer (HAT), a transformer-based model that uses shoppers' purchase history to personalise outfit predictions. The aim of this work is to recommend outfits that are internally coherent while matching an individual shopper's style and taste. To achieve this, we stack two transformer models, one that produces outfit representations and another one that processes th… ▽ More

    Submitted 26 September, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

  12. arXiv:2406.09286  [pdf, other

    eess.AS cs.SD

    FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching

    Authors: Chaeyoung Jung, Suyeon Lee, Ji-Hoon Kim, Joon Son Chung

    Abstract: This work proposes an efficient method to enhance the quality of corrupted speech signals by leveraging both acoustic and visual cues. While existing diffusion-based approaches have demonstrated remarkable quality, their applicability is limited by slow inference speeds and computational complexity. To address this issue, we present FlowAVSE which enhances the inference speed and reduces the numbe… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: INTERSPEECH 2024

  13. Enhancing Battlefield Awareness: An Aerial RIS-assisted ISAC System with Deep Reinforcement Learning

    Authors: Hyunsang Cho, Seonghoon Yoo, Bang Chul Jung, Joonhyuk Kang

    Abstract: This paper considers a joint communication and sensing technique for enhancing situational awareness in practical battlefield scenarios. In particular, we propose an aerial reconfigurable intelligent surface (ARIS)-assisted integrated sensing and communication (ISAC) system consisting of a single access point (AP), an ARIS, multiple users, and a sensing target. With deep reinforcement learning (DR… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  14. arXiv:2405.01264  [pdf, other

    eess.SY

    Model Predictive Guidance for Fuel-Optimal Landing of Reusable Launch Vehicles

    Authors: Ki-Wook Jung, Sang-Don Lee, Cheol-Goo Jung, Chang-Hun Lee

    Abstract: This paper introduces a landing guidance strategy for reusable launch vehicles (RLVs) using a model predictive approach based on sequential convex programming (SCP). The proposed approach devises two distinct optimal control problems (OCPs): planning a fuel-optimal landing trajectory that accommodates practical path constraints specific to RLVs, and determining real-time optimal tracking commands.… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  15. arXiv:2404.01604  [pdf, other

    cs.CV eess.IV

    WaveDH: Wavelet Sub-bands Guided ConvNet for Efficient Image Dehazing

    Authors: Seongmin Hwang, Daeyoung Han, Cheolkon Jung, Moongu Jeon

    Abstract: The surge in interest regarding image dehazing has led to notable advancements in deep learning-based single image dehazing approaches, exhibiting impressive performance in recent studies. Despite these strides, many existing methods fall short in meeting the efficiency demands of practical applications. In this paper, we introduce WaveDH, a novel and compact ConvNet designed to address this effic… ▽ More

    Submitted 17 January, 2025; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: Under Review

  16. arXiv:2401.11620  [pdf, other

    eess.SY

    Real-Time Systems Optimization with Black-box Constraints and Hybrid Variables

    Authors: Sen Wang, Dong Li, Shao-Yu Huang, Xuanliang Deng, Ashrarul H. Sifat, Changhee Jung, Ryan Williams, Haibo Zeng

    Abstract: When optimizing real-time systems, designers often face a challenging problem where the schedulability constraints are non-convex, non-continuous, or lack an analytical form to understand their properties. Although the optimization framework NORTH proposed in previous work is general (it works with arbitrary schedulability analysis) and scalable, it can only handle problems with continuous variabl… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: Workshop on OPtimization for Embedded and ReAl-time systems (OPERA 2023) co-located with the 44th IEEE Real-Time Systems Symposium (RTSS)

  17. arXiv:2401.03284  [pdf, other

    eess.SY

    Joint Optimization of Continuous Variables and Priority Assignments for Real-Time Systems with Black-box Schedulability Constraints

    Authors: Sen Wang, Dong Li, Shao-Yu Huang, Xuanliang Deng, Ashrarul H. Sifat, Changhee Jung, Ryan Williams, Haibo Zeng

    Abstract: In real-time systems optimization, designers often face a challenging problem posed by the non-convex and non-continuous schedulability conditions, which may even lack an analytical form to understand their properties. To tackle this challenging problem, we treat the schedulability analysis as a black box that only returns true/false results. We propose a general and scalable framework to optimize… ▽ More

    Submitted 18 March, 2025; v1 submitted 6 January, 2024; originally announced January 2024.

    Comments: Extension of a conference paper

  18. arXiv:2310.19699  [pdf, other

    eess.SY cs.OS cs.SC

    Optimizing Logical Execution Time Model for Both Determinism and Low Latency

    Authors: Sen Wang, Dong Li, Ashrarul H. Sifat, Shao-Yu Huang, Xuanliang Deng, Changhee Jung, Ryan Williams, Haibo Zeng

    Abstract: The Logical Execution Time (LET) programming model has recently received considerable attention, particularly because of its timing and dataflow determinism. In LET, task computation appears always to take the same amount of time (called the task's LET interval), and the task reads (resp. writes) at the beginning (resp. end) of the interval. Compared to other communication mechanisms, such as impl… ▽ More

    Submitted 7 March, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: accepted in RTAS'24

  19. arXiv:2310.19581  [pdf, other

    eess.AS cs.CV cs.SD

    Seeing Through the Conversation: Audio-Visual Speech Separation based on Diffusion Model

    Authors: Suyeon Lee, Chaeyoung Jung, Youngjoon Jang, Jaehun Kim, Joon Son Chung

    Abstract: The objective of this work is to extract target speaker's voice from a mixture of voices using visual cues. Existing works on audio-visual speech separation have demonstrated their performance with promising intelligibility, but maintaining naturalness remains a challenge. To address this issue, we propose AVDiffuSS, an audio-visual speech separation model based on a diffusion mechanism known for… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Project page with demo: https://mm.kaist.ac.kr/projects/avdiffuss/

  20. arXiv:2309.12306  [pdf, other

    cs.CV cs.SD eess.AS

    TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning

    Authors: Chaeyoung Jung, Suyeon Lee, Kihyun Nam, Kyeongha Rho, You Jin Kim, Youngjoon Jang, Joon Son Chung

    Abstract: The goal of this work is Active Speaker Detection (ASD), a task to determine whether a person is speaking or not in a series of video frames. Previous works have dealt with the task by exploring network architectures while learning effective representations has been less explored. In this work, we propose TalkNCE, a novel talk-aware contrastive loss. The loss is only applied to part of the full se… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  21. arXiv:2303.09463  [pdf, other

    cs.RO eess.SY

    An Autonomous System for Head-to-Head Race: Design, Implementation and Analysis; Team KAIST at the Indy Autonomous Challenge

    Authors: Chanyoung Jung, Andrea Finazzi, Hyunki Seong, Daegyu Lee, Seungwook Lee, Bosung Kim, Gyuri Gang, Seungil Han, David Hyunchul Shim

    Abstract: While the majority of autonomous driving research has concentrated on everyday driving scenarios, further safety and performance improvements of autonomous vehicles require a focus on extreme driving conditions. In this context, autonomous racing is a new area of research that has been attracting considerable interest recently. Due to the fact that a vehicle is driven by its perception, planning,… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: 35 pages, 31 figures, 5 tables, Field Robotics (accepted)

  22. arXiv:2208.00323  [pdf

    eess.SP cs.AI cs.LG

    A Multi-View Learning Approach to Enhance Automatic 12-Lead ECG Diagnosis Performance

    Authors: Jae-Won Choi, Dae-Yong Hong, Chan Jung, Eugene Hwang, Sung-Hyuk Park, Seung-Young Roh

    Abstract: The performances of commonly used electrocardiogram (ECG) diagnosis models have recently improved with the introduction of deep learning (DL). However, the impact of various combinations of multiple DL components and/or the role of data augmentation techniques on the diagnosis have not been sufficiently investigated. This study proposes an ensemble-based multi-view learning approach with an ECG au… ▽ More

    Submitted 30 July, 2022; originally announced August 2022.

    Comments: 9 pages, 3 figures, and 5 tables

  23. arXiv:2207.12232  [pdf, other

    cs.RO eess.SY

    A Resilient Navigation and Path Planning System for High-speed Autonomous Race Car

    Authors: Daegyu Lee, Chanyoung Jung, Andrea Finazzi, Hyunki Seong, D. Hyunchul Shim

    Abstract: This paper describes a resilient navigation and planning system used in the Indy Autonomous Challenge (IAC) competition. The IAC is a competition where full-scale race cars run autonomously on Indianapolis Motor Speedway(IMS) up to 290 km/h (180 mph). Race cars will experience severe vibrations. Especially at high speeds. These vibrations can degrade standard localization algorithms based on preci… ▽ More

    Submitted 15 September, 2022; v1 submitted 25 July, 2022; originally announced July 2022.

  24. arXiv:2207.06904  [pdf, other

    eess.SP cs.AI cs.LG

    Attention mechanisms for physiological signal deep learning: which attention should we take?

    Authors: Seong-A Park, Hyung-Chul Lee, Chul-Woo Jung, Hyun-Lim Yang

    Abstract: Attention mechanisms are widely used to dramatically improve deep learning model performance in various fields. However, their general ability to improve the performance of physiological signal deep learning model is immature. In this study, we experimentally analyze four attention mechanisms (e.g., squeeze-and-excitation, non-local, convolutional block attention module, and multi-head self-attent… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

  25. arXiv:2207.02377  [pdf, other

    eess.IV cs.CV

    Patch-wise Deep Metric Learning for Unsupervised Low-Dose CT Denoising

    Authors: Chanyong Jung, Joonhyung Lee, Sunkyoung You, Jong Chul Ye

    Abstract: The acquisition conditions for low-dose and high-dose CT images are usually different, so that the shifts in the CT numbers often occur. Accordingly, unsupervised deep learning-based approaches, which learn the target image distribution, often introduce CT number distortions and result in detrimental effects in diagnostic performance. To address this, here we propose a novel unsupervised learning… ▽ More

    Submitted 13 July, 2022; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: MICCAI 2022

  26. arXiv:2206.12930  [pdf, other

    cs.CV eess.IV

    SVBR-NET: A Non-Blind Spatially Varying Defocus Blur Removal Network

    Authors: Ali Karaali, Claudio Rosito Jung

    Abstract: Defocus blur is a physical consequence of the optical sensors used in most cameras. Although it can be used as a photographic style, it is commonly viewed as an image degradation modeled as the convolution of a sharp image with a spatially-varying blur kernel. Motivated by the advance of blur estimation methods in the past years, we propose a non-blind approach for image deblurring that can deal w… ▽ More

    Submitted 26 June, 2022; originally announced June 2022.

    Comments: Accepted to ICIP2022

  27. arXiv:2012.01919  [pdf, other

    eess.SP

    Internal Calibration Process Using Chirp Pulses with Application of the Adam Learning Algorithm

    Authors: Junho Kweon, Chan-Yong Jung, Kyung-Bin Bae, Seong-Ook Park

    Abstract: We propose a new internal calibration process using chirp pulses. Our method is utilized to mitigate thermal drift, which is unwanted changes and usually occurs in active elements such as a high power amplifier and low noise amplifier. The proposed method has advantages from two distinct aspects: calibration signal and algorithm. In respect to the calibration signal, our method does not contain an… ▽ More

    Submitted 3 December, 2020; originally announced December 2020.

    Comments: This work has been submitted to the IEEE for possible publication

  28. Edge and Identity Preserving Network for Face Super-Resolution

    Authors: Jonghyun Kim, Gen Li, Inyong Yun, Cheolkon Jung, Joongkyu Kim

    Abstract: Face super-resolution (SR) has become an indispensable function in security solutions such as video surveillance and identification system, but the distortion in facial components is a great challenge in it. Most state-of-the-art methods have utilized facial priors with deep neural networks. These methods require extra labels, longer training time, and larger computation memory. In this paper, we… ▽ More

    Submitted 30 March, 2021; v1 submitted 27 August, 2020; originally announced August 2020.

    Comments: Neurocomputing'2021

  29. arXiv:1910.01091  [pdf, other

    eess.IV cs.CV q-bio.QM

    W-Net: A CNN-based Architecture for White Blood Cells Image Classification

    Authors: Changhun Jung, Mohammed Abuhamad, Jumabek Alikhanov, Aziz Mohaisen, Kyungja Han, DaeHun Nyang

    Abstract: Computer-aided methods for analyzing white blood cells (WBC) have become widely popular due to the complexity of the manual process. Recent works have shown highly accurate segmentation and detection of white blood cells from microscopic blood images. However, the classification of the observed cells is still a challenge and highly demanded as the distribution of the five types reflects on the con… ▽ More

    Submitted 2 October, 2019; originally announced October 2019.

  30. arXiv:1909.12116  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Optimal Transport driven CycleGAN for Unsupervised Learning in Inverse Problems

    Authors: Byeongsu Sim, Gyutaek Oh, Jeongsol Kim, Chanyong Jung, Jong Chul Ye

    Abstract: To improve the performance of classical generative adversarial network (GAN), Wasserstein generative adversarial networks (W-GAN) was developed as a Kantorovich dual formulation of the optimal transport (OT) problem using Wasserstein-1 distance. However, it was not clear how cycleGAN-type generative models can be derived from the optimal transport theory. Here we show that a novel cycleGAN archite… ▽ More

    Submitted 30 August, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

    Comments: accepted for publication in the SIAM Journal on Imaging Sciences

  31. arXiv:1908.06566  [pdf, other

    cs.CV cs.LG eess.IV

    Adversarial Defense by Suppressing High-frequency Components

    Authors: Zhendong Zhang, Cheolkon Jung, Xiaolong Liang

    Abstract: Recent works show that deep neural networks trained on image classification dataset bias towards textures. Those models are easily fooled by applying small high-frequency perturbations to clean images. In this paper, we learn robust image classification models by removing high-frequency components. Specifically, we develop a differentiable high-frequency suppression module based on discrete Fourie… ▽ More

    Submitted 3 September, 2019; v1 submitted 18 August, 2019; originally announced August 2019.

    Comments: 3 pages. This paper is a technical report of the 5th place solution in the IJCAI-2019 Alibaba Adversarial AI Challenge. This paper has been accepted by the corresponding workshop

  32. arXiv:1908.02648  [pdf, other

    eess.IV cs.CV

    Attention-Aware Linear Depthwise Convolution for Single Image Super-Resolution

    Authors: Seongmin Hwang, Gwanghuyn Yu, Cheolkon Jung, Jinyoung Kim

    Abstract: Although deep convolutional neural networks (CNNs) have obtained outstanding performance in image superresolution (SR), their computational cost increases geometrically as CNN models get deeper and wider. Meanwhile, the features of intermediate layers are treated equally across the channel, thus hindering the representational capability of CNNs. In this paper, we propose an attention-aware linear… ▽ More

    Submitted 29 November, 2019; v1 submitted 7 August, 2019; originally announced August 2019.

    Comments: 9 pages, 8 figures