Skip to main content

Showing 1–50 of 586 results for author: Kim, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2505.09508  [pdf, other

    eess.SP

    Wearable Tracking of Eye and Body Movements During Breaching Training: Towards Real-Time Blast Injury Monitoring

    Authors: Jeremy P. Kemmerer, James R. Williamson, Joseph Kim, Elizabeth Halford, Hrishikesh M. Rao, Christopher J. Smalt

    Abstract: Repeated exposure to blast overpressure in occupational settings has been associated with changes in cognitive and psychological health, as well as deficits in neurosensory subsystems. In this work, we describe a wearable system to simultaneously monitor physiology and blast exposure levels and demonstrate how this system can identify individualized exposure levels corresponding to acute physiolog… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  2. arXiv:2505.07365  [pdf, ps, other

    cs.SD cs.AI cs.CL cs.MM eess.AS

    Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge

    Authors: Chao-Han Huck Yang, Sreyan Ghosh, Qing Wang, Jaeyeon Kim, Hengyi Hong, Sonal Kumar, Guirui Zhong, Zhifeng Kong, S Sakshi, Vaibhavi Lokegaonkar, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha, Gunhee Kim, Jun Du, Rafael Valle, Bryan Catanzaro

    Abstract: We present Task 5 of the DCASE 2025 Challenge: an Audio Question Answering (AQA) benchmark spanning multiple domains of sound understanding. This task defines three QA subsets (Bioacoustics, Temporal Soundscapes, and Complex QA) to test audio-language models on interactive question-answering over diverse acoustic scenes. We describe the dataset composition (from marine mammal calls to soundscapes… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: Preprint. DCASE 2025 Audio QA Challenge: https://dcase.community/challenge2025/task-audio-question-answering

  3. arXiv:2505.04432  [pdf, other

    eess.SP

    SwinLSTM Autoencoder for Temporal-Spatial-Frequency Domain CSI Compression in Massive MIMO Systems

    Authors: Aakash Saini, Yunchou Xing, Jee Hyun Kim, Amir Ahmadian Tehrani, Wolfgang Gerstacker

    Abstract: This study presents a parameter-light, low-complexity artificial intelligence/machine learning (AI/ML) model that enhances channel state information (CSI) feedback in wireless systems by jointly exploiting temporal, spatial, and frequency (TSF) domain correlations. While traditional frameworks use autoencoders for CSI compression at the user equipment (UE) and reconstruction at the network (NW) si… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: 7 pages, 5 figures

  4. arXiv:2505.04105  [pdf

    eess.IV cs.CV

    MAISY: Motion-Aware Image SYnthesis for Medical Image Motion Correction

    Authors: Andrew Zhang, Hao Wang, Shuchang Ye, Michael Fulham, Jinman Kim

    Abstract: Patient motion during medical image acquisition causes blurring, ghosting, and distorts organs, which makes image interpretation challenging. Current state-of-the-art algorithms using Generative Adversarial Network (GAN)-based methods with their ability to learn the mappings between corrupted images and their ground truth via Structural Similarity Index Measure (SSIM) loss effectively generate mot… ▽ More

    Submitted 8 May, 2025; v1 submitted 6 May, 2025; originally announced May 2025.

  5. arXiv:2505.02951  [pdf, other

    cs.IT eess.SP

    Multi-Antenna Users in Cell-Free Massive MIMO: Stream Allocation and Necessity of Downlink Pilots

    Authors: Eren Berk Kama, Junbeom Kim, Emil Björnson

    Abstract: We consider a cell-free massive multiple-input multiple-output (MIMO) system with multiple antennas on the users and access points (APs). In previous works, the downlink spectral efficiency (SE) has been evaluated using the hardening bound that requires no downlink pilots. This approach works well for single-antenna users. In this paper, we show that much higher SEs can be achieved if downlink pil… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

    Comments: 13 pages, 9 figures. arXiv admin note: text overlap with arXiv:2404.18516

  6. arXiv:2505.00481  [pdf, other

    eess.SY

    Stabilization by Controllers Having Integer Coefficients

    Authors: Joowon Lee, Donggil Lee, Junsoo Kim

    Abstract: The system property of ``having integer coefficients,'' that is, a transfer function has an integer monic polynomial as its denominator, is significant in the field of encrypted control as it is required for a dynamic controller to be realized over encrypted data. This paper shows that there always exists a controller with integer coefficients stabilizing a given discrete-time linear time-invarian… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

  7. arXiv:2504.19247  [pdf, other

    cs.RO eess.SY

    Efficient COLREGs-Compliant Collision Avoidance using Turning Circle-based Control Barrier Function

    Authors: Changyu Lee, Jinwook Park, Jinwhan Kim

    Abstract: This paper proposes a computationally efficient collision avoidance algorithm using turning circle-based control barrier functions (CBFs) that comply with international regulations for preventing collisions at sea (COLREGs). Conventional CBFs often lack explicit consideration of turning capabilities and avoidance direction, which are key elements in developing a COLREGs-compliant collision avoidan… ▽ More

    Submitted 27 April, 2025; originally announced April 2025.

    Comments: This work has been submitted to an IEEE journal for possible publication

  8. Documentation on Encrypted Dynamic Control Simulation Code using Ring-LWE based Cryptosystems

    Authors: Yeongjun Jang, Joowon Lee, Junsoo Kim

    Abstract: Encrypted controllers offer secure computation by employing modern cryptosystems to execute control operations directly over encrypted data without decryption. However, incorporating cryptosystems into dynamic controllers significantly increases the computational load. This paper aims to provide an accessible guideline for running encrypted controllers using an open-source library Lattigo, which s… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

    Comments: 6 pages

    Journal ref: Journal of The Society of Instrument and Control Engineers, vol. 64, no. 4, pp. 248-254, 2025

  9. arXiv:2504.09655  [pdf

    eess.IV cs.CV

    OmniMamba4D: Spatio-temporal Mamba for longitudinal CT lesion segmentation

    Authors: Justin Namuk Kim, Yiqiao Liu, Rajath Soans, Keith Persson, Sarah Halek, Michal Tomaszewski, Jianda Yuan, Gregory Goldmacher, Antong Chen

    Abstract: Accurate segmentation of longitudinal CT scans is important for monitoring tumor progression and evaluating treatment responses. However, existing 3D segmentation models solely focus on spatial information. To address this gap, we propose OmniMamba4D, a novel segmentation model designed for 4D medical images (3D images over time). OmniMamba4D utilizes a spatio-temporal tetra-orientated Mamba block… ▽ More

    Submitted 24 April, 2025; v1 submitted 13 April, 2025; originally announced April 2025.

    Comments: Accepted at IEEE International Symposium on Biomedical Imaging (ISBI) 2025

  10. arXiv:2504.09248  [pdf, ps, other

    eess.SY cs.CR

    Asymptotic stabilization under homomorphic encryption: A re-encryption free method

    Authors: Shuai Feng, Qian Ma, Junsoo Kim, Shengyuan Xu

    Abstract: In this paper, we propose methods to encrypted a pre-given dynamic controller with homomorphic encryption, without re-encrypting the control inputs. We first present a preliminary result showing that the coefficients in a pre-given dynamic controller can be scaled up into integers by the zooming-in factor in dynamic quantization, without utilizing re-encryption. However, a sufficiently small zoomi… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.

  11. arXiv:2504.00244  [pdf, other

    math.OC eess.SY

    System Identification from Partial Observations under Adversarial Attacks

    Authors: Jihun Kim, Javad Lavaei

    Abstract: This paper is concerned with the partially observed linear system identification, where the goal is to obtain reasonably accurate estimation of the balanced truncation of the true system up to the order $k$ from output measurements. We consider the challenging case of system identification under adversarial attacks, where the probability of having an attack at each time is $Θ(1/k)$ while the value… ▽ More

    Submitted 31 March, 2025; originally announced April 2025.

    Comments: 9 pages, 2 figures

    MSC Class: 93B15; 93B30; 93C05

  12. arXiv:2503.22829  [pdf

    eess.IV cs.AI cs.CV cs.LG

    Nonhuman Primate Brain Tissue Segmentation Using a Transfer Learning Approach

    Authors: Zhen Lin, Hongyu Yuan, Richard Barcus, Qing Lyu, Sucheta Chakravarty, Megan E. Lipford, Carol A. Shively, Suzanne Craft, Mohammad Kawas, Jeongchul Kim, Christopher T. Whitlow

    Abstract: Non-human primates (NHPs) serve as critical models for understanding human brain function and neurological disorders due to their close evolutionary relationship with humans. Accurate brain tissue segmentation in NHPs is critical for understanding neurological disorders, but challenging due to the scarcity of annotated NHP brain MRI datasets, the small size of the NHP brain, the limited resolution… ▽ More

    Submitted 1 April, 2025; v1 submitted 28 March, 2025; originally announced March 2025.

  13. arXiv:2503.20280  [pdf, other

    cs.RO eess.SY

    Turning Circle-based Control Barrier Function for Efficient Collision Avoidance of Nonholonomic Vehicles

    Authors: Changyu Lee, Kiyong Park, Jinwhan Kim

    Abstract: This paper presents a new control barrier function (CBF) designed to improve the efficiency of collision avoidance for nonholonomic vehicles. Traditional CBFs typically rely on the shortest Euclidean distance to obstacles, overlooking the limited heading change ability of nonholonomic vehicles. This often leads to abrupt maneuvers and excessive speed reductions, which is not desirable and reduces… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

    Comments: This work has been submitted to an IEEE journal for possible publication

  14. arXiv:2503.18642  [pdf, other

    eess.IV cs.CV

    Rethinking Glaucoma Calibration: Voting-Based Binocular and Metadata Integration

    Authors: Taejin Jeong, Joohyeok Kim, Jaehoon Joo, Yeonwoo Jung, Hyeonmin Kim, Seong Jae Hwang

    Abstract: Glaucoma is an incurable ophthalmic disease that damages the optic nerve, leads to vision loss, and ranks among the leading causes of blindness worldwide. Diagnosing glaucoma typically involves fundus photography, optical coherence tomography (OCT), and visual field testing. However, the high cost of OCT often leads to reliance on fundus photography and visual field testing, both of which exhibit… ▽ More

    Submitted 24 March, 2025; originally announced March 2025.

  15. arXiv:2503.16956  [pdf, other

    eess.AS cs.AI cs.CV cs.SD

    From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech

    Authors: Ji-Hoon Kim, Jeongsoo Choi, Jaehun Kim, Chaeyoung Jung, Joon Son Chung

    Abstract: The objective of this study is to generate high-quality speech from silent talking face videos, a task also known as video-to-speech synthesis. A significant challenge in video-to-speech synthesis lies in the substantial modality gap between silent video and multi-faceted speech. In this paper, we propose a novel video-to-speech system that effectively bridges this modality gap, significantly enha… ▽ More

    Submitted 21 March, 2025; originally announced March 2025.

    Comments: CVPR 2025, demo page: https://mm.kaist.ac.kr/projects/faces2voices/

  16. arXiv:2503.09385  [pdf, other

    cs.SE cs.RO eess.SY

    PCLA: A Framework for Testing Autonomous Agents in the CARLA Simulator

    Authors: Masoud Jamshidiyan Tehrani, Jinhan Kim, Paolo Tonella

    Abstract: Recent research on testing autonomous driving agents has grown significantly, especially in simulation environments. The CARLA simulator is often the preferred choice, and the autonomous agents from the CARLA Leaderboard challenge are regarded as the best-performing agents within this environment. However, researchers who test these agents, rather than training their own ones from scratch, often f… ▽ More

    Submitted 13 March, 2025; v1 submitted 12 March, 2025; originally announced March 2025.

    Comments: This work will be published at the FSE 2025 demonstration track

  17. arXiv:2503.05848  [pdf, other

    cs.RO eess.SY

    Merry-Go-Round: Safe Control of Decentralized Multi-Robot Systems with Deadlock Prevention

    Authors: Wonjong Lee, Joonyeol Sim, Joonkyung Kim, Siwon Jo, Wenhao Luo, Changjoo Nam

    Abstract: We propose a hybrid approach for decentralized multi-robot navigation that ensures both safety and deadlock prevention. Building on a standard control formulation, we add a lightweight deadlock prevention mechanism by forming temporary "roundabouts" (circular reference paths). Each robot relies only on local, peer-to-peer communication and a controller for base collision avoidance; a roundabout is… ▽ More

    Submitted 7 March, 2025; originally announced March 2025.

    Comments: 7 pages, 7 Figures

  18. arXiv:2503.05366  [pdf, other

    eess.SY

    A Risk-aware Bi-level Bidding Strategy for Virtual Power Plant with Power-to-Hydrogen System

    Authors: Jaehyun Yoo, Jip Kim

    Abstract: This paper presents a risk-aware bi-level bidding strategy for Virtual Power Plant (VPP) that integrates Power-to-Hydrogen (P2H) system, addressing the challenges posed by renewable energy variability and market volatility. By incorporating Conditional Value at Risk (CVaR) within the bi-level optimization framework, the proposed strategy enables VPPs to mitigate financial risks associated with unc… ▽ More

    Submitted 7 March, 2025; originally announced March 2025.

    Comments: 5 pages, 5 figures, 2025 PES General Meeting

  19. arXiv:2503.05361  [pdf, other

    eess.SY

    Community Energy Management System for Fast Frequency Response: A Hierarchical Control Approach

    Authors: Joonsung Jung, Hyunjoong Kim, Hyunghwan Shin, Jip Kim

    Abstract: The increase in renewable energy sources (RES) has reduced power system inertia, making frequency stabilization more challenging and highlighting the need for fast frequency response (FFR) resources. While building energy management systems (BEMS) equipped with distributed energy resources (DERs) can provide FFR, individual BEMS alone cannot fully meet demand. To address this, we propose a communi… ▽ More

    Submitted 7 March, 2025; originally announced March 2025.

    Comments: 5 pages, 7 figures, submitted to PES General Meeting 2025

    MSC Class: 90C05; 90C90 ACM Class: I.2.8; C.3; G.1.6

  20. arXiv:2503.03983  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

    Authors: Sreyan Ghosh, Zhifeng Kong, Sonal Kumar, S Sakshi, Jaehyeon Kim, Wei Ping, Rafael Valle, Dinesh Manocha, Bryan Catanzaro

    Abstract: Understanding and reasoning over non-speech sounds and music are crucial for both humans and AI agents to interact effectively with their environments. In this paper, we introduce Audio Flamingo 2 (AF2), an Audio-Language Model (ALM) with advanced audio understanding and reasoning capabilities. AF2 leverages (i) a custom CLAP model, (ii) synthetic Audio QA data for fine-grained audio reasoning, an… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

  21. arXiv:2502.16459  [pdf

    eess.IV cs.AI cs.CV

    Deep learning approaches to surgical video segmentation and object detection: A Scoping Review

    Authors: Devanish N. Kamtam, Joseph B. Shrager, Satya Deepya Malla, Nicole Lin, Juan J. Cardona, Jake J. Kim, Clarence Hu

    Abstract: Introduction: Computer vision (CV) has had a transformative impact in biomedical fields such as radiology, dermatology, and pathology. Its real-world adoption in surgical applications, however, remains limited. We review the current state-of-the-art performance of deep learning (DL)-based CV models for segmentation and object detection of anatomical structures in videos obtained during surgical pr… ▽ More

    Submitted 23 February, 2025; originally announced February 2025.

    Comments: 38 pages, 2 figures

  22. arXiv:2502.13986  [pdf, other

    eess.IV

    Structure-from-Sherds++: Robust Incremental 3D Reassembly of Axially Symmetric Pots from Unordered and Mixed Fragment Collections

    Authors: Seong Jong Yoo, Sisung Liu, Muhammad Zeeshan Arshad, Jinhyeok Kim, Young Min Kim, Yiannis Aloimonos, Cornelia Fermuller, Kyungdon Joo, Jinwook Kim, Je Hyeong Hong

    Abstract: Reassembling multiple axially symmetric pots from fragmentary sherds is crucial for cultural heritage preservation, yet it poses significant challenges due to thin and sharp fracture surfaces that generate numerous false positive matches and hinder large-scale puzzle solving. Existing global approaches, which optimize all potential fragment pairs simultaneously or data-driven models, are prone to… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

    Comments: 24 pages

  23. arXiv:2502.10283  [pdf, other

    cs.CR eess.SY

    Anomaly Detection with LWE Encrypted Control

    Authors: Rijad Alisic, Junsoo Kim, Henrik Sandberg

    Abstract: Detecting attacks using encrypted signals is challenging since encryption hides its information content. We present a novel mechanism for anomaly detection over Learning with Errors (LWE) encrypted signals without using decryption, secure channels, nor complex communication schemes. Instead, the detector exploits the homomorphic property of LWE encryption to perform hypothesis tests on transformat… ▽ More

    Submitted 14 February, 2025; originally announced February 2025.

  24. arXiv:2502.09283  [pdf, other

    eess.SP

    Rate-Splitting Multiple Access for 6G: Prototypes, Experimental Results and Link/System level Simulations

    Authors: Sundar Aditya, Yong Jin Daniel Kim, David Vargas, David Redgate, Onur Dizdar, Neil Bhushan, Xinze Lyu, Sibo Zhang, Stephen Wang, Bruno Clerckx

    Abstract: Rate-Splitting Multiple Access (RSMA) is a powerful and versatile physical layer multiple access technique that generalizes and has better interference management capabilities than 5G-based Space Division Multiple Access (SDMA). It is also a rapidly maturing technology, all of which makes it a natural successor to SDMA in 6G. In this article, we describe RSMA's suitability for 6G by presenting: i)… ▽ More

    Submitted 17 February, 2025; v1 submitted 13 February, 2025; originally announced February 2025.

    Comments: Submitted to the IEEE Communications Standards Magazine December 2025 Special Issue on "Wireless Technologies for 6G and Beyond: Applications, Implementations, and Standardization"

  25. arXiv:2502.08675  [pdf, other

    eess.IV

    Improving Lesion Segmentation in Medical Images by Global and Regional Feature Compensation

    Authors: Chuhan Wang, Zhenghao Chen, Jean Y. H. Yang, Jinman Kim

    Abstract: Automated lesion segmentation of medical images has made tremendous improvements in recent years due to deep learning advancements. However, accurately capturing fine-grained global and regional feature representations remains a challenge. Many existing methods obtain suboptimal performance on complex lesion segmentation due to information loss during typical downsampling operations and the insuff… ▽ More

    Submitted 11 February, 2025; originally announced February 2025.

  26. arXiv:2502.05330  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Multi-Class Segmentation of Aortic Branches and Zones in Computed Tomography Angiography: The AortaSeg24 Challenge

    Authors: Muhammad Imran, Jonathan R. Krebs, Vishal Balaji Sivaraman, Teng Zhang, Amarjeet Kumar, Walker R. Ueland, Michael J. Fassler, Jinlong Huang, Xiao Sun, Lisheng Wang, Pengcheng Shi, Maximilian Rokuss, Michael Baumgartner, Yannick Kirchhof, Klaus H. Maier-Hein, Fabian Isensee, Shuolin Liu, Bing Han, Bong Thanh Nguyen, Dong-jin Shin, Park Ji-Woo, Mathew Choi, Kwang-Hyun Uhm, Sung-Jea Ko, Chanwoong Lee , et al. (38 additional authors not shown)

    Abstract: Multi-class segmentation of the aorta in computed tomography angiography (CTA) scans is essential for diagnosing and planning complex endovascular treatments for patients with aortic dissections. However, existing methods reduce aortic segmentation to a binary problem, limiting their ability to measure diameters across different branches and zones. Furthermore, no open-source dataset is currently… ▽ More

    Submitted 7 February, 2025; originally announced February 2025.

  27. arXiv:2502.03505  [pdf, other

    eess.IV cs.AI cs.LG

    Enhancing Free-hand 3D Photoacoustic and Ultrasound Reconstruction using Deep Learning

    Authors: SiYeoul Lee, SeonHo Kim, Minkyung Seo, SeongKyu Park, Salehin Imrus, Kambaluru Ashok, DongEon Lee, Chunsu Park, SeonYeong Lee, Jiye Kim, Jae-Heung Yoo, MinWoo Kim

    Abstract: This study introduces a motion-based learning network with a global-local self-attention module (MoGLo-Net) to enhance 3D reconstruction in handheld photoacoustic and ultrasound (PAUS) imaging. Standard PAUS imaging is often limited by a narrow field of view and the inability to effectively visualize complex 3D structures. The 3D freehand technique, which aligns sequential 2D images for 3D reconst… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

  28. arXiv:2502.01092  [pdf, other

    cs.RO cs.CV eess.SY

    Enhancing Feature Tracking Reliability for Visual Navigation using Real-Time Safety Filter

    Authors: Dabin Kim, Inkyu Jang, Youngsoo Han, Sunwoo Hwang, H. Jin Kim

    Abstract: Vision sensors are extensively used for localizing a robot's pose, particularly in environments where global localization tools such as GPS or motion capture systems are unavailable. In many visual navigation systems, localization is achieved by detecting and tracking visual features or landmarks, which provide information about the sensor's relative pose. For reliable feature tracking and accurat… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

    Comments: 7 pages, 6 figures, Accepted to 2025 IEEE International Conference on Robotics & Automation (ICRA 2025)

  29. arXiv:2502.00619  [pdf, other

    eess.IV cs.AI cs.CV

    Distribution-aware Fairness Learning in Medical Image Segmentation From A Control-Theoretic Perspective

    Authors: Yujin Oh, Pengfei Jin, Sangjoon Park, Sekeun Kim, Siyeop Yoon, Kyungsang Kim, Jin Sung Kim, Xiang Li, Quanzheng Li

    Abstract: Ensuring fairness in medical image segmentation is critical due to biases in imbalanced clinical data acquisition caused by demographic attributes (e.g., age, sex, race) and clinical factors (e.g., disease severity). To address these challenges, we introduce Distribution-aware Mixture of Experts (dMoE), inspired by optimal control theory. We provide a comprehensive analysis of its underlying mecha… ▽ More

    Submitted 1 February, 2025; originally announced February 2025.

    Comments: 12 pages, 3 figures, 9 tables

  30. arXiv:2501.15744  [pdf

    eess.AS cs.SD

    Noise disturbance and lack of privacy: Modeling acoustic dissatisfaction in open-plan offices

    Authors: Manuj Yadav, Jungsoo Kim, Valtteri Hongisto, Densil Cabrera, Richard de Dear

    Abstract: Open-plan offices are well-known to be adversely affected by acoustic issues. This study aims to model acoustic dissatisfaction using measurements of room acoustics, sound environment during occupancy, and occupant surveys (n = 349) in 28 offices representing a diverse range of workplace parameters. As latent factors, the contribution of $\textit{lack of privacy}$ (LackPriv) was 25% higher than… ▽ More

    Submitted 3 May, 2025; v1 submitted 26 January, 2025; originally announced January 2025.

    Comments: The following article has been submitted to The Journal of the Acoustical Society of America. After it is published, it will be found at https://pubs.aip.org/asa/jasa

  31. arXiv:2501.11631  [pdf, other

    cs.SD cs.AI eess.AS

    Noise-Agnostic Multitask Whisper Training for Reducing False Alarm Errors in Call-for-Help Detection

    Authors: Myeonghoon Ryu, June-Woo Kim, Minseok Oh, Suji Lee, Han Park

    Abstract: Keyword spotting is often implemented by keyword classifier to the encoder in acoustic models, enabling the classification of predefined or open vocabulary keywords. Although keyword spotting is a crucial task in various applications and can be extended to call-for-help detection in emergencies, however, the previous method often suffers from scalability limitations due to retraining required to i… ▽ More

    Submitted 20 January, 2025; originally announced January 2025.

    Comments: Accepted to ICASSP 2025

  32. arXiv:2501.11542  [pdf, other

    eess.SY cs.LG

    DLinear-based Prediction of Remaining Useful Life of Lithium-Ion Batteries: Feature Engineering through Explainable Artificial Intelligence

    Authors: Minsu Kim, Jaehyun Oh, Sang-Young Lee, Junghwan Kim

    Abstract: Accurate prediction of the Remaining Useful Life (RUL) of lithium-ion batteries is essential for ensuring safety, reducing maintenance costs, and optimizing usage. However, predicting RUL is challenging due to the nonlinear characteristics of the degradation caused by complex chemical reactions. Machine learning allows precise predictions by learning the latent functions of degradation relationshi… ▽ More

    Submitted 20 January, 2025; originally announced January 2025.

  33. arXiv:2501.03045  [pdf, other

    eess.AS cs.AI

    Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments

    Authors: Hanbin Bae, Byungjun Kang, Jiwon Kim, Jaeyong Hwang, Hosang Sung, Hoon-Young Cho

    Abstract: This study emphasizes the significance of exploring distance-based source separation (DSS) in outdoor environments. Unlike existing studies that primarily focus on indoor settings, the proposed model is designed to capture the unique characteristics of outdoor audio sources. It incorporates advanced techniques, including a two-stage conformer block, a linear relation-aware self-attention (RSA), an… ▽ More

    Submitted 6 January, 2025; originally announced January 2025.

    Comments: Accepted by ICASSP2025. \c{opyright} 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component

  34. SNeRV: Spectra-preserving Neural Representation for Video

    Authors: Jina Kim, Jihoo Lee, Je-Won Kang

    Abstract: Neural representation for video (NeRV), which employs a neural network to parameterize video signals, introduces a novel methodology in video representations. However, existing NeRV-based methods have difficulty in capturing fine spatial details and motion patterns due to spectral bias, in which a neural network learns high-frequency (HF) components at a slower rate than low-frequency (LF) compone… ▽ More

    Submitted 3 January, 2025; originally announced January 2025.

    Comments: ECCV 2024

  35. arXiv:2501.01347  [pdf, other

    cs.SD cs.CL eess.AS

    AdaptVC: High Quality Voice Conversion with Adaptive Learning

    Authors: Jaehun Kim, Ji-Hoon Kim, Yeunju Choi, Tan Dat Nguyen, Seongkyu Mun, Joon Son Chung

    Abstract: The goal of voice conversion is to transform the speech of a source speaker to sound like that of a reference speaker while preserving the original content. A key challenge is to extract disentangled linguistic content from the source and voice style from the reference. While existing approaches leverage various methods to isolate the two, a generalization still requires further attention, especia… ▽ More

    Submitted 14 January, 2025; v1 submitted 2 January, 2025; originally announced January 2025.

    Comments: ICASSP 2025; demo available https://mm.kaist.ac.kr/projects/AdaptVC

  36. Smooth Reference Command Generation and Control for Transition Flight of VTOL Aircraft Using Time-Varying Optimization

    Authors: Jinrae Kim, John L. Bullock, Sheng Cheng, Naira Hovakimyan

    Abstract: Vertical take-off and landing (VTOL) aircraft pose a challenge in generating reference commands during transition flight. While sparsity between hover and cruise flight modes can be promoted for effective transitions by formulating $\ell_{1}$-norm minimization problems, solving these problems offline pointwise in time can lead to non-smooth reference commands, resulting in abrupt transitions. This… ▽ More

    Submitted 1 January, 2025; originally announced January 2025.

    Comments: 10 pages, 7 figures, AIAA SciTech 2025 Forum

  37. arXiv:2412.20048  [pdf, other

    eess.AS cs.AI cs.SD eess.SP

    CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation

    Authors: Ji-Hoon Kim, Hong-Sun Yang, Yoon-Cheol Ju, Il-Hwan Kim, Byeong-Yeol Kim, Joon Son Chung

    Abstract: The goal of this work is to generate natural speech in multiple languages while maintaining the same speaker identity, a task known as cross-lingual speech synthesis. A key challenge of cross-lingual speech synthesis is the language-speaker entanglement problem, which causes the quality of cross-lingual systems to lag behind that of intra-lingual systems. In this paper, we propose CrossSpeech++, w… ▽ More

    Submitted 28 December, 2024; originally announced December 2024.

  38. arXiv:2412.18545  [pdf

    eess.IV cs.AI cs.CV

    Advancing Deformable Medical Image Registration with Multi-axis Cross-covariance Attention

    Authors: Mingyuan Meng, Michael Fulham, Lei Bi, Jinman Kim

    Abstract: Deformable image registration is a fundamental requirement for medical image analysis. Recently, transformers have been widely used in deep learning-based registration methods for their ability to capture long-range dependency via self-attention (SA). However, the high computation and memory loads of SA (growing quadratically with the spatial resolution) hinder transformers from processing subtle… ▽ More

    Submitted 24 December, 2024; originally announced December 2024.

    Comments: Under Review

  39. arXiv:2412.12853  [pdf, other

    eess.IV cs.CV

    Automatic Left Ventricular Cavity Segmentation via Deep Spatial Sequential Network in 4D Computed Tomography Studies

    Authors: Yuyu Guo, Lei Bi, Zhengbin Zhu, David Dagan Feng, Ruiyan Zhang, Qian Wang, Jinman Kim

    Abstract: Automated segmentation of left ventricular cavity (LVC) in temporal cardiac image sequences (multiple time points) is a fundamental requirement for quantitative analysis of its structural and functional changes. Deep learning based methods for the segmentation of LVC are the state of the art; however, these methods are generally formulated to work on single time points, and fails to exploit the co… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

    Comments: 9 pages

  40. arXiv:2412.12687  [pdf, other

    cs.LG cs.DC cs.IT cs.NI eess.SP

    Uncertainty-Aware Hybrid Inference with On-Device Small and Remote Large Language Models

    Authors: Seungeun Oh, Jinhyuk Kim, Jihong Park, Seung-Woo Ko, Tony Q. S. Quek, Seong-Lyun Kim

    Abstract: This paper studies a hybrid language model (HLM) architecture that integrates a small language model (SLM) operating on a mobile device with a large language model (LLM) hosted at the base station (BS) of a wireless network. The HLM token generation process follows the speculative inference principle: the SLM's vocabulary distribution is uploaded to the LLM, which either accepts or rejects it, wit… ▽ More

    Submitted 18 March, 2025; v1 submitted 17 December, 2024; originally announced December 2024.

    Comments: 7 pages, 6 figures; to be presented at IEEE International Conference on Machine Learning for Communication and Networking (ICMLCN) 2025

  41. arXiv:2412.12215  [pdf, other

    cs.LG cs.AI cs.SD eess.AS

    Imagined Speech State Classification for Robust Brain-Computer Interface

    Authors: Byung-Kwan Ko, Jun-Young Kim, Seo-Hyun Lee

    Abstract: This study examines the effectiveness of traditional machine learning classifiers versus deep learning models for detecting the imagined speech using electroencephalogram data. Specifically, we evaluated conventional machine learning techniques such as CSP-SVM and LDA-SVM classifiers alongside deep learning architectures such as EEGNet, ShallowConvNet, and DeepConvNet. Machine learning classifiers… ▽ More

    Submitted 15 December, 2024; originally announced December 2024.

  42. Improving Automatic Fetal Biometry Measurement with Swoosh Activation Function

    Authors: Shijia Zhou, Euijoon Ahn, Hao Wang, Ann Quinton, Narelle Kennedy, Pradeeba Sridar, Ralph Nanan, Jinman Kim

    Abstract: The measurement of fetal thalamus diameter (FTD) and fetal head circumference (FHC) are crucial in identifying abnormal fetal thalamus development as it may lead to certain neuropsychiatric disorders in later life. However, manual measurements from 2D-US images are laborious, prone to high inter-observer variability, and complicated by the high signal-to-noise ratio nature of the images. Deep lear… ▽ More

    Submitted 15 December, 2024; originally announced December 2024.

    Journal ref: MICCAI 2023

  43. arXiv:2412.10820  [pdf, other

    eess.SY

    Inertia-aware Unit Commitment and Remuneration Methods for Decarbonized Power System

    Authors: HyunJoong Kim, Jip Kim

    Abstract: To maintain frequency stability in decarbonized power systems, inertia services from synchronous generators (SGs) and inverter-based resources must be procured. However, designing an inertia-aware system operation poses significant challenges in considering the variability and uncertainty of renewable energy sources (RES) and adopting a remuneration method for inertia provision due to SG commitmen… ▽ More

    Submitted 14 December, 2024; originally announced December 2024.

    Comments: 10pages, 7 figures

  44. arXiv:2412.09403  [pdf

    physics.optics eess.IV

    Space-time inverse-scattering of translation-based motion

    Authors: Jeongsoo Kim, Shwetadwip Chowdhury

    Abstract: In optical diffraction tomography (ODT), a sample's 3D refractive-index (RI) is often reconstructed after illuminating it from multiple angles, with the assumption that the sample remains static throughout data collection. When the sample undergoes dynamic motion during this data-collection process, significant artifacts and distortions compromise the fidelity of the reconstructed images. In this… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

    Comments: 20 pages, 5 figures

  45. arXiv:2412.09109  [pdf, other

    eess.SY cs.CE math.OC

    evS2CP: Real-time Simultaneous Speed and Charging Planner for Connected Electric Vehicles

    Authors: Minwoo Gwon, Jiwon Kim, Seungjun Yoo, Kwang-Ki K. Kim

    Abstract: This paper presents evS2CP, an optimization-based framework for simultaneous speed and charging planning designed for connected electric vehicles (EVs). With EVs emerging as competitive alternatives to internal combustion engine vehicles, overcoming challenges such as limited charging infrastructure is crucial. evS2CP addresses these issues by minimizing the travel time, charging time, and energy… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

    Comments: 15 pages, 9 figures, 2 tables

    MSC Class: 93C85; 49M37; 65K05; 90C29; 68T40; 70B15 ACM Class: G.1.6; G.1.2; H.5.2; G.4; H.4.3; J.6

  46. arXiv:2412.06624  [pdf, other

    eess.IV cs.AI cs.CV

    Fundus Image-based Visual Acuity Assessment with PAC-Guarantees

    Authors: Sooyong Jang, Kuk Jin Jang, Hyonyoung Choi, Yong-Seop Han, Seongjin Lee, Jin-hyun Kim, Insup Lee

    Abstract: Timely detection and treatment are essential for maintaining eye health. Visual acuity (VA), which measures the clarity of vision at a distance, is a crucial metric for managing eye health. Machine learning (ML) techniques have been introduced to assist in VA measurement, potentially alleviating clinicians' workloads. However, the inherent uncertainties in ML models make relying solely on them for… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

    Comments: To be published in ML4H 2024

  47. arXiv:2411.19486  [pdf, other

    cs.CV cs.SD eess.AS

    V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow

    Authors: Jeongsoo Choi, Ji-Hoon Kim, Jinyu Li, Joon Son Chung, Shujie Liu

    Abstract: In this paper, we introduce V2SFlow, a novel Video-to-Speech (V2S) framework designed to generate natural and intelligible speech directly from silent talking face videos. While recent V2S systems have shown promising results on constrained datasets with limited speakers and vocabularies, their performance often degrades on real-world, unconstrained datasets due to the inherent variability and com… ▽ More

    Submitted 29 November, 2024; originally announced November 2024.

  48. arXiv:2411.18086  [pdf, other

    cs.RO eess.SY

    DMVC-Tracker: Distributed Multi-Agent Trajectory Planning for Target Tracking Using Dynamic Buffered Voronoi and Inter-Visibility Cells

    Authors: Yunwoo Lee, Jungwon Park, H. Jin Kim

    Abstract: This letter presents a distributed trajectory planning method for multi-agent aerial tracking. The proposed method uses a Dynamic Buffered Voronoi Cell (DBVC) and a Dynamic Inter-Visibility Cell (DIVC) to formulate the distributed trajectory generation. Specifically, the DBVC and the DIVC are time-variant spaces that prevent mutual collisions and occlusions among agents, while enabling them to mai… ▽ More

    Submitted 5 March, 2025; v1 submitted 27 November, 2024; originally announced November 2024.

    Comments: 8 pages, 6 figures

  49. arXiv:2411.17785  [pdf, other

    eess.SP cs.LG

    New Test-Time Scenario for Biosignal: Concept and Its Approach

    Authors: Yong-Yeon Jo, Byeong Tak Lee, Beom Joon Kim, Jeong-Ho Hong, Hak Seung Lee, Joon-myoung Kwon

    Abstract: Online Test-Time Adaptation (OTTA) enhances model robustness by updating pre-trained models with unlabeled data during testing. In healthcare, OTTA is vital for real-time tasks like predicting blood pressure from biosignals, which demand continuous adaptation. We introduce a new test-time scenario with streams of unlabeled samples and occasional labeled samples. Our framework combines supervised a… ▽ More

    Submitted 26 November, 2024; originally announced November 2024.

    Comments: Findings paper presented at Machine Learning for Health (ML4H) symposium 2024, December 15-16, 2024, Vancouver, Canada, 6 pages

  50. arXiv:2411.17277  [pdf, other

    eess.SY

    Minimizing Conservatism in Safety-Critical Control for Input-Delayed Systems via Adaptive Delay Estimation

    Authors: Yitaek Kim, Ersin Das, Jeeseop Kim, Aaron D. Ames, Joel W. Burdick, Christoffer Sloth

    Abstract: Input delays affect systems such as teleoperation and wirelessly autonomous connected vehicles, and may lead to safety violations. One promising way to ensure safety in the presence of delay is to employ control barrier functions (CBFs), and extensions thereof that account for uncertainty: delay adaptive CBFs (DaCBFs). This paper proposes an online adaptive safety control framework for reducing th… ▽ More

    Submitted 26 November, 2024; originally announced November 2024.

    Comments: This paper has been submitted to ECC 2025 for possible publication