Skip to main content

Showing 1–50 of 2,495 results for author: Park, H

.
  1. arXiv:2506.13295  [pdf, ps, other

    eess.AS cs.SD

    Instance-Specific Test-Time Training for Speech Editing in the Wild

    Authors: Taewoo Kim, Uijong Lee, Hayoung Park, Choongsang Cho, Nam In Park, Young Han Lee

    Abstract: Speech editing systems aim to naturally modify speech content while preserving acoustic consistency and speaker identity. However, previous studies often struggle to adapt to unseen and diverse acoustic conditions, resulting in degraded editing performance in real-world scenarios. To address this, we propose an instance-specific test-time training method for speech editing in the wild. Our approac… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: Submitted to IEEE Signal Processing Letters

  2. arXiv:2506.12482  [pdf, ps, other

    cs.AI

    Tiered Agentic Oversight: A Hierarchical Multi-Agent System for AI Safety in Healthcare

    Authors: Yubin Kim, Hyewon Jeong, Chanwoo Park, Eugene Park, Haipeng Zhang, Xin Liu, Hyeonhoon Lee, Daniel McDuff, Marzyeh Ghassemi, Cynthia Breazeal, Samir Tulebaev, Hae Won Park

    Abstract: Current large language models (LLMs), despite their power, can introduce safety risks in clinical settings due to limitations such as poor error detection and single point of failure. To address this, we propose Tiered Agentic Oversight (TAO), a hierarchical multi-agent framework that enhances AI safety through layered, automated supervision. Inspired by clinical hierarchies (e.g., nurse, physicia… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  3. arXiv:2506.12471  [pdf, ps, other

    eess.IV cs.CV

    Adaptive Multi-resolution Hash-Encoding Framework for INR-based Dental CBCT Reconstruction with Truncated FOV

    Authors: Hyoung Suk Park, Kiwan Jeon

    Abstract: Implicit neural representation (INR), particularly in combination with hash encoding, has recently emerged as a promising approach for computed tomography (CT) image reconstruction. However, directly applying INR techniques to 3D dental cone-beam CT (CBCT) with a truncated field of view (FOV) is challenging. During the training process, if the FOV does not fully encompass the patient's head, a dis… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

    Comments: 18 pages, 4 figures

    MSC Class: 68Wxx

  4. arXiv:2506.11920  [pdf, ps, other

    quant-ph cond-mat.dis-nn

    Nanoscale Magnetic Resonance Imaging and Control of a Strongly Interacting Dipolar System

    Authors: Piotr Put, Nathaniel T. Leitao, Christina Spaegele, Haoyang Gao, Oksana Makarova, Bartholomeus Machielse, Hengyun Zhou, Federico Capasso, Leigh S. Martin, Hongkun Park, Mikhail D. Lukin

    Abstract: Magnetic Resonance Imaging (MRI) is a fundamental tool for physical and life sciences, yet its spatial resolution is typically limited to macroscopic scales. Here, we demonstrate nanoscale MRI by combining strong, time-dependent local magnetic field gradients with coherent control of a dense ensemble of electron spins hosted in atom-like defects in diamond. Using this platform, we generate and man… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  5. arXiv:2506.11329  [pdf, ps, other

    cs.AR

    A4: Microarchitecture-Aware LLC Management for Datacenter Servers with Emerging I/O Devices

    Authors: Haneul Park, Jiaqi Lou, Sangjin Lee, Yifan Yuan, Kyoung Soo Park, Yongseok Son, Ipoom Jeong, Nam Sung Kim

    Abstract: In modern server CPUs, the Last-Level Cache (LLC) serves not only as a victim cache for higher-level private caches but also as a buffer for low-latency DMA transfers between CPU cores and I/O devices through Direct Cache Access (DCA). However, prior work has shown that high-bandwidth network-I/O devices can rapidly flood the LLC with packets, often causing significant contention with co-running w… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  6. arXiv:2506.11081  [pdf, ps, other

    cs.CL

    SAGE:Specification-Aware Grammar Extraction for Automated Test Case Generation with LLMs

    Authors: Aditi, Hyunwoo Park, Sicheol Sung, Yo-Sub Han, Sang-Ki Ko

    Abstract: Grammar-based test case generation has proven effective for competitive programming problems, but generating valid and general grammars from natural language specifications remains a key challenge, especially under limited supervision. Context-Free Grammars with Counters (CCFGs) have recently been introduced as a formalism to represent such specifications with logical constraints by storing and re… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  7. arXiv:2506.10567  [pdf, ps, other

    cs.CV

    LRSLAM: Low-rank Representation of Signed Distance Fields in Dense Visual SLAM System

    Authors: Hongbeen Park, Minjeong Park, Giljoo Nam, Jinkyu Kim

    Abstract: Simultaneous Localization and Mapping (SLAM) has been crucial across various domains, including autonomous driving, mobile robotics, and mixed reality. Dense visual SLAM, leveraging RGB-D camera systems, offers advantages but faces challenges in achieving real-time performance, robustness, and scalability for large-scale scenes. Recent approaches utilizing neural implicit scene representations sho… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: Accepted at ECCV 2024

  8. arXiv:2506.09993  [pdf, other

    cs.CV cs.AI cs.LG

    Text-Aware Image Restoration with Diffusion Models

    Authors: Jaewon Min, Jin Hyeon Kim, Paul Hyunbin Cho, Jaeeun Lee, Jihye Park, Minkyu Park, Sangpil Kim, Hyunhee Park, Seungryong Kim

    Abstract: Image restoration aims to recover degraded images. However, existing diffusion-based restoration methods, despite great success in natural image restoration, often struggle to faithfully reconstruct textual regions in degraded images. Those methods frequently generate plausible but incorrect text-like patterns, a phenomenon we refer to as text-image hallucination. In this paper, we introduce Text-… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: Project page: https://cvlab-kaist.github.io/TAIR/

  9. arXiv:2506.08660  [pdf, other

    cs.LG cs.AI

    Towards Robust Real-World Multivariate Time Series Forecasting: A Unified Framework for Dependency, Asynchrony, and Missingness

    Authors: Jinkwan Jang, Hyungjin Park, Jinmyeong Choi, Taesup Kim

    Abstract: Real-world time series data are inherently multivariate, often exhibiting complex inter-channel dependencies. Each channel is typically sampled at its own period and is prone to missing values due to various practical and operational constraints. These characteristics pose fundamental challenges related to channel dependency, sampling asynchrony, and missingness, all of which must be addressed to… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  10. arXiv:2506.08573  [pdf, ps, other

    q-fin.MF q-fin.PR

    Designing funding rates for perpetual futures in cryptocurrency markets

    Authors: Jaehyun Kim, Hyungbin Park

    Abstract: In cryptocurrency markets, a key challenge for perpetual future issuers is maintaining alignment between the perpetual future price and target value. This study addresses this challenge by exploring the relationship between funding rates and perpetual future prices. Our results demonstrate that by appropriately designing funding rates, the perpetual future price can remain aligned with the target… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  11. arXiv:2506.08476  [pdf, ps, other

    cond-mat.soft physics.chem-ph

    Bridging Electrostatic Screening and Ion Transport in Lithium Salt-Doped Ionic Liquids

    Authors: Hyungshick Park, Bong June Sung, Jeongmin Kim

    Abstract: Alkali salt-doped ionic liquids are emerging as promising electrolyte systems for energy applications, owing to their excellent interfacial stability. To address their limited ionic conductivity, various strategies have been proposed, including modifying the ion solvation environment and enhancing the transport of selected ions (e.g., Li$^+$). Despite the pivotal role of electrostatic interactions… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 11 pages, 5 figures

  12. arXiv:2506.07879  [pdf, ps, other

    hep-ex

    Measurement of the CP asymmetry in $D^+ \to π^+ π^0$ decays at Belle II

    Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, S. Alghamdi, M. Alhakami, A. Aloisio, K. Amos, M. Angelsmark, N. Anh Ky, C. Antonioli, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, N. K. Baghel, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett , et al. (380 additional authors not shown)

    Abstract: We measure the CP asymmetry in $D^+ \to π^+ π^0$ decays reconstructed in $e^+ e^-$ collisions at the Belle II experiment using a data set corresponding to an integrated luminosity of 428 fb$^{-1}$. A control sample of $D^+ \to π^+ K_{S}$ decays is used to correct for detection and production asymmetries. The result, $A_{CP}(D^+ \to π^+π^0) =(-1.8 \pm 0.9 \pm 0.1)\%$, where the first uncertainty is… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Report number: Belle II Preprint 2025-012, KEK Preprint 2025-10

  13. arXiv:2506.04054  [pdf, ps, other

    cs.CV

    Video Deblurring with Deconvolution and Aggregation Networks

    Authors: Giyong Choi, HyunWook Park

    Abstract: In contrast to single-image deblurring, video deblurring has the advantage that neighbor frames can be utilized to deblur a target frame. However, existing video deblurring algorithms often fail to properly employ the neighbor frames, resulting in sub-optimal performance. In this paper, we propose a deconvolution and aggregation network (DAN) for video deblurring that utilizes the information of n… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  14. arXiv:2506.03892  [pdf, ps, other

    cs.CV

    Joint Video Enhancement with Deblurring, Super-Resolution, and Frame Interpolation Network

    Authors: Giyong Choi, HyunWook Park

    Abstract: Video quality is often severely degraded by multiple factors rather than a single factor. These low-quality videos can be restored to high-quality videos by sequentially performing appropriate video enhancement techniques. However, the sequential approach was inefficient and sub-optimal because most video enhancement approaches were designed without taking into account that multiple factors togeth… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  15. arXiv:2506.02338  [pdf, other

    cs.CL

    One Missing Piece for Open-Source Reasoning Models: A Dataset to Mitigate Cold-Starting Short CoT LLMs in RL

    Authors: Hyungjoo Chae, Dongjin Kang, Jihyuk Kim, Beong-woo Kwak, Sunghyun Park, Haeju Park, Jinyoung Yeo, Moontae Lee, Kyungjae Lee

    Abstract: With the release of R1, a publicly available large reasoning model (LRM), researchers commonly train new LRMs by training language models on R1's long chain-of-thought (CoT) inferences. While prior works show that LRMs' capabilities can be reproduced through direct distillation, the continued reliance on the existing models (e.g., R1) remains a critical limitation in advancing the field. As a firs… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: ACL 2025 Industry

  16. arXiv:2506.01620  [pdf, ps, other

    astro-ph.GA

    Exploring the potential for kinematically colder HI component as a tracer for star-forming gas in nearby galaxies

    Authors: Hye-Jin Park, Andrew J. Battisti, Antoine Marchal, Luca Cortese, Emily Wisnioski, Mark Seibert, Shin-Jeong Kim, Naomi McClure-Griffiths, W. J. G. de Blok, Kathryn Grasha, Barry F. Madore, Jeff A. Rich, Rachael L. Beaton

    Abstract: Atomic hydrogen (HI) dominates the mass of the cold interstellar medium, undergoing thermal condensation to form molecular gas and fuel star formation. Kinematically colder HI components, identified via kinematic decomposition of HI 21 cm data cubes, serve as a crucial transition phase between diffuse warm neutral gas and molecular hydrogen (H$_{2}$). We analyse these colder HI components by decom… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: 18 pages, 9 figures, accepted for publication to MNRAS

  17. arXiv:2506.01411  [pdf, ps, other

    cs.CV cs.AI

    ViTA-PAR: Visual and Textual Attribute Alignment with Attribute Prompting for Pedestrian Attribute Recognition

    Authors: Minjeong Park, Hongbeen Park, Jinkyu Kim

    Abstract: The Pedestrian Attribute Recognition (PAR) task aims to identify various detailed attributes of an individual, such as clothing, accessories, and gender. To enhance PAR performance, a model must capture features ranging from coarse-grained global attributes (e.g., for identifying gender) to fine-grained local details (e.g., for recognizing accessories) that may appear in diverse regions. Recent re… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Accepted to IEEE ICIP 2025

  18. arXiv:2506.00827  [pdf, ps, other

    cs.CV

    Improving Keystep Recognition in Ego-Video via Dexterous Focus

    Authors: Zachary Chavis, Stephen J. Guy, Hyun Soo Park

    Abstract: In this paper, we address the challenge of understanding human activities from an egocentric perspective. Traditional activity recognition techniques face unique challenges in egocentric videos due to the highly dynamic nature of the head during many activities. We propose a framework that seeks to address these challenges in a way that is independent of network architecture by restricting the ego… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  19. arXiv:2505.24751  [pdf, ps, other

    cs.RO

    EL-AGHF: Extended Lagrangian Affine Geometric Heat Flow

    Authors: Sangmin Kim, Hae-Won Park

    Abstract: We propose a constrained Affine Geometric Heat Flow (AGHF) method that evolves so as to suppress the dynamics gaps associated with inadmissible control directions. AGHF provides a unified framework applicable to a wide range of motion planning problems, including both holonomic and non-holonomic systems. However, to generate admissible trajectories, it requires assigning infinite penalties to inad… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

    Comments: 6 pages, 4 figures

  20. arXiv:2505.23026  [pdf, ps, other

    cs.CL cs.AI

    Context-Robust Knowledge Editing for Language Models

    Authors: Haewon Park, Gyubin Choi, Minjun Kim, Yohan Jo

    Abstract: Knowledge editing (KE) methods offer an efficient way to modify knowledge in large language models. Current KE evaluations typically assess editing success by considering only the edited knowledge without any preceding contexts. In real-world applications, however, preceding contexts often trigger the retrieval of the original knowledge and undermine the intended edit. To address this issue, we de… ▽ More

    Submitted 31 May, 2025; v1 submitted 28 May, 2025; originally announced May 2025.

    Comments: ACL 2025 Findings. Our code and datasets are available at https://github.com/holi-lab/CoRE

  21. arXiv:2505.23006  [pdf, ps, other

    cs.CL cs.AI

    A Practical Approach for Building Production-Grade Conversational Agents with Workflow Graphs

    Authors: Chiwan Park, Wonjun Jang, Daeryong Kim, Aelim Ahn, Kichang Yang, Woosung Hwang, Jihyeon Roh, Hyerin Park, Hyosun Wang, Min Seok Kim, Jihoon Kang

    Abstract: The advancement of Large Language Models (LLMs) has led to significant improvements in various service domains, including search, recommendation, and chatbot applications. However, applying state-of-the-art (SOTA) research to industrial settings presents challenges, as it requires maintaining flexible conversational abilities while also strictly complying with service-specific constraints. This ca… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: Accepted to ACL 2025 Industry Track. 12 pages, 5 figures

    ACM Class: I.2.7

  22. arXiv:2505.21757  [pdf, ps, other

    cs.CL

    BehaviorSFT: Behavioral Token Conditioning for Clinical Agents Across the Proactivity Spectrum

    Authors: Yubin Kim, Zhiyuan Hu, Hyewon Jeong, Eugene Park, Shuyue Stella Li, Chanwoo Park, Shiyun Xiong, MingYu Lu, Hyeonhoon Lee, Xin Liu, Daniel McDuff, Cynthia Breazeal, Samir Tulebaev, Hae Won Park

    Abstract: Large Language Models (LLMs) as clinical agents require careful behavioral adaptation. While adept at reactive tasks (e.g., diagnosis reasoning), LLMs often struggle with proactive engagement, like unprompted identification of critical missing information or risks. We introduce BehaviorBench, a comprehensive dataset to evaluate agent behaviors across a clinical assistance spectrum, ranging from re… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  23. arXiv:2505.21451  [pdf, ps, other

    cs.CL

    Words Like Knives: Backstory-Personalized Modeling and Detection of Violent Communication

    Authors: Jocelyn Shen, Akhila Yerukola, Xuhui Zhou, Cynthia Breazeal, Maarten Sap, Hae Won Park

    Abstract: Conversational breakdowns in close relationships are deeply shaped by personal histories and emotional context, yet most NLP research treats conflict detection as a general task, overlooking the relational dynamics that influence how messages are perceived. In this work, we leverage nonviolent communication (NVC) theory to evaluate LLMs in detecting conversational breakdowns and assessing how rela… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  24. arXiv:2505.21380  [pdf, ps, other

    cs.CL

    PHISH in MESH: Korean Adversarial Phonetic Substitution and Phonetic-Semantic Feature Integration Defense

    Authors: Byungjun Kim, Minju Kim, Hyeonchu Park, Bugeun Kim

    Abstract: As malicious users increasingly employ phonetic substitution to evade hate speech detection, researchers have investigated such strategies. However, two key challenges remain. First, existing studies have overlooked the Korean language, despite its vulnerability to phonetic perturbations due to its phonographic nature. Second, prior work has primarily focused on constructing datasets rather than d… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: Under review

  25. arXiv:2505.21127  [pdf, ps, other

    astro-ph.GA

    The TYPHOON Stellar Population Synthesis Survey. II. Pushing Full Spectral Fitting to the Limit in the Nearby Grand Design Barred Spiral M83

    Authors: Eva Sextl, Rolf-Peter Kudritzki, Fabio Bresolin, Kathryn Grasha, Hye-Jin Park, Qian-Hui Chen, Andrew J. Battisti, Mark Seibert, Barry F. Madore, Jeffrey A. Rich

    Abstract: We apply population synthesis techniques to analyze TYPHOON long slit spectra of the starburst barred spiral galaxy M83. The analysis covers a central square of 5 arcmin side length. We determine the spatial distribution of dust through the analysis of reddening and extinction, together with star formation rates, ages, and metallicities of young and old stellar populations. For the first time, a s… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: 25 pages, 21 figures

  26. arXiv:2505.20609  [pdf, other

    cs.AI cs.CL

    Comparisons between a Large Language Model-based Real-Time Compound Diagnostic Medical AI Interface and Physicians for Common Internal Medicine Cases using Simulated Patients

    Authors: Hyungjun Park, Chang-Yun Woo, Seungjo Lim, Seunghwan Lim, Keunho Kwak, Ju Young Jeong, Chong Hyun Suh

    Abstract: Objective To develop an LLM based realtime compound diagnostic medical AI interface and performed a clinical trial comparing this interface and physicians for common internal medicine cases based on the United States Medical License Exam (USMLE) Step 2 Clinical Skill (CS) style exams. Methods A nonrandomized clinical trial was conducted on August 20, 2024. We recruited one general physician, two i… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  27. arXiv:2505.19519  [pdf, ps, other

    cs.CV

    Regularized Personalization of Text-to-Image Diffusion Models without Distributional Drift

    Authors: Gihoon Kim, Hyungjin Park, Taesup Kim

    Abstract: Personalization using text-to-image diffusion models involves adapting a pretrained model to novel subjects with only a few image examples. This task presents a fundamental challenge, as the model must not only learn the new subject effectively but also preserve its ability to generate diverse and coherent outputs across a wide range of prompts. In other words, successful personalization requires… ▽ More

    Submitted 27 May, 2025; v1 submitted 26 May, 2025; originally announced May 2025.

  28. arXiv:2505.19401  [pdf, ps, other

    eess.AS

    Stack Less, Repeat More: A Block Reusing Approach for Progressive Speech Enhancement

    Authors: Jangyeon Kim, Ui-Hyeop Shin, Jaehyun Ko, Hyung-Min Park

    Abstract: This paper presents an efficient speech enhancement (SE) approach that reuses a processing block repeatedly instead of conventional stacking. Rather than increasing the number of blocks for learning deep latent representations, repeating a single block leads to progressive refinement while reducing parameter redundancy. We also minimize domain transformation by keeping an encoder and decoder shall… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

    Comments: Accepted to Interspeech 2025

  29. arXiv:2505.16351  [pdf, other

    eess.AS cs.AI

    Dysfluent WFST: A Framework for Zero-Shot Speech Dysfluency Transcription and Detection

    Authors: Chenxu Guo, Jiachen Lian, Xuanru Zhou, Jinming Zhang, Shuhe Li, Zongli Ye, Hwi Joo Park, Anaisha Das, Zoe Ezzes, Jet Vonk, Brittany Morin, Rian Bogley, Lisa Wauters, Zachary Miller, Maria Gorno-Tempini, Gopala Anumanchipalli

    Abstract: Automatic detection of speech dysfluency aids speech-language pathologists in efficient transcription of disordered speech, enhancing diagnostics and treatment planning. Traditional methods, often limited to classification, provide insufficient clinical insight, and text-independent models misclassify dysfluency, especially in context-dependent cases. This work introduces Dysfluent-WFST, a zero-sh… ▽ More

    Submitted 24 May, 2025; v1 submitted 22 May, 2025; originally announced May 2025.

    Comments: Accepted for Interspeech2025

  30. arXiv:2505.15922  [pdf, ps, other

    cs.CL

    Aligning Dialogue Agents with Global Feedback via Large Language Model Reward Decomposition

    Authors: Dong Won Lee, Hae Won Park, Cynthia Breazeal, Louis-Philippe Morency

    Abstract: We propose a large language model based reward decomposition framework for aligning dialogue agents using only a single session-level feedback signal. We leverage the reasoning capabilities of a frozen, pretrained large language model (LLM) to infer fine-grained local implicit rewards by decomposing global, session-level feedback. Our first text-only variant prompts the LLM to perform reward decom… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

    Comments: 9 pages, 3 figures, 3 tables

  31. arXiv:2505.14814  [pdf, ps, other

    cs.SD cs.CL eess.AS

    GraphemeAug: A Systematic Approach to Synthesized Hard Negative Keyword Spotting Examples

    Authors: Harry Zhang, Kurt Partridge, Pai Zhu, Neng Chen, Hyun Jin Park, Dhruuv Agarwal, Quan Wang

    Abstract: Spoken Keyword Spotting (KWS) is the task of distinguishing between the presence and absence of a keyword in audio. The accuracy of a KWS model hinges on its ability to correctly classify examples close to the keyword and non-keyword boundary. These boundary examples are often scarce in training data, limiting model performance. In this paper, we propose a method to systematically generate adversa… ▽ More

    Submitted 24 May, 2025; v1 submitted 20 May, 2025; originally announced May 2025.

    Comments: Accepted at Interspeech 2025

  32. arXiv:2505.13577  [pdf, other

    cs.SD cs.AI eess.AS

    VocalAgent: Large Language Models for Vocal Health Diagnostics with Safety-Aware Evaluation

    Authors: Yubin Kim, Taehan Kim, Wonjune Kang, Eugene Park, Joonsik Yoon, Dongjae Lee, Xin Liu, Daniel McDuff, Hyeonhoon Lee, Cynthia Breazeal, Hae Won Park

    Abstract: Vocal health plays a crucial role in peoples' lives, significantly impacting their communicative abilities and interactions. However, despite the global prevalence of voice disorders, many lack access to convenient diagnosis and treatment. This paper introduces VocalAgent, an audio large language model (LLM) to address these challenges through vocal health diagnosis. We leverage Qwen-Audio-Chat fi… ▽ More

    Submitted 26 May, 2025; v1 submitted 19 May, 2025; originally announced May 2025.

  33. arXiv:2505.12231  [pdf, ps, other

    cs.RO

    Design of a 3-DOF Hopping Robot with an Optimized Gearbox: An Intermediate Platform Toward Bipedal Robots

    Authors: JongHun Choe, Gijeong Kim, Hajun Kim, Dongyun Kang, Min-Su Kim, Hae-Won Park

    Abstract: This paper presents a 3-DOF hopping robot with a human-like lower-limb joint configuration and a flat foot, capable of performing dynamic and repetitive jumping motions. To achieve both high torque output and a large hollow shaft diameter for efficient cable routing, a compact 3K compound planetary gearbox was designed using mixed-integer nonlinear programming for gear tooth optimization. To meet… ▽ More

    Submitted 20 May, 2025; v1 submitted 18 May, 2025; originally announced May 2025.

  34. arXiv:2505.12222  [pdf, other

    cs.RO

    Learning Impact-Rich Rotational Maneuvers via Centroidal Velocity Rewards and Sim-to-Real Techniques: A One-Leg Hopper Flip Case Study

    Authors: Dongyun Kang, Gijeong Kim, JongHun Choe, Hajun Kim, Hae-Won Park

    Abstract: Dynamic rotational maneuvers, such as front flips, inherently involve large angular momentum generation and intense impact forces, presenting major challenges for reinforcement learning and sim-to-real transfer. In this work, we propose a general framework for learning and deploying impact-rich, rotation-intensive behaviors through centroidal velocity-based rewards and actuator-aware sim-to-real t… ▽ More

    Submitted 20 May, 2025; v1 submitted 17 May, 2025; originally announced May 2025.

  35. arXiv:2505.12089  [pdf, ps, other

    eess.IV cs.AI cs.CV

    NTIRE 2025 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results

    Authors: Sangmin Lee, Eunpil Park, Angel Canelo, Hyunhee Park, Youngjo Kim, Hyung-Ju Chun, Xin Jin, Chongyi Li, Chun-Le Guo, Radu Timofte, Qi Wu, Tianheng Qiu, Yuchun Dong, Shenglin Ding, Guanghua Pan, Weiyu Zhou, Tao Hu, Yixu Feng, Duwei Dai, Yu Cao, Peng Wu, Wei Dong, Yanning Zhang, Qingsen Yan, Simon J. Larsen , et al. (11 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2025 Efficient Burst HDR and Restoration Challenge, which aims to advance efficient multi-frame high dynamic range (HDR) and restoration techniques. The challenge is based on a novel RAW multi-frame fusion dataset, comprising nine noisy and misaligned RAW frames with various exposure levels per scene. Participants were tasked with developing solutions capable of effect… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

  36. arXiv:2505.09705  [pdf, other

    hep-ex

    Search for a dark Higgs boson produced in association with inelastic dark matter at the Belle II experiment

    Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, S. Alghamdi, M. Alhakami, A. Aloisio, N. Althubiti, K. Amos, M. Angelsmark, N. Anh Ky, C. Antonioli, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, N. K. Baghel, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal , et al. (415 additional authors not shown)

    Abstract: Inelastic dark matter models that have two dark matter particles and a massive dark photon can reproduce the observed relic dark matter density without violating cosmological limits. The mass splitting between the two dark matter particles $χ_{1}$ and $χ_{2}$, with $m(χ_{2}) > m(χ_{1})$, is induced by a dark Higgs field and a corresponding dark Higgs boson $h^{\prime}$. We present a search for dar… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

    Comments: Submitted for publication with Physical Review Letters

    Report number: Belle II Preprint 2025-015, KEK Preprint 2025-14

  37. arXiv:2505.08418  [pdf, ps, other

    hep-ex

    Search for lepton flavor-violating decay modes $B^0 \to K^{\ast 0}τ^\pm\ell^\mp$ ($\ell = e,μ$) with hadronic B-tagging at Belle and Belle II

    Authors: Belle, Belle II Collaborations, :, I. Adachi, Y. Ahn, H. Aihara, N. Akopov, S. Alghamdi, M. Alhakami, A. Aloisio, K. Amos, M. Angelsmark, N. Anh Ky, C. Antonioli, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, N. K. Baghel, S. Bahinipati, P. Bambade, Sw. Banerjee , et al. (353 additional authors not shown)

    Abstract: We present the results of a search for the charged-lepton-flavor violating decays $B^0 \rightarrow K^{*0}τ^\pm \ell^{\mp}$, where $\ell^{\mp}$ is either an electron or a muon. The results are based on 365 fb$^{-1}$ and 711 fb$^{-1}$ datasets collected with the Belle II and Belle detectors, respectively. We use an exclusive hadronic $B$-tagging technique, and search for a signal decay in the system… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: 19 pages, 4 figures

    Report number: Belle II preprint: 2025-014, KEK preprint: 2025-13

  38. arXiv:2505.05710  [pdf, ps, other

    cs.CV cs.AI eess.IV

    HyperspectralMAE: The Hyperspectral Imagery Classification Model using Fourier-Encoded Dual-Branch Masked Autoencoder

    Authors: Wooyoung Jeong, Hyun Jae Park, Seonghun Jeong, Jong Wook Jang, Tae Hoon Lim, Dae Seoung Kim

    Abstract: Hyperspectral imagery provides rich spectral detail but poses unique challenges because of its high dimensionality in both spatial and spectral domains. We propose \textit{HyperspectralMAE}, a Transformer-based foundation model for hyperspectral data that employs a \textit{dual masking} strategy: during pre-training we randomly occlude 50\% of spatial patches and 50\% of spectral bands. This force… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  39. arXiv:2505.05068  [pdf

    cond-mat.str-el cond-mat.supr-con

    Orbital-Selective Quasiparticle Depletion across the Density Wave Transition in Trilayer Nickelate La$_4$Ni$_3$O$_{10}$

    Authors: Dong-Hyeon Gim, Chung Ha Park, Kee Hoon Kim

    Abstract: We investigate the evolution of polarized electronic Raman response in trilayer nickelate La$_4$Ni$_3$O$_{10}$, uncovering a systematic reduction of the incoherent electron continuum across the density wave transition in the $A_{1g}$ and $B_{1g}$ representations. Analysis based on the Fermi surface band curvatures points to quasiparticle coherence in momentum positions with dominant $d_{x^2-y^2}$… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: (Main text) 12 pages, 4 figures. (Supplemental materials) 8 pages, 5 figures

  40. SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer

    Authors: Young-Hu Park, Rae-Hong Park, Hyung-Min Park

    Abstract: This paper presents an efficient visual speech encoder for lip reading. While most recent lip reading studies have been based on the ResNet architecture and have achieved significant success, they are not sufficiently suitable for efficiently capturing lip reading features due to high computational complexity in modeling spatio-temporal information. Additionally, using a complex visual model not o… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Journal ref: Neurocomputing, Volume 639, 28 July 2025, 130289

  41. arXiv:2505.03777  [pdf, other

    cs.LG

    MolMole: Molecule Mining from Scientific Literature

    Authors: LG AI Research, Sehyun Chun, Jiye Kim, Ahra Jo, Yeonsik Jo, Seungyul Oh, Seungjun Lee, Kwangrok Ryoo, Jongmin Lee, Seung Hwan Kim, Byung Jun Kang, Soonyoung Lee, Jun Ha Park, Chanwoo Moon, Jiwon Ham, Haein Lee, Heejae Han, Jaeseung Byun, Soojong Do, Minju Ha, Dongyun Kim, Kyunghoon Bae, Woohyung Lim, Edward Hwayoung Lee, Yongmin Park , et al. (9 additional authors not shown)

    Abstract: The extraction of molecular structures and reaction data from scientific documents is challenging due to their varied, unstructured chemical formats and complex document layouts. To address this, we introduce MolMole, a vision-based deep learning framework that unifies molecule detection, reaction diagram parsing, and optical chemical structure recognition (OCSR) into a single pipeline for automat… ▽ More

    Submitted 7 May, 2025; v1 submitted 30 April, 2025; originally announced May 2025.

    Comments: 15 pages, 12 figures

  42. arXiv:2505.03306  [pdf

    quant-ph physics.app-ph

    Magnetic-field dependent VB- spin decoherence in hexagonal boron nitrides: A first-principles study

    Authors: Jaewook Lee, Hyeonsu Kim, Huijin Park, Hosung Seo

    Abstract: The negatively charged boron vacancy (VB-) in h-BN operates as an optically addressable spin qubit in two-dimensional materials. To further advance the spin into a versatile qubit platform, it is imperative to understand its spin decoherence precisely, which is currently one of the major limiting factors for the VB- spin. In this study, we employ a first-principles quantum many-body simulation to… ▽ More

    Submitted 8 May, 2025; v1 submitted 6 May, 2025; originally announced May 2025.

    Comments: 23 pages, 6 figures

  43. arXiv:2505.02912  [pdf, other

    hep-ex

    Measurement of the time-integrated $CP$ asymmetry in $D^0\toπ^0π^0$ decays at Belle II

    Authors: Belle II Collaboration, I. Adachi, Y. Ahn, N. Akopov, S. Alghamdi, M. Alhakami, A. Aloisio, N. Althubiti, K. Amos, M. Angelsmark, N. Anh Ky, C. Antonioli, D. M. Asner, H. Atmacan, T. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, N. K. Baghel, S. Bahinipati, P. Bambade, Sw. Banerjee, M. Barrett, M. Bartl , et al. (350 additional authors not shown)

    Abstract: We measure the time-integrated $CP$ asymmetry, $A_{CP}$, in $D^0\toπ^0π^0$ decays reconstructed in $e^+e^-\to c\bar{c}$ events collected by Belle II during 2019--2022. The data corresponds to an integrated luminosity of 428$\mathrm{fb}^{-1}$. The $D^0$ decays are required to originate from the flavor-conserving $D^{*+} \to D^0 π^+$ decay to determine the charm flavor at production time. Control sa… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

    Report number: Belle II Preprint 2025-009, KEK Preprint 2025-7

  44. arXiv:2505.02908  [pdf, other

    astro-ph.HE astro-ph.GA astro-ph.SR

    Early Shock-Cooling Observations and Progenitor Constraints of Type IIb SN 2024uwq

    Authors: Bhagya M. Subrayan, David J. Sand, K. Azalee Bostroem, Saurabh W. Jha, Aravind P. Ravi, Michaela Schwab, Jennifer E. Andrews, Griffin Hosseinzadeh, Stefano Valenti, Yize Dong, Jeniveve Pearson, Manisha Shrestha, Lindsey A. Kwok, Emily Hoang, Jeonghee Rho, Seong Hyun Park, Sung-Chul Yoon, T. R. Geball, Joshua Haislip, Daryl Janzen, Vladimir Kouprianov, Darshana Mehta, Nicolás Meza Retamal, Daniel E. Reichart, Moira Andrews , et al. (4 additional authors not shown)

    Abstract: We present early multi-wavelength photometric and spectroscopic observations of the Type IIb supernova SN 2024uwq, capturing its shock-cooling emission phase and double-peaked light curve evolution. Early spectra reveal broad H-alpha (v ~ 15,500 km s$^{-1}$) and He I P-Cygni profiles of similar strengths. Over time the He I lines increase in strength while the H-alpha decreases, consistent with a… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

    Comments: 22 pages, 11 figures, Submitted to ApJL

  45. arXiv:2505.01737  [pdf, other

    cs.CV

    Learning Multi-frame and Monocular Prior for Estimating Geometry in Dynamic Scenes

    Authors: Seong Hyeon Park, Jinwoo Shin

    Abstract: In monocular videos that capture dynamic scenes, estimating the 3D geometry of video contents has been a fundamental challenge in computer vision. Specifically, the task is significantly challenged by the object motion, where existing models are limited to predict only partial attributes of the dynamic scenes, such as depth or pointmaps spanning only over a pair of frames. Since these attributes a… ▽ More

    Submitted 3 May, 2025; originally announced May 2025.

  46. arXiv:2505.00735  [pdf, other

    eess.IV cs.CV

    Leveraging Depth Maps and Attention Mechanisms for Enhanced Image Inpainting

    Authors: Jin Hyun Park, Harine Choi, Praewa Pitiphat

    Abstract: Existing deep learning-based image inpainting methods typically rely on convolutional networks with RGB images to reconstruct images. However, relying exclusively on RGB images may neglect important depth information, which plays a critical role in understanding the spatial and structural context of a scene. Just as human vision leverages stereo cues to perceive depth, incorporating depth maps int… ▽ More

    Submitted 8 May, 2025; v1 submitted 29 April, 2025; originally announced May 2025.

  47. arXiv:2505.00260  [pdf, other

    quant-ph cond-mat.mes-hall cond-mat.mtrl-sci

    Wideband covariance magnetometry below the diffraction limit

    Authors: Xuan Hoang Le, Pavel E. Dolgirev, Piotr Put, Eric L. Peterson, Arjun Pillai, Alexander A. Zibrov, Eugene Demler, Hongkun Park, Mikhail D. Lukin

    Abstract: We experimentally demonstrate a method for measuring correlations of wideband magnetic signals with spatial resolution below the optical diffraction limit. Our technique employs two nitrogen-vacancy (NV) centers in diamond as nanoscale magnetometers, spectrally resolved by inhomogeneous optical transitions. Using high-fidelity optical readout and long spin coherence time, we probe correlated MHz-r… ▽ More

    Submitted 30 April, 2025; originally announced May 2025.

    Comments: 20 pages, 14 figures

  48. arXiv:2504.21784  [pdf, other

    math.NA

    A Comparison of the Consistent and Independent Second Moment Methods Applied to Thermal Radiative Transfer

    Authors: Samuel Olivier, James S. Warsa, HyeongKae Park

    Abstract: The design of efficient numerical methods for modeling thermal radiative transfer (TRT) is challenging due to the stiff, nonlinear coupling between radiation and material energies, especially at the time scales of interest in high energy density physics and astrophysics. Here, we investigate the use of the Second Moment Method (SMM) to accelerate absorption-emission within the context of the multi… ▽ More

    Submitted 30 April, 2025; originally announced April 2025.

  49. arXiv:2504.21340  [pdf, other

    cs.CV cs.LG

    Towards Improved Cervical Cancer Screening: Vision Transformer-Based Classification and Interpretability

    Authors: Khoa Tuan Nguyen, Ho-min Park, Gaeun Oh, Joris Vankerschaver, Wesley De Neve

    Abstract: We propose a novel approach to cervical cell image classification for cervical cancer screening using the EVA-02 transformer model. We developed a four-step pipeline: fine-tuning EVA-02, feature extraction, selecting important features through multiple machine learning models, and training a new artificial neural network with optional loss weighting for improved generalization. With this design, o… ▽ More

    Submitted 30 April, 2025; originally announced April 2025.

    Comments: Accepted at ISBI 2025 "Challenge 2: Pap Smear Cell Classification Challenge"

  50. Multi-Sensor Fusion for Quadruped Robot State Estimation using Invariant Filtering and Smoothing

    Authors: Ylenia Nisticò, Hajun Kim, João Carlos Virgolino Soares, Geoff Fink, Hae-Won Park, Claudio Semini

    Abstract: This letter introduces two multi-sensor state estimation frameworks for quadruped robots, built on the Invariant Extended Kalman Filter (InEKF) and Invariant Smoother (IS). The proposed methods, named E-InEKF and E-IS, fuse kinematics, IMU, LiDAR, and GPS data to mitigate position drift, particularly along the z-axis, a common issue in proprioceptive-based approaches. We derived observation models… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

    Comments: Accepted for publication in IEEE Robotics and Automation Letters