-
NTIRE 2025 Challenge on Real-World Face Restoration: Methods and Results
Authors:
Zheng Chen,
Jingkai Wang,
Kai Liu,
Jue Gong,
Lei Sun,
Zongwei Wu,
Radu Timofte,
Yulun Zhang,
Jianxing Zhang,
Jinlong Wu,
Jun Wang,
Zheng Xie,
Hakjae Jeon,
Suejin Han,
Hyung-Ju Chun,
Hyunhee Park,
Zhicun Yin,
Junjie Chen,
Ming Liu,
Xiaoming Li,
Chao Zhou,
Wangmeng Zuo,
Weixia Zhang,
Dingquan Li,
Kede Ma
, et al. (29 additional authors not shown)
Abstract:
This paper provides a review of the NTIRE 2025 challenge on real-world face restoration, highlighting the proposed solutions and the resulting outcomes. The challenge focuses on generating natural, realistic outputs while maintaining identity consistency. Its goal is to advance state-of-the-art solutions for perceptual quality and realism, without imposing constraints on computational resources or…
▽ More
This paper provides a review of the NTIRE 2025 challenge on real-world face restoration, highlighting the proposed solutions and the resulting outcomes. The challenge focuses on generating natural, realistic outputs while maintaining identity consistency. Its goal is to advance state-of-the-art solutions for perceptual quality and realism, without imposing constraints on computational resources or training data. The track of the challenge evaluates performance using a weighted image quality assessment (IQA) score and employs the AdaFace model as an identity checker. The competition attracted 141 registrants, with 13 teams submitting valid models, and ultimately, 10 teams achieved a valid score in the final ranking. This collaborative effort advances the performance of real-world face restoration while offering an in-depth overview of the latest trends in the field.
△ Less
Submitted 20 April, 2025;
originally announced April 2025.
-
NTIRE 2025 Challenge on Image Super-Resolution ($\times$4): Methods and Results
Authors:
Zheng Chen,
Kai Liu,
Jue Gong,
Jingkai Wang,
Lei Sun,
Zongwei Wu,
Radu Timofte,
Yulun Zhang,
Xiangyu Kong,
Xiaoxuan Yu,
Hyunhee Park,
Suejin Han,
Hakjae Jeon,
Dafeng Zhang,
Hyung-Ju Chun,
Donghun Ryou,
Inju Ha,
Bohyung Han,
Lu Zhao,
Yuyi Zhang,
Pengyu Yan,
Jiawei Hu,
Pengwei Liu,
Fengjun Guo,
Hongyuan Yu
, et al. (86 additional authors not shown)
Abstract:
This paper presents the NTIRE 2025 image super-resolution ($\times$4) challenge, one of the associated competitions of the 10th NTIRE Workshop at CVPR 2025. The challenge aims to recover high-resolution (HR) images from low-resolution (LR) counterparts generated through bicubic downsampling with a $\times$4 scaling factor. The objective is to develop effective network designs or solutions that ach…
▽ More
This paper presents the NTIRE 2025 image super-resolution ($\times$4) challenge, one of the associated competitions of the 10th NTIRE Workshop at CVPR 2025. The challenge aims to recover high-resolution (HR) images from low-resolution (LR) counterparts generated through bicubic downsampling with a $\times$4 scaling factor. The objective is to develop effective network designs or solutions that achieve state-of-the-art SR performance. To reflect the dual objectives of image SR research, the challenge includes two sub-tracks: (1) a restoration track, emphasizes pixel-wise accuracy and ranks submissions based on PSNR; (2) a perceptual track, focuses on visual realism and ranks results by a perceptual score. A total of 286 participants registered for the competition, with 25 teams submitting valid entries. This report summarizes the challenge design, datasets, evaluation protocol, the main results, and methods of each team. The challenge serves as a benchmark to advance the state of the art and foster progress in image SR.
△ Less
Submitted 28 April, 2025; v1 submitted 20 April, 2025;
originally announced April 2025.
-
The Human Robot Social Interaction (HSRI) Dataset: Benchmarking Foundational Models' Social Reasoning
Authors:
Dong Won Lee,
Yubin Kim,
Denison Guvenoz,
Sooyeon Jeong,
Parker Malachowsky,
Louis-Philippe Morency,
Cynthia Breazeal,
Hae Won Park
Abstract:
Our work aims to advance the social reasoning of embodied artificial intelligence (AI) agents in real-world social interactions. Recently, language models (LMs) and foundational models (FMs) are being utilized as automatic evaluators of human-AI interactions with the goal of eventually being used to improve the policy of the AI agent. To enable further research in this direction, we introduce a la…
▽ More
Our work aims to advance the social reasoning of embodied artificial intelligence (AI) agents in real-world social interactions. Recently, language models (LMs) and foundational models (FMs) are being utilized as automatic evaluators of human-AI interactions with the goal of eventually being used to improve the policy of the AI agent. To enable further research in this direction, we introduce a large-scale real-world Human Robot Social Interaction (HSRI) Dataset to benchmark the capabilities of LMs and FMs to identify and reason about social interactions, specifically with regard to robot social errors and competencies . Our dataset consists of 400 real-world human social robot interaction videos and over 10K annotations, detailing the robot's social errors, competencies, rationale, and corrective actions, capturing unique aspects of human-AI interaction only present in real-world interactions. To further assess AI models' ability to reason about social interactions, we propose eight new benchmark tasks for evaluating centered around whether AI models can (1) evaluate social interactions via detecting social errors and competencies, (2) identify the explanatory factors associated to errors and competencies, (3) understand the flow of real-world social interactions, and (4) provide reasons and corrective actions for social errors. Human studies and experiments with modern LMs and FMs reveal that current models struggle with these tasks, demonstrating that our dataset and benchmark provides a step forward towards socially intelligent AI.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
DuoLoRA : Cycle-consistent and Rank-disentangled Content-Style Personalization
Authors:
Aniket Roy,
Shubhankar Borse,
Shreya Kadambi,
Debasmit Das,
Shweta Mahajan,
Risheek Garrepalli,
Hyojin Park,
Ankita Nayak,
Rama Chellappa,
Munawar Hayat,
Fatih Porikli
Abstract:
We tackle the challenge of jointly personalizing content and style from a few examples. A promising approach is to train separate Low-Rank Adapters (LoRA) and merge them effectively, preserving both content and style. Existing methods, such as ZipLoRA, treat content and style as independent entities, merging them by learning masks in LoRA's output dimensions. However, content and style are intertw…
▽ More
We tackle the challenge of jointly personalizing content and style from a few examples. A promising approach is to train separate Low-Rank Adapters (LoRA) and merge them effectively, preserving both content and style. Existing methods, such as ZipLoRA, treat content and style as independent entities, merging them by learning masks in LoRA's output dimensions. However, content and style are intertwined, not independent. To address this, we propose DuoLoRA, a content-style personalization framework featuring three key components: (i) rank-dimension mask learning, (ii) effective merging via layer priors, and (iii) Constyle loss, which leverages cycle-consistency in the merging process. First, we introduce ZipRank, which performs content-style merging within the rank dimension, offering adaptive rank flexibility and significantly reducing the number of learnable parameters. Additionally, we incorporate SDXL layer priors to apply implicit rank constraints informed by each layer's content-style bias and adaptive merger initialization, enhancing the integration of content and style. To further refine the merging process, we introduce Constyle loss, which leverages the cycle-consistency between content and style. Our experimental results demonstrate that DuoLoRA outperforms state-of-the-art content-style merging methods across multiple benchmarks.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
The Tenth NTIRE 2025 Image Denoising Challenge Report
Authors:
Lei Sun,
Hang Guo,
Bin Ren,
Luc Van Gool,
Radu Timofte,
Yawei Li,
Xiangyu Kong,
Hyunhee Park,
Xiaoxuan Yu,
Suejin Han,
Hakjae Jeon,
Jia Li,
Hyung-Ju Chun,
Donghun Ryou,
Inju Ha,
Bohyung Han,
Jingyu Ma,
Zhijuan Huang,
Huiyuan Fu,
Hongyuan Yu,
Boqi Zhang,
Jiawei Shi,
Heng Zhang,
Huadong Ma,
Deepak Kumar Tyagi
, et al. (69 additional authors not shown)
Abstract:
This paper presents an overview of the NTIRE 2025 Image Denoising Challenge (σ = 50), highlighting the proposed methodologies and corresponding results. The primary objective is to develop a network architecture capable of achieving high-quality denoising performance, quantitatively evaluated using PSNR, without constraints on computational complexity or model size. The task assumes independent ad…
▽ More
This paper presents an overview of the NTIRE 2025 Image Denoising Challenge (σ = 50), highlighting the proposed methodologies and corresponding results. The primary objective is to develop a network architecture capable of achieving high-quality denoising performance, quantitatively evaluated using PSNR, without constraints on computational complexity or model size. The task assumes independent additive white Gaussian noise (AWGN) with a fixed noise level of 50. A total of 290 participants registered for the challenge, with 20 teams successfully submitting valid results, providing insights into the current state-of-the-art in image denoising.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
Balancing Graph Embedding Smoothness in Self-Supervised Learning via Information-Theoretic Decomposition
Authors:
Heesoo Jung,
Hogun Park
Abstract:
Self-supervised learning (SSL) in graphs has garnered significant attention, particularly in employing Graph Neural Networks (GNNs) with pretext tasks initially designed for other domains, such as contrastive learning and feature reconstruction. However, it remains uncertain whether these methods effectively reflect essential graph properties, precisely representation similarity with its neighbors…
▽ More
Self-supervised learning (SSL) in graphs has garnered significant attention, particularly in employing Graph Neural Networks (GNNs) with pretext tasks initially designed for other domains, such as contrastive learning and feature reconstruction. However, it remains uncertain whether these methods effectively reflect essential graph properties, precisely representation similarity with its neighbors. We observe that existing methods position opposite ends of a spectrum driven by the graph embedding smoothness, with each end corresponding to outperformance on specific downstream tasks. Decomposing the SSL objective into three terms via an information-theoretic framework with a neighbor representation variable reveals that this polarization stems from an imbalance among the terms, which existing methods may not effectively maintain. Further insights suggest that balancing between the extremes can lead to improved performance across a wider range of downstream tasks. A framework, BSG (Balancing Smoothness in Graph SSL), introduces novel loss functions designed to supplement the representation quality in graph-based SSL by balancing the derived three terms: neighbor loss, minimal loss, and divergence loss. We present a theoretical analysis of the effects of these loss functions, highlighting their significance from both the SSL and graph smoothness perspectives. Extensive experiments on multiple real-world datasets across node classification and link prediction consistently demonstrate that BSG achieves state-of-the-art performance, outperforming existing methods. Our implementation code is available at https://github.com/steve30572/BSG.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
Voice Conversion with Diverse Intonation using Conditional Variational Auto-Encoder
Authors:
Soobin Suh,
Dabi Ahn,
Heewoong Park,
Jonghun Park
Abstract:
Voice conversion is a task of synthesizing an utterance with target speaker's voice while maintaining linguistic information of the source utterance. While a speaker can produce varying utterances from a single script with different intonations, conventional voice conversion models were limited to producing only one result per source input. To overcome this limitation, we propose a novel approach…
▽ More
Voice conversion is a task of synthesizing an utterance with target speaker's voice while maintaining linguistic information of the source utterance. While a speaker can produce varying utterances from a single script with different intonations, conventional voice conversion models were limited to producing only one result per source input. To overcome this limitation, we propose a novel approach for voice conversion with diverse intonations using conditional variational autoencoder (CVAE). Experiments have shown that the speaker's style feature can be mapped into a latent space with Gaussian distribution. We have also been able to convert voices with more diverse intonation by making the posterior of the latent space more complex with inverse autoregressive flow (IAF). As a result, the converted voice not only has a diversity of intonations, but also has better sound quality than the model without CVAE.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
Non-orientable Exceptional Points in Twisted Boundary Systems
Authors:
Jung-Wan Ryu,
Jae-Ho Han,
Moon Jip Park,
Hee Chul Park,
Chang-Hwan Yi
Abstract:
Non-orientable manifolds, such as the Möbius strip and the Klein bottle, defy conventional geometric intuition through their twisted boundary conditions. As a result, topological defects on non-orientable manifolds give rise to novel physical phenomena. We study the adiabatic transport of exceptional points (EPs) along non-orientable closed loops and uncover distinct topological responses arising…
▽ More
Non-orientable manifolds, such as the Möbius strip and the Klein bottle, defy conventional geometric intuition through their twisted boundary conditions. As a result, topological defects on non-orientable manifolds give rise to novel physical phenomena. We study the adiabatic transport of exceptional points (EPs) along non-orientable closed loops and uncover distinct topological responses arising from the lack of global orientation. Notably, we demonstrate that the cyclic permutation of eigenstates across an EP depends sensitively on the loop orientation, yielding inequivalent braid representations for clockwise and counterclockwise encirclement; this is a feature unique to non-orientable geometries. Orientation-dependent geometric quantities, such as the winding number, cannot be consistently defined due to the absence of a global orientation. However, when a boundary is introduced, such quantities become well defined within the local interior, even though the global manifold remains non-orientable. We further demonstrate the adiabatic evolution of EPs and the emergence of orientation-sensitive observables in a Klein Brillouin zone, described by an effective non-Hermitian Hamiltonian that preserves momentum-space glide symmetry. Finally, we numerically implement these ideas in a microdisk cavity with embedded scatterers using synthetic momenta.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
Making Acoustic Side-Channel Attacks on Noisy Keyboards Viable with LLM-Assisted Spectrograms' "Typo" Correction
Authors:
Seyyed Ali Ayati,
Jin Hyun Park,
Yichen Cai,
Marcus Botacin
Abstract:
The large integration of microphones into devices increases the opportunities for Acoustic Side-Channel Attacks (ASCAs), as these can be used to capture keystrokes' audio signals that might reveal sensitive information. However, the current State-Of-The-Art (SOTA) models for ASCAs, including Convolutional Neural Networks (CNNs) and hybrid models, such as CoAtNet, still exhibit limited robustness u…
▽ More
The large integration of microphones into devices increases the opportunities for Acoustic Side-Channel Attacks (ASCAs), as these can be used to capture keystrokes' audio signals that might reveal sensitive information. However, the current State-Of-The-Art (SOTA) models for ASCAs, including Convolutional Neural Networks (CNNs) and hybrid models, such as CoAtNet, still exhibit limited robustness under realistic noisy conditions. Solving this problem requires either: (i) an increased model's capacity to infer contextual information from longer sequences, allowing the model to learn that an initially noisily typed word is the same as a futurely collected non-noisy word, or (ii) an approach to fix misidentified information from the contexts, as one does not type random words, but the ones that best fit the conversation context. In this paper, we demonstrate that both strategies are viable and complementary solutions for making ASCAs practical. We observed that no existing solution leverages advanced transformer architectures' power for these tasks and propose that: (i) Visual Transformers (VTs) are the candidate solutions for capturing long-term contextual information and (ii) transformer-powered Large Language Models (LLMs) are the candidate solutions to fix the ``typos'' (mispredictions) the model might make. Thus, we here present the first-of-its-kind approach that integrates VTs and LLMs for ASCAs.
We first show that VTs achieve SOTA performance in classifying keystrokes when compared to the previous CNN benchmark. Second, we demonstrate that LLMs can mitigate the impact of real-world noise. Evaluations on the natural sentences revealed that: (i) incorporating LLMs (e.g., GPT-4o) in our ASCA pipeline boosts the performance of error-correction tasks; and (ii) the comparable performance can be attained by a lightweight, fine-tuned smaller LLM (67 times smaller than GPT-4o), using...
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
Test of lepton flavor universality with measurements of $R(D^{+})$ and $R(D^{*+})$ using semileptonic $B$ tagging at the Belle II experiment
Authors:
Belle II Collaboration,
I. Adachi,
K. Adamczyk,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
S. Alghamdi,
M. Alhakami,
A. Aloisio,
N. Althubiti,
K. Amos,
M. Angelsmark,
N. Anh Ky,
C. Antonioli,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
N. K. Baghel,
S. Bahinipati
, et al. (428 additional authors not shown)
Abstract:
We report measurements of the ratios of branching fractions $\mathcal{R}(D^{(*)+}) = \mathcal{B}(\overline{B}{}^0 \to D^{(*)+} \,τ^- \, \overlineν_τ) / \mathcal{B}(\overline{B}{}^0 \to D^{(*)+} \, \ell^- \, \overlineν_\ell)$, where $\ell$ denotes either an electron or a muon. These ratios test the universality of the charged-current weak interaction. The results are based on a…
▽ More
We report measurements of the ratios of branching fractions $\mathcal{R}(D^{(*)+}) = \mathcal{B}(\overline{B}{}^0 \to D^{(*)+} \,τ^- \, \overlineν_τ) / \mathcal{B}(\overline{B}{}^0 \to D^{(*)+} \, \ell^- \, \overlineν_\ell)$, where $\ell$ denotes either an electron or a muon. These ratios test the universality of the charged-current weak interaction. The results are based on a $365\, \mathrm{fb}^{-1}$ data sample collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider, which operates at a center-of-mass energy corresponding to the $Υ(4S)$ resonance, just above the threshold for $B\overline{B}{}$ production. Signal candidates are reconstructed by selecting events in which the companion $B$ meson from the $Υ(4S) \to B\overline{B}{}$ decay is identified in semileptonic modes. The $τ$ lepton is reconstructed via its leptonic decays. We obtain $\mathcal{R}(D^+) = 0.418 \pm 0.074 ~({\mathrm{stat}}) \pm 0.051 ~({\mathrm{syst}})$ and $\mathcal{R}(D^{*+}) = 0.306 \pm 0.034 ~({\mathrm{stat}}) \pm 0.018 ~({\mathrm{syst}})$, which are consistent with world average values. Accounting for the correlation between them, these values differ from the Standard Model expectation by a collective significance of $1.7$ standard deviations.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
Search for $B^0 \to K^{\ast 0} τ^+ τ^-$ decays at the Belle II experiment
Authors:
Belle II Collaboration,
I. Adachi,
K. Adamczyk,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
M. Alhakami,
A. Aloisio,
N. Althubiti,
M. Angelsmark,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
N. K. Baghel,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett
, et al. (424 additional authors not shown)
Abstract:
We present a search for the rare flavor-changing neutral-current decay $B^0 \to K^{\ast 0} τ^+ τ^-$ with data collected by the Belle II experiment at the SuperKEKB electron-positron collider. The analysis uses a 365 fb$^{-1}$ data sample recorded at the center-of-mass energy of the $Υ(4S)$ resonance. One of the $B$ mesons produced in the $Υ(4S)\to B^0 \bar{B}^0$ process is fully reconstructed in a…
▽ More
We present a search for the rare flavor-changing neutral-current decay $B^0 \to K^{\ast 0} τ^+ τ^-$ with data collected by the Belle II experiment at the SuperKEKB electron-positron collider. The analysis uses a 365 fb$^{-1}$ data sample recorded at the center-of-mass energy of the $Υ(4S)$ resonance. One of the $B$ mesons produced in the $Υ(4S)\to B^0 \bar{B}^0$ process is fully reconstructed in a hadronic decay mode, while its companion $B$ meson is required to decay into a $K^{\ast 0}$ and two $τ$ leptons of opposite charge. The $τ$ leptons are reconstructed in final states with a single electron, muon, charged pion or charged $ρ$ meson, and additional neutrinos. We set an upper limit on the branching ratio of $BR(B^0 \to K^{\ast 0} τ^+ τ^-) < 1.8 \times 10^{-3}$ at the 90% confidence level, which is the most stringent constraint reported to date.
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
PreCi: Pretraining and Continual Improvement of Humanoid Locomotion via Model-Assumption-Based Regularization
Authors:
Hyunyoung Jung,
Zhaoyuan Gu,
Ye Zhao,
Hae-Won Park,
Sehoon Ha
Abstract:
Humanoid locomotion is a challenging task due to its inherent complexity and high-dimensional dynamics, as well as the need to adapt to diverse and unpredictable environments. In this work, we introduce a novel learning framework for effectively training a humanoid locomotion policy that imitates the behavior of a model-based controller while extending its capabilities to handle more complex locom…
▽ More
Humanoid locomotion is a challenging task due to its inherent complexity and high-dimensional dynamics, as well as the need to adapt to diverse and unpredictable environments. In this work, we introduce a novel learning framework for effectively training a humanoid locomotion policy that imitates the behavior of a model-based controller while extending its capabilities to handle more complex locomotion tasks, such as more challenging terrain and higher velocity commands. Our framework consists of three key components: pre-training through imitation of the model-based controller, fine-tuning via reinforcement learning, and model-assumption-based regularization (MAR) during fine-tuning. In particular, MAR aligns the policy with actions from the model-based controller only in states where the model assumption holds to prevent catastrophic forgetting. We evaluate the proposed framework through comprehensive simulation tests and hardware experiments on a full-size humanoid robot, Digit, demonstrating a forward speed of 1.5 m/s and robust locomotion across diverse terrains, including slippery, sloped, uneven, and sandy terrains.
△ Less
Submitted 13 April, 2025;
originally announced April 2025.
-
Is Earendel a Star?: Investigating the Sunrise Arc Using JWST Strong and Weak Gravitational Lensing Analyses
Authors:
Zachary P. Scofield,
M. James Jee,
Sangjun Cha,
Hyosun Park
Abstract:
The galaxy cluster WHL J013719.8-08284 at $z = 0.566$ exhibits a strong-lensing feature known as the Sunrise Arc, which hosts the candidate star Earendel at $z \approx 6.2$, the most distant star candidate observed to date. If this object is a star, or a system of a few stars, its apparent magnitude implies both extreme gravitational lensing magnification and unusually high luminosity. This study…
▽ More
The galaxy cluster WHL J013719.8-08284 at $z = 0.566$ exhibits a strong-lensing feature known as the Sunrise Arc, which hosts the candidate star Earendel at $z \approx 6.2$, the most distant star candidate observed to date. If this object is a star, or a system of a few stars, its apparent magnitude implies both extreme gravitational lensing magnification and unusually high luminosity. This study revisits Earendel's magnification, which, in previous literature, exhibits significant uncertainty across various lens models ($2μ= 4{,}000$-$35{,}000$). We present an improved cluster mass reconstruction and a tighter constraint on Earendel's magnification using a joint strong- and weak-lensing analysis with JWST data. Our strong-lensing mass model, incorporating newly identified multiple-image systems from JWST imaging data and modifying the existing multiple-image assignment scheme, produces a root-mean-square (RMS) lens-plane scatter of less than $0.''3$. Additionally, our weak-lensing catalog achieves a source density of $\sim 100$ galaxies arcmin$^{-2}$, providing constraints on the mass profile beyond the strong-lensing regime. In our best-fit model, we estimate the magnification of Earendel to be $μ= 43$-$67$, significantly lower than previously proposed and thus calling into question its classification as a star.
△ Less
Submitted 11 April, 2025;
originally announced April 2025.
-
Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape Images
Authors:
In-Hwan Jin,
Haesoo Choo,
Seong-Hun Jeong,
Heemoon Park,
Junghwan Kim,
Oh-joon Kwon,
Kyeongbo Kong
Abstract:
To achieve realistic immersion in landscape images, fluids such as water and clouds need to move within the image while revealing new scenes from various camera perspectives. Recently, a field called dynamic scene video has emerged, which combines single image animation with 3D photography. These methods use pseudo 3D space, implicitly represented with Layered Depth Images (LDIs). LDIs separate a…
▽ More
To achieve realistic immersion in landscape images, fluids such as water and clouds need to move within the image while revealing new scenes from various camera perspectives. Recently, a field called dynamic scene video has emerged, which combines single image animation with 3D photography. These methods use pseudo 3D space, implicitly represented with Layered Depth Images (LDIs). LDIs separate a single image into depth-based layers, which enables elements like water and clouds to move within the image while revealing new scenes from different camera perspectives. However, as landscapes typically consist of continuous elements, including fluids, the representation of a 3D space separates a landscape image into discrete layers, and it can lead to diminished depth perception and potential distortions depending on camera movement. Furthermore, due to its implicit modeling of 3D space, the output may be limited to videos in the 2D domain, potentially reducing their versatility. In this paper, we propose representing a complete 3D space for dynamic scene video by modeling explicit representations, specifically 4D Gaussians, from a single image. The framework is focused on optimizing 3D Gaussians by generating multi-view images from a single image and creating 3D motion to optimize 4D Gaussians. The most important part of proposed framework is consistent 3D motion estimation, which estimates common motion among multi-view images to bring the motion in 3D space closer to actual motions. As far as we know, this is the first attempt that considers animation while representing a complete 3D space from a single landscape image. Our model demonstrates the ability to provide realistic immersion in various landscape images through diverse experiments and metrics. Extensive experimental results are https://cvsp-lab.github.io/ICLR2025_3D-MOM/.
△ Less
Submitted 4 April, 2025;
originally announced April 2025.
-
DropGaussian: Structural Regularization for Sparse-view Gaussian Splatting
Authors:
Hyunwoo Park,
Gun Ryu,
Wonjun Kim
Abstract:
Recently, 3D Gaussian splatting (3DGS) has gained considerable attentions in the field of novel view synthesis due to its fast performance while yielding the excellent image quality. However, 3DGS in sparse-view settings (e.g., three-view inputs) often faces with the problem of overfitting to training views, which significantly drops the visual quality of novel view images. Many existing approache…
▽ More
Recently, 3D Gaussian splatting (3DGS) has gained considerable attentions in the field of novel view synthesis due to its fast performance while yielding the excellent image quality. However, 3DGS in sparse-view settings (e.g., three-view inputs) often faces with the problem of overfitting to training views, which significantly drops the visual quality of novel view images. Many existing approaches have tackled this issue by using strong priors, such as 2D generative contextual information and external depth signals. In contrast, this paper introduces a prior-free method, so-called DropGaussian, with simple changes in 3D Gaussian splatting. Specifically, we randomly remove Gaussians during the training process in a similar way of dropout, which allows non-excluded Gaussians to have larger gradients while improving their visibility. This makes the remaining Gaussians to contribute more to the optimization process for rendering with sparse input views. Such simple operation effectively alleviates the overfitting problem and enhances the quality of novel view synthesis. By simply applying DropGaussian to the original 3DGS framework, we can achieve the competitive performance with existing prior-based 3DGS methods in sparse-view settings of benchmark datasets without any additional complexity. The code and model are publicly available at: https://github.com/DCVL-3D/DropGaussian release.
△ Less
Submitted 1 April, 2025;
originally announced April 2025.
-
Does "Reasoning" with Large Language Models Improve Recognizing, Generating, and Reframing Unhelpful Thoughts?
Authors:
Yilin Qi,
Dong Won Lee,
Cynthia Breazeal,
Hae Won Park
Abstract:
Cognitive Reframing, a core element of Cognitive Behavioral Therapy (CBT), helps individuals reinterpret negative experiences by finding positive meaning. Recent advances in Large Language Models (LLMs) have demonstrated improved performance through reasoning-based strategies. This inspires a promising direction of leveraging the reasoning capabilities of LLMs to improve CBT and mental reframing b…
▽ More
Cognitive Reframing, a core element of Cognitive Behavioral Therapy (CBT), helps individuals reinterpret negative experiences by finding positive meaning. Recent advances in Large Language Models (LLMs) have demonstrated improved performance through reasoning-based strategies. This inspires a promising direction of leveraging the reasoning capabilities of LLMs to improve CBT and mental reframing by simulating the process of critical thinking, potentially enabling more effective recognition, generation, and reframing of cognitive distortions. In this work, we investigate the role of various reasoning methods, including pre-trained reasoning LLMs and augmented reasoning strategies such as CoT and self-consistency in enhancing LLMs' ability to perform cognitive reframing tasks. We find that augmented reasoning methods, even when applied to "outdated" LLMs like GPT-3.5, consistently outperform state-of-the-art pretrained reasoning models on recognizing, generating and reframing unhelpful thoughts.
△ Less
Submitted 31 March, 2025;
originally announced April 2025.
-
Semantic Packet Aggregation and Repeated Transmission for Text-to-Image Generation
Authors:
Seunghun Lee,
Jihong Park,
Jinho Choi,
Hyuncheol Park
Abstract:
Text-based communication is expected to be prevalent in 6G applications such as wireless AI-generated content (AIGC). Motivated by this, this paper addresses the challenges of transmitting text prompts over erasure channels for a text-to-image AIGC task by developing the semantic segmentation and repeated transmission (SMART) algorithm. SMART groups words in text prompts into packets, prioritizing…
▽ More
Text-based communication is expected to be prevalent in 6G applications such as wireless AI-generated content (AIGC). Motivated by this, this paper addresses the challenges of transmitting text prompts over erasure channels for a text-to-image AIGC task by developing the semantic segmentation and repeated transmission (SMART) algorithm. SMART groups words in text prompts into packets, prioritizing the task-specific significance of semantics within these packets, and optimizes the number of repeated transmissions. Simulation results show that SMART achieves higher similarities in received texts and generated images compared to a character-level packetization baseline, while reducing computing latency by orders of magnitude compared to an exhaustive search baseline.
△ Less
Submitted 31 March, 2025;
originally announced March 2025.
-
Colossal enhancement of spin transmission through magnon confinement in an antiferromagnet
Authors:
Sajid Husain,
Maya Ramesh,
Xinyan Li,
Sergei Prokhorenko,
Shashank Kumar Ojha,
Aiden Ross,
Koushik Das,
Boyang Zhao,
Hyeon Woo Park,
Peter Meisenheimer,
Yousra Nahas,
Lucas Caretta,
Lane W. Martin,
Se Kwon Kim,
Zhi Yao,
Haidan Wen,
Sayeef Salahuddin,
Long-Qing Chen,
Yimo Han,
Rogerio de Sousa,
Laurent Bellaiche,
Manuel Bibes,
Darrell G. Schlom,
Ramamoorthy Ramesh
Abstract:
Since Felix Bloch's introduction of the concept of spin waves in 1930, magnons (the quanta of spin waves) have been extensively studied in a range of materials for spintronics, particularly for non-volatile logic-in-memory devices. Controlling magnons in conventional antiferromagnets and harnessing them in practical applications, however, remains a challenge. In this letter, we demonstrate highly…
▽ More
Since Felix Bloch's introduction of the concept of spin waves in 1930, magnons (the quanta of spin waves) have been extensively studied in a range of materials for spintronics, particularly for non-volatile logic-in-memory devices. Controlling magnons in conventional antiferromagnets and harnessing them in practical applications, however, remains a challenge. In this letter, we demonstrate highly efficient magnon transport in an LaFeO$_3$/BiFeO$_3$/LaFeO$_3$ all-antiferromagnetic system which can be controlled electrically, making it highly desirable for energy-efficient computation. Leveraging spin-orbit-driven spin-charge transduction, we demonstrate that this material architecture permits magnon confinement in ultrathin antiferromagnets, enhancing the output voltage generated by magnon transport by several orders of magnitude, which provides a pathway to enable magnetoelectric memory and logic functionalities. Additionally, its non-volatility enables ultralow-power logic-in-memory processing, where magnonic devices can be efficiently reconfigured via electrically controlled magnon spin currents within magnetoelectric channels.
△ Less
Submitted 31 March, 2025;
originally announced March 2025.
-
Scalable Geometric Learning with Correlation-Based Functional Brain Networks
Authors:
Kisung You,
Yelim Lee,
Hae-Jeong Park
Abstract:
The correlation matrix is a central representation of functional brain networks in neuroimaging. Traditional analyses often treat pairwise interactions independently in a Euclidean setting, overlooking the intrinsic geometry of correlation matrices. While earlier attempts have embraced the quotient geometry of the correlation manifold, they remain limited by computational inefficiency and numerica…
▽ More
The correlation matrix is a central representation of functional brain networks in neuroimaging. Traditional analyses often treat pairwise interactions independently in a Euclidean setting, overlooking the intrinsic geometry of correlation matrices. While earlier attempts have embraced the quotient geometry of the correlation manifold, they remain limited by computational inefficiency and numerical instability, particularly in high-dimensional contexts. This paper presents a novel geometric framework that employs diffeomorphic transformations to embed correlation matrices into a Euclidean space, preserving salient manifold properties and enabling large-scale analyses. The proposed method integrates with established learning algorithms - regression, dimensionality reduction, and clustering - and extends naturally to population-level inference of brain networks. Simulation studies demonstrate both improved computational speed and enhanced accuracy compared to conventional manifold-based approaches. Moreover, applications in real neuroimaging scenarios illustrate the framework's utility, enhancing behavior score prediction, subject fingerprinting in resting-state fMRI, and hypothesis testing in electroencephalogram data. An open-source MATLAB toolbox is provided to facilitate broader adoption and advance the application of correlation geometry in functional brain network research.
△ Less
Submitted 9 April, 2025; v1 submitted 30 March, 2025;
originally announced March 2025.
-
Large Language Models Are Better Logical Fallacy Reasoners with Counterargument, Explanation, and Goal-Aware Prompt Formulation
Authors:
Jiwon Jeong,
Hyeju Jang,
Hogun Park
Abstract:
The advancement of Large Language Models (LLMs) has greatly improved our ability to process complex language. However, accurately detecting logical fallacies remains a significant challenge. This study presents a novel and effective prompt formulation approach for logical fallacy detection, applicable in both supervised (fine-tuned) and unsupervised (zero-shot) settings. Our method enriches input…
▽ More
The advancement of Large Language Models (LLMs) has greatly improved our ability to process complex language. However, accurately detecting logical fallacies remains a significant challenge. This study presents a novel and effective prompt formulation approach for logical fallacy detection, applicable in both supervised (fine-tuned) and unsupervised (zero-shot) settings. Our method enriches input text incorporating implicit contextual information -- counterarguments, explanations, and goals -- which we query for validity within the context of the argument. We then rank these queries based on confidence scores to inform classification. We evaluate our approach across multiple datasets from 5 domains, covering 29 distinct fallacy types, using models from the GPT and LLaMA series. The results show substantial improvements over state-of-the-art models, with F1 score increases of up to 0.60 in zero-shot settings and up to 0.45 in fine-tuned models. Extensive analyses further illustrate why and how our method excels.
△ Less
Submitted 30 March, 2025;
originally announced March 2025.
-
Infant Core-collapse Supernovae with Circumstellar Interactions from KMTNet I: Luminous Transitional Case of KSP-SN-2022c
Authors:
Nan Jiang,
Dae-Sik Moon,
Yuan Qi Ni,
Maria R. Drout,
Hong Soo Park,
Santiago González-Gaitán,
Sang Chul Kim,
Youngdae Lee,
Ernest Chang
Abstract:
We present $BVi$ multi-band high-cadence observations of a Type II supernova (SN) KSP-SN-2022c from a star-forming galaxy at $z$ $\simeq$ 0.041 from its infant to nebular phase. Early light curve fitting with a single power-law is consistent with the first detection of roughly 15 minutes after shock breakout. The SN light curves feature a rapid rise and decline across its luminous ($V$ $\simeq$ -1…
▽ More
We present $BVi$ multi-band high-cadence observations of a Type II supernova (SN) KSP-SN-2022c from a star-forming galaxy at $z$ $\simeq$ 0.041 from its infant to nebular phase. Early light curve fitting with a single power-law is consistent with the first detection of roughly 15 minutes after shock breakout. The SN light curves feature a rapid rise and decline across its luminous ($V$ $\simeq$ -18.41 mag) peak together with a short plateau. The presence of the short plateau and rapid post-peak decline place the SN within a small group of transitional type between Type II-P and II-L subtypes. Its (i) broad and asymmetric H profiles with large emission-to-absorption ratios and (ii) near-peak luminosity in excess of predictions from SN shock cooling models both point to circumstellar interactions in this SN. Early colour evolution exhibits a short-lived blueward motion in $B-V$ within the first few days and continuous reddening in $V-i$, inconsistent with simple blackbody heating. Our simulations of SN light curves estimate 13 $M_\odot$ and 680 $R_\odot$ for the mass and radius of the progenitor, respectively, together with CSM of 0.73 $M_\odot$ to account for the excess luminosity and rapid post-peak declines. We discuss the origin of its short plateau and early colour evolution in the context of partial envelope stripping of the progenitor star and a delayed SN shock breakout near the edge of the CSM, respectively, as indicated by our simulations. We establish a correlation between post-peak decline rates and CSM mass in Type II SNe, highlighting that CSM interactions play a major role in shaping the post-peak evolution of transitional types.
△ Less
Submitted 29 March, 2025;
originally announced March 2025.
-
Thermodynamic anomalies in overdamped systems with time-dependent temperature
Authors:
Shakul Awasthi,
Hyunggyu Park,
Jae Sung Lee
Abstract:
One of the key objectives in investigating small stochastic systems is the development of micrometer-sized engines and the understanding of their thermodynamics. However, the primary mathematical tool used for this purpose, the overdamped approximation, has a critical limitation: it fails to fully capture the thermodynamics when the temperature varies over time, as the velocity is not considered i…
▽ More
One of the key objectives in investigating small stochastic systems is the development of micrometer-sized engines and the understanding of their thermodynamics. However, the primary mathematical tool used for this purpose, the overdamped approximation, has a critical limitation: it fails to fully capture the thermodynamics when the temperature varies over time, as the velocity is not considered in the approximation. Specifically, we show that heat dissipation and entropy production calculated under the overdamped approximation deviate from their true values. These discrepancies are termed thermodynamic anomalies. To overcome this limitation, we analytically derive expressions for these anomalies in the presence of a general time-varying temperature. One notable feature of the result is that high viscosity and small mass, though both leading to the same overdamped dynamic equations, result in different thermodynamic anomaly relations. Our results have significant implications, particularly for accurately calculating the efficiency of heat engines operating in overdamped environments with time-varying temperatures, without requiring velocity measurements. Additionally, our findings offer a simple method for estimating the kinetic energy of an overdamped system.
△ Less
Submitted 28 March, 2025;
originally announced March 2025.
-
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
Authors:
Minho Park,
Sunghyun Park,
Jungsoo Lee,
Hyojin Park,
Kyuwoong Hwang,
Fatih Porikli,
Jaegul Choo,
Sungha Choi
Abstract:
This paper addresses the challenge of data scarcity in semantic segmentation by generating datasets through text-to-image (T2I) generation models, reducing image acquisition and labeling costs. Segmentation dataset generation faces two key challenges: 1) aligning generated samples with the target domain and 2) producing informative samples beyond the training data. Fine-tuning T2I models can help…
▽ More
This paper addresses the challenge of data scarcity in semantic segmentation by generating datasets through text-to-image (T2I) generation models, reducing image acquisition and labeling costs. Segmentation dataset generation faces two key challenges: 1) aligning generated samples with the target domain and 2) producing informative samples beyond the training data. Fine-tuning T2I models can help generate samples aligned with the target domain. However, it often overfits and memorizes training data, limiting their ability to generate diverse and well-aligned samples. To overcome these issues, we propose Concept-Aware LoRA (CA-LoRA), a novel fine-tuning approach that selectively identifies and updates only the weights associated with necessary concepts (e.g., style or viewpoint) for domain alignment while preserving the pretrained knowledge of the T2I model to produce informative samples. We demonstrate its effectiveness in generating datasets for urban-scene segmentation, outperforming baseline and state-of-the-art methods in in-domain (few-shot and fully-supervised) settings, as well as in domain generalization tasks, especially under challenging conditions such as adverse weather and varying illumination, further highlighting its superiority.
△ Less
Submitted 28 March, 2025;
originally announced March 2025.
-
sudo rm -rf agentic_security
Authors:
Sejin Lee,
Jian Kim,
Haon Park,
Ashkan Yousefpour,
Sangyoon Yu,
Min Song
Abstract:
Large Language Models (LLMs) are increasingly deployed as computer-use agents, autonomously performing tasks within real desktop or web environments. While this evolution greatly expands practical use cases for humans, it also creates serious security exposures. We present SUDO (Screen-based Universal Detox2Tox Offense), a novel attack framework that systematically bypasses refusal-trained safegua…
▽ More
Large Language Models (LLMs) are increasingly deployed as computer-use agents, autonomously performing tasks within real desktop or web environments. While this evolution greatly expands practical use cases for humans, it also creates serious security exposures. We present SUDO (Screen-based Universal Detox2Tox Offense), a novel attack framework that systematically bypasses refusal-trained safeguards in commercial computer-use agents, such as Claude for Computer Use. The core mechanism, Detox2Tox, transforms harmful requests (that agents initially reject) into seemingly benign requests via detoxification, secures detailed instructions from advanced vision language models (VLMs), and then reintroduces malicious content via toxification just before execution. Unlike conventional jailbreaks, SUDO iteratively refines its attacks based on a built-in refusal feedback, making it increasingly effective against robust policy filters. In extensive tests spanning 50 real-world tasks and multiple state-of-the-art VLMs, SUDO achieves a stark attack success rate of 24.41% (with no refinement), and up to 41.33% (by its iterative refinement) in Claude for Computer Use. By revealing these vulnerabilities and demonstrating the ease with which they can be exploited in real-world computing environments, this paper highlights an immediate need for robust, context-aware safeguards. WARNING: This paper includes harmful or offensive model outputs
△ Less
Submitted 8 June, 2025; v1 submitted 26 March, 2025;
originally announced March 2025.
-
Pressure tuning of Kitaev spin liquid candidate Na$_3$Co$_2$SbO$_6$
Authors:
E. H. T. Poldi,
R. Tartaglia,
G. Fabbris,
N. Nguyen,
H. Park,
Z. Liu,
M. van Veenendaal,
R. Kumar,
G. Jose,
S. Samanta,
W. Bi,
Y. Xiao,
D. Popov,
Y. Wu,
J. -W. Kim,
H. Zheng,
J. Yan,
J. F. Mitchell,
R. J. Hemley,
D. Haskel
Abstract:
The search for Kitaev's quantum spin liquid (KQSL) state in real materials has recently expanded with the prediction that honeycomb lattices of divalent, high-spin cobalt ions could host the dominant bond-dependent exchange interactions required to stabilize the elusive entangled quantum state. The layered honeycomb Na$_3$Co$_2$SbO$_6$ has been singled out as a leading candidate provided that the…
▽ More
The search for Kitaev's quantum spin liquid (KQSL) state in real materials has recently expanded with the prediction that honeycomb lattices of divalent, high-spin cobalt ions could host the dominant bond-dependent exchange interactions required to stabilize the elusive entangled quantum state. The layered honeycomb Na$_3$Co$_2$SbO$_6$ has been singled out as a leading candidate provided that the trigonal crystal field acting on Co $3d$ orbitals, which enhances non-Kitaev exchange interactions between $J_{\rm eff}=\frac{1}{2}$ spin-orbital pseudospins, is reduced. We find that applied pressure leads to anisotropic compression of the layered structure, significantly reducing the trigonal distortion of CoO$_6$ octahedra. A strong enhancement of ferromagnetic correlations between pseudospins is observed in the spin-polarized (3 Tesla) phase up to about 60 GPa. Higher pressures drive a spin transition into a low-spin state destroying the $J_{\rm eff}=\frac{1}{2}$ local moments required to map the spin Hamiltonian into Kitaev's model. The spin transition strongly suppresses the low-temperature magnetic susceptibility and appears to stabilize a paramagnetic phase driven by frustration. Although applied pressure fails to realize a KQSL state, the possible emergence of frustrated magnetism of localized, low-spin $S=\frac{1}{2}$ moments opens the door for exploration of novel magnetic quantum states in compressed honeycomb lattices of divalent cobaltates.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
Combined Annual Modulation Dark Matter Search with COSINE-100 and ANAIS-112
Authors:
N. Carlin,
J. Y. Cho,
J. J. Choi,
S. Choi,
A. C. Ezeribe,
L. E. França,
C. Ha,
I. S. Hahn,
S. J. Hollick,
S. B. Hong,
E. J. Jeon,
H. W. Joo,
W. G. Kang,
M. Kauer,
B. H. Kim,
H. J. Kim,
J. Kim,
K. W. Kim,
S. H. Kim,
S. K. Kim,
W. K. Kim,
Y. D. Kim,
Y. H. Kim,
Y. J. Ko,
D. H. Lee
, et al. (49 additional authors not shown)
Abstract:
The annual modulation signal, claimed to be consistent with dark matter as observed by DAMA/LIBRA in a sodium-iodide based detector, has persisted for over two decades. COSINE-100 and ANAIS-112 were designed to test the claim directly using the same target material. COSINE-100, located at Yangyang Underground Laboratory in South Korea, and ANAIS-112, located at Canfranc Underground Laboratory in S…
▽ More
The annual modulation signal, claimed to be consistent with dark matter as observed by DAMA/LIBRA in a sodium-iodide based detector, has persisted for over two decades. COSINE-100 and ANAIS-112 were designed to test the claim directly using the same target material. COSINE-100, located at Yangyang Underground Laboratory in South Korea, and ANAIS-112, located at Canfranc Underground Laboratory in Spain, have been taking data since 2016 and 2017, respectively. Each experiment published its respective results independently. In this paper, we present the results of an annual modulation search as a test of the signal observed by DAMA/LIBRA with the first three respective years of data from COSINE-100 and ANAIS-112. Using a Markov Chain Monte Carlo method, we find best fit values for modulation amplitude of $-0.0002 {\pm} 0.0026$ cpd/kg/keV in the 1-6 keV and $0.0021 {\pm} 0.0028$ cpd/kg/keV in the 2-6 keV energy regions. These results are not compatible with DAMA/LIBRA's assertion for their observation of annual modulation at $3.7σ$ and $2.6σ$, respectively. Performing a simple combination of the newly released 6-years datasets from both experiments find values consistent with no modulation at $0.0005 {\pm} 0.0019$ cpd/kg/keV in the 1-6 keV and $0.0027 {\pm} 0.0021$ cpd/kg/keV in the 2-6 keV energy regions with $4.68σ$ and $3.53σ$ respective exclusions of the DAMA/LIBRA signal.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
Global minimizers for fast diffusion versus nonlocal interactions on negatively curved manifolds
Authors:
José A. Carrillo,
Razvan C. Fetecau,
Hansol Park
Abstract:
We investigate the existence of ground states for a free energy functional on Cartan-Hadamard manifolds. The energy, which consists of an entropy and an interaction term, is associated to a macroscopic aggregation model that includes nonlinear diffusion and nonlocal interactions. We consider specifically the regime of fast diffusion, and establish necessary and sufficient conditions on the behavio…
▽ More
We investigate the existence of ground states for a free energy functional on Cartan-Hadamard manifolds. The energy, which consists of an entropy and an interaction term, is associated to a macroscopic aggregation model that includes nonlinear diffusion and nonlocal interactions. We consider specifically the regime of fast diffusion, and establish necessary and sufficient conditions on the behaviour of the interaction potential for global energy minimizers to exist. We first consider the case of manifolds with constant bounds of sectional curvatures, then extend the results to manifolds with general curvature bounds. To establish our results we derive several new Carlson-Levin type inequalities for Cartan-Hadamard manifolds.
△ Less
Submitted 24 March, 2025;
originally announced March 2025.
-
On the number of asynchronous attractors in AND-NOT Boolean networks
Authors:
Van-Giang Trinh,
Samuel Pastva,
Jordan Rozum,
Kyu Hyong Park,
Réka Albert
Abstract:
Boolean Networks (BNs) describe the time evolution of binary states using logic functions on the nodes of a network. They are fundamental models for complex discrete dynamical systems, with applications in various areas of science and engineering, and especially in systems biology. A key aspect of the dynamical behavior of BNs is the number of attractors, which determines the diversity of long-ter…
▽ More
Boolean Networks (BNs) describe the time evolution of binary states using logic functions on the nodes of a network. They are fundamental models for complex discrete dynamical systems, with applications in various areas of science and engineering, and especially in systems biology. A key aspect of the dynamical behavior of BNs is the number of attractors, which determines the diversity of long-term system trajectories. Due to the noisy nature and incomplete characterization of biological systems, a stochastic asynchronous update scheme is often more appropriate than the deterministic synchronous one. AND-NOT BNs, whose logic functions are the conjunction of literals, are an important subclass of BNs because of their structural simplicity and their usefulness in analyzing biological systems for which the only information available is a collection of interactions among components. In this paper, we establish new theoretical results regarding asynchronous attractors in AND-NOT BNs. We derive two new upper bounds for the number of asynchronous attractors in an AND-NOT BN based on structural properties (strong even cycles and dominating sets, respectively) of the AND-NOT BN. These findings contribute to a more comprehensive understanding of asynchronous dynamics in AND-NOT BNs, with implications for attractor enumeration and counting, as well as for network design and control.
△ Less
Submitted 24 March, 2025;
originally announced March 2025.
-
Radar-Guided Polynomial Fitting for Metric Depth Estimation
Authors:
Patrick Rim,
Hyoungseob Park,
Vadim Ezhov,
Jeffrey Moon,
Alex Wong
Abstract:
We propose PolyRad, a novel radar-guided depth estimation method that introduces polynomial fitting to transform scaleless depth predictions from pretrained monocular depth estimation (MDE) models into metric depth maps. Unlike existing approaches that rely on complex architectures or expensive sensors, our method is grounded in a simple yet fundamental insight: using polynomial coefficients predi…
▽ More
We propose PolyRad, a novel radar-guided depth estimation method that introduces polynomial fitting to transform scaleless depth predictions from pretrained monocular depth estimation (MDE) models into metric depth maps. Unlike existing approaches that rely on complex architectures or expensive sensors, our method is grounded in a simple yet fundamental insight: using polynomial coefficients predicted from cheap, ubiquitous radar data to adaptively adjust depth predictions non-uniformly across depth ranges. Although MDE models often infer reasonably accurate local depth structure within each object or local region, they may misalign these regions relative to one another, making a linear scale-and-shift transformation insufficient given three or more of these regions. In contrast, PolyRad generalizes beyond linear transformations and is able to correct such misalignments by introducing inflection points. Importantly, our polynomial fitting framework preserves structural consistency through a novel training objective that enforces monotonicity via first-derivative regularization. PolyRad achieves state-of-the-art performance on the nuScenes, ZJU-4DRadarCam, and View-of-Delft datasets, outperforming existing methods by 30.3% in MAE and 37.2% in RMSE.
△ Less
Submitted 21 March, 2025;
originally announced March 2025.
-
Progressive Test Time Energy Adaptation for Medical Image Segmentation
Authors:
Xiaoran Zhang,
Byung-Woo Hong,
Hyoungseob Park,
Daniel H. Pak,
Anne-Marie Rickmann,
Lawrence H. Staib,
James S. Duncan,
Alex Wong
Abstract:
We propose a model-agnostic, progressive test-time energy adaptation approach for medical image segmentation. Maintaining model performance across diverse medical datasets is challenging, as distribution shifts arise from inconsistent imaging protocols and patient variations. Unlike domain adaptation methods that require multiple passes through target data - impractical in clinical settings - our…
▽ More
We propose a model-agnostic, progressive test-time energy adaptation approach for medical image segmentation. Maintaining model performance across diverse medical datasets is challenging, as distribution shifts arise from inconsistent imaging protocols and patient variations. Unlike domain adaptation methods that require multiple passes through target data - impractical in clinical settings - our approach adapts pretrained models progressively as they process test data. Our method leverages a shape energy model trained on source data, which assigns an energy score at the patch level to segmentation maps: low energy represents in-distribution (accurate) shapes, while high energy signals out-of-distribution (erroneous) predictions. By minimizing this energy score at test time, we refine the segmentation model to align with the target distribution. To validate the effectiveness and adaptability, we evaluated our framework on eight public MRI (bSSFP, T1- and T2-weighted) and X-ray datasets spanning cardiac, spinal cord, and lung segmentation. We consistently outperform baselines both quantitatively and qualitatively.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
Fast Homomorphic Linear Algebra with BLAS
Authors:
Youngjin Bae,
Jung Hee Cheon,
Guillaume Hanrot,
Jai Hyun Park,
Damien Stehlé
Abstract:
Homomorphic encryption is a cryptographic paradigm allowing to compute on encrypted data, opening a wide range of applications in privacy-preserving data manipulation, notably in AI. Many of those applications require significant linear algebra computations (matrix x vector products, and matrix x matrix products).
This central role of linear algebra computations goes far beyond homomorphic algeb…
▽ More
Homomorphic encryption is a cryptographic paradigm allowing to compute on encrypted data, opening a wide range of applications in privacy-preserving data manipulation, notably in AI. Many of those applications require significant linear algebra computations (matrix x vector products, and matrix x matrix products).
This central role of linear algebra computations goes far beyond homomorphic algebra and applies to most areas of scientific computing. This high versatility led, over time, to the development of a set of highly optimized routines, specified in 1979 under the name BLAS (basic linear algebra subroutines).
Motivated both by the applicative importance of homomorphic linear algebra and the access to highly efficient implementations of cleartext linear algebra able to draw the most out of available hardware, we explore the connections between CKKS-based homomorphic linear algebra and floating-point plaintext linear algebra. The CKKS homomorphic encryption system is the most natural choice in this setting, as it natively handles real numbers and offers a large SIMD parallelism.
We provide reductions for matrix-vector products, vector-vector products for moderate-sized to large matrices to their plaintext equivalents. Combined with BLAS, we demonstrate that the efficiency loss between CKKS-based encrypted square matrix multiplication and double-precision floating-point square matrix multiplication is a mere 4-12 factor, depending on the precise situation.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
V-NAW: Video-based Noise-aware Adaptive Weighting for Facial Expression Recognition
Authors:
JunGyu Lee,
Kunyoung Lee,
Haesol Park,
Ig-Jae Kim,
Gi Pyo Nam
Abstract:
Facial Expression Recognition (FER) plays a crucial role in human affective analysis and has been widely applied in computer vision tasks such as human-computer interaction and psychological assessment. The 8th Affective Behavior Analysis in-the-Wild (ABAW) Challenge aims to assess human emotions using the video-based Aff-Wild2 dataset. This challenge includes various tasks, including the video-ba…
▽ More
Facial Expression Recognition (FER) plays a crucial role in human affective analysis and has been widely applied in computer vision tasks such as human-computer interaction and psychological assessment. The 8th Affective Behavior Analysis in-the-Wild (ABAW) Challenge aims to assess human emotions using the video-based Aff-Wild2 dataset. This challenge includes various tasks, including the video-based EXPR recognition track, which is our primary focus. In this paper, we demonstrate that addressing label ambiguity and class imbalance, which are known to cause performance degradation, can lead to meaningful performance improvements. Specifically, we propose Video-based Noise-aware Adaptive Weighting (V-NAW), which adaptively assigns importance to each frame in a clip to address label ambiguity and effectively capture temporal variations in facial expressions. Furthermore, we introduce a simple and effective augmentation strategy to reduce redundancy between consecutive frames, which is a primary cause of overfitting. Through extensive experiments, we validate the effectiveness of our approach, demonstrating significant improvements in video-based FER performance.
△ Less
Submitted 12 May, 2025; v1 submitted 20 March, 2025;
originally announced March 2025.
-
Three-dimensional Supersonic flows for the steady Euler-Poisson system in divergent nozzles
Authors:
Hyangdong Park
Abstract:
We establish the unique existence of an axisymmetric supersonic solution with nonzero vorticity and nonzero angular momentum density for the steady Euler-Poisson system in three-dimensional divergent nozzles when prescribing the velocity, strength of electric field, and the entropy at the entrance. To the best of our knowledge, this is the first study on the three-dimensional divergent nozzle prob…
▽ More
We establish the unique existence of an axisymmetric supersonic solution with nonzero vorticity and nonzero angular momentum density for the steady Euler-Poisson system in three-dimensional divergent nozzles when prescribing the velocity, strength of electric field, and the entropy at the entrance. To the best of our knowledge, this is the first study on the three-dimensional divergent nozzle problem for the steady Euler-Poisson system. We first solve the reformulated problem via the method of the Helmholtz decomposition and obtain a strong solution with a Sobolev norm. Then, using the fact that the longitudinal axis is the axis of the time-like variable, we apply the standard elliptic theory for the interior regularity to see that the solution is a classical solution. Furthermore, we deal carefully with singularity issues related to the polar angle on the axis of the divergent nozzle.
△ Less
Submitted 18 March, 2025;
originally announced March 2025.
-
Signal amplification in a solid-state quantum sensor via asymmetric time-reversal of many-body dynamics
Authors:
Haoyang Gao,
Leigh S. Martin,
Lillian B. Hughes,
Nathaniel T. Leitao,
Piotr Put,
Hengyun Zhou,
Nazli U. Koyluoglu,
Simon A. Meynell,
Ania C. Bleszynski Jayich,
Hongkun Park,
Mikhail D. Lukin
Abstract:
Electronic spins of nitrogen vacancy (NV) centers in diamond constitute a promising system for micro- and nano-scale magnetic sensing, due to their operation under ambient conditions, ease of placement in close proximity to sensing targets, and biological compatibility. At high densities, the electronic spins interact through dipolar coupling, which typically limits but can also potentially enhanc…
▽ More
Electronic spins of nitrogen vacancy (NV) centers in diamond constitute a promising system for micro- and nano-scale magnetic sensing, due to their operation under ambient conditions, ease of placement in close proximity to sensing targets, and biological compatibility. At high densities, the electronic spins interact through dipolar coupling, which typically limits but can also potentially enhance sensing performance. Here we report the experimental demonstration of many-body signal amplification in a solid-state, room temperature quantum sensor. Our approach utilizes time-reversed two-axis-twisting interactions, engineered through dynamical control of the quantization axis and Floquet engineering in a two-dimensional ensemble of NV centers. Strikingly, we observe that the optimal amplification occurs when the backward evolution time equals twice the forward evolution time, in sharp contrast to the conventional Loschmidt echo. These observations can be understood as resulting from an underlying time-reversed mirror symmetry of the microscopic dynamics, providing key insights into signal amplification and opening the door towards entanglement-enhanced practical quantum sensing.
△ Less
Submitted 18 March, 2025;
originally announced March 2025.
-
Iterative Predictor-Critic Code Decoding for Real-World Image Dehazing
Authors:
Jiayi Fu,
Siyu Liu,
Zikun Liu,
Chun-Le Guo,
Hyunhee Park,
Ruiqi Wu,
Guoqing Wang,
Chongyi Li
Abstract:
We propose a novel Iterative Predictor-Critic Code Decoding framework for real-world image dehazing, abbreviated as IPC-Dehaze, which leverages the high-quality codebook prior encapsulated in a pre-trained VQGAN. Apart from previous codebook-based methods that rely on one-shot decoding, our method utilizes high-quality codes obtained in the previous iteration to guide the prediction of the Code-Pr…
▽ More
We propose a novel Iterative Predictor-Critic Code Decoding framework for real-world image dehazing, abbreviated as IPC-Dehaze, which leverages the high-quality codebook prior encapsulated in a pre-trained VQGAN. Apart from previous codebook-based methods that rely on one-shot decoding, our method utilizes high-quality codes obtained in the previous iteration to guide the prediction of the Code-Predictor in the subsequent iteration, improving code prediction accuracy and ensuring stable dehazing performance. Our idea stems from the observations that 1) the degradation of hazy images varies with haze density and scene depth, and 2) clear regions play crucial cues in restoring dense haze regions. However, it is non-trivial to progressively refine the obtained codes in subsequent iterations, owing to the difficulty in determining which codes should be retained or replaced at each iteration. Another key insight of our study is to propose Code-Critic to capture interrelations among codes. The Code-Critic is used to evaluate code correlations and then resample a set of codes with the highest mask scores, i.e., a higher score indicates that the code is more likely to be rejected, which helps retain more accurate codes and predict difficult ones. Extensive experiments demonstrate the superiority of our method over state-of-the-art methods in real-world dehazing.
△ Less
Submitted 29 March, 2025; v1 submitted 17 March, 2025;
originally announced March 2025.
-
CompMarkGS: Robust Watermarking for Compressed 3D Gaussian Splatting
Authors:
Sumin In,
Youngdong Jang,
Utae Jeong,
MinHyuk Jang,
Hyeongcheol Park,
Eunbyung Park,
Sangpil Kim
Abstract:
3D Gaussian Splatting (3DGS) is increasingly adopted in various academic and commercial applications due to its real-time and high-quality rendering capabilities, emphasizing the growing need for copyright protection technologies for 3DGS. However, the large model size of 3DGS requires developing efficient compression techniques. This highlights the necessity of an integrated framework that addres…
▽ More
3D Gaussian Splatting (3DGS) is increasingly adopted in various academic and commercial applications due to its real-time and high-quality rendering capabilities, emphasizing the growing need for copyright protection technologies for 3DGS. However, the large model size of 3DGS requires developing efficient compression techniques. This highlights the necessity of an integrated framework that addresses copyright protection and data compression for 3D content. Nevertheless, existing 3DGS watermarking methods significantly degrade watermark performance under 3DGS compression methods, particularly quantization-based approaches that achieve superior compression performance. To ensure reliable watermark detection under compression, we propose a compression-tolerant anchor-based 3DGS watermarking, which preserves watermark integrity and rendering quality. This is achieved by introducing anchor-based 3DGS watermarking. We embed the watermark into the anchor attributes, particularly the anchor feature, to enhance security and rendering quality. We also propose a quantization distortion layer that injects quantization noise during training, preserving the watermark after quantization-based compression. Moreover, we employ a frequency-aware anchor growing strategy that improves rendering quality and watermark performance by effectively identifying Gaussians in high-frequency regions. Extensive experiments demonstrate that our proposed method preserves the watermark even under compression and maintains high rendering quality.
△ Less
Submitted 11 June, 2025; v1 submitted 17 March, 2025;
originally announced March 2025.
-
ProtoDepth: Unsupervised Continual Depth Completion with Prototypes
Authors:
Patrick Rim,
Hyoungseob Park,
S. Gangopadhyay,
Ziyao Zeng,
Younjoon Chung,
Alex Wong
Abstract:
We present ProtoDepth, a novel prototype-based approach for continual learning of unsupervised depth completion, the multimodal 3D reconstruction task of predicting dense depth maps from RGB images and sparse point clouds. The unsupervised learning paradigm is well-suited for continual learning, as ground truth is not needed. However, when training on new non-stationary distributions, depth comple…
▽ More
We present ProtoDepth, a novel prototype-based approach for continual learning of unsupervised depth completion, the multimodal 3D reconstruction task of predicting dense depth maps from RGB images and sparse point clouds. The unsupervised learning paradigm is well-suited for continual learning, as ground truth is not needed. However, when training on new non-stationary distributions, depth completion models will catastrophically forget previously learned information. We address forgetting by learning prototype sets that adapt the latent features of a frozen pretrained model to new domains. Since the original weights are not modified, ProtoDepth does not forget when test-time domain identity is known. To extend ProtoDepth to the challenging setting where the test-time domain identity is withheld, we propose to learn domain descriptors that enable the model to select the appropriate prototype set for inference. We evaluate ProtoDepth on benchmark dataset sequences, where we reduce forgetting compared to baselines by 52.2% for indoor and 53.2% for outdoor to achieve the state of the art.
△ Less
Submitted 16 March, 2025;
originally announced March 2025.
-
Impact of structural distortions on the correlated electronic structure of orbital-selective Mott insulating Na$_3$Co$_2$SbO$_6$ under strains
Authors:
Nam Nguyen,
Alex Taekyung Lee,
Anh T. Ngo,
Hyowon Park
Abstract:
Na$_{3}$Co$_{2}$SbO$_6$ is a promising candidate to realize the Kitaev spin liquid phase since the large Kitaev spin exchange interaction is tunable via the change in electronic structure, such as the trigonal crystal field splitting ($Δ_{TCF}$). Here, we show that the uncorrelated electronic structure of Na$_{3}$Co$_{2}$SbO$_6$ is rather insensitive to the strain effect due to the low crystal sym…
▽ More
Na$_{3}$Co$_{2}$SbO$_6$ is a promising candidate to realize the Kitaev spin liquid phase since the large Kitaev spin exchange interaction is tunable via the change in electronic structure, such as the trigonal crystal field splitting ($Δ_{TCF}$). Here, we show that the uncorrelated electronic structure of Na$_{3}$Co$_{2}$SbO$_6$ is rather insensitive to the strain effect due to the low crystal symmetry accompanied by oxygen displacements and the presence of Sb $s$ orbitals. This suggests that the Kitaev spin-exchange interaction obtained from perturbation theory also does not depend much on the strain effect. Using density functional theory plus dynamical mean field theory, we find that the correlated electronic structure of Na$_{3}$Co$_{2}$SbO$_6$ is an orbital selective Mott insulating state where the trigonal $a_{1g}$ orbital is insulating due to correlation-assisted hybridization, while other $d$ orbitals behave as typical Mott insulators, resulting in tunability of $Δ_{TCF}$ under the strain effect effectively. Our results show that the local Co-site symmetry and dynamical correlation effects will play an important role in engineering the novel magnetic phase in this and related materials.
△ Less
Submitted 15 March, 2025;
originally announced March 2025.
-
Chiral Pseudogap Metal Emerging from a Disordered van der Waals Mott Insulator 1T-TaS2-xSex
Authors:
Hyunjin Jung,
Jiwon Jung,
ChoongJae Won,
Hae-Ryong Park,
Sang-Wook Cheong,
Jaeyoung Kim,
Gil Young Cho,
Han Woong Yeom
Abstract:
The emergence of a pseudogap is a hallmark of anomalous electronic states formed through substantial manybody interaction but the mechanism of the pseudogap formation and its role in related emerging quantum states such as unconventional superconductivity remain largely elusive. Here, we report the emergence of an unusual pseudogap in a representative van der Waals chiral charge density wave (CDW)…
▽ More
The emergence of a pseudogap is a hallmark of anomalous electronic states formed through substantial manybody interaction but the mechanism of the pseudogap formation and its role in related emerging quantum states such as unconventional superconductivity remain largely elusive. Here, we report the emergence of an unusual pseudogap in a representative van der Waals chiral charge density wave (CDW) materials with strong electron correlation, 1T-TaS2, through isoelectronic substitute of S. We investigate systematically the evolution of electronic band dispersions of 1T-TaS2-xSex (0=<x=<2) using angle-resolved photoemission spectroscopy (ARPES). Our results show that the Se substitution induces a quantum transition from an insulating to a pseudogap metallic phase with the CDW order preserved. Moreover, the asymmetry of the pseudogap spectral function is found, which reflects the chiral nature of CDW structure. The present observation is contrasted with the previous suggestions of a Mott transition driven by band width control or charge transfer. Instead, we attribute the pseudogap phase to a disordered Mott insulator in line with the recent observation of substantial lateral electronic disorder. These findings provide a unique electronic system with chiral pseudogap, where the complex interplay between CDW, chirality, disorder, and electronic correlation may lead to unconventional emergent physics.
△ Less
Submitted 15 March, 2025;
originally announced March 2025.
-
SPECTra: Scalable Multi-Agent Reinforcement Learning with Permutation-Free Networks
Authors:
Hyunwoo Park,
Baekryun Seong,
Sang-Ki Ko
Abstract:
In cooperative multi-agent reinforcement learning (MARL), the permutation problem where the state space grows exponentially with the number of agents reduces sample efficiency. Additionally, many existing architectures struggle with scalability, relying on a fixed structure tied to a specific number of agents, limiting their applicability to environments with a variable number of entities. While a…
▽ More
In cooperative multi-agent reinforcement learning (MARL), the permutation problem where the state space grows exponentially with the number of agents reduces sample efficiency. Additionally, many existing architectures struggle with scalability, relying on a fixed structure tied to a specific number of agents, limiting their applicability to environments with a variable number of entities. While approaches such as graph neural networks (GNNs) and self-attention mechanisms have progressed in addressing these challenges, they have significant limitations as dense GNNs and self-attention mechanisms incur high computational costs. To overcome these limitations, we propose a novel agent network and a non-linear mixing network that ensure permutation-equivariance and scalability, allowing them to generalize to environments with various numbers of agents. Our agent network significantly reduces computational complexity, and our scalable hypernetwork enables efficient weight generation for non-linear mixing. Additionally, we introduce curriculum learning to improve training efficiency. Experiments on SMACv2 and Google Research Football (GRF) demonstrate that our approach achieves superior learning performance compared to existing methods. By addressing both permutation-invariance and scalability in MARL, our work provides a more efficient and adaptable framework for cooperative MARL. Our code is available at https://github.com/funny-rl/SPECTra.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
Magnon thermal conductivity in multiferroics with spin cycloids
Authors:
Hyeon Woo Park,
Shu Zhang,
Peter Meisenheimer,
Maya Ramesh,
Sajid Husain,
Isaac Harris,
Jorge Íñiguez-González,
Zhi Yao,
Ramamoorthy Ramesh,
Se Kwon Kim
Abstract:
Multiferroic materials, characterized by the occurrence of two or more ferroic properties, hold potential in future technological applications and also exhibit intriguing phenomena caused by the interplay of multiple orders. One such example is the formation of spin cycloid structures within multiferroic materials, which we investigate in this work by focusing on their magnon excitations and trans…
▽ More
Multiferroic materials, characterized by the occurrence of two or more ferroic properties, hold potential in future technological applications and also exhibit intriguing phenomena caused by the interplay of multiple orders. One such example is the formation of spin cycloid structures within multiferroic materials, which we investigate in this work by focusing on their magnon excitations and transport based on a general multiferroic Hamiltonian with an antiferromagnetic order. More specifically, we identify the ground state and explore the dynamics of magnon modes, revealing distinct in-plane and out-of-plane modes with anisotropic dispersion relations.The magnon modes include a massless excitation, known as the Goldstone boson, originating from the spontaneous breaking of the translational symmetry by the formation of the cycloid structures. By employing the Boltzmann transport formalism, the magnonic thermal conductivity with spin cycloids and low-temperature anisotropic behaviors is discussed. This work provides pathways to envision the spin-textured multiferroics, which may serve as a fertile ground to look for novel thermal and spin transport with the rich interplay of quasiparticles such as magnons and phonons.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
Observation of High-Temperature Dissipationless Fractional Chern Insulator
Authors:
Heonjoon Park,
Weijie Li,
Chaowei Hu,
Christiano Beach,
Miguel Gonçalves,
Juan Felipe Mendez-Valderrama,
Jonah Herzog-Arbeitman,
Takashi Taniguchi,
Kenji Watanabe,
David Cobden,
Liang Fu,
B. Andrei Bernevig,
Nicolas Regnault,
Jiun-Haw Chu,
Di Xiao,
Xiaodong Xu
Abstract:
The fractional quantum anomalous Hall effect has recently been experimentally observed in zero-field fractional Chern insulators (FCI). However, an outstanding challenge is the presence of a substantial longitudinal resistance $R_{xx}$ (a few k$Ω$), even though the anomalous Hall resistance $R_{xy}$ is quantized. This dissipative behavior is likely linked to imperfect sample quality. Here, we repo…
▽ More
The fractional quantum anomalous Hall effect has recently been experimentally observed in zero-field fractional Chern insulators (FCI). However, an outstanding challenge is the presence of a substantial longitudinal resistance $R_{xx}$ (a few k$Ω$), even though the anomalous Hall resistance $R_{xy}$ is quantized. This dissipative behavior is likely linked to imperfect sample quality. Here, we report transport measurements of a drastically improved twisted $\text{MoTe}_2$ bilayer device, which exhibits quantized $R_{xy}$ and vanishing $R_{xx}$ for the $-2/3$ state, marking a dissipationless FCI. Contrary to fractional quantum Hall states where the energy gap increases with magnetic field, we find that the thermal activation gap of the observed FCI states decreases rapidly as the magnetic field rises from zero, then plateaus above a few teslas. This observation is attributed to the interplay between spin and charge gaps. Due to the spontaneous ferromagnetism, the spin gap dominates at low field, while the charge gap becomes appreciable once the magnetic field freezes spin fluctuations. For the $-2/3$ state, we estimate the spin and FCI gap of about 55 and 20 K, respectively. Our results provide insights into the energy scale of FCI and offer a pathway for quantum engineering of exotic correlated topological states.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
DNA Nanotechnology for Superradiance
Authors:
Jaewon Lee,
Sung Hun Park,
Jangwon Kim,
Kyung Hun Rho,
Hoyoung Lee,
Soyeon Kim,
Seungwoo Lee
Abstract:
Superradiance, first proposed by Dicke in 1954, is a highly efficient quantum light source that differs from conventional spontaneous emission. Unlike typical spontaneous emission, where intensity scales linearly with the number of electric dipoles, superradiance exhibits an intensity that scales quadratically with the number of electric dipoles. Similarly, the decay rate also increases proportion…
▽ More
Superradiance, first proposed by Dicke in 1954, is a highly efficient quantum light source that differs from conventional spontaneous emission. Unlike typical spontaneous emission, where intensity scales linearly with the number of electric dipoles, superradiance exhibits an intensity that scales quadratically with the number of electric dipoles. Similarly, the decay rate also increases proportionally to the dipole numbers. To realize superradiance, excited electric dipoles must be arranged in the same orientation with spacing much smaller than the wavelength of the excitation light. While previous studies have accidentally observed superradiance through the random aggregation of quantum dots and organic dyes, a deterministic approach for the materialization of superradiant has yet to be established. Herein, we (i) specifically outline the advantages of DNA nanotechnology in tackling this challenge, (ii) discuss the reasons why superradiance has not yet been realized even with the state-of-the art DNA nanotechnology, and (iii) propose potential solutions for overcoming the current limitations.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
PMT calibration for the JSNS2-II far detector with an embedded LED system
Authors:
Jisu Park,
M. K. Cheoun,
J. H. Choi,
J. Y. Choi,
T. Dodo,
J. Goh,
M. Harada,
S. Hasegawa,
W. Hwang,
T. Iida,
H. I. Jang,
J. S. Jang,
K. K. Joo,
D. E. Jung,
S. K. Kang,
Y. Kasugai,
T. Kawasaki,
E. M. Kim,
S. B. Kim,
S. Y. Kim,
H. Kinoshita,
T. Konno,
D. H. Lee,
C. Little,
T. Maruyama
, et al. (31 additional authors not shown)
Abstract:
The JSNS2-II (the second phase of JSNS2, J-PARC Sterile Neutrino Search at J-PARC Spallation Neutron Source) is an experiment aimed at searching for sterile neutrinos. This experiment has entered its second phase, employing two liquid scintillator detectors located at near and far positions from the neutrino source. Recently, the far detector of the experiment has been completed and is currently i…
▽ More
The JSNS2-II (the second phase of JSNS2, J-PARC Sterile Neutrino Search at J-PARC Spallation Neutron Source) is an experiment aimed at searching for sterile neutrinos. This experiment has entered its second phase, employing two liquid scintillator detectors located at near and far positions from the neutrino source. Recently, the far detector of the experiment has been completed and is currently in the calibration phase. This paper presents a detailed description of the calibration process utilizing the LED system. The LED system of the far detector uses two Ultra-Violet (UV) LEDs, which are effective in calibrating all of PMTs at once. The UV light is converted into the visible light wavelengths inside liquid scintillator via the wavelength shifters, providing pseudo-isotropic light. The properties of all functioning Photo-Multiplier-Tubes (PMTs) to detect the neutrino events in the far detector, such as gain, its dependence of supplied High Voltage (HV), and Peak-to-Valley (PV) were calibrated. To achieve a good energy resolution for physics events, up to 10% of the relative gain adjustment is required for all functioning PMTs. This will be achieved using the measured HV curves and the LED calibration. The Peak-to-Valley (PV) ratio values are the similar to those from the production company, which distinguish the single photo-electron signal from the pedestal. Additionally, the precision of PMT signal timing is measured to be 2.1 ns, meeting the event reconstruction requirement of 10 ns.
△ Less
Submitted 11 March, 2025;
originally announced March 2025.
-
CIMAGE: Exploiting the Conditional Independence in Masked Graph Auto-encoders
Authors:
Jongwon Park,
Heesoo Jung,
Hogun Park
Abstract:
Recent Self-Supervised Learning (SSL) methods encapsulating relational information via masking in Graph Neural Networks (GNNs) have shown promising performance. However, most existing approaches rely on random masking strategies in either feature or graph space, which may fail to capture task-relevant information fully. We posit that this limitation stems from an inability to achieve minimum redun…
▽ More
Recent Self-Supervised Learning (SSL) methods encapsulating relational information via masking in Graph Neural Networks (GNNs) have shown promising performance. However, most existing approaches rely on random masking strategies in either feature or graph space, which may fail to capture task-relevant information fully. We posit that this limitation stems from an inability to achieve minimum redundancy between masked and unmasked components while ensuring maximum relevance of both to potential downstream tasks. Conditional Independence (CI) inherently satisfies the minimum redundancy and maximum relevance criteria, but its application typically requires access to downstream labels. To address this challenge, we introduce CIMAGE, a novel approach that leverages Conditional Independence to guide an effective masking strategy within the latent space. CIMAGE utilizes CI-aware latent factor decomposition to generate two distinct contexts, leveraging high-confidence pseudo-labels derived from unsupervised graph clustering. In this framework, the pretext task involves reconstructing the masked second context solely from the information provided by the first context. Our theoretical analysis further supports the superiority of CIMAGE's novel CI-aware masking method by demonstrating that the learned embedding exhibits approximate linear separability, which enables accurate predictions for the downstream task. Comprehensive evaluations across diverse graph benchmarks illustrate the advantage of CIMAGE, with notably higher average rankings on node classification and link prediction tasks. Notably, our proposed model highlights the under-explored potential of CI in enhancing graph SSL methodologies and offers enriched insights for effective graph representation learning.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
Medical Hallucinations in Foundation Models and Their Impact on Healthcare
Authors:
Yubin Kim,
Hyewon Jeong,
Shan Chen,
Shuyue Stella Li,
Mingyu Lu,
Kumail Alhamoud,
Jimin Mun,
Cristina Grau,
Minseok Jung,
Rodrigo Gameiro,
Lizhou Fan,
Eugene Park,
Tristan Lin,
Joonsik Yoon,
Wonjin Yoon,
Maarten Sap,
Yulia Tsvetkov,
Paul Liang,
Xuhai Xu,
Xin Liu,
Daniel McDuff,
Hyeonhoon Lee,
Hae Won Park,
Samir Tulebaev,
Cynthia Breazeal
Abstract:
Foundation Models that are capable of processing and generating multi-modal data have transformed AI's role in medicine. However, a key limitation of their reliability is hallucination, where inaccurate or fabricated information can impact clinical decisions and patient safety. We define medical hallucination as any instance in which a model generates misleading medical content. This paper examine…
▽ More
Foundation Models that are capable of processing and generating multi-modal data have transformed AI's role in medicine. However, a key limitation of their reliability is hallucination, where inaccurate or fabricated information can impact clinical decisions and patient safety. We define medical hallucination as any instance in which a model generates misleading medical content. This paper examines the unique characteristics, causes, and implications of medical hallucinations, with a particular focus on how these errors manifest themselves in real-world clinical scenarios. Our contributions include (1) a taxonomy for understanding and addressing medical hallucinations, (2) benchmarking models using medical hallucination dataset and physician-annotated LLM responses to real medical cases, providing direct insight into the clinical impact of hallucinations, and (3) a multi-national clinician survey on their experiences with medical hallucinations. Our results reveal that inference techniques such as Chain-of-Thought (CoT) and Search Augmented Generation can effectively reduce hallucination rates. However, despite these improvements, non-trivial levels of hallucination persist. These findings underscore the ethical and practical imperative for robust detection and mitigation strategies, establishing a foundation for regulatory policies that prioritize patient safety and maintain clinical integrity as AI becomes more integrated into healthcare. The feedback from clinicians highlights the urgent need for not only technical advances but also for clearer ethical and regulatory guidelines to ensure patient safety. A repository organizing the paper resources, summaries, and additional information is available at https://github.com/mitmedialab/medical hallucination.
△ Less
Submitted 25 February, 2025;
originally announced March 2025.
-
Quantum decoherence of nitrogen-vacancy spin ensembles in a nitrogen spin bath in diamond under dynamical decoupling
Authors:
Huijin Park,
Mykyta Onizhuk,
Eunsang Lee,
Harim Lim,
Junghyun Lee,
Sangwon Oh,
Giulia Galli,
Hosung Seo
Abstract:
The negatively charged nitrogen-vacancy (NV) center in diamond has emerged as a leading qubit platform for quantum technology applications. One of the key challenges for NV-based quantum applications is building an accurate model to predict its decoherence properties and their quantum nature. In this study, we combine theory and experiment to investigate NV decoherence dynamics in the presence of…
▽ More
The negatively charged nitrogen-vacancy (NV) center in diamond has emerged as a leading qubit platform for quantum technology applications. One of the key challenges for NV-based quantum applications is building an accurate model to predict its decoherence properties and their quantum nature. In this study, we combine theory and experiment to investigate NV decoherence dynamics in the presence of nitrogen donor (P1 center) baths, which is one of the dominant decoherence sources in diamond. We employ a cluster-correlation expansion (CCE) method to compute the NV decoherence under the Hahn-echo (HE) and Carr-Purcell-Meiboom-Gill (CPMG) pulse sequences at various P1 concentrations from 1ppm to 300 ppm. We show that the coherence time (T2) increases with the number of pi pulses applied, indicating that the NV spin is decoupled from the P1 bath. Notably, we find that T2 scales quadratically as a function of the pulse number, on a logarithmic scale, as opposed to the linear scaling predicted by widely accepted semi-classical theories in the literature. In our experiment, we measure the CPMG signal for two diamond samples with high P1 concentrations of 0.8ppm and 13ppm. We demonstrate that the T2 scaling is indeed quadratic, thus confirming our theoretical predictions. Our results show that the quantum bath model combined with the CCE method can accurately capture the quantum nature of the P1-driven NV decoherence. Our study opens a new avenue for developing a complete noise model that could be used to optimize the performance of NV-based quantum devices.
△ Less
Submitted 7 March, 2025;
originally announced March 2025.
-
Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions
Authors:
Chan Hur,
Jeong-hun Hong,
Dong-hun Lee,
Dabin Kang,
Semin Myeong,
Sang-hyo Park,
Hyeyoung Park
Abstract:
In recent text-video retrieval, the use of additional captions from vision-language models has shown promising effects on the performance. However, existing models using additional captions often have struggled to capture the rich semantics, including temporal changes, inherent in the video. In addition, incorrect information caused by generative models can lead to inaccurate retrieval. To address…
▽ More
In recent text-video retrieval, the use of additional captions from vision-language models has shown promising effects on the performance. However, existing models using additional captions often have struggled to capture the rich semantics, including temporal changes, inherent in the video. In addition, incorrect information caused by generative models can lead to inaccurate retrieval. To address these issues, we propose a new framework, Narrating the Video (NarVid), which strategically leverages the comprehensive information available from frame-level captions, the narration. The proposed NarVid exploits narration in multiple ways: 1) feature enhancement through cross-modal interactions between narration and video, 2) query-aware adaptive filtering to suppress irrelevant or incorrect information, 3) dual-modal matching score by adding query-video similarity and query-narration similarity, and 4) hard-negative loss to learn discriminative features from multiple perspectives using the two similarities from different views. Experimental results demonstrate that NarVid achieves state-of-the-art performance on various benchmark datasets.
△ Less
Submitted 25 March, 2025; v1 submitted 7 March, 2025;
originally announced March 2025.
-
One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs
Authors:
Junwoo Ha,
Hyunjun Kim,
Sangyoon Yu,
Haon Park,
Ashkan Yousefpour,
Yuna Park,
Suhyun Kim
Abstract:
We introduce a novel framework for consolidating multi-turn adversarial ``jailbreak'' prompts into single-turn queries, significantly reducing the manual overhead required for adversarial testing of large language models (LLMs). While multi-turn human jailbreaks have been shown to yield high attack success rates, they demand considerable human effort and time. Our multi-turn-to-single-turn (M2S) m…
▽ More
We introduce a novel framework for consolidating multi-turn adversarial ``jailbreak'' prompts into single-turn queries, significantly reducing the manual overhead required for adversarial testing of large language models (LLMs). While multi-turn human jailbreaks have been shown to yield high attack success rates, they demand considerable human effort and time. Our multi-turn-to-single-turn (M2S) methods -- Hyphenize, Numberize, and Pythonize -- systematically reformat multi-turn dialogues into structured single-turn prompts. Despite removing iterative back-and-forth interactions, these prompts preserve and often enhance adversarial potency: in extensive evaluations on the Multi-turn Human Jailbreak (MHJ) dataset, M2S methods achieve attack success rates from 70.6 percent to 95.9 percent across several state-of-the-art LLMs. Remarkably, the single-turn prompts outperform the original multi-turn attacks by as much as 17.5 percentage points while cutting token usage by more than half on average. Further analysis shows that embedding malicious requests in enumerated or code-like structures exploits ``contextual blindness'', bypassing both native guardrails and external input-output filters. By converting multi-turn conversations into concise single-turn prompts, the M2S framework provides a scalable tool for large-scale red teaming and reveals critical weaknesses in contemporary LLM defenses.
△ Less
Submitted 25 May, 2025; v1 submitted 6 March, 2025;
originally announced March 2025.
-
Measurement of the Branching Fraction of $Λ_c^+ \to p K_S^0 π^0$ at Belle
Authors:
The Belle,
Belle II Collaborations,
:,
I. Adachi,
L. Aggarwal,
H. Ahmed,
J. K. Ahn,
H. Aihara,
N. Akopov,
M. Alhakami,
A. Aloisio,
N. Althubiti,
M. Angelsmark,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
N. K. Baghel,
S. Bahinipati,
P. Bambade
, et al. (404 additional authors not shown)
Abstract:
We report a precise measurement of the ratio of branching fractions $\mathcal{B}(Λ_c^+\to p K_S^0 π^0)/\mathcal{B}(Λ_c^+\to p K^- π^+)$ using 980 fb$^{-1}$ of $e^+e^-$ data from the Belle experiment. We obtain a value of $\mathcal{B}(Λ_c^+\to p K_S^0 π^0)/\mathcal{B}(Λ_c^+\to p K^- π^+)=0.339\pm 0.002\pm 0.009$, where the first and second uncertainties are statistical and systematic, respectively.…
▽ More
We report a precise measurement of the ratio of branching fractions $\mathcal{B}(Λ_c^+\to p K_S^0 π^0)/\mathcal{B}(Λ_c^+\to p K^- π^+)$ using 980 fb$^{-1}$ of $e^+e^-$ data from the Belle experiment. We obtain a value of $\mathcal{B}(Λ_c^+\to p K_S^0 π^0)/\mathcal{B}(Λ_c^+\to p K^- π^+)=0.339\pm 0.002\pm 0.009$, where the first and second uncertainties are statistical and systematic, respectively. This Belle result is consistent with the previous measurement from the CLEO experiment but has a fivefold improvement in precision. By combining our result with the world average $\mathcal{B}(Λ_c^+\to p K^- π^+)$, we obtain the absolute branching fraction $\mathcal{B}(Λ_c^+\to p K_S^0 π^0)=(2.12\pm 0.01\pm 0.05 \pm 0.10)\%$, where the uncertainties are statistical, systematic, and the uncertainty in the absolute branching fraction scale $\mathcal{B}(Λ_c^+\to p K^- π^+)$, respectively. This measurement can shed light on hadronic decay mechanisms in charmed baryon decays.
△ Less
Submitted 18 March, 2025; v1 submitted 6 March, 2025;
originally announced March 2025.