Skip to main content

Showing 1–50 of 1,004 results for author: Kim, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.06125  [pdf, ps, other

    cs.LG cs.AI

    Subspace-based Approximate Hessian Method for Zeroth-Order Optimization

    Authors: Dongyoon Kim, Sungjae Lee, Wonjin Lee, Kwang In Kim

    Abstract: Zeroth-order optimization addresses problems where gradient information is inaccessible or impractical to compute. While most existing methods rely on first-order approximations, incorporating second-order (curvature) information can, in principle, significantly accelerate convergence. However, the high cost of function evaluations required to estimate Hessian matrices often limits practical appli… ▽ More

    Submitted 8 July, 2025; originally announced July 2025.

    Comments: 20 pages, 8 figures

  2. arXiv:2507.05296  [pdf

    cs.CY cs.AI

    Integrating Generative AI in BIM Education: Insights from Classroom Implementation

    Authors: Islem Sahraoui, Kinam Kim, Lu Gao, Zia Din, Ahmed Senouci

    Abstract: This study evaluates the implementation of a Generative AI-powered rule checking workflow within a graduate-level Building Information Modeling (BIM) course at a U.S. university. Over two semesters, 55 students participated in a classroom-based pilot exploring the use of GenAI for BIM compliance tasks, an area with limited prior research. The instructional design included lectures on prompt engine… ▽ More

    Submitted 5 July, 2025; originally announced July 2025.

  3. arXiv:2507.04683  [pdf, ps, other

    cs.LG

    Recovering Plasticity of Neural Networks via Soft Weight Rescaling

    Authors: Seungwon Oh, Sangyeon Park, Isaac Han, Kyung-Joong Kim

    Abstract: Recent studies have shown that as training progresses, neural networks gradually lose their capacity to learn new information, a phenomenon known as plasticity loss. An unbounded weight growth is one of the main causes of plasticity loss. Furthermore, it harms generalization capability and disrupts optimization dynamics. Re-initializing the network can be a solution, but it results in the loss of… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

  4. arXiv:2507.04364  [pdf

    cs.CL cs.SI

    Large Language Models' Varying Accuracy in Recognizing Risk-Promoting and Health-Supporting Sentiments in Public Health Discourse: The Cases of HPV Vaccination and Heated Tobacco Products

    Authors: Soojong Kim, Kwanho Kim, Hye Min Kim

    Abstract: Machine learning methods are increasingly applied to analyze health-related public discourse based on large-scale data, but questions remain regarding their ability to accurately detect different types of health sentiments. Especially, Large Language Models (LLMs) have gained attention as a powerful technology, yet their accuracy and feasibility in capturing different opinions and perspectives on… ▽ More

    Submitted 6 July, 2025; originally announced July 2025.

    Comments: Forthcoming in Social Science & Medicine

  5. arXiv:2507.04014  [pdf, ps, other

    cs.CL cs.AI cs.CY

    Nunchi-Bench: Benchmarking Language Models on Cultural Reasoning with a Focus on Korean Superstition

    Authors: Kyuhee Kim, Sangah Lee

    Abstract: As large language models (LLMs) become key advisors in various domains, their cultural sensitivity and reasoning skills are crucial in multicultural environments. We introduce Nunchi-Bench, a benchmark designed to evaluate LLMs' cultural understanding, with a focus on Korean superstitions. The benchmark consists of 247 questions spanning 31 topics, assessing factual knowledge, culturally appropria… ▽ More

    Submitted 5 July, 2025; originally announced July 2025.

  6. arXiv:2507.03378  [pdf, ps, other

    cs.CL

    Making Sense of Korean Sentences: A Comprehensive Evaluation of LLMs through KoSEnd Dataset

    Authors: Seunguk Yu, Kyeonghyun Kim, Jungmin Yun, Youngbin Kim

    Abstract: Although LLMs have made significant progress in various languages, there are still concerns about their effectiveness with low-resource agglutinative languages compared to languages such as English. In this study, we focused on Korean, a language known for its complex sentence endings, and evaluated LLMs on this challenging aspect. We introduce the Korean Sentence Endings (KoSEnd) dataset, which i… ▽ More

    Submitted 4 July, 2025; originally announced July 2025.

    Comments: ACL 2025 student research workshop

  7. arXiv:2507.02687  [pdf, ps, other

    cs.CV cs.AI

    APT: Adaptive Personalized Training for Diffusion Models with Limited Data

    Authors: JungWoo Chae, Jiyoon Kim, JaeWoong Choi, Kyungyul Kim, Sangheum Hwang

    Abstract: Personalizing diffusion models using limited data presents significant challenges, including overfitting, loss of prior knowledge, and degradation of text alignment. Overfitting leads to shifts in the noise prediction distribution, disrupting the denoising trajectory and causing the model to lose semantic coherence. In this paper, we propose Adaptive Personalized Training (APT), a novel framework… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

    Comments: CVPR 2025 camera ready. Project page: https://lgcnsai.github.io/apt

    MSC Class: 60J60; 68T07 ACM Class: I.2.6; I.2.10; I.4.9

    Journal ref: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025, pp. 28619-28628

  8. arXiv:2507.00464  [pdf, ps, other

    cs.RO

    A Miniature High-Resolution Tension Sensor Based on a Photo-Reflector for Robotic Hands and Grippers

    Authors: Hyun-Bin Kim, Kyung-Soo Kim

    Abstract: This paper presents a miniature tension sensor using a photo-reflector, designed for compact tendon-driven grippers and robotic hands. The proposed sensor has a small form factor of 13~mm x 7~mm x 6.5~mm and is capable of measuring tensile forces up to 200~N. A symmetric elastomer structure incorporating fillets and flexure hinges is designed based on Timoshenko beam theory and verified via FEM an… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  9. arXiv:2506.22853  [pdf, ps, other

    cs.CL cs.AI

    DICE-BENCH: Evaluating the Tool-Use Capabilities of Large Language Models in Multi-Round, Multi-Party Dialogues

    Authors: Kyochul Jang, Donghyeon Lee, Kyusik Kim, Dongseok Heo, Taewhoo Lee, Woojeong Kim, Bongwon Suh

    Abstract: Existing function-calling benchmarks focus on single-turn interactions. However, they overlook the complexity of real-world scenarios. To quantify how existing benchmarks address practical applications, we introduce DICE-SCORE, a metric that evaluates the dispersion of tool-related information such as function name and parameter values throughout the dialogue. Analyzing existing benchmarks through… ▽ More

    Submitted 2 July, 2025; v1 submitted 28 June, 2025; originally announced June 2025.

    Comments: 9 pages, ACL 2025 Vienna

  10. arXiv:2506.21174  [pdf

    eess.AS cs.LG

    Performance improvement of spatial semantic segmentation with enriched audio features and agent-based error correction for DCASE 2025 Challenge Task 4

    Authors: Jongyeon Park, Joonhee Lee, Do-Hyeon Lim, Hong Kook Kim, Hyeongcheol Geum, Jeong Eun Lim

    Abstract: This technical report presents submission systems for Task 4 of the DCASE 2025 Challenge. This model incorporates additional audio features (spectral roll-off and chroma features) into the embedding feature extracted from the mel-spectral feature to im-prove the classification capabilities of an audio-tagging model in the spatial semantic segmentation of sound scenes (S5) system. This approach is… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: DCASE 2025 challenge Task4, 5 pages

  11. arXiv:2506.20551  [pdf, ps, other

    cs.SE cs.AI

    Large Language Model-Driven Code Compliance Checking in Building Information Modeling

    Authors: Soumya Madireddy, Lu Gao, Zia Din, Kinam Kim, Ahmed Senouci, Zhe Han, Yunpeng Zhang

    Abstract: This research addresses the time-consuming and error-prone nature of manual code compliance checking in Building Information Modeling (BIM) by introducing a Large Language Model (LLM)-driven approach to semi-automate this critical process. The developed system integrates LLMs such as GPT, Claude, Gemini, and Llama, with Revit software to interpret building codes, generate Python scripts, and perfo… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  12. arXiv:2506.12839  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Fair Bayesian Model-Based Clustering

    Authors: Jihu Lee, Kunwoong Kim, Yongdai Kim

    Abstract: Fair clustering has become a socially significant task with the advancement of machine learning technologies and the growing demand for trustworthy AI. Group fairness ensures that the proportions of each sensitive group are similar in all clusters. Most existing group-fair clustering methods are based on the $K$-means clustering and thus require the distance between instances and the number of clu… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

  13. arXiv:2506.11271  [pdf, ps, other

    stat.ML cs.LG

    Collaborative Prediction: To Join or To Disjoin Datasets

    Authors: Kyung Rok Kim, Yansong Wang, Xiaocheng Li, Guanting Chen

    Abstract: With the recent rise of generative Artificial Intelligence (AI), the need of selecting high-quality dataset to improve machine learning models has garnered increasing attention. However, some part of this topic remains underexplored, even for simple prediction models. In this work, we study the problem of developing practical algorithms that select appropriate dataset to minimize population loss o… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: To be published in the 41st Conference on Uncertainty in Artificial Intelligence (UAI 2025)

  14. arXiv:2506.09229  [pdf, ps, other

    cs.CV

    Cross-Frame Representation Alignment for Fine-Tuning Video Diffusion Models

    Authors: Sungwon Hwang, Hyojin Jang, Kinam Kim, Minho Park, Jaegul Choo

    Abstract: Fine-tuning Video Diffusion Models (VDMs) at the user level to generate videos that reflect specific attributes of training data presents notable challenges, yet remains underexplored despite its practical importance. Meanwhile, recent work such as Representation Alignment (REPA) has shown promise in improving the convergence and quality of DiT-based image diffusion models by aligning, or assimila… ▽ More

    Submitted 25 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

    Comments: Project page: https://crepavideo.github.io

  15. arXiv:2506.07424  [pdf, other

    cs.CL cs.AI

    Plug-in and Fine-tuning: Bridging the Gap between Small Language Models and Large Language Models

    Authors: Kyeonghyun Kim, Jinhee Jang, Juhwan Choi, Yoonji Lee, Kyohoon Jin, YoungBin Kim

    Abstract: Large language models (LLMs) are renowned for their extensive linguistic knowledge and strong generalization capabilities, but their high computational demands make them unsuitable for resource-constrained environments. In contrast, small language models (SLMs) are computationally efficient but often lack the broad generalization capacity of LLMs. To bridge this gap, we propose PiFi, a novel frame… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: ACL 2025 main conference

  16. arXiv:2506.05619  [pdf, ps, other

    cs.AI cs.LG

    Population-Proportional Preference Learning from Human Feedback: An Axiomatic Approach

    Authors: Kihyun Kim, Jiawei Zhang, Asuman Ozdaglar, Pablo A. Parrilo

    Abstract: Conventional preference learning methods often prioritize opinions held more widely when aggregating preferences from multiple evaluators. This may result in policies that are biased in favor of some types of opinions or groups. The objective of this paper is to develop a novel preference learning framework capable of aligning aggregate opinions and policies proportionally with the true population… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  17. arXiv:2506.05419  [pdf, ps, other

    cs.CV cs.AI

    Dream to Generalize: Zero-Shot Model-Based Reinforcement Learning for Unseen Visual Distractions

    Authors: Jeongsoo Ha, Kyungsoo Kim, Yusung Kim

    Abstract: Model-based reinforcement learning (MBRL) has been used to efficiently solve vision-based control tasks in highdimensional image observations. Although recent MBRL algorithms perform well in trained observations, they fail when faced with visual distractions in observations. These task-irrelevant distractions (e.g., clouds, shadows, and light) may be constantly present in real-world scenarios. In… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: AAAI 2023

  18. arXiv:2506.05418  [pdf, ps, other

    cs.CV cs.AI cs.LG

    Self-Predictive Dynamics for Generalization of Vision-based Reinforcement Learning

    Authors: Kyungsoo Kim, Jeongsoo Ha, Yusung Kim

    Abstract: Vision-based reinforcement learning requires efficient and robust representations of image-based observations, especially when the images contain distracting (task-irrelevant) elements such as shadows, clouds, and light. It becomes more important if those distractions are not exposed during training. We design a Self-Predictive Dynamics (SPD) method to extract task-relevant features efficiently, e… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: IJCAI 2022

  19. arXiv:2506.05415  [pdf, ps, other

    cs.CL

    Automatically Detecting Amusing Games in Wordle

    Authors: Ronaldo Luo, Gary Liang, Cindy Liu, Adam Kabbara, Minahil Bakhtawar, Kina Kim, Michael Guerzhoy

    Abstract: We explore automatically predicting which Wordle games Reddit users find amusing. We scrape approximately 80k reactions by Reddit users to Wordle games from Reddit, classify the reactions as expressing amusement or not using OpenAI's GPT-3.5 using few-shot prompting, and verify that GPT-3.5's labels roughly correspond to human labels. We then extract features from Wordle games that can predict… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: Accepted to the Intenational Conference on Computational Creeativity (ICCC) 2025

  20. arXiv:2506.04704  [pdf, ps, other

    cs.CV cs.AI

    HoliSafe: Holistic Safety Benchmarking and Modeling with Safety Meta Token for Vision-Language Model

    Authors: Youngwan Lee, Kangsan Kim, Kwanyong Park, Ilcahe Jung, Soojin Jang, Seanie Lee, Yong-Ju Lee, Sung Ju Hwang

    Abstract: Despite emerging efforts to enhance the safety of Vision-Language Models (VLMs), current approaches face two main shortcomings. 1) Existing safety-tuning datasets and benchmarks only partially consider how image-text interactions can yield harmful content, often overlooking contextually unsafe outcomes from seemingly benign pairs. This narrow coverage leaves VLMs vulnerable to jailbreak attacks in… ▽ More

    Submitted 11 June, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

    Comments: Project page: https://youngwanlee.github.io/holisafe

  21. arXiv:2506.04272  [pdf, ps, other

    cs.LG

    Understanding the Impact of Sampling Quality in Direct Preference Optimization

    Authors: Kyung Rok Kim, Yumo Bai, Chonghuan Wang, Guanting Chen

    Abstract: We study the role of the sampling distribution in Direct Preference Optimization (DPO) and aim to understand its impact on DPO's training dynamics. Our analyses show that both the solution space and the convergence behavior of DPO depend on the support and quality of the generating distribution. We first analyze how distribution of responses influences policy updates during gradient descent, drawi… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: Submitted to NeurIPS2025

  22. arXiv:2506.01947  [pdf, ps, other

    eess.IV cs.CV

    RAW Image Reconstruction from RGB on Smartphones. NTIRE 2025 Challenge Report

    Authors: Marcos V. Conde, Radu Timofte, Radu Berdan, Beril Besbinar, Daisuke Iso, Pengzhou Ji, Xiong Dun, Zeying Fan, Chen Wu, Zhansheng Wang, Pengbo Zhang, Jiazi Huang, Qinglin Liu, Wei Yu, Shengping Zhang, Xiangyang Ji, Kyungsik Kim, Minkyung Kim, Hwalmin Lee, Hekun Ma, Huan Zheng, Yanyan Wei, Zhao Zhang, Jing Fang, Meilin Gao , et al. (8 additional authors not shown)

    Abstract: Numerous low-level vision tasks operate in the RAW domain due to its linear properties, bit depth, and sensor designs. Despite this, RAW image datasets are scarce and more expensive to collect than the already large and public sRGB datasets. For this reason, many approaches try to generate realistic RAW images using sensor information and sRGB images. This paper covers the second challenge on RAW… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: CVPR 2025 - New Trends in Image Restoration and Enhancement (NTIRE)

  23. arXiv:2506.01420  [pdf, ps, other

    cs.CL cs.LG

    Self-Refining Language Model Anonymizers via Adversarial Distillation

    Authors: Kyuyoung Kim, Hyunjun Jeon, Jinwoo Shin

    Abstract: Large language models (LLMs) are increasingly used in sensitive domains, where their ability to infer personal data from seemingly benign text poses emerging privacy risks. While recent LLM-based anonymization methods help mitigate such risks, they often rely on proprietary models (e.g., GPT-4), raising concerns about cost and the potential exposure of sensitive data to untrusted external systems.… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Preprint

  24. arXiv:2506.01360  [pdf, ps, other

    cs.LG

    RDB2G-Bench: A Comprehensive Benchmark for Automatic Graph Modeling of Relational Databases

    Authors: Dongwon Choi, Sunwoo Kim, Juyeon Kim, Kyungho Kim, Geon Lee, Shinhwan Kang, Myunghwan Kim, Kijung Shin

    Abstract: Relational databases (RDBs) are composed of interconnected tables, where relationships between them are defined through foreign keys. Recent research on applying machine learning to RDBs has explored graph-based representations of RDBs, where rows of tables are modeled as nodes, and foreign key relationships are modeled as edges. RDB-to-graph modeling helps capture cross-table dependencies, ultima… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Code and datasets are in https://github.com/chlehdwon/RDB2G-Bench

  25. arXiv:2506.01332  [pdf, other

    cs.AI cs.CL cs.MA

    An Empirical Study of Group Conformity in Multi-Agent Systems

    Authors: Min Choi, Keonwoo Kim, Sungwon Chae, Sangyeob Baek

    Abstract: Recent advances in Large Language Models (LLMs) have enabled multi-agent systems that simulate real-world interactions with near-human reasoning. While previous studies have extensively examined biases related to protected attributes such as race, the emergence and propagation of biases on socially contentious issues in multi-agent LLM interactions remain underexplored. This study explores how LLM… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Journal ref: ACL 2025 (findings)

  26. arXiv:2506.01206  [pdf, other

    cs.CL cs.AI

    Mamba Drafters for Speculative Decoding

    Authors: Daewon Choi, Seunghyuk Oh, Saket Dingliwal, Jihoon Tack, Kyuyoung Kim, Woomin Song, Seojin Kim, Insu Han, Jinwoo Shin, Aram Galstyan, Shubham Katiyar, Sravan Babu Bodapati

    Abstract: Speculative decoding has emerged as a promising approach to accelerating large language model (LLM) generation using a fast drafter while maintaining alignment with the target model's distribution. However, existing approaches face a trade-off: external drafters offer flexibility but can suffer from slower drafting, while self-speculation methods use drafters tailored to the target model but requi… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  27. arXiv:2506.00996  [pdf, other

    cs.CV

    Temporal In-Context Fine-Tuning for Versatile Control of Video Diffusion Models

    Authors: Kinam Kim, Junha Hyung, Jaegul Choo

    Abstract: Recent advances in text-to-video diffusion models have enabled high-quality video synthesis, but controllable generation remains challenging, particularly under limited data and compute. Existing fine-tuning methods for conditional generation often rely on external encoders or architectural modifications, which demand large datasets and are typically restricted to spatially aligned conditioning, l… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: project page: https://kinam0252.github.io/TIC-FT/

  28. arXiv:2505.21727  [pdf, ps, other

    cs.DC

    FedCostAware: Enabling Cost-Aware Federated Learning on the Cloud

    Authors: Aditya Sinha, Zilinghan Li, Tingkai Liu, Volodymyr Kindratenko, Kibaek Kim, Ravi Madduri

    Abstract: Federated learning (FL) is a distributed machine learning (ML) approach that allows multiple clients to collaboratively train ML model without exchanging their original training data, offering a solution that is particularly valuable in sensitive domains such as biomedicine. However, training robust FL models often requires substantial computing resources from participating clients, such as GPUs,… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  29. arXiv:2505.21721  [pdf, ps, other

    stat.ML cs.LG math.OC stat.CO

    Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference

    Authors: Kyurae Kim, Yi-An Ma, Trevor Campbell, Jacob R. Gardner

    Abstract: We prove that, given a mean-field location-scale variational family, black-box variational inference (BBVI) with the reparametrization gradient converges at an almost dimension-independent rate. Specifically, for strongly log-concave and log-smooth targets, the number of iterations for BBVI with a sub-Gaussian family to achieve an objective $ε$-close to the global optimum is $\mathrm{O}(\log d)$,… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  30. arXiv:2505.21556  [pdf, ps, other

    cs.CV cs.AI

    Benign-to-Toxic Jailbreaking: Inducing Harmful Responses from Harmless Prompts

    Authors: Hee-Seon Kim, Minbeom Kim, Wonjun Lee, Kihyun Kim, Changick Kim

    Abstract: Optimization-based jailbreaks typically adopt the Toxic-Continuation setting in large vision-language models (LVLMs), following the standard next-token prediction objective. In this setting, an adversarial image is optimized to make the model predict the next token of a toxic prompt. However, we find that the Toxic-Continuation paradigm is effective at continuing already-toxic inputs, but struggle… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

    Comments: LVLM, Jailbreak

  31. arXiv:2505.20935  [pdf, other

    cs.CV

    ISAC: Training-Free Instance-to-Semantic Attention Control for Improving Multi-Instance Generation

    Authors: Sanghyun Jo, Wooyeol Lee, Ziseok Lee, Kyungsu Kim

    Abstract: Text-to-image diffusion models excel at generating single-instance scenes but struggle with multi-instance scenarios, often merging or omitting objects. Unlike previous training-free approaches that rely solely on semantic-level guidance without addressing instance individuation, our training-free method, Instance-to-Semantic Attention Control (ISAC), explicitly resolves incomplete instance format… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: 34 pages

  32. arXiv:2505.20875  [pdf, ps, other

    cs.CL cs.AI

    Trans-EnV: A Framework for Evaluating the Linguistic Robustness of LLMs Against English Varieties

    Authors: Jiyoung Lee, Seungho Kim, Jieun Han, Jun-Min Lee, Kitaek Kim, Alice Oh, Edward Choi

    Abstract: Large Language Models (LLMs) are predominantly evaluated on Standard American English (SAE), often overlooking the diversity of global English varieties. This narrow focus may raise fairness concerns as degraded performance on non-standard varieties can lead to unequal benefits for users worldwide. Therefore, it is critical to extensively evaluate the linguistic robustness of LLMs on multiple non-… ▽ More

    Submitted 4 June, 2025; v1 submitted 27 May, 2025; originally announced May 2025.

    Comments: 27 pages, 6 figures, 16 tables

  33. arXiv:2505.20854  [pdf, other

    cs.SE cs.AI cs.CL

    An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks

    Authors: Xin Zhou, Kisub Kim, Ting Zhang, Martin Weyssow, Luis F. Gomes, Guang Yang, David Lo

    Abstract: Large Language Models (LLMs) and other automated techniques have been increasingly used to support software developers by generating software artifacts such as code snippets, patches, and comments. However, accurately assessing the correctness of these generated artifacts remains a significant challenge. On one hand, human evaluation provides high accuracy but is labor-intensive and lacks scalabil… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: 20 pages

  34. arXiv:2505.20813  [pdf, ps, other

    cs.CL cs.AI

    RSCF: Relation-Semantics Consistent Filter for Entity Embedding of Knowledge Graph

    Authors: Junsik Kim, Jinwook Park, Kangil Kim

    Abstract: In knowledge graph embedding, leveraging relation specific entity transformation has markedly enhanced performance. However, the consistency of embedding differences before and after transformation remains unaddressed, risking the loss of valuable inductive bias inherent in the embeddings. This inconsistency stems from two problems. First, transformation representations are specified for relations… ▽ More

    Submitted 12 June, 2025; v1 submitted 27 May, 2025; originally announced May 2025.

    Comments: Accepted to ACL 2025, 17 pages, 10 figures

  35. arXiv:2505.20455  [pdf, ps, other

    cs.RO

    HAND Me the Data: Fast Robot Adaptation via Hand Path Retrieval

    Authors: Matthew Hong, Anthony Liang, Kevin Kim, Harshitha Rajaprakash, Jesse Thomason, Erdem Bıyık, Jesse Zhang

    Abstract: We hand the community HAND, a simple and time-efficient method for teaching robots new manipulation tasks through human hand demonstrations. Instead of relying on task-specific robot demonstrations collected via teleoperation, HAND uses easy-to-provide hand demonstrations to retrieve relevant behaviors from task-agnostic robot play data. Using a visual tracking pipeline, HAND extracts the motion o… ▽ More

    Submitted 1 June, 2025; v1 submitted 26 May, 2025; originally announced May 2025.

  36. arXiv:2505.18111  [pdf, ps, other

    cs.CV

    Adapting SAM 2 for Visual Object Tracking: 1st Place Solution for MMVPR Challenge Multi-Modal Tracking

    Authors: Cheng-Yen Yang, Hsiang-Wei Huang, Pyong-Kun Kim, Chien-Kai Kuo, Jui-Wei Chang, Kwang-Ju Kim, Chung-I Huang, Jenq-Neng Hwang

    Abstract: We present an effective approach for adapting the Segment Anything Model 2 (SAM2) to the Visual Object Tracking (VOT) task. Our method leverages the powerful pre-trained capabilities of SAM2 and incorporates several key techniques to enhance its performance in VOT applications. By combining SAM2 with our proposed optimizations, we achieved a first place AUC score of 89.4 on the 2024 ICPR Multi-mod… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: Accepted by ICPR Multi-Modal Visual Pattern Recognition Workshop

  37. arXiv:2505.17818  [pdf, other

    cs.AI cs.CL

    PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions

    Authors: Daeun Kyung, Hyunseung Chung, Seongsu Bae, Jiho Kim, Jae Ho Sohn, Taerim Kim, Soo Kyung Kim, Edward Choi

    Abstract: Doctor-patient consultations require multi-turn, context-aware communication tailored to diverse patient personas. Training or evaluating doctor LLMs in such settings requires realistic patient interaction systems. However, existing simulators often fail to reflect the full range of personas seen in clinical practice. To address this, we introduce PatientSim, a patient simulator that generates rea… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: 9 pages for main text, 4 pages for references, 27 pages for supplementary materials

  38. arXiv:2505.17561  [pdf, other

    cs.CV cs.AI

    Model Already Knows the Best Noise: Bayesian Active Noise Selection via Attention in Video Diffusion Model

    Authors: Kwanyoung Kim, Sanghyun Kim

    Abstract: The choice of initial noise significantly affects the quality and prompt alignment of video diffusion models, where different noise seeds for the same prompt can lead to drastically different generations. While recent methods rely on externally designed priors such as frequency filters or inter-frame smoothing, they often overlook internal model signals that indicate which noise seeds are inherent… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: 19 pages, 10 figures

  39. arXiv:2505.17529  [pdf, other

    cs.CV cs.AI

    Do You Keep an Eye on What I Ask? Mitigating Multimodal Hallucination via Attention-Guided Ensemble Decoding

    Authors: Yeongjae Cho, Keonwoo Kim, Taebaek Hwang, Sungzoon Cho

    Abstract: Recent advancements in Large Vision-Language Models (LVLMs) have significantly expanded their utility in tasks like image captioning and visual question answering. However, they still struggle with object hallucination, where models generate descriptions that inaccurately reflect the visual content by including nonexistent objects or misrepresenting existing ones. While previous methods, such as d… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  40. arXiv:2505.17475  [pdf, ps, other

    cs.CV

    PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation

    Authors: Uyoung Jeong, Jonathan Freer, Seungryul Baek, Hyung Jin Chang, Kwang In Kim

    Abstract: We study multi-dataset training (MDT) for pose estimation, where skeletal heterogeneity presents a unique challenge that existing methods have yet to address. In traditional domains, \eg regression and classification, MDT typically relies on dataset merging or multi-head supervision. However, the diversity of skeleton types and limited cross-dataset supervision complicate integration in pose estim… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: accepted to CVPR 2025

  41. arXiv:2505.13660  [pdf, ps, other

    math.OC cs.LG stat.ML

    Sobolev Gradient Ascent for Optimal Transport: Barycenter Optimization and Convergence Analysis

    Authors: Kaheon Kim, Bohan Zhou, Changbo Zhu, Xiaohui Chen

    Abstract: This paper introduces a new constraint-free concave dual formulation for the Wasserstein barycenter. Tailoring the vanilla dual gradient ascent algorithm to the Sobolev geometry, we derive a scalable Sobolev gradient ascent (SGA) algorithm to compute the barycenter for input distributions supported on a regular grid. Despite the algorithmic simplicity, we provide a global convergence analysis that… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  42. arXiv:2505.09131  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Fair Clustering via Alignment

    Authors: Kunwoong Kim, Jihu Lee, Sangchul Park, Yongdai Kim

    Abstract: Algorithmic fairness in clustering aims to balance the proportions of instances assigned to each cluster with respect to a given sensitive attribute. While recently developed fair clustering algorithms optimize clustering objectives under specific fairness constraints, their inherent complexity or approximation often results in suboptimal clustering utility or numerical instability in practice. To… ▽ More

    Submitted 23 May, 2025; v1 submitted 14 May, 2025; originally announced May 2025.

    Journal ref: ICML 2025 (Forty-Second International Conference on Machine Learning)

  43. arXiv:2505.09069  [pdf, other

    cs.RO physics.ins-det

    A Novel 6-axis Force/Torque Sensor Using Inductance Sensors

    Authors: Hyun-Bin Kim, Kyung-Soo Kim

    Abstract: This paper presents a novel six-axis force/torque (F/T) sensor based on inductive sensing technology. Unlike conventional strain gauge-based sensors that require direct contact and external amplification, the proposed sensor utilizes non-contact inductive measurements to estimate force via displacement of a conductive target. A compact, fully integrated architecture is achieved by incorporating a… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: 10 pages, 8 figures

  44. arXiv:2505.06435  [pdf, other

    stat.ML cs.LG

    Fair Representation Learning for Continuous Sensitive Attributes using Expectation of Integral Probability Metrics

    Authors: Insung Kong, Kunwoong Kim, Yongdai Kim

    Abstract: AI fairness, also known as algorithmic fairness, aims to ensure that algorithms operate without bias or discrimination towards any individual or group. Among various AI algorithms, the Fair Representation Learning (FRL) approach has gained significant interest in recent years. However, existing FRL algorithms have a limitation: they are primarily designed for categorical sensitive attributes and t… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

    Comments: 42 pages, 30 figures. IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)

  45. arXiv:2505.06271  [pdf, other

    cs.LG cs.AI cs.SD

    Tri-MTL: A Triple Multitask Learning Approach for Respiratory Disease Diagnosis

    Authors: June-Woo Kim, Sanghoon Lee, Miika Toikkanen, Daehwan Hwang, Kyunghoon Kim

    Abstract: Auscultation remains a cornerstone of clinical practice, essential for both initial evaluation and continuous monitoring. Clinicians listen to the lung sounds and make a diagnosis by combining the patient's medical history and test results. Given this strong association, multitask learning (MTL) can offer a compelling framework to simultaneously model these relationships, integrating respiratory s… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: Accepted to EMBC 2025

  46. arXiv:2505.06270  [pdf, other

    cs.LG cs.AI

    Importance Analysis for Dynamic Control of Balancing Parameter in a Simple Knowledge Distillation Setting

    Authors: Seongmin Kim, Kwanho Kim, Minseung Kim, Kanghyun Jo

    Abstract: Although deep learning models owe their remarkable success to deep and complex architectures, this very complexity typically comes at the expense of real-time performance. To address this issue, a variety of model compression techniques have been proposed, among which knowledge distillation (KD) stands out for its strong empirical performance. The KD contains two concurrent processes: (i) matching… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: 3 pages, 2 figures, conference preprint for IWIS2025

  47. arXiv:2505.03562  [pdf, ps, other

    cs.CV cs.AI

    Real-Time Person Image Synthesis Using a Flow Matching Model

    Authors: Jiwoo Jeong, Kirok Kim, Wooju Kim, Nam-Joon Kim

    Abstract: Pose-Guided Person Image Synthesis (PGPIS) generates realistic person images conditioned on a target pose and a source image. This task plays a key role in various real-world applications, such as sign language video generation, AR/VR, gaming, and live streaming. In these scenarios, real-time PGPIS is critical for providing immediate visual feedback and maintaining user immersion.However, achievin… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

  48. arXiv:2505.03315  [pdf, other

    cs.AI

    Artificial Behavior Intelligence: Technology, Challenges, and Future Directions

    Authors: Kanghyun Jo, Jehwan Choi, Kwanho Kim, Seongmin Kim, Duy-Linh Nguyen, Xuan-Thuy Vo, Adri Priadana, Tien-Dat Tran

    Abstract: Understanding and predicting human behavior has emerged as a core capability in various AI application domains such as autonomous driving, smart healthcare, surveillance systems, and social robotics. This paper defines the technical framework of Artificial Behavior Intelligence (ABI), which comprehensively analyzes and interprets human posture, facial expressions, emotions, behavioral sequences, a… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: 9 pages, 6 figures, Pre-print for IWIS2025

  49. arXiv:2505.02305  [pdf, ps, other

    cs.SE

    Refining Fuzzed Crashing Inputs for Better Fault Diagnosis

    Authors: Kieun Kim, Seongmin Lee, Shin Hong

    Abstract: We present DiffMin, a technique that refines a fuzzed crashing input to gain greater similarities to given passing inputs to help developers analyze the crashing input to identify the failure-inducing condition and locate buggy code for debugging. DiffMin iteratively applies edit actions to transform a fuzzed input while preserving the crash behavior. Our pilot study with the Magma benchmark demon… ▽ More

    Submitted 6 May, 2025; v1 submitted 4 May, 2025; originally announced May 2025.

    Comments: This paper will be presented in the Posters track at FSE 2025 (https://conf.researchr.org/track/fse-2025/fse-2025-posters)

    ACM Class: D.2.5

  50. arXiv:2505.01933  [pdf

    cs.LG econ.EM

    Unemployment Dynamics Forecasting with Machine Learning Regression Models

    Authors: Kyungsu Kim

    Abstract: In this paper, I explored how a range of regression and machine learning techniques can be applied to monthly U.S. unemployment data to produce timely forecasts. I compared seven models: Linear Regression, SGDRegressor, Random Forest, XGBoost, CatBoost, Support Vector Regression, and an LSTM network, training each on a historical span of data and then evaluating on a later hold-out period. Input f… ▽ More

    Submitted 3 May, 2025; originally announced May 2025.

    Comments: 18 pages, 2 charts