Skip to main content

Showing 1–50 of 277 results for author: Seo, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.04166  [pdf

    cs.CY

    Governance and Technological Challenge in Digital Solidarity Economies: A Case Study of a Collaborative Transportation Platform in South Korea

    Authors: Jeongone Seo, Tawfiq Ammari

    Abstract: South Korea's City P illustrates how lofty goals of digital solidarity can falter when challenged by local governance realities. Drawing on Hansmann's ownership theory, collaborative governance concepts, and platform cooperativism, we conducted a qualitative case study involving policy documents, independent assessments, and 11 in-depth interviews with residents, officials, and technology develope… ▽ More

    Submitted 5 July, 2025; originally announced July 2025.

    Comments: 31 pages, 3 figures, under journal review

  2. arXiv:2507.02225  [pdf, ps, other

    cs.LG

    Metric Design != Metric Behavior: Improving Metric Selection for the Unbiased Evaluation of Dimensionality Reduction

    Authors: Jiyeon Bae, Hyeon Jeon, Jinwook Seo

    Abstract: Evaluating the accuracy of dimensionality reduction (DR) projections in preserving the structure of high-dimensional data is crucial for reliable visual analytics. Diverse evaluation metrics targeting different structural characteristics have thus been developed. However, evaluations of DR projections can become biased if highly correlated metrics--those measuring similar structural characteristic… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    Comments: IEEE VIS 2025 (short paper)

  3. arXiv:2506.23102  [pdf, ps, other

    eess.IV cs.CV

    MedRegion-CT: Region-Focused Multimodal LLM for Comprehensive 3D CT Report Generation

    Authors: Sunggu Kyung, Jinyoung Seo, Hyunseok Lim, Dongyeong Kim, Hyungbin Park, Jimin Sung, Jihyun Kim, Wooyoung Jo, Yoojin Nam, Namkug Kim

    Abstract: The recent release of RadGenome-Chest CT has significantly advanced CT-based report generation. However, existing methods primarily focus on global features, making it challenging to capture region-specific details, which may cause certain abnormalities to go unnoticed. To address this, we propose MedRegion-CT, a region-focused Multi-Modal Large Language Model (MLLM) framework, featuring three key… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

    Comments: 14 pages, 5 figures, submitted to ICCV 2025

  4. arXiv:2506.21556  [pdf, ps, other

    cs.CL

    VAT-KG: Knowledge-Intensive Multimodal Knowledge Graph Dataset for Retrieval-Augmented Generation

    Authors: Hyeongcheol Park, MinHyuk Jang, Ha Dam Baek, Gyusam Chang, Jiyoung Seo, Jiwan Park, Hogun Park, Sangpil Kim

    Abstract: Multimodal Knowledge Graphs (MMKGs), which represent explicit knowledge across multiple modalities, play a pivotal role by complementing the implicit knowledge of Multimodal Large Language Models (MLLMs) and enabling more grounded reasoning via Retrieval Augmented Generation (RAG). However, existing MMKGs are generally limited in scope: they are often constructed by augmenting pre-existing knowled… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: Project Page: https://vatkg.github.io/

  5. arXiv:2506.19217  [pdf, ps, other

    cs.CV cs.AI

    MedErr-CT: A Visual Question Answering Benchmark for Identifying and Correcting Errors in CT Reports

    Authors: Sunggu Kyung, Hyungbin Park, Jinyoung Seo, Jimin Sung, Jihyun Kim, Dongyeong Kim, Wooyoung Jo, Yoojin Nam, Sangah Park, Taehee Kwon, Sang Min Lee, Namkug Kim

    Abstract: Computed Tomography (CT) plays a crucial role in clinical diagnosis, but the growing demand for CT examinations has raised concerns about diagnostic errors. While Multimodal Large Language Models (MLLMs) demonstrate promising comprehension of medical knowledge, their tendency to produce inaccurate information highlights the need for rigorous validation. However, existing medical visual question an… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: 14 pages, 5 figures, submitted to CVPR 2025

  6. Navigating High-Dimensional Backstage: A Guide for Exploring Literature for the Reliable Use of Dimensionality Reduction

    Authors: Hyeon Jeon, Hyunwook Lee, Yun-Hsin Kuo, Taehyun Yang, Daniel Archambault, Sungahn Ko, Takanori Fujiwara, Kwan-Liu Ma, Jinwook Seo

    Abstract: Visual analytics using dimensionality reduction (DR) can easily be unreliable for various reasons, e.g., inherent distortions in representing the original data. The literature has thus proposed a wide range of methodologies to make DR-based visual analytics reliable. However, the diversity and extensiveness of the literature can leave novice analysts and researchers uncertain about where to begin… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: EG/VGTC EuroVis 2025 Short paper

  7. arXiv:2506.13697  [pdf, ps, other

    cs.CV

    Vid-CamEdit: Video Camera Trajectory Editing with Generative Rendering from Estimated Geometry

    Authors: Junyoung Seo, Jisang Han, Jaewoo Jung, Siyoon Jin, Joungbin Lee, Takuya Narihira, Kazumi Fukuda, Takashi Shibuya, Donghoon Ahn, Shoukang Hu, Seungryong Kim, Yuki Mitsufuji

    Abstract: We introduce Vid-CamEdit, a novel framework for video camera trajectory editing, enabling the re-synthesis of monocular videos along user-defined camera paths. This task is challenging due to its ill-posed nature and the limited multi-view video data for training. Traditional reconstruction methods struggle with extreme trajectory changes, and existing generative models for dynamic novel view synt… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: Our project page can be found at https://cvlab-kaist.github.io/Vid-CamEdit/

  8. arXiv:2506.10197  [pdf

    cs.HC

    Intergenerational AI Literacy in Korean Immigrant Families: Interpretive Gatekeeping Meets Convenient Critical Deferment

    Authors: Jeongone Seo, Ryan Womack, Tawfiq Ammari

    Abstract: As artificial intelligence (AI) becomes deeply integrated into family life, immigrant families must navigate unique intergenerational, linguistic, and cultural challenges. This study examines how Korean immigrant families in the United States negotiate the use of AI tools such as ChatGPT and smart assistants in their homes. Through 20 semi-structured interviews with parents and teens, we identify… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  9. arXiv:2506.08725  [pdf

    cs.HC cs.LG

    Stop Misusing t-SNE and UMAP for Visual Analytics

    Authors: Hyeon Jeon, Jeongin Park, Sungbok Shin, Jinwook Seo

    Abstract: Misuses of t-SNE and UMAP in visual analytics have become increasingly common. For example, although t-SNE and UMAP projections often do not faithfully reflect true distances between clusters, practitioners frequently use them to investigate inter-cluster relationships. In this paper, we bring this issue to the surface and comprehensively investigate why such misuse occurs and how to prevent it. W… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 9 pages

  10. arXiv:2506.08644  [pdf, other

    cs.LG

    Semi-gradient DICE for Offline Constrained Reinforcement Learning

    Authors: Woosung Kim, JunHo Seo, Jongmin Lee, Byung-Jun Lee

    Abstract: Stationary Distribution Correction Estimation (DICE) addresses the mismatch between the stationary distribution induced by a policy and the target distribution required for reliable off-policy evaluation (OPE) and policy optimization. DICE-based offline constrained RL particularly benefits from the flexibility of DICE, as it simultaneously maximizes return while estimating costs in offline setting… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: Constrained Offline Reinforcement Learning

  11. arXiv:2506.04283  [pdf, ps, other

    cs.GR cs.AI cs.CV

    SSIMBaD: Sigma Scaling with SSIM-Guided Balanced Diffusion for AnimeFace Colorization

    Authors: Junpyo Seo, Hanbin Koo, Jieun Yook, Byung-Ro Moon

    Abstract: We propose a novel diffusion-based framework for automatic colorization of Anime-style facial sketches. Our method preserves the structural fidelity of the input sketch while effectively transferring stylistic attributes from a reference image. Unlike traditional approaches that rely on predefined noise schedules - which often compromise perceptual consistency -- our framework builds on continuous… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: 10 pages, rest of the pages are appendix

  12. arXiv:2505.23806  [pdf, ps, other

    cs.CL cs.AI

    MedOrchestra: A Hybrid Cloud-Local LLM Approach for Clinical Data Interpretation

    Authors: Sihyeon Lee, Hyunjoo Song, Jong-chan Lee, Yoon Jin Lee, Boram Lee, Hee-Eon Lim, Dongyeong Kim, Jinwook Seo, Bohyoung Kim

    Abstract: Deploying large language models (LLMs) in clinical settings faces critical trade-offs: cloud LLMs, with their extensive parameters and superior performance, pose risks to sensitive clinical data privacy, while local LLMs preserve privacy but often fail at complex clinical interpretation tasks. We propose MedOrchestra, a hybrid framework where a cloud LLM decomposes complex clinical tasks into mana… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  13. arXiv:2505.23400  [pdf, ps, other

    cs.CV

    Bridging Geometric and Semantic Foundation Models for Generalized Monocular Depth Estimation

    Authors: Sanggyun Ma, Wonjoon Choi, Jihun Park, Jaeyeul Kim, Seunghun Lee, Jiwan Seo, Sunghoon Im

    Abstract: We present Bridging Geometric and Semantic (BriGeS), an effective method that fuses geometric and semantic information within foundation models to enhance Monocular Depth Estimation (MDE). Central to BriGeS is the Bridging Gate, which integrates the complementary strengths of depth and segmentation foundation models. This integration is further refined by our Attention Temperature Scaling techniqu… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  14. arXiv:2505.21467  [pdf, ps, other

    cs.CL

    Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion

    Authors: Zhanqiu Hu, Jian Meng, Yash Akhauri, Mohamed S. Abdelfattah, Jae-sun Seo, Zhiru Zhang, Udit Gupta

    Abstract: Diffusion language models offer parallel token generation and inherent bidirectionality, promising more efficient and powerful sequence modeling compared to autoregressive approaches. However, state-of-the-art diffusion models (e.g., Dream 7B, LLaDA 8B) suffer from slow inference. While they match the quality of similarly sized Autoregressive (AR) Models (e.g., Qwen2.5 7B, Llama3 8B), their iterat… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  15. arXiv:2505.18326  [pdf, ps, other

    cs.CY cs.HC

    Pragmatic Disengagement and Culturally Situated Non Use Older Korean Immigrants Strategies for Navigating Digital Noise

    Authors: Jeongone Seo, Tawfiq Ammari

    Abstract: Older immigrant adults often face layered barriers to digital participation, including language exclusion, generational divides, and emotional fatigue. This study examines how older Korean immigrants in the greater NYC area selectively engage with digital tools such as smartphones, YouTube, and AI platforms. Using a community-based participatory research (CBPR) framework and 22 semi-structured int… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  16. arXiv:2505.00779  [pdf, other

    cs.RO cs.LG eess.SY

    Uncertainty-aware Latent Safety Filters for Avoiding Out-of-Distribution Failures

    Authors: Junwon Seo, Kensuke Nakamura, Andrea Bajcsy

    Abstract: Recent advances in generative world models have enabled classical safe control methods, such as Hamilton-Jacobi (HJ) reachability, to generalize to complex robotic systems operating directly from high-dimensional sensor observations. However, obtaining comprehensive coverage of all safety-critical scenarios during world model training is extremely challenging. As a result, latent safety filters bu… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

  17. arXiv:2504.17080  [pdf, other

    cs.RO eess.SY

    Geometric Formulation of Unified Force-Impedance Control on SE(3) for Robotic Manipulators

    Authors: Joohwan Seo, Nikhil Potu Surya Prakash, Soomi Lee, Arvind Kruthiventy, Megan Teng, Jongeun Choi, Roberto Horowitz

    Abstract: In this paper, we present an impedance control framework on the SE(3) manifold, which enables force tracking while guaranteeing passivity. Building upon the unified force-impedance control (UFIC) and our previous work on geometric impedance control (GIC), we develop the geometric unified force impedance control (GUFIC) to account for the SE(3) manifold structure in the controller formulation using… ▽ More

    Submitted 23 April, 2025; originally announced April 2025.

    Comments: Submitted to Control Decision Conference (CDC) 2025

  18. arXiv:2504.09859  [pdf, other

    cs.HC

    Can VLMs Assess Similarity Between Graph Visualizations?

    Authors: Seokweon Jung, Hyeon Jeon, Jeongmin Rhee, Jinwook Seo

    Abstract: Graph visualizations have been studied for tasks such as clustering and temporal analysis, but how these visual similarities relate to established graph similarity measures remains unclear. In this paper, we explore the potential of Vision Language Models (VLMs) to approximate human-like perception of graph similarity. We generate graph datasets of various sizes and densities and compare VLM-deriv… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  19. arXiv:2504.06264  [pdf, other

    cs.CV

    D^2USt3R: Enhancing 3D Reconstruction with 4D Pointmaps for Dynamic Scenes

    Authors: Jisang Han, Honggyu An, Jaewoo Jung, Takuya Narihira, Junyoung Seo, Kazumi Fukuda, Chaehyun Kim, Sunghwan Hong, Yuki Mitsufuji, Seungryong Kim

    Abstract: We address the task of 3D reconstruction in dynamic scenes, where object motions degrade the quality of previous 3D pointmap regression methods, such as DUSt3R, originally designed for static 3D scene reconstruction. Although these methods provide an elegant and powerful solution in static settings, they struggle in the presence of dynamic motions that disrupt alignment based solely on camera pose… ▽ More

    Submitted 8 April, 2025; originally announced April 2025.

    Comments: project page: https://cvlab-kaist.github.io/DDUSt3R/

  20. arXiv:2504.03979  [pdf, other

    cs.CL cond-mat.mtrl-sci cs.IR

    Structured Extraction of Process Structure Properties Relationships in Materials Science

    Authors: Amit K Verma, Zhisong Zhang, Junwon Seo, Robin Kuo, Runbo Jiang, Emma Strubell, Anthony D Rollett

    Abstract: With the advent of large language models (LLMs), the vast unstructured text within millions of academic papers is increasingly accessible for materials discovery, although significant challenges remain. While LLMs offer promising few- and zero-shot learning capabilities, particularly valuable in the materials domain where expert annotations are scarce, general-purpose LLMs often fail to address ke… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

    Comments: 16 pages, 3 figures, 13 table

  21. arXiv:2503.13180  [pdf, other

    cs.LG cs.AI cs.DC

    GC-Fed: Gradient Centralized Federated Learning with Partial Client Participation

    Authors: Jungwon Seo, Ferhat Ozgur Catak, Chunming Rong, Kibeom Hong, Minhoe Kim

    Abstract: Federated Learning (FL) enables privacy-preserving multi-source information fusion (MSIF) but is challenged by client drift in highly heterogeneous data settings. Many existing drift-mitigation strategies rely on reference-based techniques--such as gradient adjustments or proximal loss--that use historical snapshots (e.g., past gradients or previous global models) as reference points. When only a… ▽ More

    Submitted 20 March, 2025; v1 submitted 17 March, 2025; originally announced March 2025.

  22. arXiv:2503.12806  [pdf, other

    cs.MM cs.CV cs.SD eess.AS

    AV-Surf: Surface-Enhanced Geometry-Aware Novel-View Acoustic Synthesis

    Authors: Hadam Baek, Hannie Shin, Jiyoung Seo, Chanwoo Kim, Saerom Kim, Hyeongbok Kim, Sangpil Kim

    Abstract: Accurately modeling sound propagation with complex real-world environments is essential for Novel View Acoustic Synthesis (NVAS). While previous studies have leveraged visual perception to estimate spatial acoustics, the combined use of surface normal and structural details from 3D representations in acoustic modeling has been underexplored. Given their direct impact on sound wave reflections and… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

  23. arXiv:2503.09829  [pdf, other

    cs.RO cs.LG eess.SY

    SE(3)-Equivariant Robot Learning and Control: A Tutorial Survey

    Authors: Joohwan Seo, Soochul Yoo, Junwoo Chang, Hyunseok An, Hyunwoo Ryu, Soomi Lee, Arvind Kruthiventy, Jongeun Choi, Roberto Horowitz

    Abstract: Recent advances in deep learning and Transformers have driven major breakthroughs in robotics by employing techniques such as imitation learning, reinforcement learning, and LLM-based multimodal perception and decision-making. However, conventional deep learning and Transformer models often struggle to process data with inherent symmetries and invariances, typically relying on large datasets or ex… ▽ More

    Submitted 23 April, 2025; v1 submitted 12 March, 2025; originally announced March 2025.

    Comments: Accepted to International Journcal of Control, Automation and Systems (IJCAS)

  24. "Sighted People Have Their Pick Of The Litter": Unpacking The Need For Digital Mental Health (DMH) Tracking Services With And For The Blind Community

    Authors: Omar Khan, JooYoung Seo

    Abstract: The proliferation of digital mental health (DMH) tracking services promises personalized support, yet accessibility barriers limit equal access. This study investigates blind community experiences with DMH tracking services across the United States as a step toward inclusive health technology design. Working with blind advocacy organizations, we distributed a cross-sectional observational survey (… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: Accepted to CHI 2025

  25. arXiv:2503.06491  [pdf, other

    cs.CL cs.LG

    MoFE: Mixture of Frozen Experts Architecture

    Authors: Jean Seo, Jaeyoon Kim, Hyopil Shin

    Abstract: We propose the Mixture of Frozen Experts (MoFE) architecture, which integrates Parameter-efficient Fine-tuning (PEFT) and the Mixture of Experts (MoE) architecture to enhance both training efficiency and model scalability. By freezing the Feed Forward Network (FFN) layers within the MoE framework, MoFE significantly reduces the number of trainable parameters, improving training efficiency while st… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

    Comments: NAACL 2025 Industry

  26. arXiv:2503.06413  [pdf, other

    stat.ML cs.AI cs.LG

    Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba Models

    Authors: Nguyen Do, Truc Nguyen, Malik Hassanaly, Raed Alharbi, Jung Taek Seo, My T. Thai

    Abstract: Despite a plethora of anomaly detection models developed over the years, their ability to generalize to unseen anomalies remains an issue, particularly in critical systems. This paper aims to address this challenge by introducing Swift Hydra, a new framework for training an anomaly detection method based on generative AI and reinforcement learning (RL). Through featuring an RL policy that operates… ▽ More

    Submitted 24 March, 2025; v1 submitted 8 March, 2025; originally announced March 2025.

    Journal ref: In The Thirteenth International Conference on Learning Representations, 2025

  27. arXiv:2503.04807  [pdf, other

    cs.CL cs.AI

    Call for Rigor in Reporting Quality of Instruction Tuning Data

    Authors: Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim

    Abstract: Instruction tuning is crucial for adapting large language models (LLMs) to align with user intentions. Numerous studies emphasize the significance of the quality of instruction tuning (IT) data, revealing a strong correlation between IT data quality and the alignment performance of LLMs. In these studies, the quality of IT data is typically assessed by evaluating the performance of LLMs trained wi… ▽ More

    Submitted 15 May, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

    Comments: Accepted to the ACL2025-main

  28. arXiv:2503.01097  [pdf, other

    cs.LG

    Measuring the Validity of Clustering Validation Datasets

    Authors: Hyeon Jeon, Michaël Aupetit, DongHwa Shin, Aeri Cho, Seokhyeon Park, Jinwook Seo

    Abstract: Clustering techniques are often validated using benchmark datasets where class labels are used as ground-truth clusters. However, depending on the datasets, class labels may not align with the actual data clusters, and such misalignment hampers accurate validation. Therefore, it is essential to evaluate and compare datasets regarding their cluster-label matching (CLM), i.e., how well their class l… ▽ More

    Submitted 2 March, 2025; originally announced March 2025.

    Comments: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

  29. arXiv:2502.15826  [pdf, other

    cs.CL cs.AI

    CoME: An Unlearning-based Approach to Conflict-free Model Editing

    Authors: Dahyun Jung, Jaehyung Seo, Jaewook Lee, Chanjun Park, Heuiseok Lim

    Abstract: Large language models (LLMs) often retain outdated or incorrect information from pre-training, which undermines their reliability. While model editing methods have been developed to address such errors without full re-training, they frequently suffer from knowledge conflicts, where outdated information interferes with new knowledge. In this work, we propose Conflict-free Model Editing (CoME), a no… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

    Comments: Accepted to NAACL 2025 main conference

  30. arXiv:2502.15395  [pdf, ps, other

    cs.HC

    Beyond Tools: Understanding How Heavy Users Integrate LLMs into Everyday Tasks and Decision-Making

    Authors: Eunhye Kim, Kiroong Choe, Minju Yoo, Sadat Shams Chowdhury, Jinwook Seo

    Abstract: Large language models (LLMs) are increasingly used for both everyday and specialized tasks. While HCI research focuses on domain-specific applications, little is known about how heavy users integrate LLMs into everyday decision-making. Through qualitative interviews with heavy LLM users (n=7) who employ these systems for both intuitive and analytical thinking tasks, our findings show that particip… ▽ More

    Submitted 16 April, 2025; v1 submitted 21 February, 2025; originally announced February 2025.

  31. arXiv:2502.12560  [pdf, other

    cs.CL

    How does a Language-Specific Tokenizer affect LLMs?

    Authors: Jean Seo, Jaeyoon Kim, SungJoo Byun, Hyopil Shin

    Abstract: The necessity of language-specific tokenizers intuitively appears crucial for effective natural language processing, yet empirical analyses on their significance and underlying reasons are lacking. This study explores how language-specific tokenizers influence the behavior of Large Language Models predominantly trained with English text data, through the case study of Korean. The research unfolds… ▽ More

    Submitted 21 February, 2025; v1 submitted 18 February, 2025; originally announced February 2025.

  32. arXiv:2502.08690  [pdf, other

    cs.LG cs.AI cs.CV

    Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation

    Authors: Hoigi Seo, Wongi Jeong, Jae-sun Seo, Se Young Chun

    Abstract: Large-scale text encoders in text-to-image (T2I) diffusion models have demonstrated exceptional performance in generating high-quality images from textual prompts. Unlike denoising modules that rely on multiple iterative steps, text encoders require only a single forward pass to produce text embeddings. However, despite their minimal contribution to total inference time and floating-point operatio… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

  33. Kalman Filter-Based Distributed Gaussian Process for Unknown Scalar Field Estimation in Wireless Sensor Networks

    Authors: Jaemin Seo, Geunsik Bae, Hyondong Oh

    Abstract: In this letter, we propose an online scalar field estimation algorithm of unknown environments using a distributed Gaussian process (DGP) framework in wireless sensor networks (WSNs). While the kernel-based Gaussian process (GP) has been widely employed for estimating unknown scalar fields, its centralized nature is not well-suited for handling a large amount of data from WSNs. To overcome the lim… ▽ More

    Submitted 9 February, 2025; originally announced February 2025.

    Journal ref: Expert Systems with Applications, vol. 285, p. 127822, 2025

  34. arXiv:2502.05609  [pdf, other

    cs.CL

    Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding

    Authors: Sukmin Cho, Sangjin Choi, Taeho Hwang, Jeongyeon Seo, Soyeong Jeong, Huije Lee, Hoyun Song, Jong C. Park, Youngjin Kwon

    Abstract: Accelerating inference in Large Language Models (LLMs) is critical for real-time interactions, as they have been widely incorporated into real-world services. Speculative decoding, a fully algorithmic solution, has gained attention for improving inference speed by drafting and verifying tokens, thereby generating multiple tokens in a single forward pass. However, current drafting strategies usuall… ▽ More

    Submitted 8 February, 2025; originally announced February 2025.

    Comments: Findings of NAACL 2025

  35. arXiv:2502.05221  [pdf, ps, other

    math.OC cs.AI

    Blackout DIFUSCO

    Authors: Jun Pyo Seo

    Abstract: This study explores the integration of Blackout Diffusion into the DIFUSCO framework for combinatorial optimization, specifically targeting the Traveling Salesman Problem (TSP). Inspired by the success of discrete-time diffusion models (D3PM) in maintaining structural integrity, we extend the paradigm to a continuous-time framework, leveraging the unique properties of Blackout Diffusion. Continuou… ▽ More

    Submitted 5 June, 2025; v1 submitted 5 February, 2025; originally announced February 2025.

    Comments: 12 pages

  36. arXiv:2502.04018  [pdf, other

    cs.LG

    PINT: Physics-Informed Neural Time Series Models with Applications to Long-term Inference on WeatherBench 2m-Temperature Data

    Authors: Keon Vin Park, Jisu Kim, Jaemin Seo

    Abstract: This paper introduces PINT (Physics-Informed Neural Time Series Models), a framework that integrates physical constraints into neural time series models to improve their ability to capture complex dynamics. We apply PINT to the ERA5 WeatherBench dataset, focusing on long-term forecasting of 2m-temperature data. PINT incorporates the Simple Harmonic Oscillator Equation as a physics-informed prior,… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

  37. arXiv:2502.00182  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Understanding Federated Learning from IID to Non-IID dataset: An Experimental Study

    Authors: Jungwon Seo, Ferhat Ozgur Catak, Chunming Rong

    Abstract: As privacy concerns and data regulations grow, federated learning (FL) has emerged as a promising approach for training machine learning models across decentralized data sources without sharing raw data. However, a significant challenge in FL is that client data are often non-IID (non-independent and identically distributed), leading to reduced performance compared to centralized learning. While m… ▽ More

    Submitted 3 June, 2025; v1 submitted 31 January, 2025; originally announced February 2025.

    Journal ref: 36th Norwegian ICT Conference for Research and Education, NIKT 2024

  38. Leveraging Multimodal LLM for Inspirational User Interface Search

    Authors: Seokhyeon Park, Yumin Song, Soohyun Lee, Jaeyoung Kim, Jinwook Seo

    Abstract: Inspirational search, the process of exploring designs to inform and inspire new creative work, is pivotal in mobile user interface (UI) design. However, exploring the vast space of UI references remains a challenge. Existing AI-based UI search methods often miss crucial semantics like target users or the mood of apps. Additionally, these models typically require metadata like view hierarchies, li… ▽ More

    Submitted 15 February, 2025; v1 submitted 29 January, 2025; originally announced January 2025.

    Comments: In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '25)

  39. arXiv:2501.16680  [pdf, ps, other

    cs.CR cs.DS

    Differentially Private Set Representations

    Authors: Sarvar Patel, Giuseppe Persiano, Joon Young Seo, Kevin Yeo

    Abstract: We study the problem of differentially private (DP) mechanisms for representing sets of size $k$ from a large universe. Our first construction creates $(ε,δ)$-DP representations with error probability of $1/(e^ε+ 1)$ using space at most $1.05 k ε\cdot \log(e)$ bits where the time to construct a representation is $O(k \log(1/δ))$ while decoding time is $O(\log(1/δ))$. We also present a second algor… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

    Comments: Appears at NeurIPS 2024

  40. arXiv:2501.16510  [pdf, other

    physics.flu-dyn cs.AI

    Decrypting the temperature field in flow boiling with latent diffusion models

    Authors: UngJin Na, JunYoung Seo, Taeil Kim, ByongGuk Jeon, HangJin Jo

    Abstract: This paper presents an innovative method using Latent Diffusion Models (LDMs) to generate temperature fields from phase indicator maps. By leveraging the BubbleML dataset from numerical simulations, the LDM translates phase field data into corresponding temperature distributions through a two-stage training process involving a vector-quantized variational autoencoder (VQVAE) and a denoising autoen… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

  41. arXiv:2501.12668  [pdf, other

    cs.LG cs.AI

    NBDI: A Simple and Effective Termination Condition for Skill Extraction from Task-Agnostic Demonstrations

    Authors: Myunsoo Kim, Hayeong Lee, Seong-Woong Shim, JunHo Seo, Byung-Jun Lee

    Abstract: Intelligent agents are able to make decisions based on different levels of granularity and duration. Recent advances in skill learning enabled the agent to solve complex, long-horizon tasks by effectively guiding the agent in choosing appropriate skills. However, the practice of using fixed-length skills can easily result in skipping valuable decision points, which ultimately limits the potential… ▽ More

    Submitted 20 May, 2025; v1 submitted 22 January, 2025; originally announced January 2025.

  42. Unveiling High-dimensional Backstage: A Survey for Reliable Visual Analytics with Dimensionality Reduction

    Authors: Hyeon Jeon, Hyunwook Lee, Yun-Hsin Kuo, Taehyun Yang, Daniel Archambault, Sungahn Ko, Takanori Fujiwara, Kwan-Liu Ma, Jinwook Seo

    Abstract: Dimensionality reduction (DR) techniques are essential for visually analyzing high-dimensional data. However, visual analytics using DR often face unreliability, stemming from factors such as inherent distortions in DR projections. This unreliability can lead to analytic insights that misrepresent the underlying data, potentially resulting in misguided decisions. To tackle these reliability challe… ▽ More

    Submitted 3 March, 2025; v1 submitted 17 January, 2025; originally announced January 2025.

    Comments: In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '25)

  43. arXiv:2501.09954  [pdf, other

    cs.LG cs.AI cs.AR

    AIRCHITECT v2: Learning the Hardware Accelerator Design Space through Unified Representations

    Authors: Jamin Seo, Akshat Ramachandran, Yu-Chuan Chuang, Anirudh Itagi, Tushar Krishna

    Abstract: Design space exploration (DSE) plays a crucial role in enabling custom hardware architectures, particularly for emerging applications like AI, where optimized and specialized designs are essential. With the growing complexity of deep neural networks (DNNs) and the introduction of advanced foundational models (FMs), the design space for DNN accelerators is expanding at an exponential rate. Addition… ▽ More

    Submitted 16 January, 2025; originally announced January 2025.

    Comments: Accepted to DATE 2025

  44. arXiv:2501.05768  [pdf, other

    cs.LG cs.AI

    Halal or Not: Knowledge Graph Completion for Predicting Cultural Appropriateness of Daily Products

    Authors: Van Thuy Hoang, Tien-Bach-Thanh Do, Jinho Seo, Seung Charlie Kim, Luong Vuong Nguyen, Duong Nguyen Minh Huy, Hyeon-Ju Jeon, O-Joun Lee

    Abstract: The growing demand for halal cosmetic products has exposed significant challenges, especially in Muslim-majority countries. Recently, various machine learning-based strategies, e.g., image-based methods, have shown remarkable success in predicting the halal status of cosmetics. However, these methods mainly focus on analyzing the discrete and specific ingredients within separate cosmetics, which i… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

    Comments: 10 pages

  45. StereoMath: An Accessible and Musical Equation Editor

    Authors: Kenneth Ge, JooYoung Seo

    Abstract: For blind and low-vision (BLV) individuals, digital math communication is uniquely difficult due to the lack of accessible tools. Currently, the state of the art is either code-based, like LaTeX, or WYSIWYG, like visual editors. However, both paradigms view math communication as primarily a visual typesetting problem, and may be accessible but difficult to use. In this paper, we present an equatio… ▽ More

    Submitted 2 January, 2025; originally announced January 2025.

    Comments: 5 pages, 2 figures, accepted as demo paper at ASSETS '24

    Journal ref: ASSETS 2024: Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility, Article No.: 129, Pages 1 - 5

  46. arXiv:2412.19450  [pdf, other

    cs.AI

    Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models

    Authors: Hyeonseok Moon, Jaehyung Seo, Seungyoon Lee, Chanjun Park, Heuiseok Lim

    Abstract: One of the key strengths of Large Language Models (LLMs) is their ability to interact with humans by generating appropriate responses to given instructions. This ability, known as instruction-following capability, has established a foundation for the use of LLMs across various fields and serves as a crucial metric for evaluating their performance. While numerous evaluation benchmarks have been dev… ▽ More

    Submitted 22 January, 2025; v1 submitted 26 December, 2024; originally announced December 2024.

    Comments: NAACL25-Findings

  47. arXiv:2412.11520  [pdf, other

    cs.CV cs.AI

    EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting

    Authors: Dong In Lee, Hyeongcheol Park, Jiyoung Seo, Eunbyung Park, Hyunje Park, Ha Dam Baek, Sangheon Shin, Sangmin Kim, Sangpil Kim

    Abstract: Recent advancements in 3D editing have highlighted the potential of text-driven methods in real-time, user-friendly AR/VR applications. However, current methods rely on 2D diffusion models without adequately considering multi-view information, resulting in multi-view inconsistency. While 3D Gaussian Splatting (3DGS) significantly improves rendering quality and speed, its 3D editing process encount… ▽ More

    Submitted 17 April, 2025; v1 submitted 16 December, 2024; originally announced December 2024.

  48. arXiv:2412.09916  [pdf, other

    cs.HC

    ProxyLLM : LLM-Driven Framework for Customer Support Through Text-Style Transfer

    Authors: Sehyeong Jo, Jungwon Seo

    Abstract: Chatbot-based customer support services have significantly advanced with the introduction of large language models (LLMs), enabling enhanced response quality and broader application across industries. However, while these advancements focus on reducing business costs and improving customer satisfaction, limited attention has been given to the experiences of customer service agents, who are critica… ▽ More

    Submitted 17 December, 2024; v1 submitted 13 December, 2024; originally announced December 2024.

  49. arXiv:2411.17338  [pdf, other

    cs.CL cs.AI cs.CY

    Different Bias Under Different Criteria: Assessing Bias in LLMs with a Fact-Based Approach

    Authors: Changgeon Ko, Jisu Shin, Hoyun Song, Jeongyeon Seo, Jong C. Park

    Abstract: Large language models (LLMs) often reflect real-world biases, leading to efforts to mitigate these effects and make the models unbiased. Achieving this goal requires defining clear criteria for an unbiased state, with any deviation from these criteria considered biased. Some studies define an unbiased state as equal treatment across diverse demographic groups, aiming for balanced outputs from LLMs… ▽ More

    Submitted 26 November, 2024; originally announced November 2024.

    Comments: Accepted in NeurIPS 2024 Workshop on Socially Responsible Language Modelling Research (SoLaR)

  50. arXiv:2411.09255  [pdf, other

    cs.CL

    DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in Biomedicine

    Authors: Jean Seo, Jongwon Lim, Dongjun Jang, Hyopil Shin

    Abstract: We introduce DAHL, a benchmark dataset and automated evaluation system designed to assess hallucination in long-form text generation, specifically within the biomedical domain. Our benchmark dataset, meticulously curated from biomedical research papers, consists of 8,573 questions across 29 categories. DAHL evaluates fact-conflicting hallucinations in Large Language Models (LLMs) by deconstructing… ▽ More

    Submitted 14 November, 2024; originally announced November 2024.

    Comments: EMNLP2024/FEVER