Skip to main content

Showing 1–50 of 661 results for author: Kang, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.08787  [pdf, other

    cs.RO cs.CV

    UniSkill: Imitating Human Videos via Cross-Embodiment Skill Representations

    Authors: Hanjung Kim, Jaehyun Kang, Hyolim Kang, Meedeum Cho, Seon Joo Kim, Youngwoon Lee

    Abstract: Mimicry is a fundamental learning mechanism in humans, enabling individuals to learn new tasks by observing and imitating experts. However, applying this ability to robots presents significant challenges due to the inherent differences between human and robot embodiments in both their visual appearance and physical capabilities. While previous methods bridge this gap using cross-embodiment dataset… ▽ More

    Submitted 15 May, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

    Comments: Project Page: https://kimhanjung.github.io/UniSkill/

  2. arXiv:2505.08504  [pdf, ps, other

    cs.CL

    Reassessing Graph Linearization for Sequence-to-sequence AMR Parsing: On the Advantages and Limitations of Triple-Based Encoding

    Authors: Jeongwoo Kang, Maximin Coavoux, Cédric Lopez, Didier Schwab

    Abstract: Sequence-to-sequence models are widely used to train Abstract Meaning Representation (Banarescu et al., 2013, AMR) parsers. To train such models, AMR graphs have to be linearized into a one-line text format. While Penman encoding is typically used for this purpose, we argue that it has limitations: (1) for deep graphs, some closely related nodes are located far apart in the linearized text (2) Pen… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: published at Insights from Negative Results in NLP (workshop EMNLP 2025)

  3. arXiv:2505.08020  [pdf, ps, other

    math.CO cs.DM cs.DS

    Reconfiguration of List Colourings

    Authors: Stijn Cambie, Wouter Cames van Batenburg, Daniel W. Cranston, Jan van den Heuvel, Ross J. Kang

    Abstract: Given a proper (list) colouring of a graph $G$, a recolouring step changes the colour at a single vertex to another colour (in its list) that is currently unused on its neighbours, hence maintaining a proper colouring. Suppose that each vertex $v$ has its own private list $L(v)$ of allowed colours such that $|L(v)|\ge \mbox{deg}(v)+1$. We prove that if $G$ is connected and its maximum degree $Δ$ i… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: 27 pages, 4 figures

  4. arXiv:2505.07290  [pdf, other

    cs.NI

    Multi-Agent DRL for Multi-Objective Twin Migration Routing with Workload Prediction in 6G-enabled IoV

    Authors: Peng Yin, Wentao Liang, Jinbo Wen, Jiawen Kang, Junlong Chen, Dusit Niyato

    Abstract: Sixth Generation (6G)-enabled Internet of Vehicles (IoV) facilitates efficient data synchronization through ultra-fast bandwidth and high-density connectivity, enabling the emergence of Vehicle Twins (VTs). As highly accurate replicas of vehicles, VTs can support intelligent vehicular applications for occupants in 6G-enabled IoV. Thanks to the full coverage capability of 6G, resource-constrained v… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  5. arXiv:2505.07176  [pdf, other

    cs.MA cs.AI

    Internet of Agents: Fundamentals, Applications, and Challenges

    Authors: Yuntao Wang, Shaolong Guo, Yanghe Pan, Zhou Su, Fahao Chen, Tom H. Luan, Peng Li, Jiawen Kang, Dusit Niyato

    Abstract: With the rapid proliferation of large language models and vision-language models, AI agents have evolved from isolated, task-specific systems into autonomous, interactive entities capable of perceiving, reasoning, and acting without human intervention. As these agents proliferate across virtual and physical environments, from virtual assistants to embodied robots, the need for a unified, agent-cen… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

    Comments: 22 pages,10 figures, 8 tables. Submitted to IEEE TCCN

  6. arXiv:2505.06378  [pdf, other

    cs.GT cs.AI

    Bi-LSTM based Multi-Agent DRL with Computation-aware Pruning for Agent Twins Migration in Vehicular Embodied AI Networks

    Authors: Yuxiang Wei, Zhuoqi Zeng, Yue Zhong, Jiawen Kang, Ryan Wen Liu, M. Shamim Hossain

    Abstract: With the advancement of large language models and embodied Artificial Intelligence (AI) in the intelligent transportation scenarios, the combination of them in intelligent transportation spawns the Vehicular Embodied AI Network (VEANs). In VEANs, Autonomous Vehicles (AVs) are typical agents whose local advanced AI applications are defined as vehicular embodied AI agents, enabling capabilities such… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

  7. arXiv:2505.05798  [pdf, ps, other

    cs.LG cs.CV eess.IV eess.SP

    Improving Generalizability of Kolmogorov-Arnold Networks via Error-Correcting Output Codes

    Authors: Youngjoon Lee, Jinu Gong, Joonhyuk Kang

    Abstract: Kolmogorov-Arnold Networks (KAN) offer universal function approximation using univariate spline compositions without nonlinear activations. In this work, we integrate Error-Correcting Output Codes (ECOC) into the KAN framework to transform multi-class classification into multiple binary tasks, improving robustness via Hamming-distance decoding. Our proposed KAN with ECOC method outperforms vanilla… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

    Comments: 4 pages

  8. arXiv:2505.03777  [pdf, other

    cs.LG

    MolMole: Molecule Mining from Scientific Literature

    Authors: LG AI Research, Sehyun Chun, Jiye Kim, Ahra Jo, Yeonsik Jo, Seungyul Oh, Seungjun Lee, Kwangrok Ryoo, Jongmin Lee, Seung Hwan Kim, Byung Jun Kang, Soonyoung Lee, Jun Ha Park, Chanwoo Moon, Jiwon Ham, Haein Lee, Heejae Han, Jaeseung Byun, Soojong Do, Minju Ha, Dongyun Kim, Kyunghoon Bae, Woohyung Lim, Edward Hwayoung Lee, Yongmin Park , et al. (9 additional authors not shown)

    Abstract: The extraction of molecular structures and reaction data from scientific documents is challenging due to their varied, unstructured chemical formats and complex document layouts. To address this, we introduce MolMole, a vision-based deep learning framework that unifies molecule detection, reaction diagram parsing, and optical chemical structure recognition (OCSR) into a single pipeline for automat… ▽ More

    Submitted 7 May, 2025; v1 submitted 30 April, 2025; originally announced May 2025.

    Comments: 15 pages, 12 figures

  9. arXiv:2505.00055  [pdf, other

    cs.MA cs.GT

    TinyMA-IEI-PPO: Exploration Incentive-Driven Multi-Agent DRL with Self-Adaptive Pruning for Vehicular Embodied AI Agent Twins Migration

    Authors: Zhuoqi Zeng, Yuxiang Wei, Jiawen Kang

    Abstract: Embodied Artificial Intelligence (EAI) addresses autonomous driving challenges in Vehicular Embodied AI Networks (VEANETs) through multi-modal perception, adaptive decision-making, and hardware-software co-scheduling. However, the computational demands of virtual services and the inherent mobility of autonomous vehicles (AVs) necessitate real-time migration of Vehicular Embodied Agent AI Twins (VE… ▽ More

    Submitted 30 April, 2025; originally announced May 2025.

  10. RecGaze: The First Eye Tracking and User Interaction Dataset for Carousel Interfaces

    Authors: Santiago de Leon-Martinez, Jingwei Kang, Robert Moro, Maarten de Rijke, Branislav Kveton, Harrie Oosterhuis, Maria Bielikova

    Abstract: Carousel interfaces are widely used in e-commerce and streaming services, but little research has been devoted to them. Previous studies of interfaces for presenting search and recommendation results have focused on single ranked lists, but it appears their results cannot be extrapolated to carousels due to the added complexity. Eye tracking is a highly informative approach to understanding how us… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

    Comments: Accepted to Resource & Reproducibility Track SIGIR '25

  11. arXiv:2504.19660  [pdf, other

    cs.NI eess.SP

    Decentralization of Generative AI via Mixture of Experts for Wireless Networks: A Comprehensive Survey

    Authors: Yunting Xu, Jiacheng Wang, Ruichen Zhang, Changyuan Zhao, Dusit Niyato, Jiawen Kang, Zehui Xiong, Bo Qian, Haibo Zhou, Shiwen Mao, Abbas Jamalipour, Xuemin Shen, Dong In Kim

    Abstract: Mixture of Experts (MoE) has emerged as a promising paradigm for scaling model capacity while preserving computational efficiency, particularly in large-scale machine learning architectures such as large language models (LLMs). Recent advances in MoE have facilitated its adoption in wireless networks to address the increasing complexity and heterogeneity of modern communication systems. This paper… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

    Comments: Survey paper, 30 pages, 13 figures

  12. arXiv:2504.19639  [pdf, other

    cs.LG eess.SP

    A Unified Benchmark of Federated Learning with Kolmogorov-Arnold Networks for Medical Imaging

    Authors: Youngjoon Lee, Jinu Gong, Joonhyuk Kang

    Abstract: Federated Learning (FL) enables model training across decentralized devices without sharing raw data, thereby preserving privacy in sensitive domains like healthcare. In this paper, we evaluate Kolmogorov-Arnold Networks (KAN) architectures against traditional MLP across six state-of-the-art FL algorithms on a blood cell classification dataset. Notably, our experiments demonstrate that KAN can eff… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

    Comments: 5 pages

  13. SoCov: Semi-Orthogonal Parametric Pooling of Covariance Matrix for Speaker Recognition

    Authors: Rongjin Li, Weibin Zhang, Dongpeng Chen, Jintao Kang, Xiaofen Xing

    Abstract: In conventional deep speaker embedding frameworks, the pooling layer aggregates all frame-level features over time and computes their mean and standard deviation statistics as inputs to subsequent segment-level layers. Such statistics pooling strategy produces fixed-length representations from variable-length speech segments. However, this method treats different frame-level features equally and d… ▽ More

    Submitted 23 April, 2025; originally announced April 2025.

    Comments: This paper has been accepted by IEEE ICASSP2025

  14. FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep Learning

    Authors: Ju Yeon Kang, Ji Won Yoon, Semin Kim, Min Hyun Han, Nam Soo Kim

    Abstract: Recently, fake audio detection has gained significant attention, as advancements in speech synthesis and voice conversion have increased the vulnerability of automatic speaker verification (ASV) systems to spoofing attacks. A key challenge in this task is generalizing models to detect unseen, out-of-distribution (OOD) attacks. Although existing approaches have shown promising results, they inheren… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

    Comments: Accepted at ICASSP 2025

  15. arXiv:2504.14802  [pdf, other

    cs.DC

    ReCraft: Self-Contained Split, Merge, and Membership Change of Raft Protocol

    Authors: Kezhi Xiong, Soonwon Moon, Joshua Kang, Bryant Curto, Jieung Kim, Ji-Yong Shin

    Abstract: Designing reconfiguration schemes for consensus protocols is challenging because subtle corner cases during reconfiguration could invalidate the correctness of the protocol. Thus, most systems that embed consensus protocols conservatively implement the reconfiguration and refrain from developing an efficient scheme. Existing implementations often stop the entire system during reconfiguration and r… ▽ More

    Submitted 27 April, 2025; v1 submitted 20 April, 2025; originally announced April 2025.

    Journal ref: The 55th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (2025)

  16. arXiv:2504.14326  [pdf, other

    cs.NI

    Diffusion-based Dynamic Contract for Federated AI Agent Construction in Mobile Metaverses

    Authors: Jinbo Wen, Jiawen Kang, Yang Zhang, Yue Zhong, Dusit Niyato, Jie Xu, Jianhang Tang, Chau Yuen

    Abstract: Mobile metaverses have attracted significant attention from both academia and industry, which are envisioned as the next-generation Internet, providing users with immersive and ubiquitous metaverse services through mobile devices. Driven by Large Language Models (LLMs) and Vision-Language Models (VLMs), Artificial Intelligence (AI) agents hold the potential to empower the creation, maintenance, an… ▽ More

    Submitted 19 April, 2025; originally announced April 2025.

  17. arXiv:2504.12339  [pdf, other

    cs.CL cs.SD eess.AS

    GOAT-TTS: LLM-based Text-To-Speech Generation Optimized via A Dual-Branch Architecture

    Authors: Yaodong Song, Hongjie Chen, Jie Lian, Yuxin Zhang, Guangmin Xia, Zehan Li, Genliang Zhao, Jian Kang, Yongxiang Li, Jie Li

    Abstract: While large language models (LLMs) have revolutionized text-to-speech (TTS) synthesis through discrete tokenization paradigms, current architectures exhibit fundamental tensions between three critical dimensions: 1) irreversible loss of acoustic characteristics caused by quantization of speech prompts; 2) stringent dependence on precisely aligned prompt speech-text pairs that limit real-world depl… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  18. arXiv:2504.11305  [pdf, other

    cs.CV cs.AI

    CFIS-YOLO: A Lightweight Multi-Scale Fusion Network for Edge-Deployable Wood Defect Detection

    Authors: Jincheng Kang, Yi Cen, Yigang Cen, Ke Wang, Yuhan Liu

    Abstract: Wood defect detection is critical for ensuring quality control in the wood processing industry. However, current industrial applications face two major challenges: traditional methods are costly, subjective, and labor-intensive, while mainstream deep learning models often struggle to balance detection accuracy and computational efficiency for edge deployment. To address these issues, this study pr… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

    Comments: 10 pages, 11 figures

  19. arXiv:2504.09609  [pdf, other

    cs.RO cs.AI

    A highly maneuverable flying squirrel drone with agility-improving foldable wings

    Authors: Dohyeon Lee, Jun-Gill Kang, Soohee Han

    Abstract: Drones, like most airborne aerial vehicles, face inherent disadvantages in achieving agile flight due to their limited thrust capabilities. These physical constraints cannot be fully addressed through advancements in control algorithms alone. Drawing inspiration from the winged flying squirrel, this paper proposes a highly maneuverable drone equipped with agility-enhancing foldable wings. By lever… ▽ More

    Submitted 8 May, 2025; v1 submitted 13 April, 2025; originally announced April 2025.

    Comments: Accepted to IEEE Robotics and Automation Letters. Project Page : https://jgkang1210.github.io/fsdrone_ral/ , Video : https://www.youtube.com/watch?v=tckIF3KCJig , Dohyeon Lee and Jun-Gill Kang are co-authors

  20. arXiv:2504.09513  [pdf, other

    cs.CV

    DiffuMural: Restoring Dunhuang Murals with Multi-scale Diffusion

    Authors: Puyu Han, Jiaju Kang, Yuhang Pan, Erting Pan, Zeyu Zhang, Qunchao Jin, Juntao Jiang, Zhichen Liu, Luqi Gong

    Abstract: Large-scale pre-trained diffusion models have produced excellent results in the field of conditional image generation. However, restoration of ancient murals, as an important downstream task in this field, poses significant challenges to diffusion model-based restoration methods due to its large defective area and scarce training samples. Conditional restoration tasks are more concerned with wheth… ▽ More

    Submitted 13 April, 2025; originally announced April 2025.

  21. arXiv:2504.09478  [pdf, other

    cs.RO cs.AI

    A highly maneuverable flying squirrel drone with controllable foldable wings

    Authors: Jun-Gill Kang, Dohyeon Lee, Soohee Han

    Abstract: Typical drones with multi rotors are generally less maneuverable due to unidirectional thrust, which may be unfavorable to agile flight in very narrow and confined spaces. This paper suggests a new bio-inspired drone that is empowered with high maneuverability in a lightweight and easy-to-carry way. The proposed flying squirrel inspired drone has controllable foldable wings to cover a wider range… ▽ More

    Submitted 13 April, 2025; originally announced April 2025.

    Comments: Accepted at 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Project Page : https://jgkang1210.github.io/fsdrone/ , Video : https://youtu.be/Cfc-llDb3_k?si=Cal5beZw6f3HZ2ZW , Jun-Gill Kang and Dohyeon Lee are co-authors

  22. arXiv:2504.08134  [pdf, other

    cs.NI

    Hybrid Reinforcement Learning-based Sustainable Multi-User Computation Offloading for Mobile Edge-Quantum Computing

    Authors: Minrui Xu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Mingzhe Chen, Dong In Kim, Xuemin, Shen

    Abstract: Exploiting quantum computing at the mobile edge holds immense potential for facilitating large-scale network design, processing multimodal data, optimizing resource management, and enhancing network security. In this paper, we propose a pioneering paradigm of mobile edge quantum computing (MEQC) that integrates quantum computing capabilities into classical edge computing servers that are proximate… ▽ More

    Submitted 10 April, 2025; originally announced April 2025.

    Comments: arXiv admin note: substantial text overlap with arXiv:2211.06681

  23. arXiv:2504.06979  [pdf

    q-bio.QM cs.LG

    Artificial Intelligence for Pediatric Height Prediction Using Large-Scale Longitudinal Body Composition Data

    Authors: Dohyun Chun, Hae Woon Jung, Jongho Kang, Woo Young Jang, Jihun Kim

    Abstract: This study developed an accurate artificial intelligence model for predicting future height in children and adolescents using anthropometric and body composition data from the GP Cohort Study (588,546 measurements from 96,485 children aged 7-18). The model incorporated anthropometric measures, body composition, standard deviation scores, and growth velocity parameters, with performance evaluated u… ▽ More

    Submitted 9 April, 2025; originally announced April 2025.

    Comments: 23 pages, 7 figures, 2 tables

    MSC Class: 62P10; 68T05

  24. arXiv:2504.05615  [pdf, other

    cs.LG cs.AI

    FedEFC: Federated Learning Using Enhanced Forward Correction Against Noisy Labels

    Authors: Seunghun Yu, Jin-Hyun Ahn, Joonhyuk Kang

    Abstract: Federated Learning (FL) is a powerful framework for privacy-preserving distributed learning. It enables multiple clients to collaboratively train a global model without sharing raw data. However, handling noisy labels in FL remains a major challenge due to heterogeneous data distributions and communication constraints, which can severely degrade model performance. To address this issue, we propose… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

    Comments: 9 pages, 3 figures

  25. arXiv:2504.04981  [pdf, other

    cs.CV cs.AI

    DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation

    Authors: Sohyun Lee, Nayeong Kim, Juwon Kang, Seong Joon Oh, Suha Kwak

    Abstract: This paper studies continual test-time adaptation (CTTA), the task of adapting a model to constantly changing unseen domains in testing while preserving previously learned knowledge. Existing CTTA methods mostly focus on adaptation to the current test domain only, overlooking generalization to arbitrary test domains a model may face in the future. To tackle this limitation, we present a novel onli… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  26. arXiv:2504.01855  [pdf, other

    cs.LG cs.AI

    Enhanced Diffusion Sampling via Extrapolation with Multiple ODE Solutions

    Authors: Jinyoung Choi, Junoh Kang, Bohyung Han

    Abstract: Diffusion probabilistic models (DPMs), while effective in generating high-quality samples, often suffer from high computational costs due to their iterative sampling process. To address this, we propose an enhanced ODE-based sampling method for DPMs inspired by Richardson extrapolation, which reduces numerical error and improves convergence rates. Our method, RX-DPM, leverages multiple ODE solutio… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

    Comments: ICLR 2025

  27. arXiv:2504.01162  [pdf, ps, other

    cs.IR

    Information Retrieval for Climate Impact

    Authors: Maarten de Rijke, Bart van den Hurk, Flora Salim, Alaa Al Khourdajie, Nan Bai, Renato Calzone, Declan Curran, Getnet Demil, Lesley Frew, Noah Gießing, Mukesh Kumar Gupta, Maria Heuss, Sanaa Hobeichi, David Huard, Jingwei Kang, Ana Lucic, Tanwi Mallick, Shruti Nath, Andrew Okem, Barbara Pernici, Thilina Rajapakse, Hira Saleem, Harry Scells, Nicole Schneider, Damiano Spina , et al. (6 additional authors not shown)

    Abstract: The purpose of the MANILA24 Workshop on information retrieval for climate impact was to bring together researchers from academia, industry, governments, and NGOs to identify and discuss core research problems in information retrieval to assess climate change impacts. The workshop aimed to foster collaboration by bringing communities together that have so far not been very well connected -- informa… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

    Comments: Report on the MANILA24 Workshop

    ACM Class: H.3.3

  28. arXiv:2503.23361  [pdf, other

    cs.CL

    Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base

    Authors: Linxin Song, Xuwei Ding, Jieyu Zhang, Taiwei Shi, Ryotaro Shimizu, Rahul Gupta, Yang Liu, Jian Kang, Jieyu Zhao

    Abstract: Large language models (LLMs) possess impressive linguistic capabilities but often fail to faithfully retain factual knowledge, leading to hallucinations and unreliable outputs. Understanding LLMs' knowledge deficiencies by exhaustively evaluating against full-scale knowledge bases is computationally prohibitive, especially for closed-weight models. We propose stochastic error ascent (SEA), a scala… ▽ More

    Submitted 30 March, 2025; originally announced March 2025.

  29. arXiv:2503.23290  [pdf, other

    cs.NI

    Efficient Twin Migration in Vehicular Metaverses: Multi-Agent Split Deep Reinforcement Learning with Spatio-Temporal Trajectory Generation

    Authors: Junlong Chen, Jiawen Kang, Minrui Xu, Fan Wu, Hongliang Zhang, Huawei Huang, Dusit Niyato, Shiwen Mao

    Abstract: Vehicle Twins (VTs) as digital representations of vehicles can provide users with immersive experiences in vehicular metaverse applications, e.g., Augmented Reality (AR) navigation and embodied intelligence. VT migration is an effective way that migrates the VT when the locations of physical entities keep changing to maintain seamless immersive VT services. However, an efficient VT migration is ch… ▽ More

    Submitted 29 March, 2025; originally announced March 2025.

  30. arXiv:2503.19791  [pdf, other

    cs.CV

    SITA: Structurally Imperceptible and Transferable Adversarial Attacks for Stylized Image Generation

    Authors: Jingdan Kang, Haoxin Yang, Yan Cai, Huaidong Zhang, Xuemiao Xu, Yong Du, Shengfeng He

    Abstract: Image generation technology has brought significant advancements across various fields but has also raised concerns about data misuse and potential rights infringements, particularly with respect to creating visual artworks. Current methods aimed at safeguarding artworks often employ adversarial attacks. However, these methods face challenges such as poor transferability, high computational costs,… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  31. arXiv:2503.16823  [pdf, other

    cs.ET cs.GT eess.SY

    Federated Digital Twin Construction via Distributed Sensing: A Game-Theoretic Online Optimization with Overlapping Coalitions

    Authors: Ruoyang Chen, Changyan Yi, Fuhui Zhou, Jiawen Kang, Yuan Wu, Dusit Niyato

    Abstract: In this paper, we propose a novel federated framework for constructing the digital twin (DT) model, referring to a living and self-evolving visualization model empowered by artificial intelligence, enabled by distributed sensing under edge-cloud collaboration. In this framework, the DT model to be built at the cloud is regarded as a global one being split into and integrating from multiple functio… ▽ More

    Submitted 20 March, 2025; originally announced March 2025.

  32. arXiv:2503.15796  [pdf, other

    cs.LG cs.AI

    Blend the Separated: Mixture of Synergistic Experts for Data-Scarcity Drug-Target Interaction Prediction

    Authors: Xinlong Zhai, Chunchen Wang, Ruijia Wang, Jiazheng Kang, Shujie Li, Boyu Chen, Tengfei Ma, Zikai Zhou, Cheng Yang, Chuan Shi

    Abstract: Drug-target interaction prediction (DTI) is essential in various applications including drug discovery and clinical application. There are two perspectives of input data widely used in DTI prediction: Intrinsic data represents how drugs or targets are constructed, and extrinsic data represents how drugs or targets are related to other biological entities. However, any of the two perspectives of in… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

  33. arXiv:2503.14838  [pdf, other

    cs.SE

    Think Like Human Developers: Harnessing Community Knowledge for Structured Code Reasoning

    Authors: Chengran Yang, Zhensu Sun, Hong Jin Kang, Jieke Shi, David Lo

    Abstract: Large Language Models (LLMs) have significantly advanced automated code generation, yet they struggle with complex coding tasks requiring multi-step logical reasoning. High-quality reasoning data is crucial for improving LLMs' reasoning capabilities, but such datasets remain scarce. Existing approaches either rely on computationally expensive reinforcement learning (RL) or error-prone reasoning ch… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

  34. arXiv:2503.13369  [pdf, other

    cs.AI cs.CV cs.HC

    Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions

    Authors: Wan Ju Kang, Eunki Kim, Na Min An, Sangryul Kim, Haemin Choi, Ki Hoon Kwak, James Thorne

    Abstract: Often, the needs and visual abilities differ between the annotator group and the end user group. Generating detailed diagram descriptions for blind and low-vision (BLV) users is one such challenging domain. Sighted annotators could describe visuals with ease, but existing studies have shown that direct generations by them are costly, bias-prone, and somewhat lacking by BLV standards. In this study… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

    Comments: 37 pages, 10 figures, 21 tables

  35. arXiv:2503.13347  [pdf, other

    cs.CV

    TriDF: Triplane-Accelerated Density Fields for Few-Shot Remote Sensing Novel View Synthesis

    Authors: Jiaming Kang, Keyan Chen, Zhengxia Zou, Zhenwei Shi

    Abstract: Remote sensing novel view synthesis (NVS) offers significant potential for 3D interpretation of remote sensing scenes, with important applications in urban planning and environmental monitoring. However, remote sensing scenes frequently lack sufficient multi-view images due to acquisition constraints. While existing NVS methods tend to overfit when processing limited input views, advanced few-shot… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

  36. arXiv:2503.10002  [pdf, ps, other

    math.CO cs.DM math.PR

    Triangle-free graphs with the fewest independent sets

    Authors: Pjotr Buys, Jan van den Heuvel, Ross J. Kang

    Abstract: Given $d>0$ and a positive integer $n$, let $G$ be a triangle-free graph on $n$ vertices with average degree $d$. With an elegant induction, Shearer (1983) tightened a seminal result of Ajtai, Komlós and Szemerédi (1980/1981) by proving that $G$ contains an independent set of size at least $(1+o(1))\frac{\log d}{d}n$ as $d\to\infty$. By a generalisation of Shearer's method, we prove that the num… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

    Comments: 12 pages, 1 figure

    MSC Class: 05C35; 05D40; 05C80; 05C69; 60C05; 05A16; 82B20

  37. arXiv:2503.06149  [pdf, other

    cs.IT eess.SP

    Wireless Hallucination in Generative AI-enabled Communications: Concepts, Issues, and Solutions

    Authors: Xudong Wang, Jiacheng Wang, Lei Feng, Dusit Niyato, Ruichen Zhang, Jiawen Kang, Zehui Xiong, Hongyang Du, Shiwen Mao

    Abstract: Generative AI (GenAI) is driving the intelligence of wireless communications. Due to data limitations, random generation, and dynamic environments, GenAI may generate channel information or optimization strategies that violate physical laws or deviate from actual real-world requirements. We refer to this phenomenon as wireless hallucination, which results in invalid channel information, spectrum w… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

    Comments: 7 pages, 4 figures

  38. arXiv:2503.03998  [pdf, other

    cs.RO

    Robotic Compliant Object Prying Using Diffusion Policy Guided by Vision and Force Observations

    Authors: Jeon Ho Kang, Sagar Joshi, Ruopeng Huang, Satyandra K. Gupta

    Abstract: The growing adoption of batteries in the electric vehicle industry and various consumer products has created an urgent need for effective recycling solutions. These products often contain a mix of compliant and rigid components, making robotic disassembly a critical step toward achieving scalable recycling processes. Diffusion policy has emerged as a promising approach for learning low-level skill… ▽ More

    Submitted 17 March, 2025; v1 submitted 5 March, 2025; originally announced March 2025.

    Comments: Accepted to IEEE RA-L. (C) 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media. 8 pages with 9 figures

  39. arXiv:2503.02720  [pdf, other

    cs.RO cs.AI

    Vibration-Assisted Hysteresis Mitigation for Achieving High Compensation Efficiency

    Authors: Myeongbo Park, Chunggil An, Junhyun Park, Jonghyun Kang, Minho Hwang

    Abstract: Tendon-sheath mechanisms (TSMs) are widely used in minimally invasive surgical (MIS) applications, but their inherent hysteresis-caused by friction, backlash, and tendon elongation-leads to significant tracking errors. Conventional modeling and compensation methods struggle with these nonlinearities and require extensive parameter tuning. To address this, we propose a vibration-assisted hysteresis… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: 8 pages, 7 figures, and 2 tables

  40. arXiv:2503.01449  [pdf, other

    cs.SE

    Benchmarking Large Language Models for Multi-Language Software Vulnerability Detection

    Authors: Ting Zhang, Chengran Yang, Yindu Su, Martin Weyssow, Hung Nguyen, Tan Bui, Hong Jin Kang, Yikun Li, Eng Lieh Ouh, Lwin Khin Shar, David Lo

    Abstract: Recent advancements in generative AI have led to the widespread adoption of large language models (LLMs) in software engineering, addressing numerous long-standing challenges. However, a comprehensive study examining the capabilities of LLMs in software vulnerability detection (SVD), a crucial aspect of software security, is currently lacking. Existing research primarily focuses on evaluating LLMs… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

  41. arXiv:2503.00795  [pdf, other

    cs.SE cs.AI

    Towards Reliable LLM-Driven Fuzz Testing: Vision and Road Ahead

    Authors: Yiran Cheng, Hong Jin Kang, Lwin Khin Shar, Chaopeng Dong, Zhiqiang Shi, Shichao Lv, Limin Sun

    Abstract: Fuzz testing is a crucial component of software security assessment, yet its effectiveness heavily relies on valid fuzz drivers and diverse seed inputs. Recent advancements in Large Language Models (LLMs) offer transformative potential for automating fuzz testing (LLM4Fuzz), particularly in generating drivers and seeds. However, current LLM4Fuzz solutions face critical reliability challenges, incl… ▽ More

    Submitted 2 March, 2025; originally announced March 2025.

  42. arXiv:2502.19930  [pdf, other

    cs.CV

    Identity-preserving Distillation Sampling by Fixed-Point Iterator

    Authors: SeonHwa Kim, Jiwon Kim, Soobin Park, Donghoon Ahn, Jiwon Kang, Seungryong Kim, Kyong Hwan Jin, Eunju Cha

    Abstract: Score distillation sampling (SDS) demonstrates a powerful capability for text-conditioned 2D image and 3D object generation by distilling the knowledge from learned score functions. However, SDS often suffers from blurriness caused by noisy gradients. When SDS meets the image editing, such degradations can be reduced by adjusting bias shifts using reference pairs, but the de-biasing techniques are… ▽ More

    Submitted 25 March, 2025; v1 submitted 27 February, 2025; originally announced February 2025.

  43. arXiv:2502.19849  [pdf, other

    cs.LG

    Revisit the Stability of Vanilla Federated Learning Under Diverse Conditions

    Authors: Youngjoon Lee, Jinu Gong, Sun Choi, Joonhyuk Kang

    Abstract: Federated Learning (FL) is a distributed machine learning paradigm enabling collaborative model training across decentralized clients while preserving data privacy. In this paper, we revisit the stability of the vanilla FedAvg algorithm under diverse conditions. Despite its conceptual simplicity, FedAvg exhibits remarkably stable performance compared to more advanced FL techniques. Our experiments… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: 10 pages

  44. arXiv:2502.19630  [pdf, other

    cs.CV

    Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event Cameras

    Authors: Hoonhee Cho, Jae-young Kang, Youngho Kim, Kuk-Jin Yoon

    Abstract: Detecting 3D objects in point clouds plays a crucial role in autonomous driving systems. Recently, advanced multi-modal methods incorporating camera information have achieved notable performance. For a safe and effective autonomous driving system, algorithms that excel not only in accuracy but also in speed and low latency are essential. However, existing algorithms fail to meet these requirements… ▽ More

    Submitted 26 February, 2025; originally announced February 2025.

    Comments: Accepted by CVPR2025

  45. arXiv:2502.18934  [pdf, other

    cs.CL cs.LG

    Kanana: Compute-efficient Bilingual Language Models

    Authors: Kanana LLM Team, Yunju Bak, Hojin Lee, Minho Ryu, Jiyeon Ham, Seungjae Jung, Daniel Wontae Nam, Taegyeong Eo, Donghun Lee, Doohae Jung, Boseop Kim, Nayeon Kim, Jaesun Park, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Kyoung-Woon On, Seulye Baeg, Junrae Cho, Sunghee Jung, Jieun Kang, EungGyun Kim, Eunhwa Kim, Byeongil Ko, Daniel Lee , et al. (4 additional authors not shown)

    Abstract: We introduce Kanana, a series of bilingual language models that demonstrate exceeding performance in Korean and competitive performance in English. The computational cost of Kanana is significantly lower than that of state-of-the-art models of similar size. The report details the techniques employed during pre-training to achieve compute-efficient yet competitive models, including high quality dat… ▽ More

    Submitted 28 February, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

    Comments: 40 pages, 15 figures

  46. arXiv:2502.18791  [pdf, other

    cs.CL cs.AI cs.LG

    Can LLMs Help Uncover Insights about LLMs? A Large-Scale, Evolving Literature Analysis of Frontier LLMs

    Authors: Jungsoo Park, Junmo Kang, Gabriel Stanovsky, Alan Ritter

    Abstract: The surge of LLM studies makes synthesizing their findings challenging. Analysis of experimental results from literature can uncover important trends across studies, but the time-consuming nature of manual data extraction limits its use. Our study presents a semi-automated approach for literature analysis that accelerates data extraction using LLMs. It automatically identifies relevant arXiv paper… ▽ More

    Submitted 10 April, 2025; v1 submitted 25 February, 2025; originally announced February 2025.

    Comments: 22 pages, 9 figures

  47. arXiv:2502.18356  [pdf, other

    cs.LG

    WebGames: Challenging General-Purpose Web-Browsing AI Agents

    Authors: George Thomas, Alex J. Chan, Jikun Kang, Wenqi Wu, Filippos Christianos, Fraser Greenlee, Andy Toulis, Marvin Purtorab

    Abstract: We introduce WebGames, a comprehensive benchmark suite designed to evaluate general-purpose web-browsing AI agents through a collection of 50+ interactive challenges. These challenges are specifically crafted to be straightforward for humans while systematically testing the limitations of current AI systems across fundamental browser interactions, advanced input processing, cognitive tasks, workfl… ▽ More

    Submitted 25 February, 2025; originally announced February 2025.

  48. arXiv:2502.17475  [pdf, other

    eess.SP cs.AI cs.CL cs.LG

    ECG-Expert-QA: A Benchmark for Evaluating Medical Large Language Models in Heart Disease Diagnosis

    Authors: Xu Wang, Jiaju Kang, Puyu Han, Yubao Zhao, Qian Liu, Liwenfei He, Lingqiong Zhang, Lingyun Dai, Yongcheng Wang, Jie Tao

    Abstract: We present ECG-Expert-QA, a comprehensive multimodal dataset for evaluating diagnostic capabilities in electrocardiogram (ECG) interpretation. It combines real-world clinical ECG data with systematically generated synthetic cases, covering 12 essential diagnostic tasks and totaling 47,211 expert-validated QA pairs. These encompass diverse clinical scenarios, from basic rhythm recognition to comple… ▽ More

    Submitted 7 April, 2025; v1 submitted 16 February, 2025; originally announced February 2025.

  49. arXiv:2502.14883  [pdf, other

    cs.CV cs.AI

    Can LVLMs and Automatic Metrics Capture Underlying Preferences of Blind and Low-Vision Individuals for Navigational Aid?

    Authors: Na Min An, Eunki Kim, Wan Ju Kang, Sangryul Kim, Hyunjung Shim, James Thorne

    Abstract: Vision is a primary means of how humans perceive the environment, but Blind and Low-Vision (BLV) people need assistance understanding their surroundings, especially in unfamiliar environments. The emergence of semantic-based systems as assistance tools for BLV users has motivated many researchers to explore responses from Large Vision-Language Models (LVLMs). However, it has yet been studied prefe… ▽ More

    Submitted 15 February, 2025; originally announced February 2025.

    Comments: 26 pages, 12 figures, 14 tables

  50. arXiv:2502.14258  [pdf, other

    cs.CL cs.AI

    Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information

    Authors: Yein Park, Chanwoong Yoon, Jungwoo Park, Minbyul Jeong, Jaewoo Kang

    Abstract: While the ability of language models to elicit facts has been widely investigated, how they handle temporally changing facts remains underexplored. We discover Temporal Heads, specific attention heads primarily responsible for processing temporal knowledge through circuit analysis. We confirm that these heads are present across multiple models, though their specific locations may vary, and their r… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.