Skip to main content

Showing 1–50 of 60 results for author: Kang, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.16819  [pdf, other

    cs.CV

    Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts

    Authors: Taewon Kang, Ming C. Lin

    Abstract: Recent advances in scene-based video generation have enabled systems to synthesize coherent visual narratives from structured prompts. However, a crucial dimension of storytelling -- character-driven dialogue and speech -- remains underexplored. In this paper, we present a modular pipeline that transforms action-level prompts into visually and auditorily grounded narrative dialogue, enriching visu… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: 18 pages, 5 figures

  2. arXiv:2505.16222  [pdf, ps, other

    cs.CL cs.SE

    Don't Judge Code by Its Cover: Exploring Biases in LLM Judges for Code Evaluation

    Authors: Jiwon Moon, Yerin Hwang, Dongryeol Lee, Taegwan Kang, Yongil Kim, Kyomin Jung

    Abstract: With the growing use of large language models(LLMs) as evaluators, their application has expanded to code evaluation tasks, where they assess the correctness of generated code without relying on reference implementations. While this offers scalability and flexibility, it also raises a critical, unresolved question: Can LLM judges fairly and robustly evaluate semantically equivalent code with super… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: 26 pages

  3. arXiv:2505.15249  [pdf, ps, other

    cs.CL cs.CV

    Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation

    Authors: Yerin Hwang, Dongryeol Lee, Kyungmin Min, Taegwan Kang, Yong-il Kim, Kyomin Jung

    Abstract: Recently, large vision-language models (LVLMs) have emerged as the preferred tools for judging text-image alignment, yet their robustness along the visual modality remains underexplored. This work is the first study to address a key research question: Can adversarial visual manipulations systematically fool LVLM judges into assigning unfairly inflated scores? We define potential image induced bias… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

    Comments: (21pgs, 12 Tables, 9 Figures)

  4. arXiv:2504.14396  [pdf, other

    cs.CV

    SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation

    Authors: Minho Park, Taewoong Kang, Jooyeol Yun, Sungwon Hwang, Jaegul Choo

    Abstract: The increasing demand for AR/VR applications has highlighted the need for high-quality 360-degree panoramic content. However, generating high-quality 360-degree panoramic images and videos remains a challenging task due to the severe distortions introduced by equirectangular projection (ERP). Existing approaches either fine-tune pretrained diffusion models on limited ERP datasets or attempt tuning… ▽ More

    Submitted 19 April, 2025; originally announced April 2025.

  5. A real-time anomaly detection method for robots based on a flexible and sparse latent space

    Authors: Taewook Kang, Bum-Jae You, Juyoun Park, Yisoo Lee

    Abstract: The growing demand for robots to operate effectively in diverse environments necessitates the need for robust real-time anomaly detection techniques during robotic operations. However, deep learning-based models in robotics face significant challenges due to limited training data and highly noisy signal features. In this paper, we present Sparse Masked Autoregressive Flow-based Adversarial AutoEnc… ▽ More

    Submitted 22 June, 2025; v1 submitted 15 April, 2025; originally announced April 2025.

    Comments: 20 pages, 11 figures

  6. arXiv:2503.22057  [pdf, other

    cs.CE

    A production planning benchmark for real-world refinery-petrochemical complexes

    Authors: Wenli Du, Chuan Wang, Chen Fan, Zhi Li, Yeke Zhong, Tianao Kang, Ziting Liang, Minglei Yang, Feng Qian, Xin Dai

    Abstract: To achieve digital intelligence transformation and carbon neutrality, effective production planning is crucial for integrated refinery-petrochemical complexes. Modern refinery planning relies on advanced optimization techniques, whose development requires reproducible benchmark problems. However, existing benchmarks lack practical context or impose oversimplified assumptions, limiting their applic… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  7. arXiv:2503.20102  [pdf, other

    cs.LG cs.RO

    Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion

    Authors: Chang Chen, Hany Hamed, Doojin Baek, Taegu Kang, Yoshua Bengio, Sungjin Ahn

    Abstract: This paper tackles a novel problem, extendable long-horizon planning-enabling agents to plan trajectories longer than those in training data without compounding errors. To tackle this, we propose the Hierarchical Multiscale Diffuser (HM-Diffuser) and Progressive Trajectory Extension (PTE), an augmentation method that iteratively generates longer trajectories by stitching shorter ones. HM-Diffuser… ▽ More

    Submitted 10 April, 2025; v1 submitted 25 March, 2025; originally announced March 2025.

    Comments: First two authors contributed equally

  8. arXiv:2503.06310  [pdf, other

    cs.CV

    Text2Story: Advancing Video Storytelling with Text Guidance

    Authors: Taewon Kang, Divya Kothandaraman, Ming C. Lin

    Abstract: Generating coherent long-form video sequences from discrete input using only text prompts is a critical task in content creation. While diffusion-based models excel at short video synthesis, long-form storytelling from text remains largely unexplored and a challenge due to challenges pertaining to temporal coherency, preserving semantic meaning and action continuity across the video. We introduce… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

    Comments: 15 pages, 6 figures

  9. arXiv:2503.00861  [pdf, other

    cs.CV

    Zero-Shot Head Swapping in Real-World Scenarios

    Authors: Taewoong Kang, Sohyun Jeong, Hyojin Jang, Jaegul Choo

    Abstract: With growing demand in media and social networks for personalized images, the need for advanced head-swapping techniques, integrating an entire head from the head image with the body from the body image, has increased. However, traditional head swapping methods heavily rely on face-centered cropped data with primarily frontal facing views, which limits their effectiveness in real world application… ▽ More

    Submitted 24 March, 2025; v1 submitted 2 March, 2025; originally announced March 2025.

    Comments: CVPR'25

  10. arXiv:2502.17715  [pdf, other

    cs.CL cs.AI cs.HC

    Bridging Information Gaps with Comprehensive Answers: Improving the Diversity and Informativeness of Follow-Up Questions

    Authors: Zhe Liu, Taekyu Kang, Haoyu Wang, Seyed Hossein Alavi, Vered Shwartz

    Abstract: Effective conversational systems are expected to dynamically generate contextual follow-up questions to elicit new information while maintaining the conversation flow. While humans excel at asking diverse and informative questions by intuitively assessing both obtained and missing information, existing models often fall short of human performance on this task. To mitigate this, we propose a method… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

    Comments: 8 pages, 2 figures, submitted to ACL 2025

  11. arXiv:2502.04362  [pdf, other

    cs.CL cs.AI

    LLMs can be easily Confused by Instructional Distractions

    Authors: Yerin Hwang, Yongil Kim, Jahyun Koo, Taegwan Kang, Hyunkyung Bae, Kyomin Jung

    Abstract: Despite the fact that large language models (LLMs) show exceptional skill in instruction following tasks, this strength can turn into a vulnerability when the models are required to disregard certain instructions. Instruction-following tasks typically involve a clear task description and input text containing the target data to be processed. However, when the input itself resembles an instruction,… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

    Comments: 8 pages

  12. arXiv:2502.00903  [pdf, other

    cs.CL cs.AI cs.CY cs.SI

    Embracing Dialectic Intersubjectivity: Coordination of Different Perspectives in Content Analysis with LLM Persona Simulation

    Authors: Taewoo Kang, Kjerstin Thorson, Tai-Quan Peng, Dan Hiaeshutter-Rice, Sanguk Lee, Stuart Soroka

    Abstract: This study attempts to advancing content analysis methodology from consensus-oriented to coordination-oriented practices, thereby embracing diverse coding outputs and exploring the dynamics among differential perspectives. As an exploratory investigation of this approach, we evaluate six GPT-4o configurations to analyze sentiment in Fox News and MSNBC transcripts on Biden and Trump during the 2020… ▽ More

    Submitted 4 February, 2025; v1 submitted 2 February, 2025; originally announced February 2025.

  13. arXiv:2501.14249  [pdf, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  14. arXiv:2412.19002  [pdf, other

    cs.AR cs.AI

    Tempus Core: Area-Power Efficient Temporal-Unary Convolution Core for Low-Precision Edge DLAs

    Authors: Prabhu Vellaisamy, Harideep Nair, Thomas Kang, Yichen Ni, Haoyang Fan, Bin Qi, Jeff Chen, Shawn Blanton, John Paul Shen

    Abstract: The increasing complexity of deep neural networks (DNNs) poses significant challenges for edge inference deployment due to resource and power constraints of edge devices. Recent works on unary-based matrix multiplication hardware aim to leverage data sparsity and low-precision values to enhance hardware efficiency. However, the adoption and integration of such unary hardware into commercial deep l… ▽ More

    Submitted 25 December, 2024; originally announced December 2024.

    Comments: Accepted in DATE 2025

  15. arXiv:2411.07806  [pdf, other

    cs.LG cs.CR eess.SP

    Federated Low-Rank Adaptation with Differential Privacy over Wireless Networks

    Authors: Tianqu Kang, Zixin Wang, Hengtao He, Jun Zhang, Shenghui Song, Khaled B. Letaief

    Abstract: Fine-tuning large pre-trained foundation models (FMs) on distributed edge devices presents considerable computational and privacy challenges. Federated fine-tuning (FedFT) mitigates some privacy issues by facilitating collaborative model training without the need to share raw data. To lessen the computational burden on resource-limited devices, combining low-rank adaptation (LoRA) with federated l… ▽ More

    Submitted 27 November, 2024; v1 submitted 12 November, 2024; originally announced November 2024.

    Comments: 6 pages, 3 figures

  16. arXiv:2410.19503  [pdf, other

    cs.CL

    SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models

    Authors: Jahyun Koo, Yerin Hwang, Yongil Kim, Taegwan Kang, Hyunkyung Bae, Kyomin Jung

    Abstract: Despite the success of Large Language Models (LLMs), they still face challenges related to high inference costs and memory requirements. To address these issues, Knowledge Distillation (KD) has emerged as a popular method for model compression, with student-generated outputs (SGOs) as training data being particularly notable for reducing the mismatch between training and inference. However, SGOs o… ▽ More

    Submitted 22 April, 2025; v1 submitted 25 October, 2024; originally announced October 2024.

    Comments: NAACL 2025 Findings

  17. arXiv:2410.11682  [pdf, other

    cs.GR cs.AI cs.CV

    SurFhead: Affine Rig Blending for Geometrically Accurate 2D Gaussian Surfel Head Avatars

    Authors: Jaeseong Lee, Taewoong Kang, Marcel C. Bühler, Min-Jung Kim, Sungwon Hwang, Junha Hyung, Hyojin Jang, Jaegul Choo

    Abstract: Recent advancements in head avatar rendering using Gaussian primitives have achieved significantly high-fidelity results. Although precise head geometry is crucial for applications like mesh reconstruction and relighting, current methods struggle to capture intricate geometric details and render unseen poses due to their reliance on similarity transformations, which cannot handle stretch and shear… ▽ More

    Submitted 18 April, 2025; v1 submitted 15 October, 2024; originally announced October 2024.

    Comments: ICLR 2025, Project page with videos: https://summertight.github.io/SurFhead/

  18. arXiv:2408.06157  [pdf, other

    cs.CV

    3D-free meets 3D priors: Novel View Synthesis from a Single Image with Pretrained Diffusion Guidance

    Authors: Taewon Kang, Divya Kothandaraman, Dinesh Manocha, Ming C. Lin

    Abstract: Recent 3D novel view synthesis (NVS) methods often require extensive 3D data for training, and also typically lack generalization beyond the training distribution. Moreover, they tend to be object centric and struggle with complex and intricate scenes. Conversely, 3D-free methods can generate text-controlled views of complex, in-the-wild scenes using a pretrained stable diffusion model without the… ▽ More

    Submitted 27 November, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

    Comments: 18 pages, 13 figures, v4: methodology and result improvement

  19. arXiv:2407.02945  [pdf, other

    cs.CV

    VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors

    Authors: Sungwon Hwang, Min-Jung Kim, Taewoong Kang, Jayeon Kang, Jaegul Choo

    Abstract: Neural rendering-based urban scene reconstruction methods commonly rely on images collected from driving vehicles with cameras facing and moving forward. Although these methods can successfully synthesize from views similar to training camera trajectory, directing the novel view outside the training camera distribution does not guarantee on-par performance. In this paper, we tackle the Extrapolate… ▽ More

    Submitted 13 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: The first two authors contributed equally. Project Page: https://vegs3d.github.io/

  20. The Effect of Quantization in Federated Learning: A Rényi Differential Privacy Perspective

    Authors: Tianqu Kang, Lumin Liu, Hengtao He, Jun Zhang, S. H. Song, Khaled B. Letaief

    Abstract: Federated Learning (FL) is an emerging paradigm that holds great promise for privacy-preserving machine learning using distributed data. To enhance privacy, FL can be combined with Differential Privacy (DP), which involves adding Gaussian noise to the model weights. However, FL faces a significant challenge in terms of large communication overhead when transmitting these model weights. To address… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 6 pages, 5 figures, submitted to 2024 IEEE MeditCom

  21. arXiv:2404.13300  [pdf, other

    cs.LG

    Capturing Momentum: Tennis Match Analysis Using Machine Learning and Time Series Theory

    Authors: Jingdi Lei, Tianqi Kang, Yuluan Cao, Shiwei Ren

    Abstract: This paper represents an analysis on the momentum of tennis match. And due to Generalization performance of it, it can be helpful in constructing a system to predict the result of sports game and analyze the performance of player based on the Technical statistics. We First use hidden markov models to predict the momentum which is defined as the performance of players. Then we use Xgboost to prove… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: 16 pages, 18 figures

  22. arXiv:2402.18330  [pdf, other

    cs.CV

    Attention-Propagation Network for Egocentric Heatmap to 3D Pose Lifting

    Authors: Taeho Kang, Youngki Lee

    Abstract: We present EgoTAP, a heatmap-to-3D pose lifting method for highly accurate stereo egocentric 3D pose estimation. Severe self-occlusion and out-of-view limbs in egocentric camera views make accurate pose estimation a challenging problem. To address the challenge, prior methods employ joint heatmaps-probabilistic 2D representations of the body pose, but heatmap-to-3D pose conversion still remains an… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 16 pages, 9 figures, to be published as CVPR 2024 paper

  23. arXiv:2402.17127  [pdf, other

    cs.SD eess.AS

    Experimental Study: Enhancing Voice Spoofing Detection Models with wav2vec 2.0

    Authors: Taein Kang, Soyul Han, Sunmook Choi, Jaejin Seo, Sanghyeok Chung, Seungeun Lee, Seungsang Oh, Il-Youp Kwak

    Abstract: Conventional spoofing detection systems have heavily relied on the use of handcrafted features derived from speech data. However, a notable shift has recently emerged towards the direct utilization of raw speech waveforms, as demonstrated by methods like SincNet filters. This shift underscores the demand for more sophisticated audio sample features. Moreover, the success of deep learning models, p… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 5 pages

    MSC Class: 00A71 ACM Class: I.2.6

  24. arXiv:2311.09627  [pdf, other

    cs.AI cs.CL cs.LG

    Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination

    Authors: Nakyeong Yang, Taegwan Kang, Jungkyu Choi, Honglak Lee, Kyomin Jung

    Abstract: Instruction-following language models often show undesirable biases. These undesirable biases may be accelerated in the real-world usage of language models, where a wide range of instructions is used through zero-shot example prompting. To solve this problem, we first define the bias neuron, which significantly affects biased outputs, and prove its existence empirically. Furthermore, we propose a… ▽ More

    Submitted 5 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: accepted to ACL 2024

  25. arXiv:2310.10073  [pdf, other

    cs.CV

    Expression Domain Translation Network for Cross-domain Head Reenactment

    Authors: Taewoong Kang, Jeongsik Oh, Jaeseong Lee, Sunghyun Park, Jaegul Choo

    Abstract: Despite the remarkable advancements in head reenactment, the existing methods face challenges in cross-domain head reenactment, which aims to transfer human motions to domains outside the human, including cartoon characters. It is still difficult to extract motion from out-of-domain images due to the distinct appearances, such as large eyes. Recently, previous work introduced a large-scale anime d… ▽ More

    Submitted 6 November, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Project page with videos: https://keh0t0.github.io/research/EDTN/

  26. Ego3DPose: Capturing 3D Cues from Binocular Egocentric Views

    Authors: Taeho Kang, Kyungjin Lee, Jinrui Zhang, Youngki Lee

    Abstract: We present Ego3DPose, a highly accurate binocular egocentric 3D pose reconstruction system. The binocular egocentric setup offers practicality and usefulness in various applications, however, it remains largely under-explored. It has been suffering from low pose estimation accuracy due to viewing distortion, severe self-occlusion, and limited field-of-view of the joints in egocentric 2D images. He… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: 12 pages, 10 figures, to be published as SIGGRAPH Asia 2023 Conference Papers

  27. arXiv:2308.11639  [pdf, other

    eess.SP cs.AI cs.LG

    An Empirical Study on Fault Detection and Root Cause Analysis of Indium Tin Oxide Electrodes by Processing S-parameter Patterns

    Authors: Tae Yeob Kang, Haebom Lee, Sungho Suh

    Abstract: In the field of optoelectronics, indium tin oxide (ITO) electrodes play a crucial role in various applications, such as displays, sensors, and solar cells. Effective fault diagnosis and root cause analysis of the ITO electrodes are essential to ensure the performance and reliability of the devices. However, traditional visual inspection is challenging with transparent ITO electrodes, and existing… ▽ More

    Submitted 10 June, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: Accepted in IEEE Transactions on Device and Materials Reliability

  28. arXiv:2307.13220  [pdf

    eess.IV cs.AI physics.med-ph

    One for Multiple: Physics-informed Synthetic Data Boosts Generalizable Deep Learning for Fast MRI Reconstruction

    Authors: Zi Wang, Xiaotong Yu, Chengyan Wang, Weibo Chen, Jiazheng Wang, Ying-Hua Chu, Hongwei Sun, Rushuai Li, Peiyong Li, Fan Yang, Haiwei Han, Taishan Kang, Jianzhong Lin, Chen Yang, Shufu Chang, Zhang Shi, Sha Hua, Yan Li, Juan Hu, Liuhong Zhu, Jianjun Zhou, Meijing Lin, Jiefeng Guo, Congbo Cai, Zhong Chen , et al. (3 additional authors not shown)

    Abstract: Magnetic resonance imaging (MRI) is a widely used radiological modality renowned for its radiation-free, comprehensive insights into the human body, facilitating medical diagnoses. However, the drawback of prolonged scan times hinders its accessibility. The k-space undersampling offers a solution, yet the resultant artifacts necessitate meticulous removal during image reconstruction. Although Deep… ▽ More

    Submitted 28 February, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: 38 pages, 19 figures, 5 tables

  29. arXiv:2306.09681  [pdf

    physics.med-ph cs.LG

    Magnetic Resonance Spectroscopy Quantification Aided by Deep Estimations of Imperfection Factors and Macromolecular Signal

    Authors: Dicheng Chen, Meijin Lin, Huiting Liu, Jiayu Li, Yirong Zhou, Taishan Kang, Liangjie Lin, Zhigang Wu, Jiazheng Wang, Jing Li, Jianzhong Lin, Xi Chen, Di Guo, Xiaobo Qu

    Abstract: Objective: Magnetic Resonance Spectroscopy (MRS) is an important technique for biomedical detection. However, it is challenging to accurately quantify metabolites with proton MRS due to serious overlaps of metabolite signals, imperfections because of non-ideal acquisition conditions, and interference with strong background signals mainly from macromolecules. The most popular method, LCModel, adopt… ▽ More

    Submitted 9 October, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

  30. arXiv:2306.07713  [pdf, other

    cs.CV

    Robustness of SAM: Segment Anything Under Corruptions and Beyond

    Authors: Yu Qiao, Chaoning Zhang, Taegoo Kang, Donghun Kim, Chenshuang Zhang, Choong Seon Hong

    Abstract: Segment anything model (SAM), as the name suggests, is claimed to be capable of cutting out any object and demonstrates impressive zero-shot transfer performance with the guidance of prompts. However, there is currently a lack of comprehensive evaluation regarding its robustness under various corruptions. Understanding the robustness of SAM across different corruption scenarios is crucial for its… ▽ More

    Submitted 4 September, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: The first work evaluates the robustness of SAM under various corruptions such as style transfer, local occlusion, and adversarial patch attack

  31. arXiv:2306.06211  [pdf, other

    cs.CV

    A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering

    Authors: Chaoning Zhang, Joseph Cho, Fachrina Dewi Puspitasari, Sheng Zheng, Chenghao Li, Yu Qiao, Taegoo Kang, Xinru Shan, Chenshuang Zhang, Caiyan Qin, Francois Rameau, Lik-Hang Lee, Sung-Ho Bae, Choong Seon Hong

    Abstract: The Segment Anything Model (SAM), developed by Meta AI Research, represents a significant breakthrough in computer vision, offering a robust framework for image and video segmentation. This survey provides a comprehensive exploration of the SAM family, including SAM and SAM 2, highlighting their advancements in granularity and contextual understanding. Our study demonstrates SAM's versatility acro… ▽ More

    Submitted 19 October, 2024; v1 submitted 12 May, 2023; originally announced June 2023.

    Comments: Comprehensive survey on SAM family. 21 pages, 14 figures

  32. arXiv:2306.02637  [pdf, other

    cs.HC

    Gotta Go Fast: Measuring Input/Output Latencies of Virtual Reality 3D Engines for Cognitive Experiments

    Authors: Taeho Kang, Christian Wallraven

    Abstract: Virtual Reality (VR) is seeing increased adoption across many fields. The field of experimental cognitive science is also testing utilization of the technology combined with physiological measures such as electroencephalography (EEG) and eye tracking. Quantitative measures of human behavior and cognition process, however, are sensitive to minuscule time resolutions that are often overlooked in the… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  33. arXiv:2305.03609  [pdf, other

    stat.ML cs.CG cs.CR cs.LG math.AT

    Differentially Private Topological Data Analysis

    Authors: Taegyu Kang, Sehwan Kim, Jinwon Sohn, Jordan Awan

    Abstract: This paper is the first to attempt differentially private (DP) topological data analysis (TDA), producing near-optimal private persistence diagrams. We analyze the sensitivity of persistence diagrams in terms of the bottleneck distance, and we show that the commonly used Čech complex has sensitivity that does not decrease as the sample size $n$ increases. This makes it challenging for the persiste… ▽ More

    Submitted 3 November, 2023; v1 submitted 5 May, 2023; originally announced May 2023.

    Comments: 23 pages before references and appendices, 42 pages total, 8 figures

  34. arXiv:2305.00866  [pdf, other

    cs.CV cs.AI

    Attack-SAM: Towards Attacking Segment Anything Model With Adversarial Examples

    Authors: Chenshuang Zhang, Chaoning Zhang, Taegoo Kang, Donghun Kim, Sung-Ho Bae, In So Kweon

    Abstract: Segment Anything Model (SAM) has attracted significant attention recently, due to its impressive performance on various downstream tasks in a zero-short manner. Computer vision (CV) area might follow the natural language processing (NLP) area to embark on a path from task-specific vision models toward foundation models. However, deep vision models are widely recognized as vulnerable to adversarial… ▽ More

    Submitted 8 May, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

    Comments: The first work to attack Segment Anything Model with adversarial examples

  35. arXiv:2304.11666  [pdf

    cs.CL cs.CY stat.AP

    Hold the Suspect! : An Analysis on Media Framing of Itaewon Halloween Crowd Crush

    Authors: TaeYoung Kang

    Abstract: Based on the 10.9K articles from top 40 news providers of South Korea, this paper analyzed the media framing of Itaewon Halloween Crowd Crush during the first 72 hours after the incident. By adopting word-vector embedding and clustering, we figured out that conservative media focused on political parties' responses and the suspect's identity while the liberal media covered the responsibility of th… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: 6 pages, 3 figures, 2 tables

  36. arXiv:2304.10207  [pdf, other

    cs.LG

    Non-destructive Fault Diagnosis of Electronic Interconnects by Learning Signal Patterns of Reflection Coefficient in the Frequency Domain

    Authors: Tae Yeob Kang, Haebom Lee, Sungho Suh

    Abstract: Fault detection and diagnosis of the interconnects are crucial for prognostics and health management (PHM) of electronics. Traditional methods, which rely on electronic signals as prognostic factors, often struggle to accurately identify the root causes of defects without resorting to destructive testing. Furthermore, these methods are vulnerable to noise interference, which can result in false al… ▽ More

    Submitted 4 October, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted in Microelectronics Reliability

  37. A Survey on Graph Diffusion Models: Generative AI in Science for Molecule, Protein and Material

    Authors: Mengchun Zhang, Maryam Qamar, Taegoo Kang, Yuna Jung, Chenshuang Zhang, Sung-Ho Bae, Chaoning Zhang

    Abstract: Diffusion models have become a new SOTA generative modeling method in various fields, for which there are multiple survey works that provide an overall survey. With the number of articles on diffusion models increasing exponentially in the past few years, there is an increasing need for surveys of diffusion models on specific fields. In this work, we are committed to conducting a survey on the gra… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  38. arXiv:2211.15868  [pdf, other

    cs.CV

    Kinematic-aware Hierarchical Attention Network for Human Pose Estimation in Videos

    Authors: Kyung-Min Jin, Byoung-Sung Lim, Gun-Hee Lee, Tae-Kyung Kang, Seong-Whan Lee

    Abstract: Previous video-based human pose estimation methods have shown promising results by leveraging aggregated features of consecutive frames. However, most approaches compromise accuracy to mitigate jitter or do not sufficiently comprehend the temporal aspects of human motion. Furthermore, occlusion increases uncertainty between consecutive frames, which results in unsmooth results. To address these is… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  39. arXiv:2211.10707  [pdf

    cs.CL cs.CY cs.SI

    Suffering from Vaccines or from Government? : Partisan Bias in COVID-19 Vaccine Adverse Events Coverage

    Authors: TaeYoung Kang, Hanbin Lee

    Abstract: Vaccine adverse events have been presumed to be a relatively objective measure that is immune to political polarization. The real-world data, however, shows the correlation between presidential disapproval ratings and the subjective severity of adverse events. This paper investigates the partisan bias in COVID vaccine adverse events coverage with language models that can classify the topic of vacc… ▽ More

    Submitted 19 November, 2022; originally announced November 2022.

    Comments: 5 pages, 5 figures, 2 tables

  40. arXiv:2210.11388  [pdf

    eess.IV cs.CV

    Physics-informed Deep Diffusion MRI Reconstruction with Synthetic Data: Break Training Data Bottleneck in Artificial Intelligence

    Authors: Chen Qian, Haoyu Zhang, Yuncheng Gao, Mingyang Han, Zi Wang, Dan Ruan, Yu Shen, Yaping Wu, Yirong Zhou, Chengyan Wang, Boyu Jiang, Ran Tao, Zhigang Wu, Jiazheng Wang, Liuhong Zhu, Yi Guo, Taishan Kang, Jianzhong Lin, Tao Gong, Chen Yang, Guoqiang Fei, Meijin Lin, Di Guo, Jianjun Zhou, Meiyun Wang , et al. (1 additional authors not shown)

    Abstract: Diffusion magnetic resonance imaging (MRI) is the only imaging modality for non-invasive movement detection of in vivo water molecules, with significant clinical and research applications. Diffusion weighted imaging (DWI) MRI acquired by multi-shot techniques can achieve higher resolution, better signal-to-noise ratio, and lower geometric distortion than single-shot, but suffers from inter-shot mo… ▽ More

    Submitted 3 May, 2025; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: 10 pages, 8 figures

  41. arXiv:2209.11153  [pdf, other

    quant-ph cs.ET

    Bosonic Qiskit

    Authors: Timothy J Stavenger, Eleanor Crane, Kevin Smith, Christopher T Kang, Steven M Girvin, Nathan Wiebe

    Abstract: The practical benefits of hybrid quantum information processing hardware that contains continuous-variable objects (bosonic modes such as mechanical or electromagnetic oscillators) in addition to traditional (discrete-variable) qubits have recently been demonstrated by experiments with bosonic codes that reach the break-even point for quantum error correction and by efficient Gaussian boson sampli… ▽ More

    Submitted 2 December, 2022; v1 submitted 22 September, 2022; originally announced September 2022.

  42. arXiv:2207.09662  [pdf, other

    cs.CV

    HTNet: Anchor-free Temporal Action Localization with Hierarchical Transformers

    Authors: Tae-Kyung Kang, Gun-Hee Lee, Seong-Whan Lee

    Abstract: Temporal action localization (TAL) is a task of identifying a set of actions in a video, which involves localizing the start and end frames and classifying each action instance. Existing methods have addressed this task by using predefined anchor windows or heuristic bottom-up boundary-matching strategies, which are major bottlenecks in inference time. Additionally, the main challenge is the inabi… ▽ More

    Submitted 20 July, 2022; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: 6 pages

    MSC Class: 68T45

  43. arXiv:2204.03262  [pdf

    cs.CL cs.CY

    Korean Online Hate Speech Dataset for Multilabel Classification: How Can Social Science Improve Dataset on Hate Speech?

    Authors: TaeYoung Kang, Eunrang Kwon, Junbum Lee, Youngeun Nam, Junmo Song, JeongKyu Suh

    Abstract: We suggest a multilabel Korean online hate speech dataset that covers seven categories of hate speech: (1) Race and Nationality, (2) Religion, (3) Regionalism, (4) Ageism, (5) Misogyny, (6) Sexual Minorities, and (7) Male. Our 35K dataset consists of 24K online comments with Krippendorff's Alpha label accordance of .713, 2.2K neutral sentences from Wikipedia, 1.7K additionally labeled sentences ge… ▽ More

    Submitted 8 April, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: 12 pages, 3 tables

  44. MIDAS: Multi-sensorial Immersive Dynamic Autonomous System Improves Motivation of Stroke Affected Patients for Hand Rehabilitation

    Authors: Fok-Chi-Seng Fok Kow, Anoop Kumar Sinha, Zhang Jin Ming, Bao Songyu, Jake Tan Jun Kang, Hong Yan Jack Jeffrey, Galina Mihaleva, Nadia Magnenat Thalmann, Yiyu Cai

    Abstract: Majority of stroke survivors are left with poorly functioning paretic hands. Current rehabilitation devices have failed to motivate the patients enough to continue rehabilitation exercises. The objective of this project, MIDAS (Multi-sensorial Immersive Dynamic Autonomous System) is a proof of concept by using an immersive system to improve motivation of stroke patients for hand rehabilitation. MI… ▽ More

    Submitted 20 March, 2022; originally announced March 2022.

  45. arXiv:2202.02545  [pdf

    cs.SD cs.CL eess.AS

    Optimization of a Real-Time Wavelet-Based Algorithm for Improving Speech Intelligibility

    Authors: Tianqu Kang, Anh-Dung Dinh, Binghong Wang, Tianyuan Du, Yijia Chen, Kevin Chau

    Abstract: The optimization of a wavelet-based algorithm to improve speech intelligibility along with the full data set and results are reported. The discrete-time speech signal is split into frequency sub-bands via a multi-level discrete wavelet transform. Various gains are applied to the sub-band signals before they are recombined to form a modified version of the speech. The sub-band gains are adjusted wh… ▽ More

    Submitted 21 July, 2022; v1 submitted 5 February, 2022; originally announced February 2022.

    Comments: 16 pages, 7 figures, 4 tables

  46. arXiv:2111.08270  [pdf, other

    cs.CV

    Data Augmentation using Random Image Cropping for High-resolution Virtual Try-On (VITON-CROP)

    Authors: Taewon Kang, Sunghyun Park, Seunghwan Choi, Jaegul Choo

    Abstract: Image-based virtual try-on provides the capacity to transfer a clothing item onto a photo of a given person, which is usually accomplished by warping the item to a given human pose and adjusting the warped item to the person. However, the results of real-world synthetic images (e.g., selfies) from the previous method is not realistic because of the limitations which result in the neck being misrep… ▽ More

    Submitted 16 November, 2021; originally announced November 2021.

    Comments: 4 pages, 3 figures

  47. arXiv:2109.11706  [pdf, other

    cs.RO eess.SP

    Indoor Navigation Algorithm Based on a Smartphone Inertial Measurement Unit and Map Matching

    Authors: Taewon Kang, Younghoon Shin

    Abstract: We propose an indoor navigation algorithm based on pedestrian dead reckoning (PDR) using an inertial measurement unit in a smartphone and map matching. The proposed indoor navigation system is user-friendly and convenient because it requires no additional device except a smartphone and works with a pedestrian in a casual posture who is walking with a smartphone in their hand. Because the performan… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: Submitted to ICTC 2021

  48. arXiv:2106.15516  [pdf, other

    cs.LG physics.chem-ph

    GeoT: A Geometry-aware Transformer for Reliable Molecular Property Prediction and Chemically Interpretable Representation Learning

    Authors: Bumju Kwak, Jiwon Park, Taewon Kang, Jeonghee Jo, Byunghan Lee, Sungroh Yoon

    Abstract: In recent years, molecular representation learning has emerged as a key area of focus in various chemical tasks. However, many existing models fail to fully consider the geometric information of molecular structures, resulting in less intuitive representations. Moreover, the widely used message-passing mechanism is limited to provide the interpretation of experimental results from a chemical persp… ▽ More

    Submitted 28 June, 2023; v1 submitted 29 June, 2021; originally announced June 2021.

    Comments: This paper is currently under review. The code is available at this URL: https://github.com/oleneyl/geometry-aware-transformer

  49. arXiv:2103.14471  [pdf, other

    cs.CV

    Multiple GAN Inversion for Exemplar-based Image-to-Image Translation

    Authors: Taewon Kang

    Abstract: Existing state-of-the-art techniques in exemplar-based image-to-image translation hold several critical concerns. Existing methods related to exemplar-based image-to-image translation are impossible to translate on an image tuple input (source, target) that is not aligned. Additionally, we can confirm that the existing method exhibits limited generalization ability to unseen images. In order to ov… ▽ More

    Submitted 19 August, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

    Comments: Accepted to ICCV 2021 Workshop, 3rd International Workshop on Real-World Computer Vision From Inputs With Limited Quality (RLQ). Check out the project page for more information: http://itsc.kr/2021/06/24/mgi2021/, 9 pages, 10 figures, v2: corrected typos, extended version of arXiv:2011.09330

  50. arXiv:2011.09330  [pdf, other

    cs.CV

    Online Exemplar Fine-Tuning for Image-to-Image Translation

    Authors: Taewon Kang, Soohyun Kim, Sunwoo Kim, Seungryong Kim

    Abstract: Existing techniques to solve exemplar-based image-to-image translation within deep convolutional neural networks (CNNs) generally require a training phase to optimize the network parameters on domain-specific and task-specific benchmarks, thus having limited applicability and generalization ability. In this paper, we propose a novel framework, for the first time, to solve exemplar-based translatio… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

    Comments: 10 pages, 13 figures