Skip to main content

Showing 1–50 of 67 results for author: Park, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.24274  [pdf, ps, other

    cs.LG cs.AI

    Adversarial Reinforcement Learning Framework for ESP Cheater Simulation

    Authors: Inkyu Park, Jeong-Gwan Lee, Taehwan Kwon, Juheon Choi, Seungku Kim, Junsu Kim, Kimin Lee

    Abstract: Extra-Sensory Perception (ESP) cheats, which reveal hidden in-game information such as enemy locations, are difficult to detect because their effects are not directly observable in player behavior. The lack of observable evidence makes it difficult to collect reliably labeled data, which is essential for training effective anti-cheat systems. Furthermore, cheaters often adapt their behavior by lim… ▽ More

    Submitted 29 September, 2025; originally announced September 2025.

  2. arXiv:2507.14307  [pdf, ps, other

    cs.CL

    How LLMs Comprehend Temporal Meaning in Narratives: A Case Study in Cognitive Evaluation of LLMs

    Authors: Karin de Langis, Jong Inn Park, Andreas Schramm, Bin Hu, Khanh Chi Le, Michael Mensink, Ahn Thu Tong, Dongyeop Kang

    Abstract: Large language models (LLMs) exhibit increasingly sophisticated linguistic capabilities, yet the extent to which these behaviors reflect human-like cognition versus advanced pattern recognition remains an open question. In this study, we investigate how LLMs process the temporal meaning of linguistic aspect in narratives that were previously used in human studies. Using an Expert-in-the-Loop probi… ▽ More

    Submitted 18 July, 2025; originally announced July 2025.

  3. arXiv:2507.13314  [pdf, ps, other

    cs.CV cs.AI

    Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark

    Authors: Junsu Kim, Naeun Kim, Jaeho Lee, Incheol Park, Dongyoon Han, Seungryul Baek

    Abstract: The reasoning-based pose estimation (RPE) benchmark has emerged as a widely adopted evaluation standard for pose-aware multimodal large language models (MLLMs). Despite its significance, we identified critical reproducibility and benchmark-quality issues that hinder fair and consistent quantitative evaluations. Most notably, the benchmark utilizes different image indices from those of the original… ▽ More

    Submitted 17 July, 2025; originally announced July 2025.

    Comments: To be presented as a poster at MMFM 2025

  4. arXiv:2506.13295  [pdf, ps, other

    eess.AS cs.SD

    Instance-Specific Test-Time Training for Speech Editing in the Wild

    Authors: Taewoo Kim, Uijong Lee, Hayoung Park, Choongsang Cho, Nam In Park, Young Han Lee

    Abstract: Speech editing systems aim to naturally modify speech content while preserving acoustic consistency and speaker identity. However, previous studies often struggle to adapt to unseen and diverse acoustic conditions, resulting in degraded editing performance in real-world scenarios. To address this, we propose an instance-specific test-time training method for speech editing in the wild. Our approac… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: Submitted to IEEE Signal Processing Letters

  5. arXiv:2506.08059  [pdf, ps, other

    q-bio.QM cs.AI cs.LG

    CaliciBoost: Performance-Driven Evaluation of Molecular Representations for Caco-2 Permeability Prediction

    Authors: Huong Van Le, Weibin Ren, Junhong Kim, Yukyung Yun, Young Bin Park, Young Jun Kim, Bok Kyung Han, Inho Choi, Jong IL Park, Hwi-Yeol Yun, Jae-Mun Choi

    Abstract: Caco-2 permeability serves as a critical in vitro indicator for predicting the oral absorption of drug candidates during early-stage drug discovery. To enhance the accuracy and efficiency of computational predictions, we systematically investigated the impact of eight molecular feature representation types including 2D/3D descriptors, structural fingerprints, and deep learning-based embeddings com… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: 49 pages, 11 figures

  6. arXiv:2506.03610  [pdf, ps, other

    cs.AI

    Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

    Authors: Dongmin Park, Minkyu Kim, Beongjun Choi, Junhyuck Kim, Keon Lee, Jonghyun Lee, Inkyu Park, Byeong-Uk Lee, Jaeyoung Hwang, Jaewoo Ahn, Ameya S. Mahabaleshwarkar, Bilal Kartal, Pritam Biswas, Yoshi Suhara, Kangwook Lee, Jaewoong Cho

    Abstract: Large Language Model (LLM) agents are reshaping the game industry, particularly with more intelligent and human-preferable game characters. However, existing game benchmarks fall short of practical needs: they lack evaluations of diverse LLM capabilities across various game genres, studies of agentic modules crucial for complex gameplay, and fine-tuning datasets for aligning pre-trained LLMs into… ▽ More

    Submitted 28 September, 2025; v1 submitted 4 June, 2025; originally announced June 2025.

  7. arXiv:2505.18162  [pdf

    eess.SP cs.LG

    Accelerating Battery Material Optimization through iterative Machine Learning

    Authors: Seon-Hwa Lee, Insoo Ye, Changhwan Lee, Jieun Kim, Geunho Choi, Sang-Cheol Nam, Inchul Park

    Abstract: The performance of battery materials is determined by their composition and the processing conditions employed during commercial-scale fabrication, where raw materials undergo complex processing steps with various additives to yield final products. As the complexity of these parameters expands with the development of industry, conventional one-factor-at-a-time (OFAT) experiment becomes old fashion… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: 25 pages, 5 figures

  8. arXiv:2505.07906  [pdf

    cond-mat.mtrl-sci cs.CV cs.LG

    Image-Guided Microstructure Optimization using Diffusion Models: Validated with Li-Mn-rich Cathode Precursors

    Authors: Geunho Choi, Changhwan Lee, Jieun Kim, Insoo Ye, Keeyoung Jung, Inchul Park

    Abstract: Microstructure often dictates materials performance, yet it is rarely treated as an explicit design variable because microstructure is hard to quantify, predict, and optimize. Here, we introduce an image centric, closed-loop framework that makes microstructural morphology into a controllable objective and demonstrate its use case with Li- and Mn-rich layered oxide cathode precursors. This work pre… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: 37 pages, 10 figures

  9. arXiv:2505.07333  [pdf, other

    cs.CV

    Link to the Past: Temporal Propagation for Fast 3D Human Reconstruction from Monocular Video

    Authors: Matthew Marchellus, Nadhira Noor, In Kyu Park

    Abstract: Fast 3D clothed human reconstruction from monocular video remains a significant challenge in computer vision, particularly in balancing computational efficiency with reconstruction quality. Current approaches are either focused on static image reconstruction but too computationally intensive, or achieve high quality through per-video optimization that requires minutes to hours of processing, makin… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: Accepted in CVPR 2025

  10. arXiv:2504.18805  [pdf, other

    cs.CL cs.AI cs.LG

    Stealing Creator's Workflow: A Creator-Inspired Agentic Framework with Iterative Feedback Loop for Improved Scientific Short-form Generation

    Authors: Jong Inn Park, Maanas Taneja, Qianwen Wang, Dongyeop Kang

    Abstract: Generating engaging, accurate short-form videos from scientific papers is challenging due to content complexity and the gap between expert authors and readers. Existing end-to-end methods often suffer from factual inaccuracies and visual artifacts, limiting their utility for scientific dissemination. To address these issues, we propose SciTalk, a novel multi-LLM agentic framework, grounding videos… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

    Comments: Project page: https://minnesotanlp.github.io/scitalk-project-page/

  11. arXiv:2504.05770  [pdf, other

    cs.CV cs.AI

    A Lightweight Multi-Module Fusion Approach for Korean Character Recognition

    Authors: Inho Jake Park, Jaehoon Jay Jeong, Ho-Sang Jo

    Abstract: Optical Character Recognition (OCR) is essential in applications such as document processing, license plate recognition, and intelligent surveillance. However, existing OCR models often underperform in real-world scenarios due to irregular text layouts, poor image quality, character variability, and high computational costs. This paper introduces SDA-Net (Stroke-Sensitive Attention and Dynamic C… ▽ More

    Submitted 8 April, 2025; originally announced April 2025.

    Comments: 12 pages, 5 figures, 5 tables

    MSC Class: 68T07 ACM Class: I.2.10

  12. arXiv:2504.02789  [pdf, other

    cs.CL

    A Framework for Robust Cognitive Evaluation of LLMs

    Authors: Karin de Langis, Jong Inn Park, Bin Hu, Khanh Chi Le, Andreas Schramm, Michael C. Mensink, Andrew Elfenbein, Dongyeop Kang

    Abstract: Emergent cognitive abilities in large language models (LLMs) have been widely observed, but their nature and underlying mechanisms remain poorly understood. A growing body of research draws on cognitive science to investigate LLM cognition, but standard methodologies and experimen-tal pipelines have not yet been established. To address this gap we develop CognitivEval, a framework for systematical… ▽ More

    Submitted 3 April, 2025; originally announced April 2025.

  13. arXiv:2502.06047  [pdf, other

    cs.LG

    Neural Shortest Path for Surface Reconstruction from Point Clouds

    Authors: Yesom Park, Imseong Park, Jooyoung Hahn, Myungjoo Kang

    Abstract: In this paper, we propose the neural shortest path (NSP), a vector-valued implicit neural representation (INR) that approximates a distance function and its gradient. The key feature of NSP is to learn the exact shortest path (ESP), which directs an arbitrary point to its nearest point on the target surface. The NSP is decomposed into its magnitude and direction, and a variable splitting method is… ▽ More

    Submitted 9 February, 2025; originally announced February 2025.

  14. arXiv:2501.14249  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1087 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 25 September, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  15. arXiv:2410.05454  [pdf, other

    stat.ML cs.LG q-bio.NC

    Meta-Dynamical State Space Models for Integrative Neural Data Analysis

    Authors: Ayesha Vermani, Josue Nassar, Hyungju Jeon, Matthew Dowling, Il Memming Park

    Abstract: Learning shared structure across environments facilitates rapid learning and adaptive behavior in neural systems. This has been widely demonstrated and applied in machine learning to train models that are capable of generalizing to novel settings. However, there has been limited work exploiting the shared structure in neural activity during similar tasks for learning latent dynamics from neural re… ▽ More

    Submitted 7 April, 2025; v1 submitted 7 October, 2024; originally announced October 2024.

  16. Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation

    Authors: Youngwan Jin, Incheol Park, Hanbin Song, Hyeongjin Ju, Yagiz Nalcakan, Shiho Kim

    Abstract: This paper proposes Pix2Next, a novel image-to-image translation framework designed to address the challenge of generating high-quality Near-Infrared (NIR) images from RGB inputs. Our approach leverages a state-of-the-art Vision Foundation Model (VFM) within an encoder-decoder architecture, incorporating cross-attention mechanisms to enhance feature integration. This design captures detailed globa… ▽ More

    Submitted 23 April, 2025; v1 submitted 25 September, 2024; originally announced September 2024.

    Comments: 19 pages,12 figures

  17. arXiv:2409.15753  [pdf, other

    cs.LG cs.AI

    Development and Validation of Heparin Dosing Policies Using an Offline Reinforcement Learning Algorithm

    Authors: Yooseok Lim, Inbeom Park, Sujee Lee

    Abstract: Appropriate medication dosages in the intensive care unit (ICU) are critical for patient survival. Heparin, used to treat thrombosis and inhibit blood clotting in the ICU, requires careful administration due to its complexity and sensitivity to various factors, including patient clinical characteristics, underlying medical conditions, and potential drug interactions. Incorrect dosing can lead to s… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

  18. arXiv:2408.04650  [pdf

    cs.CL cs.AI cs.HC cs.LG

    Building Trust in Mental Health Chatbots: Safety Metrics and LLM-Based Evaluation Tools

    Authors: Jung In Park, Mahyar Abbasian, Iman Azimi, Dawn T. Bounds, Angela Jun, Jaesu Han, Robert M. McCarron, Jessica Borelli, Parmida Safavi, Sanaz Mirbaha, Jia Li, Mona Mahmoudi, Carmen Wiedenhoeft, Amir M. Rahmani

    Abstract: Objective: This study aims to develop and validate an evaluation framework to ensure the safety and reliability of mental health chatbots, which are increasingly popular due to their accessibility, human-like interactions, and context-aware support. Materials and Methods: We created an evaluation framework with 100 benchmark questions and ideal responses, and five guideline questions for chatbot r… ▽ More

    Submitted 28 February, 2025; v1 submitted 3 August, 2024; originally announced August 2024.

  19. arXiv:2408.00109  [pdf, other

    q-bio.NC cs.NE nlin.AO

    Back to the Continuous Attractor

    Authors: Ábel Ságodi, Guillermo Martín-Sánchez, Piotr Sokół, Il Memming Park

    Abstract: Continuous attractors offer a unique class of solutions for storing continuous-valued variables in recurrent system states for indefinitely long time intervals. Unfortunately, continuous attractors suffer from severe structural instability in general--they are destroyed by most infinitesimal changes of the dynamical law that defines them. This fragility limits their utility especially in biologica… ▽ More

    Submitted 17 January, 2025; v1 submitted 31 July, 2024; originally announced August 2024.

    Journal ref: In Proceedings of the 38th Conference on Neural Information Processing Systems (NeurIPS 2024)

  20. arXiv:2406.07488  [pdf, other

    cs.CV

    ReduceFormer: Attention with Tensor Reduction by Summation

    Authors: John Yang, Le An, Su Inn Park

    Abstract: Transformers have excelled in many tasks including vision. However, efficient deployment of transformer models in low-latency or high-throughput applications is hindered by the computation in the attention mechanism which involves expensive operations such as matrix multiplication and Softmax. To address this, we introduce ReduceFormer, a family of models optimized for efficiency with the spirit o… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  21. arXiv:2406.06004  [pdf, other

    cs.CV cs.AI cs.CL

    FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model

    Authors: Yebin Lee, Imseong Park, Myungjoo Kang

    Abstract: Most existing image captioning evaluation metrics focus on assigning a single numerical score to a caption by comparing it with reference captions. However, these methods do not provide an explanation for the assigned score. Moreover, reference captions are expensive to acquire. In this paper, we propose FLEUR, an explainable reference-free metric to introduce explainability into image captioning… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted at ACL (Main) 2024

  22. arXiv:2405.03958  [pdf, other

    cs.CV cs.AI cs.LG

    Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model

    Authors: Joo Young Choi, Jaesung R. Park, Inkyu Park, Jaewoong Cho, Albert No, Ernest K. Ryu

    Abstract: Current state-of-the-art diffusion models employ U-Net architectures containing convolutional and (qkv) self-attention layers. The U-Net processes images while being conditioned on the time embedding input for each sampling step and the class or caption embedding input corresponding to the desired conditional generation. Such conditioning involves scale-and-shift operations to the convolutional la… ▽ More

    Submitted 4 October, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  23. arXiv:2404.11615  [pdf, other

    cs.CV

    Factorized Diffusion: Perceptual Illusions by Noise Decomposition

    Authors: Daniel Geng, Inbum Park, Andrew Owens

    Abstract: Given a factorization of an image into a sum of linear components, we present a zero-shot method to control each individual component through diffusion model sampling. For example, we can decompose an image into low and high spatial frequencies and condition these components on different text prompts. This produces hybrid images, which change appearance depending on viewing distance. By decomposin… ▽ More

    Submitted 10 January, 2025; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: ECCV 2024 camera ready version + more readable size

  24. arXiv:2403.01371  [pdf, other

    stat.ML cs.LG

    eXponential FAmily Dynamical Systems (XFADS): Large-scale nonlinear Gaussian state-space modeling

    Authors: Matthew Dowling, Yuan Zhao, Il Memming Park

    Abstract: State-space graphical models and the variational autoencoder framework provide a principled apparatus for learning dynamical systems from data. State-of-the-art probabilistic approaches are often able to scale to large problems at the cost of flexibility of the variational posterior or expressivity of the dynamics model. However, those consolidations can be detrimental if the ultimate goal is to l… ▽ More

    Submitted 3 November, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

  25. arXiv:2401.16553  [pdf, other

    cs.CL cs.AI

    SelectLLM: Can LLMs Select Important Instructions to Annotate?

    Authors: Ritik Sachin Parkar, Jaehyung Kim, Jong Inn Park, Dongyeop Kang

    Abstract: Instruction tuning benefits from large and diverse datasets; however, creating such datasets involves a high cost of human labeling. While synthetic datasets generated by large language models (LLMs) have partly solved this issue, they often contain low-quality data. One effective solution is selectively annotating unlabelled instructions, especially given the relative ease of acquiring unlabeled… ▽ More

    Submitted 27 August, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: First Authors: Ritik Sachin Parkar and Jaehyung Kim | Second Author: Jong Inn Park | PI: Dongyeop Kang

  26. arXiv:2401.08655  [pdf, other

    cs.CV cs.AI cs.GR cs.LG cs.MM

    SAiD: Speech-driven Blendshape Facial Animation with Diffusion

    Authors: Inkyu Park, Jaewoong Cho

    Abstract: Speech-driven 3D facial animation is challenging due to the scarcity of large-scale visual-audio datasets despite extensive research. Most prior works, typically focused on learning regression models on a small dataset using the method of least squares, encounter difficulties generating diverse lip movements from speech and require substantial effort in refining the generated outputs. To address t… ▽ More

    Submitted 24 January, 2024; v1 submitted 24 December, 2023; originally announced January 2024.

    Comments: Fix bug related to the font size

  27. arXiv:2311.17919  [pdf, other

    cs.CV

    Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

    Authors: Daniel Geng, Inbum Park, Andrew Owens

    Abstract: We address the problem of synthesizing multi-view optical illusions: images that change appearance upon a transformation, such as a flip or rotation. We propose a simple, zero-shot method for obtaining these illusions from off-the-shelf text-to-image diffusion models. During the reverse diffusion process, we estimate the noise from different views of a noisy image, and then combine these noise est… ▽ More

    Submitted 2 April, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: CVPR 2024 camera ready

  28. arXiv:2309.17012  [pdf, other

    cs.CL cs.AI cs.LG

    Benchmarking Cognitive Biases in Large Language Models as Evaluators

    Authors: Ryan Koo, Minhwa Lee, Vipul Raheja, Jong Inn Park, Zae Myung Kim, Dongyeop Kang

    Abstract: Large Language Models are cognitively biased judges. Large Language Models (LLMs) have recently been shown to be effective as automatic evaluators with simple prompting and in-context learning. In this work, we assemble 15 LLMs of four different size ranges and evaluate their output responses by preference ranking from the other LLMs as evaluators, such as System Star is better than System Square.… ▽ More

    Submitted 25 September, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: Publishsed at ACL 2024. 29 pages, 9 figures, 14 tables

    ACM Class: I.2.7

  29. arXiv:2308.12585  [pdf, other

    q-bio.NC cs.LG cs.NE nlin.AO

    Persistent learning signals and working memory without continuous attractors

    Authors: Il Memming Park, Ábel Ságodi, Piotr Aleksander Sokół

    Abstract: Neural dynamical systems with stable attractor structures, such as point attractors and continuous attractors, are hypothesized to underlie meaningful temporal behavior that requires working memory. However, working memory may not support useful learning signals necessary to adapt to changes in the temporal structure of the environment. We show that in addition to the continuous attractors that ar… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

  30. arXiv:2308.05542  [pdf, other

    cs.CV

    Robust Asymmetric Loss for Multi-Label Long-Tailed Learning

    Authors: Wongi Park, Inhyuk Park, Sungeun Kim, Jongbin Ryu

    Abstract: In real medical data, training samples typically show long-tailed distributions with multiple labels. Class distribution of the medical data has a long-tailed shape, in which the incidence of different diseases is quite varied, and at the same time, it is not unusual for images taken from symptomatic patients to be multi-label diseases. Therefore, in this paper, we concurrently address these two i… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Journal ref: ICCVW 2023

  31. arXiv:2306.13776  [pdf, other

    cs.CV cs.LG

    Swin-Free: Achieving Better Cross-Window Attention and Efficiency with Size-varying Window

    Authors: Jinkyu Koo, John Yang, Le An, Gwenaelle Cunha Sergio, Su Inn Park

    Abstract: Transformer models have shown great potential in computer vision, following their success in language tasks. Swin Transformer is one of them that outperforms convolution-based architectures in terms of accuracy, while improving efficiency when compared to Vision Transformer (ViT) and its variants, which have quadratic complexity with respect to the input size. Swin Transformer features shifting wi… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Comments: 8 pages, 3 figures

  32. arXiv:2306.01802  [pdf, other

    q-bio.NC cs.LG stat.AP stat.ML

    Linear Time GPs for Inferring Latent Trajectories from Neural Spike Trains

    Authors: Matthew Dowling, Yuan Zhao, Il Memming Park

    Abstract: Latent Gaussian process (GP) models are widely used in neuroscience to uncover hidden state evolutions from sequential observations, mainly in neural activity recordings. While latent GP models provide a principled and powerful solution in theory, the intractable posterior in non-conjugate settings necessitates approximate inference schemes, which may lack scalability. In this work, we propose cvH… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: Published at ICML 2023

  33. arXiv:2305.11278  [pdf, other

    stat.ML cs.LG q-bio.NC

    Real-Time Variational Method for Learning Neural Trajectory and its Dynamics

    Authors: Matthew Dowling, Yuan Zhao, Il Memming Park

    Abstract: Latent variable models have become instrumental in computational neuroscience for reasoning about neural computation. This has fostered the development of powerful offline algorithms for extracting latent neural trajectories from neural recordings. However, despite the potential of real time alternatives to give immediate feedback to experimentalists, and enhance experimental design, they have rec… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: Published at ICLR 2023

  34. arXiv:2305.04468  [pdf, other

    cs.LG cs.AI

    AnomalyBERT: Self-Supervised Transformer for Time Series Anomaly Detection using Data Degradation Scheme

    Authors: Yungi Jeong, Eunseok Yang, Jung Hyun Ryu, Imseong Park, Myungjoo Kang

    Abstract: Mechanical defects in real situations affect observation values and cause abnormalities in multivariate time series, such as sensor values or network data. To perceive abnormalities in such data, it is crucial to understand the temporal context and interrelation between variables simultaneously. The anomaly detection task for time series, especially for unlabeled data, has been a challenging probl… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: 11 pages, Presented at ICLR 2023 workshop on Machine Learning for IoT

  35. arXiv:2303.02060  [pdf, other

    stat.ML cs.LG

    Spectral learning of Bernoulli linear dynamical systems models

    Authors: Iris R. Stone, Yotam Sagiv, Il Memming Park, Jonathan W. Pillow

    Abstract: Latent linear dynamical systems with Bernoulli observations provide a powerful modeling framework for identifying the temporal dynamics underlying binary time series data, which arise in a variety of contexts such as binary decision-making and discrete stochastic processes (e.g., binned neural spike trains). Here we develop a spectral learning method for fast, efficient fitting of probit-Bernoulli… ▽ More

    Submitted 26 July, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: Published in Transactions on Machine Learning Research (https://jmlr.org/tmlr/papers/)

    Journal ref: Transactions on Machine Learning Research (2023)

  36. arXiv:2212.04319  [pdf, other

    cs.CV cs.AI

    On the Robustness of Normalizing Flows for Inverse Problems in Imaging

    Authors: Seongmin Hong, Inbum Park, Se Young Chun

    Abstract: Conditional normalizing flows can generate diverse image samples for solving inverse problems. Most normalizing flows for inverse problems in imaging employ the conditional affine coupling layer that can generate diverse images quickly. However, unintended severe artifacts are occasionally observed in the output of them. In this work, we address this critical issue by investigating the origins of… ▽ More

    Submitted 16 March, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: 16 pages

  37. arXiv:2211.07077  [pdf, other

    cs.CV

    IFQA: Interpretable Face Quality Assessment

    Authors: Byungho Jo, Donghyeon Cho, In Kyu Park, Sungeun Hong

    Abstract: Existing face restoration models have relied on general assessment metrics that do not consider the characteristics of facial regions. Recent works have therefore assessed their methods using human studies, which is not scalable and involves significant effort. This paper proposes a novel face-centric metric based on an adversarial framework where a generator simulates face restoration and a discr… ▽ More

    Submitted 16 November, 2022; v1 submitted 13 November, 2022; originally announced November 2022.

    Comments: WACV 2023, Code: https://github.com/VCLLab/IFQA

  38. arXiv:2208.08005  [pdf, other

    cs.CL cs.AI

    Transformer Encoder for Social Science

    Authors: Haosen Ge, In Young Park, Xuancheng Qian, Grace Zeng

    Abstract: High-quality text data has become an important data source for social scientists. We have witnessed the success of pretrained deep neural network models, such as BERT and RoBERTa, in recent social science research. In this paper, we propose a compact pretrained deep neural network, Transformer Encoder for Social Science (TESS), explicitly designed to tackle text processing tasks in social science… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

  39. arXiv:2204.13791  [pdf, other

    cs.CV cs.LG

    Depth Estimation with Simplified Transformer

    Authors: John Yang, Le An, Anurag Dixit, Jinkyu Koo, Su Inn Park

    Abstract: Transformer and its variants have shown state-of-the-art results in many vision tasks recently, ranging from image classification to dense prediction. Despite of their success, limited work has been reported on improving the model efficiency for deployment in latency-critical applications, such as autonomous driving and robotic navigation. In this paper, we aim at improving upon the existing trans… ▽ More

    Submitted 27 May, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

    Comments: Accepted for the CVPR 2022 Transformers For Vision (T4V) workshop

  40. arXiv:2204.01264  [pdf, other

    cs.CV

    Probabilistic Implicit Scene Completion

    Authors: Dongsu Zhang, Changwoon Choi, Inbum Park, Young Min Kim

    Abstract: We propose a probabilistic shape completion method extended to the continuous geometry of large-scale 3D scenes. Real-world scans of 3D scenes suffer from a considerable amount of missing data cluttered with unsegmented objects. The problem of shape completion is inherently ill-posed, and high-quality result requires scalable solutions that consider multiple possible outcomes. We employ the Genera… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: Accepted to ICLR 2022 as spotlight, code available at https://github.com/96lives/gca

  41. Human and Scene Motion Deblurring using Pseudo-blur Synthesizer

    Authors: Jonathan Samuel Lumentut, In Kyu Park

    Abstract: Present-day deep learning-based motion deblurring methods utilize the pair of synthetic blur and sharp data to regress any particular framework. This task is designed for directly translating a blurry image input into its restored version as output. The aforementioned approach relies heavily on the quality of the synthetic blurry data, which are only available before the training stage. Handling t… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

  42. arXiv:2109.04463  [pdf, other

    cs.LG q-bio.NC

    Neural Latents Benchmark '21: Evaluating latent variable models of neural population activity

    Authors: Felix Pei, Joel Ye, David Zoltowski, Anqi Wu, Raeed H. Chowdhury, Hansem Sohn, Joseph E. O'Doherty, Krishna V. Shenoy, Matthew T. Kaufman, Mark Churchland, Mehrdad Jazayeri, Lee E. Miller, Jonathan Pillow, Il Memming Park, Eva L. Dyer, Chethan Pandarinath

    Abstract: Advances in neural recording present increasing opportunities to study neural activity in unprecedented detail. Latent variable models (LVMs) are promising tools for analyzing this rich activity across diverse neural systems and behaviors, as LVMs do not depend on known relationships between the activity and external experimental variables. However, progress with LVMs for neuronal population activ… ▽ More

    Submitted 17 January, 2022; v1 submitted 9 September, 2021; originally announced September 2021.

  43. arXiv:2107.07098  [pdf, other

    stat.ML cs.LG

    Hida-Matérn Kernel

    Authors: Matthew Dowling, Piotr Sokół, Il Memming Park

    Abstract: We present the class of Hida-Matérn kernels, which is the canonical family of covariance functions over the entire space of stationary Gauss-Markov Processes. It extends upon Matérn kernels, by allowing for flexible construction of priors over processes with oscillatory components. Any stationary kernel, including the widely used squared-exponential and spectral mixture kernels, are either directl… ▽ More

    Submitted 27 December, 2021; v1 submitted 14 July, 2021; originally announced July 2021.

  44. Deep Context- and Relation-Aware Learning for Aspect-based Sentiment Analysis

    Authors: Shinhyeok Oh, Dongyub Lee, Taesun Whang, IlNam Park, Gaeun Seo, EungGyun Kim, Harksoo Kim

    Abstract: Existing works for aspect-based sentiment analysis (ABSA) have adopted a unified approach, which allows the interactive relations among subtasks. However, we observe that these methods tend to predict polarities based on the literal meaning of aspect and opinion terms and mainly consider relations implicitly among subtasks at the word level. In addition, identifying multiple aspect-opinion pairs w… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: Accepted to ACL-IJCNLP 2021

  45. arXiv:2103.16851  [pdf, other

    cs.CV

    Attention Map-guided Two-stage Anomaly Detection using Hard Augmentation

    Authors: Jou Won Song, Kyeongbo Kong, Ye In Park, Suk-Ju Kang

    Abstract: Anomaly detection is a task that recognizes whether an input sample is included in the distribution of a target normal class or an anomaly class. Conventional generative adversarial network (GAN)-based methods utilize an entire image including foreground and background as an input. However, in these methods, a useless region unrelated to the normal class (e.g., unrelated background) is learned as… ▽ More

    Submitted 31 March, 2021; originally announced March 2021.

  46. arXiv:2102.11517  [pdf, other

    cs.LG cs.DB cs.SI

    SliceNStitch: Continuous CP Decomposition of Sparse Tensor Streams

    Authors: Taehyung Kwon, Inkyu Park, Dongjin Lee, Kijung Shin

    Abstract: Consider traffic data (i.e., triplets in the form of source-destination-timestamp) that grow over time. Tensors (i.e., multi-dimensional arrays) with a time mode are widely used for modeling and analyzing such multi-aspect data streams. In such tensors, however, new entries are added only once per period, which is often an hour, a day, or even a year. This discreteness of tensors has limited their… ▽ More

    Submitted 2 March, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

    Comments: Updated Figures 4, 5, 6, 7, and 8 after fixing a bug in preprocessing the Divvy dataset. To appear at the 37th IEEE International Conference on Data Engineering (ICDE '21)

    ACM Class: H.2.8

  47. arXiv:2012.04729  [pdf, other

    cs.LG

    On 1/n neural representation and robustness

    Authors: Josue Nassar, Piotr Aleksander Sokol, SueYeon Chung, Kenneth D. Harris, Il Memming Park

    Abstract: Understanding the nature of representation in neural networks is a goal shared by neuroscience and machine learning. It is therefore exciting that both fields converge not only on shared questions but also on similar approaches. A pressing question in these areas is understanding how the structure of the representation used by neural networks affects both their generalization, and robustness to pe… ▽ More

    Submitted 8 December, 2020; originally announced December 2020.

  48. arXiv:2010.12362  [pdf, other

    stat.ML cs.LG

    Rescuing neural spike train models from bad MLE

    Authors: Diego M. Arribas, Yuan Zhao, Il Memming Park

    Abstract: The standard approach to fitting an autoregressive spike train model is to maximize the likelihood for one-step prediction. This maximum likelihood estimation (MLE) often leads to models that perform poorly when generating samples recursively for more than one time step. Moreover, the generated spike trains can fail to capture important features of the data and even show diverging firing rates. To… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: To appear in Advances in Neural Information Processing 2020

  49. arXiv:2009.01362  [pdf, other

    stat.ML cs.LG

    Non-parametric generalized linear model

    Authors: Matthew Dowling, Yuan Zhao, Il Memming Park

    Abstract: A fundamental problem in statistical neuroscience is to model how neurons encode information by analyzing electrophysiological recordings. A popular and widely-used approach is to fit the spike trains with an autoregressive point process model. These models are characterized by a set of convolutional temporal filters, whose subsequent analysis can help reveal how neurons encode stimuli, interact w… ▽ More

    Submitted 2 September, 2020; originally announced September 2020.

  50. Integrated Eojeol Embedding for Erroneous Sentence Classification in Korean Chatbots

    Authors: DongHyun Choi, IlNam Park, Myeong Cheol Shin, EungGyun Kim, Dong Ryeol Shin

    Abstract: This paper attempts to analyze the Korean sentence classification system for a chatbot. Sentence classification is the task of classifying an input sentence based on predefined categories. However, spelling or space error contained in the input sentence causes problems in morphological analysis and tokenization. This paper proposes a novel approach of Integrated Eojeol (Korean syntactic word separ… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: 9 pages, 2 figures

    Journal ref: IEEE Access, 2021