Skip to main content

Showing 1–50 of 136 results for author: Kim, J W

.
  1. arXiv:2505.17982  [pdf, other

    cs.CV

    Few-Shot Learning from Gigapixel Images via Hierarchical Vision-Language Alignment and Modeling

    Authors: Bryan Wong, Jong Woo Kim, Huazhu Fu, Mun Yong Yi

    Abstract: Vision-language models (VLMs) have recently been integrated into multiple instance learning (MIL) frameworks to address the challenge of few-shot, weakly supervised classification of whole slide images (WSIs). A key trend involves leveraging multi-scale information to better represent hierarchical tissue structures. However, existing methods often face two key limitations: (1) insufficient modelin… ▽ More

    Submitted 27 May, 2025; v1 submitted 23 May, 2025; originally announced May 2025.

  2. arXiv:2505.10251  [pdf, ps, other

    cs.RO

    SRT-H: A Hierarchical Framework for Autonomous Surgery via Language Conditioned Imitation Learning

    Authors: Ji Woong Kim, Juo-Tung Chen, Pascal Hansen, Lucy X. Shi, Antony Goldenberg, Samuel Schmidgall, Paul Maria Scheikl, Anton Deguet, Brandon M. White, De Ru Tsai, Richard Cha, Jeffrey Jopling, Chelsea Finn, Axel Krieger

    Abstract: Research on autonomous robotic surgery has largely focused on simple task automation in controlled environments. However, real-world surgical applications require dexterous manipulation over extended time scales while demanding generalization across diverse variations in human tissue. These challenges remain difficult to address using existing logic-based or conventional end-to-end learning strate… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

  3. arXiv:2504.11573  [pdf

    cond-mat.mtrl-sci

    X-ray scattering investigation of hydride surface segregation in epitaxial Nb films

    Authors: David A. Garcia-Wetten, Philip J. Ryan, Jong Woo Kim, Dominic P Goronzy, Roger J. Reinertsen, Mark C. Hersam, Michael J. Bedzyk

    Abstract: Hydride precipitation in niobium-based, superconducting circuits is a damaging side-effect of hydrofluoric acid treatments used to clean and thin the Nb surface oxides and Si oxides. The precipitate microstructure is difficult to probe because of the high hydrogen mobility in the niobium matrix. In particular, destructive techniques used to prepare samples for elemental depth profiling can change… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

    Comments: 11 pages, 5 figures

  4. arXiv:2412.20706  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Effect of disorder on the strain-tuned charge density wave multicriticality in Pd$_x$ErTe$_3$

    Authors: Anisha G. Singh, Matthew Krogstad, Maja D. Bachmann, Paul Thompson, Stephan Rosenkranz, Ray Osborn, Alan Fang, Aharon Kapitulnik, Jong Woo Kim, Philip J. Ryan, Steven A. Kivelson, Ian R. Fisher

    Abstract: We explore, through a combination of x-ray diffraction and elastoresistivity measurements, the effect of disorder on the strain-tuned charge density wave and associated multicriticality in Pd$_x$ErTe$_3$ (x = 0, 0.01, 0.02 and 0.026). We focus particularly on the behavior near the strain-tuned bicritical point that occurs in pristine ErTe$_3$ (x=0). Our study reveals that while Pd intercalation so… ▽ More

    Submitted 29 December, 2024; originally announced December 2024.

  5. arXiv:2412.12906  [pdf, other

    cs.CV

    CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image

    Authors: Wonseok Roh, Hwanhee Jung, Jong Wook Kim, Seunggwan Lee, Innfarn Yoo, Andreas Lugmayr, Seunggeun Chi, Karthik Ramani, Sangpil Kim

    Abstract: Recently, generalizable feed-forward methods based on 3D Gaussian Splatting have gained significant attention for their potential to reconstruct 3D scenes using finite resources. These approaches create a 3D radiance field, parameterized by per-pixel 3D Gaussian primitives, from just a few images in a single forward pass. However, unlike multi-view methods that benefit from cross-view corresponden… ▽ More

    Submitted 3 February, 2025; v1 submitted 17 December, 2024; originally announced December 2024.

  6. arXiv:2411.05727  [pdf

    physics.app-ph

    Cascade hot carriers via broad-band resonant tunneling

    Authors: Kamal Kumar Paul, Ashok Mondal, Jae Woo Kim, Ji-Hee Kim, Young Hee Lee

    Abstract: Extraction of hot carriers (HCs) over the band-edge is a key to harvest solar energy beyond Shockley-Queisser limit1. Graphene is known as a HC-layered material due to phonon bottleneck effect near Dirac point, but limited by low photocarrier density2. Graphene/transition metal dichalcogenide (TMD) heterostructures circumvent this issue by ultrafast carrier transfer from TMD to graphene2,3. Nevert… ▽ More

    Submitted 8 November, 2024; originally announced November 2024.

  7. arXiv:2410.21276  [pdf, other

    cs.CL cs.AI cs.CV cs.CY cs.LG cs.SD eess.AS

    GPT-4o System Card

    Authors: OpenAI, :, Aaron Hurst, Adam Lerer, Adam P. Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, Aleksander Mądry, Alex Baker-Whitcomb, Alex Beutel, Alex Borzunov, Alex Carney, Alex Chow, Alex Kirillov, Alex Nichol, Alex Paino, Alex Renzin, Alex Tachard Passos, Alexander Kirillov, Alexi Christakis , et al. (395 additional authors not shown)

    Abstract: GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 mil… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

  8. arXiv:2410.20026  [pdf, other

    cs.CV

    Towards Robust Algorithms for Surgical Phase Recognition via Digital Twin Representation

    Authors: Hao Ding, Yuqian Zhang, Wenzheng Cheng, Xinyu Wang, Xu Lian, Chenhao Yu, Hongchao Shu, Ji Woong Kim, Axel Krieger, Mathias Unberath

    Abstract: Surgical phase recognition (SPR) is an integral component of surgical data science, enabling high-level surgical analysis. End-to-end trained neural networks that predict surgical phase directly from videos have shown excellent performance on benchmarks. However, these models struggle with robustness due to non-causal associations in the training set. Our goal is to improve model robustness to var… ▽ More

    Submitted 1 March, 2025; v1 submitted 25 October, 2024; originally announced October 2024.

  9. arXiv:2410.02486  [pdf, other

    cs.CR cs.LG

    Encryption-Friendly LLM Architecture

    Authors: Donghwan Rho, Taeseong Kim, Minje Park, Jung Woo Kim, Hyunsik Chae, Ernest K. Ryu, Jung Hee Cheon

    Abstract: Large language models (LLMs) offer personalized responses based on user interactions, but this use case raises serious privacy concerns. Homomorphic encryption (HE) is a cryptographic protocol supporting arithmetic computations in encrypted states and provides a potential solution for privacy-preserving machine learning (PPML). However, the computational intensity of transformers poses challenges… ▽ More

    Submitted 20 February, 2025; v1 submitted 3 October, 2024; originally announced October 2024.

    Comments: 27 pages

  10. arXiv:2410.00046  [pdf, other

    eess.IV cs.CV cs.LG

    Mixture of Multicenter Experts in Multimodal Generative AI for Advanced Radiotherapy Target Delineation

    Authors: Yujin Oh, Sangjoon Park, Xiang Li, Wang Yi, Jonathan Paly, Jason Efstathiou, Annie Chan, Jun Won Kim, Hwa Kyung Byun, Ik Jae Lee, Jaeho Cho, Chan Woo Wee, Peng Shu, Peilong Wang, Nathan Yu, Jason Holmes, Jong Chul Ye, Quanzheng Li, Wei Liu, Woong Sub Koom, Jin Sung Kim, Kyungsang Kim

    Abstract: Clinical experts employ diverse philosophies and strategies in patient care, influenced by regional patient populations. However, existing medical artificial intelligence (AI) models are often trained on data distributions that disproportionately reflect highly prevalent patterns, reinforcing biases and overlooking the diverse expertise of clinicians. To overcome this limitation, we introduce the… ▽ More

    Submitted 26 October, 2024; v1 submitted 27 September, 2024; originally announced October 2024.

    Comments: 39 pages

  11. arXiv:2407.12998  [pdf, other

    cs.RO

    Surgical Robot Transformer (SRT): Imitation Learning for Surgical Tasks

    Authors: Ji Woong Kim, Tony Z. Zhao, Samuel Schmidgall, Anton Deguet, Marin Kobilarov, Chelsea Finn, Axel Krieger

    Abstract: We explore whether surgical manipulation tasks can be learned on the da Vinci robot via imitation learning. However, the da Vinci system presents unique challenges which hinder straight-forward implementation of imitation learning. Notably, its forward kinematics is inconsistent due to imprecise joint measurements, and naively training a policy using such approximate kinematics data often leads to… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 8 pages

  12. arXiv:2407.07317  [pdf, other

    physics.flu-dyn

    Flow-acoustic resonance in deep and inclined cavities

    Authors: You Wei Ho, Jae Wook Kim

    Abstract: This paper presents numerical investigations of flow-acoustic resonances in deep and inclined cavities using wall-resolved large eddy simulations. The study focuses on cavity configurations with an aspect ratio of $D/L = 2.632$, subjected to two Mach numbers of $0.2$ and $0.3$ at three different inclination angles ($α=30^{\circ}$, $60^{\circ}$, and $90^{\circ}$). Fully turbulent boundary layers ge… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  13. arXiv:2405.07650  [pdf, other

    math.OC

    Arrow of Time in Estimation and Control: Duality Theory Beyond the Linear Gaussian Model

    Authors: Jin Won Kim, Prashant G. Mehta

    Abstract: Duality between estimation and control is a foundational concept in Control Theory. Most students learn about the elementary duality -- between observability and controllability -- in their first graduate course in linear systems theory. Therefore, it comes as a surprise that for a more general class of nonlinear stochastic systems (hidden Markov models or HMMs), duality is incomplete. Our objec… ▽ More

    Submitted 8 October, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

  14. arXiv:2405.02066  [pdf, other

    cs.CV eess.IV

    WateRF: Robust Watermarks in Radiance Fields for Protection of Copyrights

    Authors: Youngdong Jang, Dong In Lee, MinHyuk Jang, Jong Wook Kim, Feng Yang, Sangpil Kim

    Abstract: The advances in the Neural Radiance Fields (NeRF) research offer extensive applications in diverse domains, but protecting their copyrights has not yet been researched in depth. Recently, NeRF watermarking has been considered one of the pivotal solutions for safely deploying NeRF-based 3D representations. However, existing methods are designed to apply only to implicit or explicit NeRF representat… ▽ More

    Submitted 11 July, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

  15. arXiv:2405.01127  [pdf, other

    math.PR math.OC

    Backward Map for Filter Stability Analysis

    Authors: Jin Won Kim, Anant A. Joshi, Prashant G. Mehta

    Abstract: In this paper, a backward map is introduced for the purposes of analysis of the nonlinear (stochastic) filter stability. The backward map is important because the filter-stability in the sense of $\chisq$-divergence follows from showing a certain variance decay property for the backward map. To show this property requires additional assumptions on the model properties of the hidden Markov model (H… ▽ More

    Submitted 8 October, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: Conference proceeding related to arXiv:2305.12850

  16. arXiv:2404.15779  [pdf, ps, other

    math.PR

    Divergence metrics in the study of Markov and hidden Markov processes

    Authors: Jin Won Kim, Amirhossein Taghvaei, Prashant G. Mehta

    Abstract: This paper is divided into two parts. The first part reviews the formulae for f-divergences in the study of continuous-time Markov processes and explores their applications in areas such as stochastic stability, the second law of thermodynamics, and its non-equilibrium extensions. This sets the foundation for the second part, which focuses on f-divergence in the study of hidden Markov processes. I… ▽ More

    Submitted 2 October, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  17. arXiv:2403.14111  [pdf, other

    cs.CR cs.LG

    HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption

    Authors: Seewoo Lee, Garam Lee, Jung Woo Kim, Junbum Shin, Mun-Kyu Lee

    Abstract: Transfer learning is a de facto standard method for efficiently training machine learning models for data-scarce problems by adding and fine-tuning new classification layers to a model pre-trained on large datasets. Although numerous previous studies proposed to use homomorphic encryption to resolve the data privacy issue in transfer learning in the machine learning as a service setting, most of t… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: ICML 2023, Appendix D includes some updates after official publication

    Journal ref: PMLR 202:19010-19035, 2023

  18. arXiv:2403.08187  [pdf, other

    cs.CL cs.SD eess.AS

    Automatic Speech Recognition (ASR) for the Diagnosis of pronunciation of Speech Sound Disorders in Korean children

    Authors: Taekyung Ahn, Yeonjung Hong, Younggon Im, Do Hyung Kim, Dayoung Kang, Joo Won Jeong, Jae Won Kim, Min Jung Kim, Ah-ra Cho, Dae-Hyun Jang, Hosung Nam

    Abstract: This study presents a model of automatic speech recognition (ASR) designed to diagnose pronunciation issues in children with speech sound disorders (SSDs) to replace manual transcriptions in clinical procedures. Since ASR models trained for general purposes primarily predict input speech into real words, employing a well-known high-performance ASR model for evaluating pronunciation in children wit… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 12 pages, 2 figures

    ACM Class: I.2.7

  19. arXiv:2403.05949  [pdf, other

    cs.CV cs.LG q-bio.TO

    General surgery vision transformer: A video pre-trained foundation model for general surgery

    Authors: Samuel Schmidgall, Ji Woong Kim, Jeffrey Jopling, Axel Krieger

    Abstract: The absence of openly accessible data and specialized foundation models is a major barrier for computational research in surgery. Toward this, (i) we open-source the largest dataset of general surgery videos to-date, consisting of 680 hours of surgical videos, including data from robotic and laparoscopic techniques across 28 procedures; (ii) we propose a technique for video pre-training a general… ▽ More

    Submitted 12 April, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  20. arXiv:2402.08113  [pdf, other

    cs.CL cs.HC

    Addressing cognitive bias in medical language models

    Authors: Samuel Schmidgall, Carl Harris, Ime Essien, Daniel Olshvang, Tawsifur Rahman, Ji Woong Kim, Rojin Ziaei, Jason Eshraghian, Peter Abadir, Rama Chellappa

    Abstract: There is increasing interest in the application large language models (LLMs) to the medical field, in part because of their impressive performance on medical exam questions. While promising, exam questions do not reflect the complexity of real patient-doctor interactions. In reality, physicians' decisions are shaped by many complex factors, such as patient compliance, personal experience, ethical… ▽ More

    Submitted 20 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  21. arXiv:2401.18006  [pdf, other

    q-bio.QM cs.LG eess.SP

    EEG-GPT: Exploring Capabilities of Large Language Models for EEG Classification and Interpretation

    Authors: Jonathan W. Kim, Ahmed Alaa, Danilo Bernardo

    Abstract: In conventional machine learning (ML) approaches applied to electroencephalography (EEG), this is often a limited focus, isolating specific brain activities occurring across disparate temporal scales (from transient spikes in milliseconds to seizures lasting minutes) and spatial scales (from localized high-frequency oscillations to global sleep activity). This siloed approach limits the developmen… ▽ More

    Submitted 3 February, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

  22. arXiv:2401.14430  [pdf, other

    physics.flu-dyn physics.ao-ph

    A Westervelt equation for acoustic wave propagation through weakly stratified, arbitrary Mach number atmospheres

    Authors: Liam J. Tope, Jae Wook Kim, Peter Spence

    Abstract: Nonlinear distortion of infrasonic waves through atmospheres up to thermospheric altitudes govern large-range ground-level observations of explosive noise sources, causing large differences between the near and far field. Propagation modelling in this scenario to include realistic nonlinear effects has thus far been limited to high-fidelity, numerically intensive Direct Numerical Simulations of th… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  23. Machine learning for industrial sensing and control: A survey and practical perspective

    Authors: Nathan P. Lawrence, Seshu Kumar Damarla, Jong Woo Kim, Aditya Tulsyan, Faraz Amjad, Kai Wang, Benoit Chachuat, Jong Min Lee, Biao Huang, R. Bhushan Gopaluni

    Abstract: With the rise of deep learning, there has been renewed interest within the process industries to utilize data on large-scale nonlinear sensing and control problems. We identify key statistical and machine learning techniques that have seen practical success in the process industries. To do so, we start with hybrid modeling to provide a methodological framework underlying core application areas: so… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: 48 pages

    Journal ref: Control Engineering Practice 2024

  24. arXiv:2401.00678  [pdf, other

    cs.RO cs.LG q-bio.TO

    General-purpose foundation models for increased autonomy in robot-assisted surgery

    Authors: Samuel Schmidgall, Ji Woong Kim, Alan Kuntz, Ahmed Ezzat Ghazi, Axel Krieger

    Abstract: The dominant paradigm for end-to-end robot learning focuses on optimizing task-specific objectives that solve a single robotic problem such as picking up an object or reaching a target position. However, recent work on high-capacity models in robotics has shown promise toward being trained on large collections of diverse and task-agnostic datasets of video demonstrations. These models have shown i… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  25. arXiv:2312.01631  [pdf, other

    cs.RO

    Cooperative vs. Teleoperation Control of the Steady Hand Eye Robot with Adaptive Sclera Force Control: A Comparative Study

    Authors: Mojtaba Esfandiari, Ji Woong Kim, Botao Zhao, Golchehr Amirkhani, Muhammad Hadi, Peter Gehlbach, Russell H. Taylor, Iulian Iordachita

    Abstract: A surgeon's physiological hand tremor can significantly impact the outcome of delicate and precise retinal surgery, such as retinal vein cannulation (RVC) and epiretinal membrane peeling. Robot-assisted eye surgery technology provides ophthalmologists with advanced capabilities such as hand tremor cancellation, hand motion scaling, and safety constraints that enable them to perform these otherwise… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  26. arXiv:2309.02706  [pdf, other

    cs.CL

    HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models

    Authors: Guijin Son, Hanwool Lee, Suwan Kim, Huiseo Kim, Jaecheol Lee, Je Won Yeom, Jihyu Jung, Jung Woo Kim, Songseong Kim

    Abstract: Large language models (LLMs) trained on massive corpora demonstrate impressive capabilities in a wide range of tasks. While there are ongoing efforts to adapt these models to languages beyond English, the attention given to their evaluation methodologies remains limited. Current multilingual benchmarks often rely on back translations or re-implementations of English tests, limiting their capacity… ▽ More

    Submitted 20 March, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: Accepted at LREC-COLING 2024

  27. arXiv:2308.07788  [pdf, ps, other

    eess.AS

    GIST-AiTeR Speaker Diarization System for VoxCeleb Speaker Recognition Challenge (VoxSRC) 2023

    Authors: Dongkeon Park, Ji Won Kim, Kang Ryeol Kim, Do Hyun Lee, Hong Kook Kim

    Abstract: This report describes the submission system by the GIST-AiTeR team for the VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC-23) Track 4. Our submission system focuses on implementing diverse speaker diarization (SD) techniques, including ResNet293 and MFA-Conformer with different combinations of segment and hop length. Then, those models are combined into an ensemble model. The ResNet293 and MF… ▽ More

    Submitted 25 August, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

    Comments: VoxSRC 2023 Track4

  28. arXiv:2306.17421  [pdf, other

    cs.RO

    Micromanipulation in Surgery: Autonomous Needle Insertion Inside the Eye for Targeted Drug Delivery

    Authors: Ji Woong Kim, Peiyao Zhang, Peter Gehlbach, Iulian Iordachita, Marin Kobilarov

    Abstract: We consider a micromanipulation problem in eye surgery, specifically retinal vein cannulation (RVC). RVC involves inserting a microneedle into a retinal vein for the purpose of targeted drug delivery. The procedure requires accurately guiding a needle to a target vein and inserting it while avoiding damage to the surrounding tissues. RVC can be considered similar to the reach or push task studied… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: Experiment-oriented Locomotion and Manipulation Research, RSS 2023 workshop. arXiv admin note: text overlap with arXiv:2306.10133

  29. arXiv:2306.14755  [pdf

    cond-mat.mtrl-sci

    Emergent Tetragonality in a Fundamentally Orthorhombic Material

    Authors: Anisha G. Singh, Maja D. Bachmann, Joshua J. Sanchez, Akshat Pandey, Aharon Kapitulnik, Jong Woo Kim, Philip J. Ryan, Steven A. Kivelson, Ian R. Fisher

    Abstract: Symmetry plays a key role in determining the physical properties of materials. By Neumann's principle, the properties of a material are invariant under the symmetry operations of the space group to which the material belongs. Continuous phase transitions are associated with a spontaneous reduction in symmetry. (For example, the onset of ferromagnetism spontaneously breaks time reversal symmetry.)… ▽ More

    Submitted 29 May, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

  30. arXiv:2306.10133  [pdf, other

    cs.RO

    Deep Learning Guided Autonomous Surgery: Guiding Small Needles into Sub-Millimeter Scale Blood Vessels

    Authors: Ji Woong Kim, Peiyao Zhang, Peter Gehlbach, Iulian Iordachita, Marin Kobilarov

    Abstract: We propose a general strategy for autonomous guidance and insertion of a needle into a retinal blood vessel. The main challenges underpinning this task are the accurate placement of the needle-tip on the target vein and a careful needle insertion maneuver to avoid double-puncturing the vein, while dealing with challenging kinematic constraints and depth-estimation uncertainty. Following how surgeo… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  31. arXiv:2306.10127  [pdf, other

    cs.RO

    Towards Deep Learning Guided Autonomous Eye Surgery Using Microscope and iOCT Images

    Authors: Ji Woong Kim, Shuwen Wei, Peiyao Zhang, Peter Gehlbach, Jin U. Kang, Iulian Iordachita, Marin Kobilarov

    Abstract: Recent advancements in retinal surgery have paved the way for a modern operating room equipped with a surgical robot, a microscope, and intraoperative optical coherence tomography (iOCT)- a depth sensor widely used in retinal surgery. Integrating these tools raises the fundamental question of how to effectively combine them to enable surgical autonomy. In this work, we tackle this question by deve… ▽ More

    Submitted 27 July, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: pending submission to a journal

  32. arXiv:2306.06461  [pdf

    eess.AS cs.SD

    Semi-supervsied Learning-based Sound Event Detection using Freuqency Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task 4

    Authors: Ji Won Kim, Sang Won Son, Yoonah Song, Hong Kook Kim, Il Hoon Song, Jeong Eun Lim

    Abstract: This report proposes a frequency dynamic convolution (FDY) with a large kernel attention (LKA)-convolutional recurrent neural network (CRNN) with a pre-trained bidirectional encoder representation from audio transformers (BEATs) embedding-based sound event detection (SED) model that employs a mean-teacher and pseudo-label approach to address the challenge of limited labeled data for DCASE 2023 Tas… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

    Comments: DCASE 2023 Challenge Task 4A, 5 pages

  33. Variance Decay Property for Filter Stability

    Authors: Jin Won Kim, Prashant G. Mehta

    Abstract: This paper is concerned with the problem of nonlinear (stochastic) filter stability of a hidden Markov model (HMM) with white noise observations. A contribution is the variance decay property which is used to conclude filter stability. For this purpose, a new notion of the Poincaré inequality (PI) is introduced for the nonlinear filter. PI is related to both the ergodicity of the Markov process as… ▽ More

    Submitted 26 June, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: 16 pages

    Journal ref: IEEE Transactions on Automatic Control, 2024

  34. arXiv:2304.12727  [pdf, ps, other

    math.NA math.DS math.ST

    On forward-backward SDE approaches to continuous-time minimum variance estimation

    Authors: Jin Won Kim, Sebastian Reich

    Abstract: The work of Kalman and Bucy has established a duality between filtering and optimal estimation in the context of time-continuous linear systems. This duality has recently been extended to time-continuous nonlinear systems in terms of an optimization problem constrained by a backward stochastic partial differential equation. Here we revisit this problem from the perspective of appropriate forward-b… ▽ More

    Submitted 14 August, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

    MSC Class: 90E10; 90E11; 60G35; 62M20; 93E11; 93E20

  35. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  36. Autonomous Needle Navigation in Retinal Microsurgery: Evaluation in ex vivo Porcine Eyes

    Authors: Peiyao Zhang, Ji Woong Kim, Peter Gehlbach, Iulian Iordachita, Marin Kobilarov

    Abstract: Important challenges in retinal microsurgery include prolonged operating time, inadequate force feedback, and poor depth perception due to a constrained top-down view of the surgery. The introduction of robot-assisted technology could potentially deal with such challenges and improve the surgeon's performance. Motivated by such challenges, this work develops a strategy for autonomous needle naviga… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  37. arXiv:2301.02064  [pdf, other

    cs.CV cs.AI

    Single-round Self-supervised Distributed Learning using Vision Transformer

    Authors: Sangjoon Park, Ik-Jae Lee, Jun Won Kim, Jong Chul Ye

    Abstract: Despite the recent success of deep learning in the field of medicine, the issue of data scarcity is exacerbated by concerns about privacy and data ownership. Distributed learning approaches, including federated learning, have been investigated to address these issues. However, they are hindered by the need for cumbersome communication overheads and weaknesses in privacy protection. To tackle these… ▽ More

    Submitted 15 April, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

  38. arXiv:2212.04356  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Robust Speech Recognition via Large-Scale Weak Supervision

    Authors: Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, Ilya Sutskever

    Abstract: We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual and multitask supervision, the resulting models generalize well to standard benchmarks and are often competitive with prior fully supervised results but in a zero-shot transfer setting without the need for any fine-tuni… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

  39. A Design Method of Distributed Algorithms via Discrete-time Blended Dynamics Theorem

    Authors: Jeong Woo Kim, Jin Gyu Lee, Donggil Lee, Hyungbo Shim

    Abstract: We develop a discrete-time version of the blended dynamics theorem for the use of designing distributed computation algorithms. The blended dynamics theorem enables to predict the behavior of heterogeneous multi-agent systems. Therefore, once we get a blended dynamics for a particular computational task, design idea of node dynamics for individual heterogeneous agents can easily occur. In the cont… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Journal ref: Automatica, vol. 159, pp. 111371, Jan 2024

  40. Modern Machine Learning Tools for Monitoring and Control of Industrial Processes: A Survey

    Authors: R. Bhushan Gopaluni, Aditya Tulsyan, Benoit Chachuat, Biao Huang, Jong Min Lee, Faraz Amjad, Seshu Kumar Damarla, Jong Woo Kim, Nathan P. Lawrence

    Abstract: Over the last ten years, we have seen a significant increase in industrial data, tremendous improvement in computational power, and major theoretical advances in machine learning. This opens up an opportunity to use modern machine learning tools on large-scale nonlinear monitoring and control problems. This article provides a survey of recent results with applications in the process industry.

    Submitted 22 September, 2022; originally announced September 2022.

    Comments: IFAC World Congress 2020

  41. arXiv:2209.10357  [pdf, other

    eess.AS

    GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge

    Authors: Dongkeon Park, Yechan Yu, Kyeong Wan Park, Ji Won Kim, Hong Kook Kim

    Abstract: This report describes the submission system of the GIST-AiTeR team at the 2022 VoxCeleb Speaker Recognition Challenge (VoxSRC) Track 4. Our system mainly includes speech enhancement, voice activity detection , multi-scaled speaker embedding, probabilistic linear discriminant analysis-based speaker clustering, and overlapped speech detection models. We first construct four different diarization sys… ▽ More

    Submitted 6 October, 2022; v1 submitted 21 September, 2022; originally announced September 2022.

    Comments: 2022 VoxSRC Track4

  42. arXiv:2209.01083  [pdf, other

    cs.LG

    When Bioprocess Engineering Meets Machine Learning: A Survey from the Perspective of Automated Bioprocess Development

    Authors: Nghia Duong-Trung, Stefan Born, Jong Woo Kim, Marie-Therese Schermeyer, Katharina Paulick, Maxim Borisyak, Mariano Nicolas Cruz-Bournazou, Thorben Werner, Randolf Scholz, Lars Schmidt-Thieme, Peter Neubauer, Ernesto Martinez

    Abstract: Machine learning (ML) is becoming increasingly crucial in many fields of engineering but has not yet played out its full potential in bioprocess engineering. While experimentation has been accelerated by increasing levels of lab automation, experimental planning and data modeling are still largerly depend on human intervention. ML can be seen as a set of tools that contribute to the automation of… ▽ More

    Submitted 1 November, 2022; v1 submitted 2 September, 2022; originally announced September 2022.

  43. arXiv:2208.09183  [pdf

    cs.CV cs.AI

    Improved Image Classification with Token Fusion

    Authors: Keong Hun Choi, Jin Woo Kim, Yao Wang, Jong Eun Ha

    Abstract: In this paper, we propose a method using the fusion of CNN and transformer structure to improve image classification performance. In the case of CNN, information about a local area on an image can be extracted well, but there is a limit to the extraction of global information. On the other hand, the transformer has an advantage in relatively global extraction, but has a disadvantage in that it req… ▽ More

    Submitted 19 August, 2022; originally announced August 2022.

  44. arXiv:2208.06587  [pdf, other

    math.OC math.PR

    Duality for Nonlinear Filtering II: Optimal Control

    Authors: Jin Won Kim, Prashant G. Mehta

    Abstract: This paper is concerned with the development and use of duality theory for a nonlinear filtering model with white noise observations. The main contribution of this paper is to introduce a stochastic optimal control problem as a dual to the nonlinear filtering problem. The mathematical statement of the dual relationship between the two problems is given in the form of a duality principle. The const… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

  45. arXiv:2208.06586  [pdf, other

    math.OC math.PR

    Duality for Nonlinear Filtering I: Observability

    Authors: Jin Won Kim, Prashant G. Mehta

    Abstract: This paper is concerned with the development and use of duality theory for a hidden Markov model (HMM) with white noise observations. The main contribution of this work is to introduce a backward stochastic differential equation (BSDE) as a dual control system. A key outcome is that stochastic observability (resp. detectability) of the HMM is expressed in dual terms: as controllability (resp. stab… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

    Comments: arXiv admin note: text overlap with arXiv:2207.07709

  46. arXiv:2207.07709  [pdf, other

    math.OC math.PR

    Duality for nonlinear filtering

    Authors: Jin Won Kim

    Abstract: This thesis is concerned with the stochastic filtering problem for a hidden Markov model (HMM) with the white noise observation model. For this filtering problem, we make three types of original contributions: (1) dual controllability characterization of stochastic observability, (2) dual minimum variance optimal control formulation of the stochastic filtering problem, and (3) filter stability ana… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: Ph.D. Thesis of the author

  47. arXiv:2206.02222  [pdf, other

    math.OC cs.GT cs.MA eess.SY

    How does a Rational Agent Act in an Epidemic?

    Authors: S. Yagiz Olmez, Shubham Aggarwal, Jin Won Kim, Erik Miehling, Tamer Başar, Matthew West, Prashant G. Mehta

    Abstract: Evolution of disease in a large population is a function of the top-down policy measures from a centralized planner, as well as the self-interested decisions (to be socially active) of individual agents in a large heterogeneous population. This paper is concerned with understanding the latter based on a mean-field type optimal control model. Specifically, the model is used to investigate the role… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: text overlap with arXiv:2111.10422

  48. arXiv:2205.06468  [pdf, other

    cs.CV

    Monocular Human Digitization via Implicit Re-projection Networks

    Authors: Min-Gyu Park, Ju-Mi Kang, Je Woo Kim, Ju Hong Yoon

    Abstract: We present an approach to generating 3D human models from images. The key to our framework is that we predict double-sided orthographic depth maps and color images from a single perspective projected image. Our framework consists of three networks. The first network predicts normal maps to recover geometric details such as wrinkles in the clothes and facial regions. The second network predicts sha… ▽ More

    Submitted 15 May, 2022; v1 submitted 13 May, 2022; originally announced May 2022.

    Comments: Presented at CVRRW (AI for Content Creation workshop) 2022

  49. Controllable emergent spatial spin modulation in Sr2IrO4 by in situ shear strain

    Authors: S. Pandey, H. Zhang, J. Yang, A. F. May, J. Sanchez, Z. Liu, J. -H. Chu, J. W. Kim, P. J. Ryan, H. D. Zhou, J. Liu

    Abstract: Symmetric anisotropic interaction can be ferromagnetic and antiferromagnetic at the same time but for different crystallographic axes. We show that inducing competition of anisotropic interactions of orthogonal irreducible representations represents a general route to obtain new exotic magnetic states. We demonstrate it here by observing the emergence of a continuously tunable 12-layer spatial spi… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

  50. arXiv:2203.07211  [pdf, other

    q-bio.QM eess.SY

    Model predictive control and moving horizon estimation for adaptive optimal bolus feeding in high-throughput cultivation of \textit{E. coli}

    Authors: Jong Woo Kim, Niels Krausch, Judit Aizpuru, Tilman Barz, Sergio Lucia, Peter Neubauer, Mariano Nicolas Cruz Bournazou

    Abstract: We discuss the application of a nonlinear model predictive control (MPC) and a moving horizon estimation (MHE) to achieve an optimal operation of \textit{E. coli} fed-batch cultivations with intermittent bolus feeding. 24 parallel experiments were considered in a high-throughput microbioreactor platform at a 10 mL scale. The robotic island in question can run up to 48 fed-batch processes in parall… ▽ More

    Submitted 6 February, 2023; v1 submitted 14 March, 2022; originally announced March 2022.