Skip to main content

Showing 1–50 of 75 results for author: Bae, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2505.22568  [pdf

    eess.IV cs.CV

    Multipath cycleGAN for harmonization of paired and unpaired low-dose lung computed tomography reconstruction kernels

    Authors: Aravind R. Krishnan, Thomas Z. Li, Lucas W. Remedios, Michael E. Kim, Chenyu Gao, Gaurav Rudravaram, Elyssa M. McMaster, Adam M. Saunders, Shunxing Bao, Kaiwen Xu, Lianrui Zuo, Kim L. Sandler, Fabien Maldonado, Yuankai Huo, Bennett A. Landman

    Abstract: Reconstruction kernels in computed tomography (CT) affect spatial resolution and noise characteristics, introducing systematic variability in quantitative imaging measurements such as emphysema quantification. Choosing an appropriate kernel is therefore essential for consistent quantitative analysis. We propose a multipath cycleGAN model for CT kernel harmonization, trained on a mixture of paired… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  2. arXiv:2505.09091  [pdf, ps, other

    cs.SD cs.AI cs.CV cs.LG eess.AS

    DPN-GAN: Inducing Periodic Activations in Generative Adversarial Networks for High-Fidelity Audio Synthesis

    Authors: Zeeshan Ahmad, Shudi Bao, Meng Chen

    Abstract: In recent years, generative adversarial networks (GANs) have made significant progress in generating audio sequences. However, these models typically rely on bandwidth-limited mel-spectrograms, which constrain the resolution of generated audio sequences, and lead to mode collapse during conditional generation. To address this issue, we propose Deformable Periodic Network based GAN (DPN-GAN), a nov… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Journal ref: IEEE Access, vol. 13, pp. 69324-69340, 2025

  3. arXiv:2505.03695  [pdf, other

    cs.RO eess.SY

    Frenet Corridor Planner: An Optimal Local Path Planning Framework for Autonomous Driving

    Authors: Faizan M. Tariq, Zheng-Hang Yeh, Avinash Singh, David Isele, Sangjae Bae

    Abstract: Motivated by the requirements for effectiveness and efficiency, path-speed decomposition-based trajectory planning methods have widely been adopted for autonomous driving applications. While a global route can be pre-computed offline, real-time generation of adaptive local paths remains crucial. Therefore, we present the Frenet Corridor Planner (FCP), an optimization-based local path planning stra… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: 8 pages, 10 figures - Presented at 2025 IEEE 36th Intelligent Vehicles Symposium (IV)

  4. arXiv:2504.18539  [pdf, other

    eess.AS cs.LG cs.MM cs.SD

    Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation

    Authors: Sungnyun Kim, Sungwoo Cho, Sangmin Bae, Kangwook Jang, Se-Young Yun

    Abstract: Audio-visual speech recognition (AVSR) incorporates auditory and visual modalities to improve recognition accuracy, particularly in noisy environments where audio-only speech systems are insufficient. While previous research has largely addressed audio disruptions, few studies have dealt with visual corruptions, e.g., lip occlusions or blurred videos, which are also detrimental. To address this re… ▽ More

    Submitted 30 April, 2025; v1 submitted 23 January, 2025; originally announced April 2025.

    Comments: ICLR 2025; 22 pages, 6 figures, 14 tables

  5. arXiv:2504.12616  [pdf, other

    cs.RO eess.SY

    Graph-based Path Planning with Dynamic Obstacle Avoidance for Autonomous Parking

    Authors: Farhad Nawaz, Minjun Sung, Darshan Gadginmath, Jovin D'sa, Sangjae Bae, David Isele, Nadia Figueroa, Nikolai Matni, Faizan M. Tariq

    Abstract: Safe and efficient path planning in parking scenarios presents a significant challenge due to the presence of cluttered environments filled with static and dynamic obstacles. To address this, we propose a novel and computationally efficient planning strategy that seamlessly integrates the predictions of dynamic obstacles into the planning process, ensuring the generation of collision-free paths. O… ▽ More

    Submitted 7 May, 2025; v1 submitted 16 April, 2025; originally announced April 2025.

    Comments: IEEE Intelligent Vehicles Symposium 2025

  6. arXiv:2503.12593  [pdf, other

    eess.IV cs.AI cs.LG physics.bio-ph q-bio.QM

    Fourier-Based 3D Multistage Transformer for Aberration Correction in Multicellular Specimens

    Authors: Thayer Alshaabi, Daniel E. Milkie, Gaoxiang Liu, Cyna Shirazinejad, Jason L. Hong, Kemal Achour, Frederik Görlitz, Ana Milunovic-Jevtic, Cat Simmons, Ibrahim S. Abuzahriyeh, Erin Hong, Samara Erin Williams, Nathanael Harrison, Evan Huang, Eun Seok Bae, Alison N. Killilea, David G. Drubin, Ian A. Swinburne, Srigokul Upadhyayula, Eric Betzig

    Abstract: High-resolution tissue imaging is often compromised by sample-induced optical aberrations that degrade resolution and contrast. While wavefront sensor-based adaptive optics (AO) can measure these aberrations, such hardware solutions are typically complex, expensive to implement, and slow when serially mapping spatially varying aberrations across large fields of view. Here, we introduce AOViFT (Ada… ▽ More

    Submitted 23 May, 2025; v1 submitted 16 March, 2025; originally announced March 2025.

    Comments: 55 pages, 6 figures, 26 si figures, 8 si tables

  7. arXiv:2502.20636  [pdf, ps, other

    cs.RO eess.SY

    Delayed-Decision Motion Planning in the Presence of Multiple Predictions

    Authors: David Isele, Alexandre Miranda Anon, Faizan M. Tariq, Goro Yeh, Avinash Singh, Sangjae Bae

    Abstract: Reliable automated driving technology is challenged by various sources of uncertainties, in particular, behavioral uncertainties of traffic agents. It is common for traffic agents to have intentions that are unknown to others, leaving an automated driving car to reason over multiple possible behaviors. This paper formalizes a behavior planning scheme in the presence of multiple possible futures wi… ▽ More

    Submitted 6 June, 2025; v1 submitted 27 February, 2025; originally announced February 2025.

  8. arXiv:2502.10447  [pdf, other

    eess.AS cs.CL cs.LG

    MoHAVE: Mixture of Hierarchical Audio-Visual Experts for Robust Speech Recognition

    Authors: Sungnyun Kim, Kangwook Jang, Sangmin Bae, Sungwoo Cho, Se-Young Yun

    Abstract: Audio-visual speech recognition (AVSR) has become critical for enhancing speech recognition in noisy environments by integrating both auditory and visual modalities. However, existing AVSR systems struggle to scale up without compromising computational efficiency. In this study, we introduce MoHAVE (Mixture of Hierarchical Audio-Visual Experts), a novel robust AVSR framework designed to address th… ▽ More

    Submitted 21 May, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

    Comments: Accepted to ICML 2025

  9. arXiv:2501.13071  [pdf

    cs.CV eess.IV

    Robust Body Composition Analysis by Generating 3D CT Volumes from Limited 2D Slices

    Authors: Lianrui Zuo, Xin Yu, Dingjie Su, Kaiwen Xu, Aravind R. Krishnan, Yihao Liu, Shunxing Bao, Fabien Maldonado, Luigi Ferrucci, Bennett A. Landman

    Abstract: Body composition analysis provides valuable insights into aging, disease progression, and overall health conditions. Due to concerns of radiation exposure, two-dimensional (2D) single-slice computed tomography (CT) imaging has been used repeatedly for body composition analysis. However, this approach introduces significant spatial variability that can impact the accuracy and robustness of the anal… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

  10. arXiv:2501.13068  [pdf

    cs.CV eess.IV

    Beyond the Lungs: Extending the Field of View in Chest CT with Latent Diffusion Models

    Authors: Lianrui Zuo, Kaiwen Xu, Dingjie Su, Xin Yu, Aravind R. Krishnan, Yihao Liu, Shunxing Bao, Thomas Li, Kim L. Sandler, Fabien Maldonado, Bennett A. Landman

    Abstract: The interconnection between the human lungs and other organs, such as the liver and kidneys, is crucial for understanding the underlying risks and effects of lung diseases and improving patient care. However, most research chest CT imaging is focused solely on the lungs due to considerations of cost and radiation dose. This restricted field of view (FOV) in the acquired images poses challenges to… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

  11. arXiv:2412.11277  [pdf, other

    eess.IV cs.AI cs.CV

    Macro2Micro: Cross-modal Magnetic Resonance Imaging Synthesis Leveraging Multi-scale Brain Structures

    Authors: Sooyoung Kim, Joonwoo Kwon, Junbeom Kwon, Sangyoon Bae, Yuewei Lin, Shinjae Yoo, Jiook Cha

    Abstract: Spanning multiple scales-from macroscopic anatomy down to intricate microscopic architecture-the human brain exemplifies a complex system that demands integrated approaches to fully understand its complexity. Yet, mapping nonlinear relationships between these scales remains challenging due to technical limitations and the high cost of multimodal Magnetic Resonance Imaging (MRI) acquisition. Here,… ▽ More

    Submitted 15 December, 2024; originally announced December 2024.

    Comments: The code will be made available upon acceptance

  12. arXiv:2410.02898  [pdf, other

    eess.SY cs.LG cs.RO

    Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients

    Authors: Gabriel Chenevert, Jingqi Li, Achyuta kannan, Sangjae Bae, Donggun Lee

    Abstract: Reach-Avoid-Stay (RAS) optimal control enables systems such as robots and air taxis to reach their targets, avoid obstacles, and stay near the target. However, current methods for RAS often struggle with handling complex, dynamic environments and scaling to high-dimensional systems. While reinforcement learning (RL)-based reachability analysis addresses these challenges, it has yet to tackle the R… ▽ More

    Submitted 7 October, 2024; v1 submitted 3 October, 2024; originally announced October 2024.

  13. arXiv:2407.06116  [pdf

    eess.IV cs.CV cs.LG

    Data-driven Nucleus Subclassification on Colon H&E using Style-transferred Digital Pathology

    Authors: Lucas W. Remedios, Shunxing Bao, Samuel W. Remedios, Ho Hin Lee, Leon Y. Cai, Thomas Li, Ruining Deng, Nancy R. Newlin, Adam M. Saunders, Can Cui, Jia Li, Qi Liu, Ken S. Lau, Joseph T. Roland, Mary K Washington, Lori A. Coburn, Keith T. Wilson, Yuankai Huo, Bennett A. Landman

    Abstract: Understanding the way cells communicate, co-locate, and interrelate is essential to furthering our understanding of how the body functions. H&E is widely available, however, cell subtyping often requires expert knowledge and the use of specialized stains. To reduce the annotation burden, AI has been proposed for the classification of cells on H&E. For example, the recent Colon Nucleus Identificati… ▽ More

    Submitted 15 May, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2401.05602

  14. arXiv:2407.03563  [pdf, other

    eess.AS cs.CL cs.LG eess.IV

    Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition

    Authors: Sungnyun Kim, Kangwook Jang, Sangmin Bae, Hoirin Kim, Se-Young Yun

    Abstract: Audio-visual speech recognition (AVSR) aims to transcribe human speech using both audio and video modalities. In practical environments with noise-corrupted audio, the role of video information becomes crucial. However, prior works have primarily focused on enhancing audio features in AVSR, overlooking the importance of video features. In this study, we strengthen the video features by learning th… ▽ More

    Submitted 14 October, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted at SLT 2024 Main Conference; Code is available at https://github.com/sungnyun/avsr-temporal-dynamics

  15. arXiv:2407.00596  [pdf, other

    eess.IV cs.CV

    HATs: Hierarchical Adaptive Taxonomy Segmentation for Panoramic Pathology Image Analysis

    Authors: Ruining Deng, Quan Liu, Can Cui, Tianyuan Yao, Juming Xiong, Shunxing Bao, Hao Li, Mengmeng Yin, Yu Wang, Shilin Zhao, Yucheng Tang, Haichun Yang, Yuankai Huo

    Abstract: Panoramic image segmentation in computational pathology presents a remarkable challenge due to the morphologically complex and variably scaled anatomy. For instance, the intricate organization in kidney pathology spans multiple layers, from regions like the cortex and medulla to functional units such as glomeruli, tubules, and vessels, down to various cell types. In this paper, we propose a novel… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.19286

  16. arXiv:2406.12254  [pdf, other

    eess.IV cs.CV

    Enhancing Single-Slice Segmentation with 3D-to-2D Unpaired Scan Distillation

    Authors: Xin Yu, Qi Yang, Han Liu, Ho Hin Lee, Yucheng Tang, Lucas W. Remedios, Michael E. Kim, Rendong Zhang, Shunxing Bao, Yuankai Huo, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman

    Abstract: 2D single-slice abdominal computed tomography (CT) enables the assessment of body habitus and organ health with low radiation exposure. However, single-slice data necessitates the use of 2D networks for segmentation, but these networks often struggle to capture contextual information effectively. Consequently, even when trained on identical datasets, 3D networks typically achieve superior segmenta… ▽ More

    Submitted 12 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  17. arXiv:2405.05426  [pdf, other

    eess.SY

    ATLS: Automated Trailer Loading for Surface Vessels

    Authors: Amer Abughaida, Meet Gandhi, Jun Heo, Vaishnav Tadiparthi, Yosuke Sakamoto, Joohyun Woo, Sangjae Bae

    Abstract: Automated docking technologies of marine boats have been enlightened by an increasing number of literature. This paper contributes to the literature by proposing a mathematical framework that automates "trailer loading" in the presence of wind disturbances, which is unexplored despite its importance to boat owners. The comprehensive pipeline of localization, system identification, and trajectory o… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: To be presented at IEEE Intelligent Vehicles Symposium (IV 2024)

  18. arXiv:2405.02996  [pdf, other

    cs.SD cs.AI eess.AS

    RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification

    Authors: June-Woo Kim, Miika Toikkanen, Sangmin Bae, Minseok Kim, Ho-Young Jung

    Abstract: Recent advancements in AI have democratized its deployment as a healthcare assistant. While pretrained models from large-scale visual and audio datasets have demonstrably generalized to this task, surprisingly, no studies have explored pretrained speech models, which, as human-originated sounds, intuitively would share closer resemblance to lung sounds. This paper explores the efficacy of pretrain… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted EMBC 2024

  19. Lane-Change in Dense Traffic with Model Predictive Control and Neural Networks

    Authors: Sangjae Bae, David Isele, Alireza Nakhaei, Peng Xu, Alexandre Miranda Anon, Chiho Choi, Kikuo Fujimura, Scott Moura

    Abstract: This paper presents an online smooth-path lane-change control framework. We focus on dense traffic where inter-vehicle space gaps are narrow, and cooperation with surrounding drivers is essential to achieve the lane-change maneuver. We propose a two-stage control framework that harmonizes Model Predictive Control (MPC) with Generative Adversarial Networks (GAN) by utilizing driving intentions to g… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Journal ref: IEEE Transactions on Control Systems Technology ( Volume: 31, Issue: 2, March 2023)

  20. arXiv:2403.01898  [pdf, other

    cs.CV eess.IV

    Revisiting Learning-based Video Motion Magnification for Real-time Processing

    Authors: Hyunwoo Ha, Oh Hyun-Bin, Kim Jun-Seong, Kwon Byung-Ki, Kim Sung-Bin, Linh-Tam Tran, Ji-Yun Kim, Sung-Ho Bae, Tae-Hyun Oh

    Abstract: Video motion magnification is a technique to capture and amplify subtle motion in a video that is invisible to the naked eye. The deep learning-based prior work successfully demonstrates the modelling of the motion magnification problem with outstanding quality compared to conventional signal processing-based ones. However, it still lags behind real-time performance, which prevents it from being e… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 19 pages

  21. arXiv:2402.05350  [pdf, other

    cs.CV eess.IV

    Descanning: From Scanned to the Original Images with a Color Correction Diffusion Model

    Authors: Junghun Cha, Ali Haider, Seoyun Yang, Hoeyeong Jin, Subin Yang, A. F. M. Shahab Uddin, Jaehyoung Kim, Soo Ye Kim, Sung-Ho Bae

    Abstract: A significant volume of analog information, i.e., documents and images, have been digitized in the form of scanned copies for storing, sharing, and/or analyzing in the digital world. However, the quality of such contents is severely degraded by various distortions caused by printing, storing, and scanning processes in the physical world. Although restoring high-quality content from scanned copies… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted to AAAI 2024

  22. arXiv:2401.06305  [pdf, other

    cs.RO eess.SY

    Multi-Profile Quadratic Programming (MPQP) for Optimal Gap Selection and Speed Planning of Autonomous Driving

    Authors: Alexandre Miranda Anon, Sangjae Bae, Manish Saroya, David Isele

    Abstract: Smooth and safe speed planning is imperative for the successful deployment of autonomous vehicles. This paper presents a mathematical formulation for the optimal speed planning of autonomous driving, which has been validated in high-fidelity simulations and real-road demonstrations with practical constraints. The algorithm explores the inter-traffic gaps in the time and space domain using a breadt… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: Submitted to ICRA 2024

  23. Super-resolution multi-contrast unbiased eye atlases with deep probabilistic refinement

    Authors: Ho Hin Lee, Adam M. Saunders, Michael E. Kim, Samuel W. Remedios, Lucas W. Remedios, Yucheng Tang, Qi Yang, Xin Yu, Shunxing Bao, Chloe Cho, Louise A. Mawn, Tonia S. Rex, Kevin L. Schey, Blake E. Dewey, Jeffrey M. Spraggins, Jerry L. Prince, Yuankai Huo, Bennett A. Landman

    Abstract: Purpose: Eye morphology varies significantly across the population, especially for the orbit and optic nerve. These variations limit the feasibility and robustness of generalizing population-wise features of eye organs to an unbiased spatial reference. Approach: To tackle these limitations, we propose a process for creating high-resolution unbiased eye atlases. First, to restore spatial details… ▽ More

    Submitted 14 November, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

    Comments: Published in SPIE Journal of Medical Imaging (https://doi.org/10.1117/1.JMI.11.6.064004). 27 pages, 6 figures

    Journal ref: J. Med. Imag. 11(6), 064004 (2024)

  24. arXiv:2401.00661  [pdf, ps, other

    eess.SY cs.GT

    Personalized Dynamic Pricing Policy for Electric Vehicles: Reinforcement learning approach

    Authors: Sangjun Bae, Balazs Kulcsar, Sebastien Gros

    Abstract: With the increasing number of fast-electric vehicle charging stations (fast-EVCSs) and the popularization of information technology, electricity price competition between fast-EVCSs is highly expected, in which the utilization of public and/or privacy-preserved information will play a crucial role. Self-interest electric vehicle (EV) users, on the other hand, try to select a fast-EVCS for charging… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

  25. arXiv:2312.09603  [pdf, other

    cs.SD cs.LG eess.AS

    Stethoscope-guided Supervised Contrastive Learning for Cross-domain Adaptation on Respiratory Sound Classification

    Authors: June-Woo Kim, Sangmin Bae, Won-Yang Cho, Byungjo Lee, Ho-Young Jung

    Abstract: Despite the remarkable advances in deep learning technology, achieving satisfactory performance in lung sound classification remains a challenge due to the scarcity of available data. Moreover, the respiratory sound samples are collected from a variety of electronic stethoscopes, which could potentially introduce biases into the trained models. When a significant distribution shift occurs within t… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: accepted to ICASSP 2024

  26. arXiv:2311.06480  [pdf, other

    cs.SD cs.LG eess.AS

    Adversarial Fine-tuning using Generated Respiratory Sound to Address Class Imbalance

    Authors: June-Woo Kim, Chihyeon Yoon, Miika Toikkanen, Sangmin Bae, Ho-Young Jung

    Abstract: Deep generative models have emerged as a promising approach in the medical image domain to address data scarcity. However, their use for sequential data like respiratory sounds is less explored. In this work, we propose a straightforward approach to augment imbalanced respiratory sound data using an audio diffusion model as a conditional neural vocoder. We also demonstrate a simple yet effective a… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

    Comments: accepted in NeurIPS 2023 Workshop on Deep Generative Models for Health (DGM4H)

  27. arXiv:2309.12953  [pdf

    eess.IV cs.CV

    Inter-vendor harmonization of Computed Tomography (CT) reconstruction kernels using unpaired image translation

    Authors: Aravind R. Krishnan, Kaiwen Xu, Thomas Li, Chenyu Gao, Lucas W. Remedios, Praitayini Kanakaraj, Ho Hin Lee, Shunxing Bao, Kim L. Sandler, Fabien Maldonado, Ivana Isgum, Bennett A. Landman

    Abstract: The reconstruction kernel in computed tomography (CT) generation determines the texture of the image. Consistency in reconstruction kernels is important as the underlying CT texture can impact measurements during quantitative image analysis. Harmonization (i.e., kernel conversion) minimizes differences in measurements due to inconsistent reconstruction kernels. Existing methods investigate harmoni… ▽ More

    Submitted 26 January, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: 10 pages, 6 figures, 1 table, Submitted to SPIE Medical Imaging : Image Processing. San Diego, CA. February 2024

  28. arXiv:2309.12531  [pdf, other

    cs.RO eess.SY

    RCMS: Risk-Aware Crash Mitigation System for Autonomous Vehicles

    Authors: Faizan M. Tariq, David Isele, John S. Baras, Sangjae Bae

    Abstract: We propose a risk-aware crash mitigation system (RCMS), to augment any existing motion planner (MP), that enables an autonomous vehicle to perform evasive maneuvers in high-risk situations and minimize the severity of collision if a crash is inevitable. In order to facilitate a smooth transition between RCMS and MP, we develop a novel activation mechanism that combines instantaneous as well as pre… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: Presented at the 26th IEEE International Conference on Intelligent Transportation Systems (ITSC) 2023, Bilbao, Bizkaia, Spain

  29. arXiv:2309.09392  [pdf, other

    eess.IV cs.CV

    Deep conditional generative models for longitudinal single-slice abdominal computed tomography harmonization

    Authors: Xin Yu, Qi Yang, Yucheng Tang, Riqiang Gao, Shunxing Bao, Leon Y. Cai, Ho Hin Lee, Yuankai Huo, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman

    Abstract: Two-dimensional single-slice abdominal computed tomography (CT) provides a detailed tissue map with high resolution allowing quantitative characterization of relationships between health conditions and aging. However, longitudinal analysis of body composition changes using these scans is difficult due to positional variation between slices acquired in different years, which leading to different or… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

  30. arXiv:2309.04071  [pdf, other

    eess.IV cs.CV

    Enhancing Hierarchical Transformers for Whole Brain Segmentation with Intracranial Measurements Integration

    Authors: Xin Yu, Yucheng Tang, Qi Yang, Ho Hin Lee, Shunxing Bao, Yuankai Huo, Bennett A. Landman

    Abstract: Whole brain segmentation with magnetic resonance imaging (MRI) enables the non-invasive measurement of brain regions, including total intracranial volume (TICV) and posterior fossa volume (PFV). Enhancing the existing whole brain segmentation methodology to incorporate intracranial measurements offers a heightened level of comprehensiveness in the analysis of brain structures. Despite its potentia… ▽ More

    Submitted 10 April, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

  31. arXiv:2309.02563  [pdf, other

    eess.IV cs.CV

    Evaluation Kidney Layer Segmentation on Whole Slide Imaging using Convolutional Neural Networks and Transformers

    Authors: Muhao Liu, Chenyang Qi, Shunxing Bao, Quan Liu, Ruining Deng, Yu Wang, Shilin Zhao, Haichun Yang, Yuankai Huo

    Abstract: The segmentation of kidney layer structures, including cortex, outer stripe, inner stripe, and inner medulla within human kidney whole slide images (WSI) plays an essential role in automated image analysis in renal pathology. However, the current manual segmentation process proves labor-intensive and infeasible for handling the extensive digital pathology images encountered at a large scale. In re… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  32. arXiv:2308.05785  [pdf, other

    eess.IV cs.CV

    Leverage Weakly Annotation to Pixel-wise Annotation via Zero-shot Segment Anything Model for Molecular-empowered Learning

    Authors: Xueyuan Li, Ruining Deng, Yucheng Tang, Shunxing Bao, Haichun Yang, Yuankai Huo

    Abstract: Precise identification of multiple cell classes in high-resolution Giga-pixel whole slide imaging (WSI) is critical for various clinical scenarios. Building an AI model for this purpose typically requires pixel-level annotations, which are often unscalable and must be done by skilled domain experts (e.g., pathologists). However, these annotations can be prone to errors, especially when distinguish… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  33. arXiv:2308.05784  [pdf, other

    eess.IV cs.CV

    High-performance Data Management for Whole Slide Image Analysis in Digital Pathology

    Authors: Haoju Leng, Ruining Deng, Shunxing Bao, Dazheng Fang, Bryan A. Millis, Yucheng Tang, Haichun Yang, Xiao Wang, Yifan Peng, Lipeng Wan, Yuankai Huo

    Abstract: When dealing with giga-pixel digital pathology in whole-slide imaging, a notable proportion of data records holds relevance during each analysis operation. For instance, when deploying an image analysis algorithm on whole-slide images (WSI), the computational bottleneck often lies in the input-output (I/O) system. This is particularly notable as patch-level processing introduces a considerable I/O… ▽ More

    Submitted 20 August, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

  34. arXiv:2308.05782  [pdf, other

    eess.IV cs.CV

    Multi-scale Multi-site Renal Microvascular Structures Segmentation for Whole Slide Imaging in Renal Pathology

    Authors: Franklin Hu, Ruining Deng, Shunxing Bao, Haichun Yang, Yuankai Huo

    Abstract: Segmentation of microvascular structures, such as arterioles, venules, and capillaries, from human kidney whole slide images (WSI) has become a focal point in renal pathology. Current manual segmentation techniques are time-consuming and not feasible for large-scale digital pathology images. While deep learning-based methods offer a solution for automatic segmentation, most suffer from a limitatio… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  35. arXiv:2307.11951  [pdf, other

    eess.SP

    A Simple and Efficient RSS-AOA Based Localization with Heterogeneous Anchor Nodes

    Authors: Weizhong Ding, Shengming Chang, Shudi Bao

    Abstract: Accurate and reliable localization is crucial for various wireless communication applications. Numerous studies have proposed accurate localization methods using hybrid received signal strength (RSS) and angle of arrival (AOA) measurements. However, these studies typically assume identical measurement noise distributions for different anchor nodes, which may not accurately reflect real-world scena… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  36. arXiv:2307.11950  [pdf, other

    eess.SP

    Accurate RSS-Based Localization Using an Opposition-Based Learning Simulated Annealing Algorithm

    Authors: Weizhong Ding, Shengming Chang, Shudi Bao, Meng Chen, Jie Sun

    Abstract: Wireless sensor networks require accurate target localization, often achieved through received signal strength (RSS) localization estimation based on maximum likelihood (ML). However, ML-based algorithms can suffer from issues such as low diversity, slow convergence, and local optima, which can significantly affect localization performance. In this paper, we propose a novel localization algorithm… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  37. arXiv:2307.07409  [pdf, other

    cs.CL cs.AI eess.IV

    KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization

    Authors: Gangwoo Kim, Hajung Kim, Lei Ji, Seongsu Bae, Chanhwi Kim, Mujeen Sung, Hyunjae Kim, Kun Yan, Eric Chang, Jaewoo Kang

    Abstract: In this paper, we introduce CheXOFA, a new pre-trained vision-language model (VLM) for the chest X-ray domain. Our model is initially pre-trained on various multimodal datasets within the general domain before being transferred to the chest X-ray domain. Following a prominent VLM, we unify various domain-specific tasks into a simple sequence-to-sequence schema. It enables the model to effectively… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: Published at BioNLP workshop @ ACL 2023

  38. arXiv:2306.15681  [pdf, other

    q-bio.QM cs.LG eess.SP

    ECG-QA: A Comprehensive Question Answering Dataset Combined With Electrocardiogram

    Authors: Jungwoo Oh, Gyubok Lee, Seongsu Bae, Joon-myoung Kwon, Edward Choi

    Abstract: Question answering (QA) in the field of healthcare has received much attention due to significant advancements in natural language processing. However, existing healthcare QA datasets primarily focus on medical images, clinical notes, or structured electronic health record tables. This leaves the vast potential of combining electrocardiogram (ECG) data with these systems largely untapped. To addre… ▽ More

    Submitted 10 October, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: Accepted at NeurIPS 2023 Datasets and Benchmarks Track (10 pages for main text, 2 pages for references, 28 pages for supplementary materials)

  39. arXiv:2306.01853  [pdf, other

    eess.IV cs.CV

    Multi-Contrast Computed Tomography Atlas of Healthy Pancreas

    Authors: Yinchi Zhou, Ho Hin Lee, Yucheng Tang, Xin Yu, Qi Yang, Shunxing Bao, Jeffrey M. Spraggins, Yuankai Huo, Bennett A. Landman

    Abstract: With the substantial diversity in population demographics, such as differences in age and body composition, the volumetric morphology of pancreas varies greatly, resulting in distinctive variations in shape and appearance. Such variations increase the difficulty at generalizing population-wide pancreas features. A volumetric spatial reference is needed to adapt the morphological variability for or… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  40. Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification

    Authors: Sangmin Bae, June-Woo Kim, Won-Yang Cho, Hyerim Baek, Soyoun Son, Byungjo Lee, Changwan Ha, Kyongpil Tae, Sungnyun Kim, Se-Young Yun

    Abstract: Respiratory sound contains crucial information for the early diagnosis of fatal lung diseases. Since the COVID-19 pandemic, there has been a growing interest in contact-free medical care based on electronic stethoscopes. To this end, cutting-edge deep learning models have been developed to diagnose lung diseases; however, it is still challenging due to the scarcity of medical data. In this study,… ▽ More

    Submitted 26 December, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: INTERSPEECH 2023, Code URL: https://github.com/raymin0223/patch-mix_contrastive_learning

  41. arXiv:2304.12149  [pdf, other

    cs.CV eess.IV

    Exploring shared memory architectures for end-to-end gigapixel deep learning

    Authors: Lucas W. Remedios, Leon Y. Cai, Samuel W. Remedios, Karthik Ramadass, Aravind Krishnan, Ruining Deng, Can Cui, Shunxing Bao, Lori A. Coburn, Yuankai Huo, Bennett A. Landman

    Abstract: Deep learning has made great strides in medical imaging, enabled by hardware advances in GPUs. One major constraint for the development of new models has been the saturation of GPU memory resources during training. This is especially true in computational pathology, where images regularly contain more than 1 billion pixels. These pathological images are traditionally divided into small patches to… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  42. arXiv:2304.04155  [pdf, other

    eess.IV cs.CV

    Segment Anything Model (SAM) for Digital Pathology: Assess Zero-shot Segmentation on Whole Slide Imaging

    Authors: Ruining Deng, Can Cui, Quan Liu, Tianyuan Yao, Lucas W. Remedios, Shunxing Bao, Bennett A. Landman, Lee E. Wheless, Lori A. Coburn, Keith T. Wilson, Yaohong Wang, Shilin Zhao, Agnes B. Fogo, Haichun Yang, Yucheng Tang, Yuankai Huo

    Abstract: The segment anything model (SAM) was released as a foundation model for image segmentation. The promptable segmentation model was trained by over 1 billion masks on 11M licensed and privacy-respecting images. The model supports zero-shot image segmentation with various segmentation prompts (e.g., points, boxes, masks). It makes the SAM attractive for medical image analysis, especially for digital… ▽ More

    Submitted 9 April, 2023; originally announced April 2023.

  43. arXiv:2304.00216  [pdf, other

    eess.IV cs.CV cs.LG

    Cross-scale Multi-instance Learning for Pathological Image Diagnosis

    Authors: Ruining Deng, Can Cui, Lucas W. Remedios, Shunxing Bao, R. Michael Womick, Sophie Chiron, Jia Li, Joseph T. Roland, Ken S. Lau, Qi Liu, Keith T. Wilson, Yaohong Wang, Lori A. Coburn, Bennett A. Landman, Yuankai Huo

    Abstract: Analyzing high resolution whole slide images (WSIs) with regard to information across multiple scales poses a significant challenge in digital pathology. Multi-instance learning (MIL) is a common solution for working with high resolution images by classifying bags of objects (i.e. sets of smaller image patches). However, such processing is typically performed at a single scale (e.g., 20x magnifica… ▽ More

    Submitted 16 February, 2024; v1 submitted 31 March, 2023; originally announced April 2023.

  44. arXiv:2303.13336  [pdf, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI

    Authors: Chenshuang Zhang, Chaoning Zhang, Sheng Zheng, Mengchun Zhang, Maryam Qamar, Sung-Ho Bae, In So Kweon

    Abstract: Generative AI has demonstrated impressive performance in various fields, among which speech synthesis is an interesting direction. With the diffusion model as the most popular generative model, numerous works have attempted two active tasks: text to speech and speech enhancement. This work conducts a survey on audio diffusion model, which is complementary to existing surveys that either lack the r… ▽ More

    Submitted 2 April, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: 18 pages

  45. arXiv:2303.05785  [pdf, other

    eess.IV cs.CV cs.LG

    Scaling Up 3D Kernels with Bayesian Frequency Re-parameterization for Medical Image Segmentation

    Authors: Ho Hin Lee, Quan Liu, Shunxing Bao, Qi Yang, Xin Yu, Leon Y. Cai, Thomas Li, Yuankai Huo, Xenofon Koutsoukos, Bennett A. Landman

    Abstract: With the inspiration of vision transformers, the concept of depth-wise convolution revisits to provide a large Effective Receptive Field (ERF) using Large Kernel (LK) sizes for medical image segmentation. However, the segmentation performance might be saturated and even degraded as the kernel sizes scaled up (e.g., $21\times 21\times 21$) in a Convolutional Neural Network (CNN). We hypothesize tha… ▽ More

    Submitted 5 June, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    Comments: Accepted to MICCAI 2023 (top 13.6%), both codes and pretrained models are available at: https://github.com/MASILab/RepUX-Net

  46. SLAS: Speed and Lane Advisory System for Highway Navigation

    Authors: Faizan M. Tariq, David Isele, John S. Baras, Sangjae Bae

    Abstract: This paper proposes a hierarchical autonomous vehicle navigation architecture, composed of a high-level speed and lane advisory system (SLAS) coupled with low-level trajectory generation and trajectory following modules. Specifically, we target a multi-lane highway driving scenario where an autonomous ego vehicle navigates in traffic. We propose a novel receding horizon mixed-integer optimization… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: Presented at the IEEE 61st Conference on Decision and Control (CDC), Cancun, Mexico, 2022

    Journal ref: 2022 IEEE 61st Conference on Decision and Control (CDC), Cancun, Mexico, 2022, pp. 6979-6986

  47. arXiv:2302.00171  [pdf, other

    cs.RO cs.LG eess.SY math.OC

    Active Uncertainty Reduction for Safe and Efficient Interaction Planning: A Shielding-Aware Dual Control Approach

    Authors: Haimin Hu, David Isele, Sangjae Bae, Jaime F. Fisac

    Abstract: The ability to accurately predict others' behavior is central to the safety and efficiency of interactive robotics. Unfortunately, robots often lack access to key information on which these predictions may hinge, such as other agents' goals, attention, and willingness to cooperate. Dual control theory addresses this challenge by treating unknown parameters of a predictive model as stochastic hidde… ▽ More

    Submitted 1 November, 2023; v1 submitted 31 January, 2023; originally announced February 2023.

    Comments: The International Journal of Robotics Research. arXiv admin note: text overlap with arXiv:2202.07720

  48. arXiv:2301.06622  [pdf, other

    cs.DC eess.SY

    IOPathTune: Adaptive Online Parameter Tuning for Parallel File System I/O Path

    Authors: Md. Hasanur Rashid, Youbiao He, Forrest Sheng Bao, Dong Dai

    Abstract: Parallel file systems contain complicated I/O paths from clients to storage servers. An efficient I/O path requires proper settings of multiple parameters, as the default settings often fail to deliver optimal performance, especially for diverse workloads in the HPC environment. Existing tuning strategies have shortcomings in being adaptive, timely, and flexible. We propose IOPathTune, which adapt… ▽ More

    Submitted 16 January, 2023; originally announced January 2023.

  49. Interaction-Aware Trajectory Planning for Autonomous Vehicles with Analytic Integration of Neural Networks into Model Predictive Control

    Authors: Piyush Gupta, David Isele, Donggun Lee, Sangjae Bae

    Abstract: Autonomous vehicles (AVs) must share the driving space with other drivers and often employ conservative motion planning strategies to ensure safety. These conservative strategies can negatively impact AV's performance and significantly slow traffic throughput. Therefore, to avoid conservatism, we design an interaction-aware motion planner for the ego vehicle (AV) that interacts with surrounding ve… ▽ More

    Submitted 1 March, 2023; v1 submitted 13 January, 2023; originally announced January 2023.

  50. arXiv:2212.00059  [pdf, other

    eess.IV cs.CV

    Single Slice Thigh CT Muscle Group Segmentation with Domain Adaptation and Self-Training

    Authors: Qi Yang, Xin Yu, Ho Hin Lee, Leon Y. Cai, Kaiwen Xu, Shunxing Bao, Yuankai Huo, Ann Zenobia Moore, Sokratis Makrogiannis, Luigi Ferrucci, Bennett A. Landman

    Abstract: Objective: Thigh muscle group segmentation is important for assessment of muscle anatomy, metabolic disease and aging. Many efforts have been put into quantifying muscle tissues with magnetic resonance (MR) imaging including manual annotation of individual muscles. However, leveraging publicly available annotations in MR images to achieve muscle group segmentation on single slice computed tomograp… ▽ More

    Submitted 30 November, 2022; originally announced December 2022.