Skip to main content

Showing 1–50 of 60 results for author: Jung, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2503.15769  [pdf, other

    cs.DC cs.LG eess.SY

    Prediction of Permissioned Blockchain Performance for Resource Scaling Configurations

    Authors: Seungwoo Jung, Yeonho Yoo, Gyeongsik Yang, Chuck Yoo

    Abstract: Blockchain is increasingly offered as blockchain-as-a-service (BaaS) by cloud service providers. However, configuring BaaS appropriately for optimal performance and reliability resorts to try-and-error. A key challenge is that BaaS is often perceived as a ``black-box,'' leading to uncertainties in performance and resource provisioning. Previous studies attempted to address this challenge; however,… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

    Journal ref: ICT Express, Volume 10, Issue 6, December 2024, Pages 1253-1258

  2. arXiv:2503.13199  [pdf, other

    eess.SP

    Equalization-Enhanced Phase Noise: Modeling and DSP-aware Analysis

    Authors: Sebastian Jung, Tim Janz, Vahid Aref, Stephan ten Brink

    Abstract: In coherent optical communication systems the laser phase noise is commonly modeled as a Wiener process. We propose a sliding-window based linearization of the phase noise, enabling a novel description. We show that, by stochastically modeling the residual error introduced by this approximation, equalization-enhanced phase noise (EEPN) can be described and decomposed into four different components… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

  3. arXiv:2503.09906  [pdf, other

    eess.AS cs.SD

    ValSub: Subsampling Validation Data to Mitigate Forgetting during ASR Personalization

    Authors: Haaris Mehmood, Karthikeyan Saravanan, Pablo Peso Parada, David Tuckey, Mete Ozay, Gil Ho Lee, Jungin Lee, Seokyeong Jung

    Abstract: Automatic Speech Recognition (ASR) is widely used within consumer devices such as mobile phones. Recently, personalization or on-device model fine-tuning has shown that adaptation of ASR models towards target user speech improves their performance over rare words or accented speech. Despite these gains, fine-tuning on user data (target domain) risks the personalized model to forget knowledge about… ▽ More

    Submitted 7 April, 2025; v1 submitted 12 March, 2025; originally announced March 2025.

    Comments: Accepted at ICASSP 2025

  4. arXiv:2502.20427  [pdf, other

    cs.CR cs.AI cs.SD eess.AS

    DeePen: Penetration Testing for Audio Deepfake Detection

    Authors: Nicolas Müller, Piotr Kawa, Adriana Stan, Thien-Phuc Doan, Souhwan Jung, Wei Herng Choong, Philip Sperl, Konstantin Böttinger

    Abstract: Deepfakes - manipulated or forged audio and video media - pose significant security risks to individuals, organizations, and society at large. To address these challenges, machine learning-based classifiers are commonly employed to detect deepfake content. In this paper, we assess the robustness of such classifiers through a systematic penetration testing methodology, which we introduce as DeePen.… ▽ More

    Submitted 5 March, 2025; v1 submitted 27 February, 2025; originally announced February 2025.

  5. arXiv:2502.03132  [pdf, other

    cs.RO eess.SY

    SPARK: A Modular Benchmark for Humanoid Robot Safety

    Authors: Yifan Sun, Rui Chen, Kai S. Yun, Yikuan Fang, Sebin Jung, Feihan Li, Bowei Li, Weiye Zhao, Changliu Liu

    Abstract: This paper introduces the Safe Protective and Assistive Robot Kit (SPARK), a comprehensive benchmark designed to ensure safety in humanoid autonomy and teleoperation. Humanoid robots pose significant safety risks due to their physical capabilities of interacting with complex environments. The physical structures of humanoid robots further add complexity to the design of general safety solutions. T… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

  6. arXiv:2501.17171  [pdf

    cs.CV cs.AI cs.LG eess.IV

    Separated Inter/Intra-Modal Fusion Prompts for Compositional Zero-Shot Learning

    Authors: Sua Jung

    Abstract: Compositional Zero-Shot Learning (CZSL) aims to recognize subtle differences in meaning or the combination of states and objects through the use of known and unknown concepts during training. Existing methods either focused on prompt configuration or on using prompts to tune the pre-trained Vision-Language model. However, these methods faced challenges in accurately identifying subtle differences… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

    Comments: AIAP 2025

    Journal ref: Published at AIAP 2025

  7. arXiv:2501.09113  [pdf, other

    eess.AS cs.SD

    persoDA: Personalized Data Augmentation for Personalized ASR

    Authors: Pablo Peso Parada, Spyros Fontalis, Md Asif Jalal, Karthikeyan Saravanan, Anastasios Drosou, Mete Ozay, Gil Ho Lee, Jungin Lee, Seokyeong Jung

    Abstract: Data augmentation (DA) is ubiquitously used in training of Automatic Speech Recognition (ASR) models. DA offers increased data variability, robustness and generalization against different acoustic distortions. Recently, personalization of ASR models on mobile devices has been shown to improve Word Error Rate (WER). This paper evaluates data augmentation in this context and proposes persoDA; a DA m… ▽ More

    Submitted 17 January, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

    Comments: ICASSP'25-Copyright 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  8. Deformation-Aware Segmentation Network Robust to Motion Artifacts for Brain Tissue Segmentation using Disentanglement Learning

    Authors: Sunyoung Jung, Yoonseok Choi, Mohammed A. Al-masni, Minyoung Jung, Dong-Hyun Kim

    Abstract: Motion artifacts caused by prolonged acquisition time are a significant challenge in Magnetic Resonance Imaging (MRI), hindering accurate tissue segmentation. These artifacts appear as blurred images that mimic tissue-like appearances, making segmentation difficult. This study proposes a novel deep learning framework that demonstrates superior performance in both motion correction and robust brain… ▽ More

    Submitted 5 December, 2024; originally announced December 2024.

    Comments: Medical Image Computing and Computer Assisted Intervention, MICCAI 2024

    Journal ref: Medical Image Computing and Computer Assisted Intervention MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15009. Springer, Cham

  9. arXiv:2408.10966  [pdf, ps, other

    eess.IV cs.CV

    ISLES'24: Final Infarct Prediction with Multimodal Imaging and Clinical Data. Where Do We Stand?

    Authors: Ezequiel de la Rosa, Ruisheng Su, Mauricio Reyes, Evamaria O. Riedel, Hakim Baazaoui, Roland Wiest, Florian Kofler, Kaiyuan Yang, David Robben, Mahsa Mojtahedi, Laura van Poppel, Lucas de Vries, Anthony Winder, Kimberly Amador, Nils D. Forkert, Gyeongyeon Hwang, Jiwoo Song, Dohyun Kim, Eneko Uruñuela, Annabella Bregazzi, Matthias Wilms, Hyun Yang, Jin Tae Kwak, Sumin Jung, Luan Matheus Trindade Dalmazo , et al. (15 additional authors not shown)

    Abstract: Accurate estimation of brain infarction (i.e., irreversibly damaged tissue) is critical for guiding treatment decisions in acute ischemic stroke. Reliable infarct prediction informs key clinical interventions, including the need for patient transfer to comprehensive stroke centers, the potential benefit of additional reperfusion attempts during mechanical thrombectomy, decisions regarding secondar… ▽ More

    Submitted 7 July, 2025; v1 submitted 20 August, 2024; originally announced August 2024.

  10. arXiv:2408.04266  [pdf, other

    cs.RO eess.SY

    BPMP-Tracker: A Versatile Aerial Target Tracker Using Bernstein Polynomial Motion Primitives

    Authors: Yunwoo Lee, Jungwon Park, Boseong Jeon, Seungwoo Jung, H. Jin Kim

    Abstract: This letter presents a versatile trajectory planning pipeline for aerial tracking. The proposed tracker is capable of handling various chasing settings such as complex unstructured environments, crowded dynamic obstacles and multiple-target following. Among the entire pipeline, we focus on developing a predictor for future target motion and a chasing trajectory planner. For rapid computation, we e… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: 8 pages, 9 figures

  11. arXiv:2406.16994  [pdf, other

    eess.SP cs.AI

    Quantum Multi-Agent Reinforcement Learning for Cooperative Mobile Access in Space-Air-Ground Integrated Networks

    Authors: Gyu Seon Kim, Yeryeong Cho, Jaehyun Chung, Soohyun Park, Soyi Jung, Zhu Han, Joongheon Kim

    Abstract: Achieving global space-air-ground integrated network (SAGIN) access only with CubeSats presents significant challenges such as the access sustainability limitations in specific regions (e.g., polar regions) and the energy efficiency limitations in CubeSats. To tackle these problems, high-altitude long-endurance unmanned aerial vehicles (HALE-UAVs) can complement these CubeSat shortcomings for prov… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 17 pages, 22 figures

  12. Lesion-Aware Cross-Phase Attention Network for Renal Tumor Subtype Classification on Multi-Phase CT Scans

    Authors: Kwang-Hyun Uhm, Seung-Won Jung, Sung-Hoo Hong, Sung-Jea Ko

    Abstract: Multi-phase computed tomography (CT) has been widely used for the preoperative diagnosis of kidney cancer due to its non-invasive nature and ability to characterize renal lesions. However, since enhancement patterns of renal lesions across CT phases are different even for the same lesion type, the visual assessment by radiologists suffers from inter-observer variability in clinical practice. Altho… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: This article has been accepted for publication in Computers in Biology and Medicine

    Journal ref: Computers in Biology and Medicine, 108746, 2024

  13. arXiv:2404.03991  [pdf, other

    eess.IV cs.CV cs.LG

    Towards Efficient and Accurate CT Segmentation via Edge-Preserving Probabilistic Downsampling

    Authors: Shahzad Ali, Yu Rim Lee, Soo Young Park, Won Young Tak, Soon Ki Jung

    Abstract: Downsampling images and labels, often necessitated by limited resources or to expedite network training, leads to the loss of small objects and thin boundaries. This undermines the segmentation network's capacity to interpret images accurately and predict detailed labels, resulting in diminished performance compared to processing at original resolutions. This situation exemplifies the trade-off be… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 5 pages (4 figures, 1 table); This work has been submitted to the IEEE Signal Processing Letters

  14. arXiv:2403.14154  [pdf, other

    eess.SY

    LR-FHSS Transceiver for Direct-to-Satellite IoT Communications: Design, Implementation, and Verification

    Authors: Sooyeob Jung, Seongah Jeong, Jinkyu Kang, Gyeongrae Im, Sangjae Lee, Mi-Kyung Oh, Joon Gyu Ryu, Joonhyuk Kang

    Abstract: This paper proposes a long range-frequency hopping spread spectrum (LR-FHSS) transceiver design for the Direct-to-Satellite Internet of Things (DtS-IoT) communication system. The DtS-IoT system has recently attracted attention as a promising nonterrestrial network (NTN) solution to provide high-traffic and low-latency data transfer services to IoT devices in global coverage. In particular, this st… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 17pages, 23 figures

  15. arXiv:2403.05093  [pdf, other

    cs.CV eess.IV

    Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter Profile

    Authors: Seokjun Lee, Seung-Won Jung, Hyunseok Seo

    Abstract: Currently, image generation and synthesis have remarkably progressed with generative models. Despite photo-realistic results, intrinsic discrepancies are still observed in the frequency domain. The spectral discrepancy appeared not only in generative adversarial networks but in diffusion models. In this study, we propose a framework to effectively mitigate the disparity in frequency domain of the… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted to AAAI 2024

  16. arXiv:2401.13921  [pdf, other

    eess.AS cs.SD

    Intelli-Z: Toward Intelligible Zero-Shot TTS

    Authors: Sunghee Jung, Won Jang, Jaesam Yoon, Bongwan Kim

    Abstract: Although numerous recent studies have suggested new frameworks for zero-shot TTS using large-scale, real-world data, studies that focus on the intelligibility of zero-shot TTS are relatively scarce. Zero-shot TTS demands additional efforts to ensure clear pronunciation and speech quality due to its inherent requirement of replacing a core parameter (speaker embedding or acoustic prompt) with a new… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  17. arXiv:2401.13146  [pdf, other

    eess.AS cs.CL cs.SD

    Locality enhanced dynamic biasing and sampling strategies for contextual ASR

    Authors: Md Asif Jalal, Pablo Peso Parada, George Pavlidis, Vasileios Moschopoulos, Karthikeyan Saravanan, Chrysovalantis-Giorgos Kontoulis, Jisi Zhang, Anastasios Drosou, Gil Ho Lee, Jungin Lee, Seokyeong Jung

    Abstract: Automatic Speech Recognition (ASR) still face challenges when recognizing time-variant rare-phrases. Contextual biasing (CB) modules bias ASR model towards such contextually-relevant phrases. During training, a list of biasing phrases are selected from a large pool of phrases following a sampling strategy. In this work we firstly analyse different sampling strategies to provide insights into the t… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted for IEEE ASRU 2023

  18. arXiv:2401.12085  [pdf, other

    eess.AS cs.SD

    Consistency Based Unsupervised Self-training For ASR Personalisation

    Authors: Jisi Zhang, Vandana Rajan, Haaris Mehmood, David Tuckey, Pablo Peso Parada, Md Asif Jalal, Karthikeyan Saravanan, Gil Ho Lee, Jungin Lee, Seokyeong Jung

    Abstract: On-device Automatic Speech Recognition (ASR) models trained on speech data of a large population might underperform for individuals unseen during training. This is due to a domain shift between user data and the original training data, differed by user's speaking characteristics and environmental acoustic conditions. ASR personalisation is a solution that aims to exploit user data to improve model… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: Accepted for IEEE ASRU 2023

  19. arXiv:2312.05548  [pdf, other

    eess.IV cs.CV cs.LG

    A Unified Multi-Phase CT Synthesis and Classification Framework for Kidney Cancer Diagnosis with Incomplete Data

    Authors: Kwang-Hyun Uhm, Seung-Won Jung, Moon Hyung Choi, Sung-Hoo Hong, Sung-Jea Ko

    Abstract: Multi-phase CT is widely adopted for the diagnosis of kidney cancer due to the complementary information among phases. However, the complete set of multi-phase CT is often not available in practical clinical applications. In recent years, there have been some studies to generate the missing modality image from the available data. Nevertheless, the generated images are not guaranteed to be effectiv… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: This article has been accepted for publication in IEEE Journal of Biomedical and Health Informatics

    Journal ref: JBHI, 2022

  20. arXiv:2312.05528  [pdf, other

    eess.IV cs.CV

    Exploring 3D U-Net Training Configurations and Post-Processing Strategies for the MICCAI 2023 Kidney and Tumor Segmentation Challenge

    Authors: Kwang-Hyun Uhm, Hyunjun Cho, Zhixin Xu, Seohoon Lim, Seung-Won Jung, Sung-Hoo Hong, Sung-Jea Ko

    Abstract: In 2023, it is estimated that 81,800 kidney cancer cases will be newly diagnosed, and 14,890 people will die from this cancer in the United States. Preoperative dynamic contrast-enhanced abdominal computed tomography (CT) is often used for detecting lesions. However, there exists inter-observer variability due to subtle differences in the imaging features of kidney and kidney tumors. In this paper… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: MICCAI 2023, KITS 2023 challenge 2nd place

  21. arXiv:2312.01638  [pdf, other

    eess.IV cs.CV

    J-Net: Improved U-Net for Terahertz Image Super-Resolution

    Authors: Woon-Ha Yeo, Seung-Hwan Jung, Seung Jae Oh, Inhee Maeng, Eui Su Lee, Han-Cheol Ryu

    Abstract: Terahertz (THz) waves are electromagnetic waves in the 0.1 to 10 THz frequency range, and THz imaging is utilized in a range of applications, including security inspections, biomedical fields, and the non-destructive examination of materials. However, THz images have low resolution due to the long wavelength of THz waves. Therefore, improving the resolution of THz images is one of the current hot… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  22. arXiv:2311.15683  [pdf

    eess.AS cs.SD eess.SP

    Ultrasensitive Textile Strain Sensors Redefine Wearable Silent Speech Interfaces with High Machine Learning Efficiency

    Authors: Chenyu Tang, Muzi Xu, Wentian Yi, Zibo Zhang, Edoardo Occhipinti, Chaoqun Dong, Dafydd Ravenscroft, Sung-Min Jung, Sanghyo Lee, Shuo Gao, Jong Min Kim, Luigi G. Occhipinti

    Abstract: Our research presents a wearable Silent Speech Interface (SSI) technology that excels in device comfort, time-energy efficiency, and speech decoding accuracy for real-world use. We developed a biocompatible, durable textile choker with an embedded graphene-based strain sensor, capable of accurately detecting subtle throat movements. This sensor, surpassing other strain sensors in sensitivity by 42… ▽ More

    Submitted 7 December, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: 5 figures in the article; 11 figures and 4 tables in supplementary information

    Journal ref: npj Flexible Electronics (2024)

  23. arXiv:2307.13343  [pdf, other

    eess.AS cs.CR cs.SD

    On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer

    Authors: Md Asif Jalal, Pablo Peso Parada, Jisi Zhang, Karthikeyan Saravanan, Mete Ozay, Myoungji Han, Jung In Lee, Seokyeong Jung

    Abstract: Smart devices serviced by large-scale AI models necessitates user data transfer to the cloud for inference. For speech applications, this means transferring private user information, e.g., speaker identity. Our paper proposes a privacy-enhancing framework that targets speaker identity anonymization while preserving speech recognition accuracy for our downstream task~-~Automatic Speech Recognition… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: Proceedings of INTERSPEECH 2023

  24. arXiv:2306.09382  [pdf, ps, other

    cs.SD cs.LG cs.MM eess.AS

    Sound Demixing Challenge 2023 Music Demixing Track Technical Report: TFC-TDF-UNet v3

    Authors: Minseok Kim, Jun Hyung Lee, Soonyoung Jung

    Abstract: In this report, we present our award-winning solutions for the Music Demixing Track of Sound Demixing Challenge 2023. First, we propose TFC-TDF-UNet v3, a time-efficient music source separation model that achieves state-of-the-art results on the MUSDB benchmark. We then give full details regarding our solutions for each Leaderboard, including a loss masking approach for noise-robust training. Code… ▽ More

    Submitted 21 July, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: 5 pages, 4 tables

  25. arXiv:2306.04137  [pdf, other

    cs.MA eess.SY

    Multi-Agent Reinforcement Learning for Cooperative Air Transportation Services in City-Wide Autonomous Urban Air Mobility

    Authors: Chanyoung Park, Gyu Seon Kim, Soohyun Park, Soyi Jung, Joongheon Kim

    Abstract: The development of urban-air-mobility (UAM) is rapidly progressing with spurs, and the demand for efficient transportation management systems is a rising need due to the multifaceted environmental uncertainties. Thus, this paper proposes a novel air transportation service management algorithm based on multi-agent deep reinforcement learning (MADRL) to address the challenges of multi-UAM cooperatio… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: 15 pages, 14 figures

  26. arXiv:2305.13779  [pdf, other

    cs.AR eess.SP

    Transceiver Design and Performance Analysis for LR-FHSS-based Direct-to-Satellite IoT

    Authors: Sooyeob Jung, Seongah Jeong, Jinkyu Kang, Joon Gyu Ryu, Joonhyuk Kang

    Abstract: This paper presents a novel transceiver design aimed at enabling Direct-to-Satellite Internet of Things (DtS-IoT) systems based on long range-frequency hopping spread spectrum (LR-FHSS). Our focus lies in developing an accurate transmission method through the analysis of the frame structure and key parameters outlined in Long Range Wide-Area Network (LoRaWAN) [1]. To address the Doppler effect in… ▽ More

    Submitted 25 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 5 pages, 6 figures

    Report number: CL2023-1147

  27. Cross-domain Denoising for Low-dose Multi-frame Spiral Computed Tomography

    Authors: Yucheng Lu, Zhixin Xu, Moon Hyung Choi, Jimin Kim, Seung-Won Jung

    Abstract: Computed tomography (CT) has been used worldwide as a non-invasive test to assist in diagnosis. However, the ionizing nature of X-ray exposure raises concerns about potential health risks such as cancer. The desire for lower radiation doses has driven researchers to improve reconstruction quality. Although previous studies on low-dose computed tomography (LDCT) denoising have demonstrated the effe… ▽ More

    Submitted 28 June, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

    Journal ref: IEEE Transactions on Medical Imaging (2024)

  28. arXiv:2304.05920  [pdf, other

    eess.SP physics.optics

    Learning to exploit z-Spatial Diversity for Coherent Nonlinear Optical Fiber Communication

    Authors: Sebastian Jung, Tim Uhlemann, Alexander Span, Maximilian Bauhofer, Stephan ten Brink

    Abstract: Higher-order solitons inherently possess a spatial periodicity along the propagation axis. The pulse expands and compresses in both, frequency and time domain. This property is exploited for a bandwidth-limited receiver by sampling the optical signal at two different distances. Numerical simulations show that when pure solions are transmitted and the second (i.e., further propagated) signal is als… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  29. arXiv:2302.14273  [pdf, other

    cs.RO eess.SY

    QP Chaser: Polynomial Trajectory Generation for Autonomous Aerial Tracking

    Authors: Yunwoo Lee, Jungwon Park, Seungwoo Jung, Boseong Jeon, Dahyun Oh, H. Jin Kim

    Abstract: Maintaining the visibility of the target is one of the major objectives of aerial tracking missions. This paper proposes a target-visible trajectory planning pipeline using quadratic programming (QP). Our approach can handle various tracking settings, including 1) single- and dual-target following and 2) both static and dynamic environments, unlike other works that focus on a single specific setup… ▽ More

    Submitted 26 November, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: 18 pages, 16 figures

  30. arXiv:2301.03815  [pdf, other

    eess.SY

    Marine IoT Systems with Space-Air-Sea Integrated Networks: Hybrid LEO and UAV Edge Computing

    Authors: Sooyeob Jung, Seongah Jeong, Jinkyu Kang, Joonhyuk Kang

    Abstract: Marine Internet of Things (IoT) systems have grown substantially with the development of non-terrestrial networks (NTN) via aerial and space vehicles in the upcoming sixth-generation (6G), thereby assisting environment protection, military reconnaissance, and sea transportation. Due to unpredictable climate changes and the extreme channel conditions of maritime networks, however, it is challenging… ▽ More

    Submitted 10 January, 2023; originally announced January 2023.

    Comments: 12 pages, 8 figures, 3 tables, submission in IEEE IoT Journal

    Report number: IoT-27450-2022

  31. arXiv:2301.00124  [pdf, other

    eess.SY cs.RO

    Situation-Aware Deep Reinforcement Learning for Autonomous Nonlinear Mobility Control in Cyber-Physical Loitering Munition Systems

    Authors: Hyunsoo Lee, Soohyun Park, Won Joon Yun, Soyi Jung, Joongheon Kim

    Abstract: According to the rapid development of drone technologies, drones are widely used in many applications including military domains. In this paper, a novel situation-aware DRL- based autonomous nonlinear drone mobility control algorithm in cyber-physical loitering munition applications. On the battlefield, the design of DRL-based autonomous control algorithm is not straightforward because real-world… ▽ More

    Submitted 31 December, 2022; originally announced January 2023.

  32. arXiv:2211.03502  [pdf, other

    eess.SP cs.CV

    Neural Architectural Nonlinear Pre-Processing for mmWave Radar-based Human Gesture Perception

    Authors: Hankyul Baek, Yoo Jeong, Ha, Minjae Yoo, Soyi Jung, Joongheon Kim

    Abstract: In modern on-driving computing environments, many sensors are used for context-aware applications. This paper utilizes two deep learning models, U-Net and EfficientNet, which consist of a convolutional neural network (CNN), to detect hand gestures and remove noise in the Range Doppler Map image that was measured through a millimeter-wave (mmWave) radar. To improve the performance of classification… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: 4 pages, 7 figures

  33. arXiv:2208.07639  [pdf, other

    eess.IV

    RAWtoBit: A Fully End-to-end Camera ISP Network

    Authors: Wooseok Jeong, Seung-Won Jung

    Abstract: Image compression is an essential and last processing unit in the camera image signal processing (ISP) pipeline. While many studies have been made to replace the conventional ISP pipeline with a single end-to-end optimized deep learning model, image compression is barely considered as a part of the model. In this paper, we investigate the designing of a fully end-to-end optimized camera ISP incorp… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: Accepted at ECCV2022

  34. Lightweight Encoder-Decoder Architecture for Foot Ulcer Segmentation

    Authors: Shahzad Ali, Arif Mahmood, Soon Ki Jung

    Abstract: Continuous monitoring of foot ulcer healing is needed to ensure the efficacy of a given treatment and to avoid any possibility of deterioration. Foot ulcer segmentation is an essential step in wound diagnosis. We developed a model that is similar in spirit to the well-established encoder-decoder and residual convolution neural networks. Our model includes a residual connection along with a channel… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

    Comments: Published version of this article is available at https://link.springer.com/chapter/10.1007/978-3-031-06381-7_17

    Journal ref: Frontiers of Computer Vision. IW-FCV 2022. Communications in Computer and Information Science, vol 1578. Springer, Cham (2022)

  35. arXiv:2204.00491  [pdf, other

    cs.CV eess.IV

    FrequencyLowCut Pooling -- Plug & Play against Catastrophic Overfitting

    Authors: Julia Grabinski, Steffen Jung, Janis Keuper, Margret Keuper

    Abstract: Over the last years, Convolutional Neural Networks (CNNs) have been the dominating neural architecture in a wide range of computer vision tasks. From an image and signal processing point of view, this success might be a bit surprising as the inherent spatial pyramid design of most CNNs is apparently violating basic signal processing laws, i.e. Sampling Theorem in their down-sampling operations. Ho… ▽ More

    Submitted 20 September, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

    Comments: accepted at ECCV 2022

  36. arXiv:2203.16852  [pdf, other

    eess.AS cs.LG cs.SD

    JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech

    Authors: Dan Lim, Sunghee Jung, Eesung Kim

    Abstract: In neural text-to-speech (TTS), two-stage system or a cascade of separately learned models have shown synthesis quality close to human speech. For example, FastSpeech2 transforms an input text to a mel-spectrogram and then HiFi-GAN generates a raw waveform from a mel-spectogram where they are called an acoustic feature generator and a neural vocoder respectively. However, their training pipeline i… ▽ More

    Submitted 1 July, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Accepted to INTERSPEECH 2022

  37. arXiv:2202.10456  [pdf, other

    cs.LG cs.CR cs.CV eess.IV

    Feasibility Study of Multi-Site Split Learning for Privacy-Preserving Medical Systems under Data Imbalance Constraints in COVID-19, X-Ray, and Cholesterol Dataset

    Authors: Yoo Jeong Ha, Gusang Lee, Minjae Yoo, Soyi Jung, Seehwan Yoo, Joongheon Kim

    Abstract: It seems as though progressively more people are in the race to upload content, data, and information online; and hospitals haven't neglected this trend either. Hospitals are now at the forefront for multi-site medical data sharing to provide groundbreaking advancements in the way health records are shared and patients are diagnosed. Sharing of medical data is essential in modern medical research.… ▽ More

    Submitted 20 February, 2022; originally announced February 2022.

  38. arXiv:2201.05843  [pdf, other

    eess.SY cs.AI cs.LG cs.RO

    Cooperative Multi-Agent Deep Reinforcement Learning for Reliable Surveillance via Autonomous Multi-UAV Control

    Authors: Won Joon Yun, Soohyun Park, Joongheon Kim, MyungJae Shin, Soyi Jung, David A. Mohaisen, Jae-Hyun Kim

    Abstract: CCTV-based surveillance using unmanned aerial vehicles (UAVs) is considered a key technology for security in smart city environments. This paper creates a case where the UAVs with CCTV-cameras fly over the city area for flexible and reliable surveillance services. UAVs should be deployed to cover a large area while minimize overlapping and shadow areas for a reliable surveillance system. However,… ▽ More

    Submitted 15 January, 2022; originally announced January 2022.

    Comments: 10 pages, 6 figures, Accepted for publication in IEEE Transactions on Industrial Informatics (TII)

  39. arXiv:2111.13321  [pdf, other

    eess.AS cs.LG cs.SD

    Learning source-aware representations of music in a discrete latent space

    Authors: Jinsung Kim, Yeong-Seok Jeong, Woosung Choi, Jaehwa Chung, Soonyoung Jung

    Abstract: In recent years, neural network based methods have been proposed as a method that cangenerate representations from music, but they are not human readable and hardly analyzable oreditable by a human. To address this issue, we propose a novel method to learn source-awarelatent representations of music through Vector-Quantized Variational Auto-Encoder(VQ-VAE).We train our VQ-VAE to encode an input mi… ▽ More

    Submitted 26 November, 2021; originally announced November 2021.

    Comments: MDX Workshop @ ISMIR 2021, 7 pages, 2 figure

  40. arXiv:2111.12516  [pdf, other

    eess.AS cs.LG cs.SD

    LightSAFT: Lightweight Latent Source Aware Frequency Transform for Source Separation

    Authors: Yeong-Seok Jeong, Jinsung Kim, Woosung Choi, Jaehwa Chung, Soonyoung Jung

    Abstract: Conditioned source separations have attracted significant attention because of their flexibility, applicability and extensionality. Their performance was usually inferior to the existing approaches, such as the single source separation model. However, a recently proposed method called LaSAFT-Net has shown that conditioned models can show comparable performance against existing single-source separa… ▽ More

    Submitted 26 January, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: MDX Workshop @ ISMIR 2021, 6 pages, 1 figure

  41. arXiv:2111.12203  [pdf, other

    eess.AS cs.SD

    KUIELab-MDX-Net: A Two-Stream Neural Network for Music Demixing

    Authors: Minseok Kim, Woosung Choi, Jaehwa Chung, Daewon Lee, Soonyoung Jung

    Abstract: Recently, many methods based on deep learning have been proposed for music source separation. Some state-of-the-art methods have shown that stacking many layers with many skip connections improve the SDR performance. Although such a deep and complex architecture shows outstanding performance, it usually requires numerous computing resources and time for training and evaluation. This paper proposes… ▽ More

    Submitted 23 November, 2021; originally announced November 2021.

    Comments: MDX Workshop @ ISMIR 2021, 7 pages, 3 figures

  42. arXiv:2110.08796  [pdf, other

    eess.SY

    Stable Marriage Matching for Traffic-Aware Space-Air-Ground Integrated Networks: A Gale-Shapley Algorithmic Approach

    Authors: Hyunsoo Lee, Haemin Lee, Soyi Jung, Joongheon Kim

    Abstract: In keeping with the rapid development of communication technology, a new communication structure is required in a next-generation communication system. In particular, research using High Altitude Platform (HAP) or Unmanned Aerial Vehicle(UAV) in existing terrestrial networks is active. In this paper, we propose matching HAP and UAV using the Gale-Shapley algorithm in a relay communication situatio… ▽ More

    Submitted 17 October, 2021; originally announced October 2021.

  43. arXiv:2108.10147  [pdf, other

    cs.LG cs.AI eess.IV

    Spatio-Temporal Split Learning for Privacy-Preserving Medical Platforms: Case Studies with COVID-19 CT, X-Ray, and Cholesterol Data

    Authors: Yoo Jeong Ha, Minjae Yoo, Gusang Lee, Soyi Jung, Sae Won Choi, Joongheon Kim, Seehwan Yoo

    Abstract: Machine learning requires a large volume of sample data, especially when it is used in high-accuracy medical applications. However, patient records are one of the most sensitive private information that is not usually shared among institutes. This paper presents spatio-temporal split learning, a distributed deep neural network framework, which is a turning point in allowing collaboration among pri… ▽ More

    Submitted 20 August, 2021; originally announced August 2021.

  44. arXiv:2108.00626  [pdf, ps, other

    eess.SY

    Quantum Scheduling for Millimeter-Wave Observation Satellite Constellation

    Authors: Joongheon Kim, Yunseok Kwak, Soyi Jung, Jae-Hyun Kim

    Abstract: In beyond 5G and 6G network scenarios, the use of satellites has been actively discussed for extending target monitoring areas, even for extreme circumstances, where the monitoring functionalities can be realized due to the usage of millimeter-wave wireless links. This paper designs an efficient scheduling algorithm which minimizes overlapping monitoring areas among observation satellite constella… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

  45. arXiv:2107.11790  [pdf, other

    eess.SP

    Distributed and Autonomous Aerial Data Collection in Smart City Surveillance Applications

    Authors: Haemin Lee, Soyi Jung, Joongheon Kim

    Abstract: The massive growth of Smart City and Internet of Things applications enables safety and security. The data those are produced from surveillance cameras in aerial devices such as unmanned aerial networks (UAVs) are needed to be transferred to ground stations for secure data analysis. When the scale of network is relatively large compare to the wireless communication coverage of device, it is not al… ▽ More

    Submitted 25 July, 2021; originally announced July 2021.

  46. Progressive Joint Low-light Enhancement and Noise Removal for Raw Images

    Authors: Yucheng Lu, Seung-Won Jung

    Abstract: Low-light imaging on mobile devices is typically challenging due to insufficient incident light coming through the relatively small aperture, resulting in a low signal-to-noise ratio. Most of the previous works on low-light image processing focus either only on a single task such as illumination adjustment, color enhancement, or noise removal; or on a joint illumination adjustment and denoising ta… ▽ More

    Submitted 2 September, 2022; v1 submitted 28 June, 2021; originally announced June 2021.

  47. arXiv:2104.13553  [pdf, other

    eess.AS cs.LG cs.SD

    AMSS-Net: Audio Manipulation on User-Specified Sources with Textual Queries

    Authors: Woosung Choi, Minseok Kim, Marco A. Martínez Ramírez, Jaehwa Chung, Soonyoung Jung

    Abstract: This paper proposes a neural network that performs audio transformations to user-specified sources (e.g., vocals) of a given audio track according to a given description while preserving other sources not mentioned in the description. Audio Manipulation on a Specific Source (AMSS) is challenging because a sound object (i.e., a waveform sample or frequency bin) is `transparent'; it usually carries… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

    Comments: 10 pages, 8 figures, 3 tables, under reviewing of ACMMM 21

  48. arXiv:2010.11631  [pdf, other

    cs.SD cs.LG eess.AS

    LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned Source Separation

    Authors: Woosung Choi, Minseok Kim, Jaehwa Chung, Soonyoung Jung

    Abstract: Recent deep-learning approaches have shown that Frequency Transformation (FT) blocks can significantly improve spectrogram-based single-source separation models by capturing frequency patterns. The goal of this paper is to extend the FT block to fit the multi-source task. We propose the Latent Source Attentive Frequency Transformation (LaSAFT) block to capture source-dependent frequency patterns.… ▽ More

    Submitted 14 April, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: 5 pages, 3 figures, 2 tables. accepted to ICASSP 2021

  49. arXiv:2009.05210  [pdf

    eess.SP

    A 6.3-Nanowatt-per-Channel 96-Channel Neural Spike Processor for a Movement-Intention-Decoding Brain-Computer-Interface Implant

    Authors: Zhewei Jiang, Jiangyi Li, Pavan K. Chundi, Sung Justin Kim, Minhao Yang, Joonseong Kang, Seungchul Jung, Sang Joon Kim, Mingoo Seok

    Abstract: This paper presents microwatt end-to-end neural signal processing hardware for deployment-stage real-time upper-limb movement intent decoding. This module features intercellular spike detection, sorting, and decoding operations for a 96-channel prosthetic implant. We design the algorithms for those operations to achieve minimal computation complexity while matching or advancing the accuracy of sta… ▽ More

    Submitted 10 September, 2020; originally announced September 2020.

  50. arXiv:2008.06208  [pdf

    eess.AS cs.CL cs.SD

    Adaptable Multi-Domain Language Model for Transformer ASR

    Authors: Taewoo Lee, Min-Joong Lee, Tae Gyoon Kang, Seokyeoung Jung, Minseok Kwon, Yeona Hong, Jungin Lee, Kyoung-Gu Woo, Ho-Gyeong Kim, Jiseung Jeong, Jihyun Lee, Hosik Lee, Young Sang Choi

    Abstract: We propose an adapter based multi-domain Transformer based language model (LM) for Transformer ASR. The model consists of a big size common LM and small size adapters. The model can perform multi-domain adaptation with only the small size adapters and its related layers. The proposed model can reuse the full fine-tuned LM which is fine-tuned using all layers of an original model. The proposed LM c… ▽ More

    Submitted 10 February, 2021; v1 submitted 14 August, 2020; originally announced August 2020.

    Comments: This paper is accepted for presentation at IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP), 2021