Skip to main content

Showing 1–17 of 17 results for author: Jang, I

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.06732  [pdf, ps, other

    eess.AS cs.AI eess.SP

    Neural Spectral Band Generation for Audio Coding

    Authors: Woongjib Choi, Byeong Hyeon Kim, Hyungseob Lim, Inseon Jang, Hong-Goo Kang

    Abstract: Audio bandwidth extension is the task of reconstructing missing high frequency components of bandwidth-limited audio signals, where bandwidth limitation is a common issue for audio signals due to several reasons, including channel capacity and data constraints. While conventional spectral band replication is a well-established parametric approach to audio bandwidth extension, the SBR usually entai… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

    Comments: Accepted to Interspeech 2025

  2. arXiv:2502.01092  [pdf, other

    cs.RO cs.CV eess.SY

    Enhancing Feature Tracking Reliability for Visual Navigation using Real-Time Safety Filter

    Authors: Dabin Kim, Inkyu Jang, Youngsoo Han, Sunwoo Hwang, H. Jin Kim

    Abstract: Vision sensors are extensively used for localizing a robot's pose, particularly in environments where global localization tools such as GPS or motion capture systems are unavailable. In many visual navigation systems, localization is achieved by detecting and tracking visual features or landmarks, which provide information about the sensor's relative pose. For reliable feature tracking and accurat… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

    Comments: 7 pages, 6 figures, Accepted to 2025 IEEE International Conference on Robotics & Automation (ICRA 2025)

  3. arXiv:2408.09894  [pdf

    eess.IV cs.AI cs.CV

    Preoperative Rotator Cuff Tear Prediction from Shoulder Radiographs using a Convolutional Block Attention Module-Integrated Neural Network

    Authors: Chris Hyunchul Jo, Jiwoong Yang, Byunghwan Jeon, Hackjoon Shim, Ikbeom Jang

    Abstract: Research question: We test whether a plane shoulder radiograph can be used together with deep learning methods to identify patients with rotator cuff tears as opposed to using an MRI in standard of care. Findings: By integrating convolutional block attention modules into a deep neural network, our model demonstrates high accuracy in detecting patients with rotator cuff tears, achieving an average… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  4. arXiv:2406.12632  [pdf, other

    eess.IV cs.CV

    Cyclic 2.5D Perceptual Loss for Cross-Modal 3D Medical Image Synthesis: T1w MRI to Tau PET

    Authors: Junho Moon, Symac Kim, Haejun Chung, Ikbeom Jang

    Abstract: There is a demand for medical image synthesis or translation to generate synthetic images of missing modalities from available data. This need stems from challenges such as restricted access to high-cost imaging devices, government regulations, or failure to follow up with patients or study participants. In medical imaging, preserving high-level semantic features is often more critical than achiev… ▽ More

    Submitted 15 May, 2025; v1 submitted 18 June, 2024; originally announced June 2024.

  5. Personalized Neural Speech Codec

    Authors: Inseon Jang, Haici Yang, Wootaek Lim, Seungkwon Beack, Minje Kim

    Abstract: In this paper, we propose a personalized neural speech codec, envisioning that personalization can reduce the model complexity or improve perceptual speech quality. Despite the common usage of speech codecs where only a single talker is involved on each side of the communication, personalizing a codec for the specific user has rarely been explored in the literature. First, we assume speakers can b… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Journal ref: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 991-995

  6. arXiv:2311.08330  [pdf, other

    eess.AS cs.SD

    Generative De-Quantization for Neural Speech Codec via Latent Diffusion

    Authors: Haici Yang, Inseon Jang, Minje Kim

    Abstract: In low-bitrate speech coding, end-to-end speech coding networks aim to learn compact yet expressive features and a powerful decoder in a single network. A challenging problem as such results in unwelcome complexity increase and inferior speech quality. In this paper, we propose to separate the representation learning and information reconstruction tasks. We leverage an end-to-end codec for learnin… ▽ More

    Submitted 15 November, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Submitted to ICASSP 2024

  7. arXiv:2304.09507  [pdf, other

    eess.IV cs.CV

    Self-supervised Image Denoising with Downsampled Invariance Loss and Conditional Blind-Spot Network

    Authors: Yeong Il Jang, Keuntek Lee, Gu Yong Park, Seyun Kim, Nam Ik Cho

    Abstract: There have been many image denoisers using deep neural networks, which outperform conventional model-based methods by large margins. Recently, self-supervised methods have attracted attention because constructing a large real noise dataset for supervised training is an enormous burden. The most representative self-supervised denoisers are based on blind-spot networks, which exclude the receptive f… ▽ More

    Submitted 28 July, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: Accepted to ICCV 2023

  8. arXiv:2303.08005  [pdf, other

    eess.AS cs.SD

    Native Multi-Band Audio Coding within Hyper-Autoencoded Reconstruction Propagation Networks

    Authors: Darius Petermann, Inseon Jang, Minje Kim

    Abstract: Spectral sub-bands do not portray the same perceptual relevance. In audio coding, it is therefore desirable to have independent control over each of the constituent bands so that bitrate assignment and signal reconstruction can be achieved efficiently. In this work, we present a novel neural audio coding network that natively supports a multi-band coding paradigm. Our model extends the idea of com… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Accepted to ICASSP 2023. For resources and examples, see https://saige.sice.indiana.edu/research-projects/HARP-Net/

  9. arXiv:2211.08715  [pdf, other

    cs.SD cs.LG eess.AS

    Conditional variational autoencoder to improve neural audio synthesis for polyphonic music sound

    Authors: Seokjin Lee, Minhan Kim, Seunghyeon Shin, Daeho Lee, Inseon Jang, Wootaek Lim

    Abstract: Deep generative models for audio synthesis have recently been significantly improved. However, the task of modeling raw-waveforms remains a difficult problem, especially for audio waveforms and music signals. Recently, the realtime audio variational autoencoder (RAVE) method was developed for high-quality audio waveform synthesis. The RAVE method is based on the variational autoencoder and utilize… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: 5 pages, 6 figures

  10. arXiv:2205.12429  [pdf, other

    eess.IV cs.CV

    Interaction of a priori Anatomic Knowledge with Self-Supervised Contrastive Learning in Cardiac Magnetic Resonance Imaging

    Authors: Makiya Nakashima, Inyeop Jang, Ramesh Basnet, Mitchel Benovoy, W. H. Wilson Tang, Christopher Nguyen, Deborah Kwon, Tae Hyun Hwang, David Chen

    Abstract: Training deep learning models on cardiac magnetic resonance imaging (CMR) can be a challenge due to the small amount of expert generated labels and inherent complexity of data source. Self-supervised contrastive learning (SSCL) has recently been shown to boost performance in several medical imaging tasks. However, it is unclear how much the pre-trained representation reflects the primary organ of… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Comments: Under review at Machine Learning in Healthcare

  11. arXiv:2202.04823  [pdf, other

    q-bio.QM cs.CV cs.LG eess.IV

    Decreasing Annotation Burden of Pairwise Comparisons with Human-in-the-Loop Sorting: Application in Medical Image Artifact Rating

    Authors: Ikbeom Jang, Garrison Danley, Ken Chang, Jayashree Kalpathy-Cramer

    Abstract: Ranking by pairwise comparisons has shown improved reliability over ordinal classification. However, as the annotations of pairwise comparisons scale quadratically, this becomes less practical when the dataset is large. We propose a method for reducing the number of pairwise comparisons required to rank by a quantitative metric, demonstrating the effectiveness of the approach in ranking medical im… ▽ More

    Submitted 9 February, 2022; originally announced February 2022.

    Comments: 5 pages, 2 figures, NeurIPS Data-Centric AI Workshop 2021

    ACM Class: I.2.1

  12. arXiv:2112.06417  [pdf, other

    eess.IV cs.CV

    LC-FDNet: Learned Lossless Image Compression with Frequency Decomposition Network

    Authors: Hochang Rhee, Yeong Il Jang, Seyun Kim, Nam Ik Cho

    Abstract: Recent learning-based lossless image compression methods encode an image in the unit of subimages and achieve comparable performances to conventional non-learning algorithms. However, these methods do not consider the performance drop in the high-frequency region, giving equal consideration to the low and high-frequency areas. In this paper, we propose a new lossless image compression method that… ▽ More

    Submitted 12 December, 2021; originally announced December 2021.

  13. arXiv:2112.01629  [pdf, ps, other

    eess.IV cs.AI cs.CV

    Engineering AI Tools for Systematic and Scalable Quality Assessment in Magnetic Resonance Imaging

    Authors: Yukai Zou, Ikbeom Jang

    Abstract: A desire to achieve large medical imaging datasets keeps increasing as machine learning algorithms, parallel computing, and hardware technology evolve. Accordingly, there is a growing demand in pooling data from multiple clinical and academic institutes to enable large-scale clinical or translational research studies. Magnetic resonance imaging (MRI) is a frequently used, non-invasive imaging moda… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

    Comments: 6 pages, 2 figures, NeurIPS Data-Centric AI Workshop 2021 (Virtual)

    ACM Class: I.2.0

  14. arXiv:2107.00353  [pdf, other

    cs.RO eess.SY

    Stability and Robustness Analysis of Plug-Pulling using an Aerial Manipulator

    Authors: Jeonghyun Byun, Dongjae Lee, Hoseong Seo, Inkyu Jang, Jeongjun Choi, H. Jin Kim

    Abstract: In this paper, an autonomous aerial manipulation task of pulling a plug out of an electric socket is conducted, where maintaining the stability and robustness is challenging due to sudden disappearance of a large interaction force. The abrupt change in the dynamical model before and after the separation of the plug can cause destabilization or mission failure. To accomplish aerial plug-pulling, we… ▽ More

    Submitted 5 July, 2021; v1 submitted 1 July, 2021; originally announced July 2021.

    Comments: to be presented in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 2021

  15. arXiv:2105.11681  [pdf, other

    cs.LG cs.SD eess.AS

    Deep Neural Networks and End-to-End Learning for Audio Compression

    Authors: Daniela N. Rim, Inseon Jang, Heeyoul Choi

    Abstract: Recent achievements in end-to-end deep learning have encouraged the exploration of tasks dealing with highly structured data with unified deep network models. Having such models for compressing audio signals has been challenging since it requires discrete representations that are not easy to train with end-to-end backpropagation. In this paper, we present an end-to-end deep learning approach that… ▽ More

    Submitted 13 July, 2021; v1 submitted 25 May, 2021; originally announced May 2021.

  16. arXiv:1911.01635  [pdf, other

    eess.AS cs.SD

    Emotional speech synthesis with rich and granularized control

    Authors: Se-Yun Um, Sangshin Oh, Kyungguen Byun, Inseon Jang, Chunghyun Ahn, Hong-Goo Kang

    Abstract: This paper proposes an effective emotion control method for an end-to-end text-to-speech (TTS) system. To flexibly control the distinct characteristic of a target emotion category, it is essential to determine embedding vectors representing the TTS input. We introduce an inter-to-intra emotional distance ratio algorithm to the embedding vectors that can minimize the distance to the target emotion… ▽ More

    Submitted 5 November, 2019; v1 submitted 5 November, 2019; originally announced November 2019.

    Comments: Submitted to ICASSP 2020

  17. arXiv:1909.10219  [pdf, other

    eess.SY

    Efficient Multi-Agent Trajectory Planning with Feasibility Guarantee using Relative Bernstein Polynomial

    Authors: Jungwon Park, Junha Kim, Inkyu Jang, H. Jin Kim

    Abstract: This paper presents a new efficient algorithm which guarantees a solution for a class of multi-agent trajectory planning problems in obstacle-dense environments. Our algorithm combines the advantages of both grid-based and optimization-based approaches, and generates safe, dynamically feasible trajectories without suffering from an erroneous optimization setup such as imposing infeasible collision… ▽ More

    Submitted 8 March, 2020; v1 submitted 23 September, 2019; originally announced September 2019.

    Comments: 7 pages, ICRA2020 under review