Search | arXiv e-print repository

arXiv:2503.12005 [pdf, other]

Dynamic IRS Allocation for Spectrum-Sharing MIMO Communication and Radar Systems

Authors: Daniyal Munir, Atta Ullah, Danish Mehmood Mughal, Min Young Chung, Hans D. Schotten

Abstract: This paper investigates the use of intelligent reflecting surfaces (IRS) to assist cellular communications and radar sensing operations in a communications and sensing setup. The IRS dynamically allocates reflecting elements to simultaneously localize a target and assist a user's communication. To achieve this, we propose a novel optimization framework that jointly addresses beamforming design and… ▽ More This paper investigates the use of intelligent reflecting surfaces (IRS) to assist cellular communications and radar sensing operations in a communications and sensing setup. The IRS dynamically allocates reflecting elements to simultaneously localize a target and assist a user's communication. To achieve this, we propose a novel optimization framework that jointly addresses beamforming design and IRS element allocation. Specifically, we formulate a Weighted Minimum Mean Square Error (WMMSE)-based approach that iteratively optimizes the transmit and receive beamforming vectors, IRS phase shifts, and element allocation. The allocation mechanism adaptively balances the number of IRS elements dedicated to communication and sensing subsystems by leveraging the signal-to-noise-plus-interference-ratio (SINR) between the two. The proposed solution ensures efficient resource utilization while maintaining performance trade-offs. Numerical results demonstrate significant improvements in both communication and sensing SINRs under varying system parameters. △ Less

Submitted 15 March, 2025; originally announced March 2025.

Comments: Conference Paper

arXiv:2502.20304 [pdf, other]

Fast $\ell_1$-Regularized EEG Source Localization Using Variable Projection

Authors: Jack Michael Solomon, Rosemary Renaut, Matthias Chung

Abstract: Electroencephalograms (EEG) are invaluable for treating neurological disorders, however, mapping EEG electrode readings to brain activity requires solving a challenging inverse problem. Due to the time series data, the use of $\ell_1$ regularization quickly becomes intractable for many solvers, and, despite the reconstruction advantages of $\ell_1$ regularization, $\ell_2$-based approaches such as… ▽ More Electroencephalograms (EEG) are invaluable for treating neurological disorders, however, mapping EEG electrode readings to brain activity requires solving a challenging inverse problem. Due to the time series data, the use of $\ell_1$ regularization quickly becomes intractable for many solvers, and, despite the reconstruction advantages of $\ell_1$ regularization, $\ell_2$-based approaches such as sLORETA are used in practice. In this work, we formulate EEG source localization as a graphical generalized elastic net inverse problem and present a variable projected algorithm (VPAL) suitable for fast EEG source localization. We prove convergence of this solver for a broad class of separable convex, potentially non-smooth functions subject to linear constraints and include a modification of VPAL that reconstructs time points in sequence, suitable for real-time reconstruction. Our proposed methods are compared to state-of-the-art approaches including sLORETA and other methods for $\ell_1$-regularized inverse problems. △ Less

Submitted 27 February, 2025; originally announced February 2025.

MSC Class: 65F10; 65F22; 65F2; 90C06

arXiv:2402.15539 [pdf, ps, other]

Speech Corpus for Korean Children with Autism Spectrum Disorder: Towards Automatic Assessment Systems

Authors: Seonwoo Lee, Jihyun Mun, Sunhee Kim, Minhwa Chung

Abstract: Despite the growing demand for digital therapeutics for children with Autism Spectrum Disorder (ASD), there is currently no speech corpus available for Korean children with ASD. This paper introduces a speech corpus specifically designed for Korean children with ASD, aiming to advance speech technologies such as pronunciation and severity evaluation. Speech recordings from speech and language eval… ▽ More Despite the growing demand for digital therapeutics for children with Autism Spectrum Disorder (ASD), there is currently no speech corpus available for Korean children with ASD. This paper introduces a speech corpus specifically designed for Korean children with ASD, aiming to advance speech technologies such as pronunciation and severity evaluation. Speech recordings from speech and language evaluation sessions were transcribed, and annotated for articulatory and linguistic characteristics. Three speech and language pathologists rated these recordings for social communication severity (SCS) and pronunciation proficiency (PP) using a 3-point Likert scale. The total number of participants will be 300 for children with ASD and 50 for typically developing (TD) children. The paper also analyzes acoustic and linguistic features extracted from speech data collected and completed for annotation from 73 children with ASD and 9 TD children to investigate the characteristics of children with ASD and identify significant features that correlate with the clinical scores. The results reveal some speech and linguistic characteristics in children with ASD that differ from those in TD children or another subgroup of ASD categorized by clinical scores, demonstrating the potential for developing automatic assessment systems for SCS and PP. △ Less

Submitted 23 February, 2024; originally announced February 2024.

Comments: 11 pages, Accepted for LREC-COLING 2024

arXiv:2307.00385 [pdf, other]

Sulcal Pattern Matching with the Wasserstein Distance

Authors: Zijian Chen, Soumya Das, Moo K. Chung

Abstract: We present the unified computational framework for modeling the sulcal patterns of human brain obtained from the magnetic resonance images. The Wasserstein distance is used to align the sulcal patterns nonlinearly. These patterns are topologically different across subjects making the pattern matching a challenge. We work out the mathematical details and develop the gradient descent algorithms for… ▽ More We present the unified computational framework for modeling the sulcal patterns of human brain obtained from the magnetic resonance images. The Wasserstein distance is used to align the sulcal patterns nonlinearly. These patterns are topologically different across subjects making the pattern matching a challenge. We work out the mathematical details and develop the gradient descent algorithms for estimating the deformation field. We further quantify the image registration performance. This method is applied in identifying the differences between male and female sulcal patterns. △ Less

Submitted 1 July, 2023; originally announced July 2023.

Comments: In press in IEEE ISBI

arXiv:2306.10821 [pdf, other]

Comparison of L2 Korean pronunciation error patterns from five L1 backgrounds by using automatic phonetic transcription

Authors: Eun Jung Yeo, Hyungshin Ryu, Jooyoung Lee, Sunhee Kim, Minhwa Chung

Abstract: This paper presents a large-scale analysis of L2 Korean pronunciation error patterns from five different language backgrounds, Chinese, Vietnamese, Japanese, Thai, and English, by using automatic phonetic transcription. For the analysis, confusion matrices are generated for each L1, by aligning canonical phone sequences and automatically transcribed phone sequences obtained from fine-tuned Wav2Vec… ▽ More This paper presents a large-scale analysis of L2 Korean pronunciation error patterns from five different language backgrounds, Chinese, Vietnamese, Japanese, Thai, and English, by using automatic phonetic transcription. For the analysis, confusion matrices are generated for each L1, by aligning canonical phone sequences and automatically transcribed phone sequences obtained from fine-tuned Wav2Vec2 XLS-R phone recognizer. Each value in the confusion matrices is compared to capture frequent common error patterns and to specify patterns unique to a certain language background. Using the Foreign Speakers' Voice Data of Korean for Artificial Intelligence Learning dataset, common error pattern types are found to be (1) substitutions of aspirated or tense consonants with plain consonants, (2) deletions of syllable-final consonants, and (3) substitutions of diphthongs with monophthongs. On the other hand, thirty-nine patterns including (1) syllable-final /l/ substitutions with /n/ for Vietnamese and (2) /\textturnm/ insertions for Japanese are discovered as language-dependent. △ Less

Submitted 19 June, 2023; originally announced June 2023.

Comments: 5 pages, 2 figures, accepted to ICPhS 2023

arXiv:2305.18392 [pdf, other]

Speech Intelligibility Assessment of Dysarthric Speech by using Goodness of Pronunciation with Uncertainty Quantification

Authors: Eun Jung Yeo, Kwanghee Choi, Sunhee Kim, Minhwa Chung

Abstract: This paper proposes an improved Goodness of Pronunciation (GoP) that utilizes Uncertainty Quantification (UQ) for automatic speech intelligibility assessment for dysarthric speech. Current GoP methods rely heavily on neural network-driven overconfident predictions, which is unsuitable for assessing dysarthric speech due to its significant acoustic differences from healthy speech. To alleviate the… ▽ More This paper proposes an improved Goodness of Pronunciation (GoP) that utilizes Uncertainty Quantification (UQ) for automatic speech intelligibility assessment for dysarthric speech. Current GoP methods rely heavily on neural network-driven overconfident predictions, which is unsuitable for assessing dysarthric speech due to its significant acoustic differences from healthy speech. To alleviate the problem, UQ techniques were used on GoP by 1) normalizing the phoneme prediction (entropy, margin, maxlogit, logit-margin) and 2) modifying the scoring function (scaling, prior normalization). As a result, prior-normalized maxlogit GoP achieves the best performance, with a relative increase of 5.66%, 3.91%, and 23.65% compared to the baseline GoP for English, Korean, and Tamil, respectively. Furthermore, phoneme analysis is conducted to identify which phoneme scores significantly correlate with intelligibility scores in each language. △ Less

Submitted 28 May, 2023; originally announced May 2023.

Comments: Accepted to Interspeech 2023

arXiv:2305.08002 [pdf, ps, other]

doi 10.1016/j.phycom.2023.102108

Proportional Fair Scheduling Using Water-Filling Technique for SC-FDMA Based D2D Communication

Authors: Syed Tariq Shah, Jaheon Gu, Syed Faraz Hasan, Min Young Chung

Abstract: The resource allocation in SC-FDMA is constrained by the condition that multiple subchannels should be allocated to a single user only if they are adjacent. Therefore, the scheduling scheme of a D2D-cellular system that uses SC-FDMA must also conform to the so-called adjacency constraint. This paper proposes a heuristic algorithm with low computational complexity that applies proportional fair (PF… ▽ More The resource allocation in SC-FDMA is constrained by the condition that multiple subchannels should be allocated to a single user only if they are adjacent. Therefore, the scheduling scheme of a D2D-cellular system that uses SC-FDMA must also conform to the so-called adjacency constraint. This paper proposes a heuristic algorithm with low computational complexity that applies proportional fair (PF) scheduling in the D2D-cellular system. The proposed algorithm consists of two main phases: i) subchannel allocation and ii) adjustment of data rates, which are executed for both CUEs and DUEs. In the subchannel allocation phase for CUEs (or D2D pairs), the users' data rates are maximized via optimal power allocation to frequency-contiguous subchannels. In the second phase, a PF scheduling problem is solved to decide the modulation and coding scheme (MCS) of both CUEs and D2D pairs. Both phases of the proposed algorithm benefit from the Water-Filling (WF) technique. The simulation results suggest that the proposed scheme performs similarly to optimal PF scheduling from the perspective of users' data rate and their logarithmic sum. An additional benefit of the proposed scheme is its low computational overhead. △ Less

Submitted 2 June, 2023; v1 submitted 13 May, 2023; originally announced May 2023.

arXiv:2211.15950 [pdf, other]

Enhanced artificial intelligence-based diagnosis using CBCT with internal denoising: Clinical validation for discrimination of fungal ball, sinusitis, and normal cases in the maxillary sinus

Authors: Kyungsu Kim, Chae Yeon Lim, Joong Bo Shin, Myung Jin Chung, Yong Gi Jung

Abstract: The cone-beam computed tomography (CBCT) provides 3D volumetric imaging of a target with low radiation dose and cost compared with conventional computed tomography, and it is widely used in the detection of paranasal sinus disease. However, it lacks the sensitivity to detect soft tissue lesions owing to reconstruction constraints. Consequently, only physicians with expertise in CBCT reading can di… ▽ More The cone-beam computed tomography (CBCT) provides 3D volumetric imaging of a target with low radiation dose and cost compared with conventional computed tomography, and it is widely used in the detection of paranasal sinus disease. However, it lacks the sensitivity to detect soft tissue lesions owing to reconstruction constraints. Consequently, only physicians with expertise in CBCT reading can distinguish between inherent artifacts or noise and diseases, restricting the use of this imaging modality. The development of artificial intelligence (AI)-based computer-aided diagnosis methods for CBCT to overcome the shortage of experienced physicians has attracted substantial attention. However, advanced AI-based diagnosis addressing intrinsic noise in CBCT has not been devised, discouraging the practical use of AI solutions for CBCT. To address this issue, we propose an AI-based computer-aided diagnosis method using CBCT with a denoising module. This module is implemented before diagnosis to reconstruct the internal ground-truth full-dose scan corresponding to an input CBCT image and thereby improve the diagnostic performance. The external validation results for the unified diagnosis of sinus fungal ball, chronic rhinosinusitis, and normal cases show that the proposed method improves the micro-, macro-average AUC, and accuracy by 7.4, 5.6, and 9.6% (from 86.2, 87.0, and 73.4 to 93.6, 92.6, and 83.0%), respectively, compared with a baseline while improving human diagnosis accuracy by 11% (from 71.7 to 83.0%), demonstrating technical differentiation and clinical effectiveness. This pioneering study on AI-based diagnosis using CBCT indicates denoising can improve diagnostic performance and reader interpretability in images from the sinonasal area, thereby providing a new approach and direction to radiographic image reconstruction regarding the development of AI-based diagnostic solutions. △ Less

Submitted 29 November, 2022; originally announced November 2022.

arXiv:2210.15387 [pdf, other]

Automatic Severity Classification of Dysarthric speech by using Self-supervised Model with Multi-task Learning

Authors: Eun Jung Yeo, Kwanghee Choi, Sunhee Kim, Minhwa Chung

Abstract: Automatic assessment of dysarthric speech is essential for sustained treatments and rehabilitation. However, obtaining atypical speech is challenging, often leading to data scarcity issues. To tackle the problem, we propose a novel automatic severity assessment method for dysarthric speech, using the self-supervised model in conjunction with multi-task learning. Wav2vec 2.0 XLS-R is jointly traine… ▽ More Automatic assessment of dysarthric speech is essential for sustained treatments and rehabilitation. However, obtaining atypical speech is challenging, often leading to data scarcity issues. To tackle the problem, we propose a novel automatic severity assessment method for dysarthric speech, using the self-supervised model in conjunction with multi-task learning. Wav2vec 2.0 XLS-R is jointly trained for two different tasks: severity classification and auxiliary automatic speech recognition (ASR). For the baseline experiments, we employ hand-crafted acoustic features and machine learning classifiers such as SVM, MLP, and XGBoost. Explored on the Korean dysarthric speech QoLT database, our model outperforms the traditional baseline methods, with a relative percentage increase of 1.25% for F1-score. In addition, the proposed model surpasses the model trained without ASR head, achieving 10.61% relative percentage improvements. Furthermore, we present how multi-task learning affects the severity classification performance by analyzing the latent representations and regularization effect. △ Less

Submitted 28 April, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

Comments: Accepted to ICASSP 2023

arXiv:2209.12942 [pdf]

Cross-lingual Dysarthria Severity Classification for English, Korean, and Tamil

Authors: Eun Jung Yeo, Kwanghee Choi, Sunhee Kim, Minhwa Chung

Abstract: This paper proposes a cross-lingual classification method for English, Korean, and Tamil, which employs both language-independent features and language-unique features. First, we extract thirty-nine features from diverse speech dimensions such as voice quality, pronunciation, and prosody. Second, feature selections are applied to identify the optimal feature set for each language. A set of shared… ▽ More This paper proposes a cross-lingual classification method for English, Korean, and Tamil, which employs both language-independent features and language-unique features. First, we extract thirty-nine features from diverse speech dimensions such as voice quality, pronunciation, and prosody. Second, feature selections are applied to identify the optimal feature set for each language. A set of shared features and a set of distinctive features are distinguished by comparing the feature selection results of the three languages. Lastly, automatic severity classification is performed, utilizing the two feature sets. Notably, the proposed method removes different features by languages to prevent the negative effect of unique features for other languages. Accordingly, eXtreme Gradient Boosting (XGBoost) algorithm is employed for classification, due to its strength in imputing missing data. In order to validate the effectiveness of our proposed method, two baseline experiments are conducted: experiments using the intersection set of mono-lingual feature sets (Intersection) and experiments using the union set of mono-lingual feature sets (Union). According to the experimental results, our method achieves better performance with a 67.14% F1 score, compared to 64.52% for the Intersection experiment and 66.74% for the Union experiment. Further, the proposed method attains better performances than mono-lingual classifications for all three languages, achieving 17.67%, 2.28%, 7.79% relative percentage increases for English, Korean, and Tamil, respectively. The result specifies that commonly shared features and language-specific features must be considered separately for cross-language dysarthria severity classification. △ Less

Submitted 26 September, 2022; originally announced September 2022.

Comments: 9 pages, 4 figures, APSIPA 2022

arXiv:2207.11534 [pdf, other]

Comparative Validation of AI and non-AI Methods in MRI Volumetry to Diagnose Parkinsonian Syndromes

Authors: Joomee Song, Juyoung Hahm, Jisoo Lee, Chae Yeon Lim, Myung Jin Chung, Jinyoung Youn, Jin Whan Cho, Jong Hyeon Ahn, Kyung-Su Kim

Abstract: Automated segmentation and volumetry of brain magnetic resonance imaging (MRI) scans are essential for the diagnosis of Parkinson's disease (PD) and Parkinson's plus syndromes (P-plus). To enhance the diagnostic performance, we adopt deep learning (DL) models in brain segmentation and compared their performance with the gold-standard non-DL method. We collected brain MRI scans of healthy controls… ▽ More Automated segmentation and volumetry of brain magnetic resonance imaging (MRI) scans are essential for the diagnosis of Parkinson's disease (PD) and Parkinson's plus syndromes (P-plus). To enhance the diagnostic performance, we adopt deep learning (DL) models in brain segmentation and compared their performance with the gold-standard non-DL method. We collected brain MRI scans of healthy controls (n=105) and patients with PD (n=105), multiple systemic atrophy (n=132), and progressive supranuclear palsy (n=69) at Samsung Medical Center from January 2017 to December 2020. Using the gold-standard non-DL model, FreeSurfer (FS), we segmented six brain structures: midbrain, pons, caudate, putamen, pallidum, and third ventricle, and considered them as annotating data for DL models, the representative V-Net and UNETR. The Dice scores and area under the curve (AUC) for differentiating normal, PD, and P-plus cases were calculated. The segmentation times of V-Net and UNETR for the six brain structures per patient were 3.48 +- 0.17 and 48.14 +- 0.97 s, respectively, being at least 300 times faster than FS (15,735 +- 1.07 s). Dice scores of both DL models were sufficiently high (>0.85), and their AUCs for disease classification were superior to that of FS. For classification of normal vs. P-plus and PD vs. multiple systemic atrophy (cerebellar type), the DL models and FS showed AUCs above 0.8. DL significantly reduces the analysis time without compromising the performance of brain segmentation and differential diagnosis. Our findings may contribute to the adoption of DL brain MRI segmentation in clinical settings and advance brain research. △ Less

Submitted 23 July, 2022; originally announced July 2022.

Comments: Joomee Song and Juyoung Hahm contributed equally to this work as the co-first author. Jong Hyeon Ahn and Kyung-Su Kim ([email protected]) contributed equally to this work as the co-corresponding author

arXiv:2207.10324 [pdf, other]

Enhancing Generative Networks for Chest Anomaly Localization through Automatic Registration-Based Unpaired-to-Pseudo-Paired Training Data Translation

Authors: Kyungsu Kim, Seong Je Oh, Chae Yeon Lim, Ju Hwan Lee, Tae Uk Kim, Myung Jin Chung

Abstract: Image translation based on a generative adversarial network (GAN-IT) is a promising method for the precise localization of abnormal regions in chest X-ray images (AL-CXR) even without the pixel-level annotation. However, heterogeneous unpaired datasets undermine existing methods to extract key features and distinguish normal from abnormal cases, resulting in inaccurate and unstable AL-CXR. To addr… ▽ More Image translation based on a generative adversarial network (GAN-IT) is a promising method for the precise localization of abnormal regions in chest X-ray images (AL-CXR) even without the pixel-level annotation. However, heterogeneous unpaired datasets undermine existing methods to extract key features and distinguish normal from abnormal cases, resulting in inaccurate and unstable AL-CXR. To address this problem, we propose an improved two-stage GAN-IT involving registration and data augmentation. For the first stage, we introduce an advanced deep-learning-based registration technique that virtually and reasonably converts unpaired data into paired data for learning registration maps, by sequentially utilizing linear-based global and uniform coordinate transformation and AI-based non-linear coordinate fine-tuning. This approach enables independent and complex coordinate transformation of each detailed location of the lung while recognizing the entire lung structure, thereby achieving higher registration performance with resolving inherent artifacts caused by unpaired conditions. For the second stage, we apply data augmentation to diversify anomaly locations by swapping the left and right lung regions on the uniform registered frames, further improving the performance by alleviating imbalance in data distribution showing left and right lung lesions. The proposed method is model agnostic and shows consistent AL-CXR performance improvement in representative AI models. Therefore, we believe GAN-IT for AL-CXR can be clinically implemented by using our basis framework, even if learning data are scarce or difficult for the pixel-level disease annotation. △ Less

Submitted 15 June, 2024; v1 submitted 21 July, 2022; originally announced July 2022.

arXiv:2206.13504 [pdf, other]

AI-based computer-aided diagnostic system of chest digital tomography synthesis: Demonstrating comparative advantage with X-ray-based AI systems

Authors: Kyung-Su Kim, Ju Hwan Lee, Seong Je Oh, Myung Jin Chung

Abstract: Compared with chest X-ray (CXR) imaging, which is a single image projected from the front of the patient, chest digital tomosynthesis (CDTS) imaging can be more advantageous for lung lesion detection because it acquires multiple images projected from multiple angles of the patient. Various clinical comparative analysis and verification studies have been reported to demonstrate this, but there were… ▽ More Compared with chest X-ray (CXR) imaging, which is a single image projected from the front of the patient, chest digital tomosynthesis (CDTS) imaging can be more advantageous for lung lesion detection because it acquires multiple images projected from multiple angles of the patient. Various clinical comparative analysis and verification studies have been reported to demonstrate this, but there were no artificial intelligence (AI)-based comparative analysis studies. Existing AI-based computer-aided detection (CAD) systems for lung lesion diagnosis have been developed mainly based on CXR images; however, CAD-based on CDTS, which uses multi-angle images of patients in various directions, has not been proposed and verified for its usefulness compared to CXR-based counterparts. This study develops/tests a CDTS-based AI CAD system to detect lung lesions to demonstrate performance improvements compared to CXR-based AI CAD. We used multiple projection images as input for the CDTS-based AI model and a single-projection image as input for the CXR-based AI model to fairly compare and evaluate the performance between models. The proposed CDTS-based AI CAD system yielded sensitivities of 0.782 and 0.785 and accuracies of 0.895 and 0.837 for the performance of detecting tuberculosis and pneumonia, respectively, against normal subjects. These results show higher performance than sensitivities of 0.728 and 0.698 and accuracies of 0.874 and 0.826 for detecting tuberculosis and pneumonia through the CXR-based AI CAD, which only uses a single projection image in the frontal direction. We found that CDTS-based AI CAD improved the sensitivity of tuberculosis and pneumonia by 5.4% and 8.7% respectively, compared to CXR-based AI CAD without loss of accuracy. Therefore, we comparatively prove that CDTS-based AI CAD technology can improve performance more than CXR, enhancing the clinical applicability of CDTS. △ Less

Submitted 18 June, 2022; originally announced June 2022.

Comments: Kyung-Su Kim, Ju Hwan Lee, and Seong Je Oh have contributed equally to this work as the co-first author. Kyung-Su Kim ([email protected]) and Myung Jin Chung ([email protected]) have contributed equally to this work as the co-corresponding author

arXiv:2206.13385 [pdf, other]

3D unsupervised anomaly detection and localization through virtual multi-view projection and reconstruction: Clinical validation on low-dose chest computed tomography

Authors: Kyung-Su Kim, Seong Je Oh, Ju Hwan Lee, Myung Jin Chung

Abstract: Computer-aided diagnosis for low-dose computed tomography (CT) based on deep learning has recently attracted attention as a first-line automatic testing tool because of its high accuracy and low radiation exposure. However, existing methods rely on supervised learning, imposing an additional burden to doctors for collecting disease data or annotating spatial labels for network training, consequent… ▽ More Computer-aided diagnosis for low-dose computed tomography (CT) based on deep learning has recently attracted attention as a first-line automatic testing tool because of its high accuracy and low radiation exposure. However, existing methods rely on supervised learning, imposing an additional burden to doctors for collecting disease data or annotating spatial labels for network training, consequently hindering their implementation. We propose a method based on a deep neural network for computer-aided diagnosis called virtual multi-view projection and reconstruction for unsupervised anomaly detection. Presumably, this is the first method that only requires data from healthy patients for training to identify three-dimensional (3D) regions containing any anomalies. The method has three key components. Unlike existing computer-aided diagnosis tools that use conventional CT slices as the network input, our method 1) improves the recognition of 3D lung structures by virtually projecting an extracted 3D lung region to obtain two-dimensional (2D) images from diverse views to serve as network inputs, 2) accommodates the input diversity gain for accurate anomaly detection, and 3) achieves 3D anomaly/disease localization through a novel 3D map restoration method using multiple 2D anomaly maps. The proposed method based on unsupervised learning improves the patient-level anomaly detection by 10% (area under the curve, 0.959) compared with a gold standard based on supervised learning (area under the curve, 0.848), and it localizes the anomaly region with 93% accuracy, demonstrating its high performance. △ Less

Submitted 18 June, 2022; originally announced June 2022.

Comments: Kyung-Su Kim and Seong Je Oh have contributed equally to this work as the co-first author. Kyung-Su Kim ([email protected]) and Myung Jin Chung ([email protected]) have contributed equally to this work as the co-corresponding author

arXiv:2206.06730 [pdf, other]

Automated Precision Localization of Peripherally Inserted Central Catheter Tip through Model-Agnostic Multi-Stage Networks

Authors: Subin Park, Yoon Ki Cha, Soyoung Park, Kyung-Su Kim, Myung Jin Chung

Abstract: Peripherally inserted central catheters (PICCs) have been widely used as one of the representative central venous lines (CVCs) due to their long-term intravascular access with low infectivity. However, PICCs have a fatal drawback of a high frequency of tip mispositions, increasing the risk of puncture, embolism, and complications such as cardiac arrhythmias. To automatically and precisely detect i… ▽ More Peripherally inserted central catheters (PICCs) have been widely used as one of the representative central venous lines (CVCs) due to their long-term intravascular access with low infectivity. However, PICCs have a fatal drawback of a high frequency of tip mispositions, increasing the risk of puncture, embolism, and complications such as cardiac arrhythmias. To automatically and precisely detect it, various attempts have been made by using the latest deep learning (DL) technologies. However, even with these approaches, it is still practically difficult to determine the tip location because the multiple fragments phenomenon (MFP) occurs in the process of predicting and extracting the PICC line required before predicting the tip. This study aimed to develop a system generally applied to existing models and to restore the PICC line more exactly by removing the MFs of the model output, thereby precisely localizing the actual tip position for detecting its disposition. To achieve this, we proposed a multi-stage DL-based framework post-processing the PICC line extraction result of the existing technology. The performance was compared by each root mean squared error (RMSE) and MFP incidence rate according to whether or not MFCN is applied to five conventional models. In internal validation, when MFCN was applied to the existing single model, MFP was improved by an average of 45%. The RMSE was improved by over 63% from an average of 26.85mm (17.16 to 35.80mm) to 9.72mm (9.37 to 10.98mm). In external validation, when MFCN was applied, the MFP incidence rate decreased by an average of 32% and the RMSE decreased by an average of 65\%. Therefore, by applying the proposed MFCN, we observed the significant/consistent detection performance improvement of PICC tip location compared to the existing model. △ Less

Submitted 14 June, 2022; originally announced June 2022.

Comments: Subin Park and Yoon Ki Cha have contributed equally to this work as the co-first author. Kyung-Su Kim ([email protected]) and Myung Jin Chung ([email protected]) have contributed equally to this work as the co-corresponding author

arXiv:2109.03273 [pdf, other]

LuMaMi28: Real-Time Millimeter-Wave Massive MIMO Systems with Antenna Selection

Authors: MinKeun Chung, Liang Liu, Andreas Johansson, Sara Gunnarsson, Martin Nilsson, Zhinong Ying, Olof Zander, Kamal Samanta, Chris Clifton, Toshiyuki Koimori, Shinya Morita, Satoshi Taniguchi, Fredrik Tufvesson, Ove Edfors

Abstract: This paper presents LuMaMi28, a real-time 28 GHz massive multiple-input multiple-output (MIMO) testbed. In this testbed, the base station has 16 transceiver chains with a fully-digital beamforming architecture (with different pre-coding algorithms) and simultaneously supports multiple user equipments (UEs) with spatial multiplexing. The UEs are equipped with a beam-switchable antenna array for rea… ▽ More This paper presents LuMaMi28, a real-time 28 GHz massive multiple-input multiple-output (MIMO) testbed. In this testbed, the base station has 16 transceiver chains with a fully-digital beamforming architecture (with different pre-coding algorithms) and simultaneously supports multiple user equipments (UEs) with spatial multiplexing. The UEs are equipped with a beam-switchable antenna array for real-time antenna selection where the one with the highest channel magnitude, out of four pre-defined beams, is selected. For the beam-switchable antenna array, we consider two kinds of UE antennas, with different beam-width and different peak-gain. Based on this testbed, we provide measurement results for millimeter-wave (mmWave) massive MIMO performance in different real-life scenarios with static and mobile UEs. We explore the potential benefit of the mmWave massive MIMO systems with antenna selection based on measured channel data, and discuss the performance results through real-time measurements. △ Less

Submitted 7 September, 2021; originally announced September 2021.

Comments: 14 pages, 17 figures

arXiv:2105.13153 [pdf, other]

doi 10.1016/j.compbiomed.2022.105782

Cardiac Segmentation on CT Images through Shape-Aware Contour Attentions

Authors: Sanguk Park, Minyoung Chung

Abstract: Cardiac segmentation of atriums, ventricles, and myocardium in computed tomography (CT) images is an important first-line task for presymptomatic cardiovascular disease diagnosis. In several recent studies, deep learning models have shown significant breakthroughs in medical image segmentation tasks. Unlike other organs such as the lungs and liver, the cardiac organ consists of multiple substructu… ▽ More Cardiac segmentation of atriums, ventricles, and myocardium in computed tomography (CT) images is an important first-line task for presymptomatic cardiovascular disease diagnosis. In several recent studies, deep learning models have shown significant breakthroughs in medical image segmentation tasks. Unlike other organs such as the lungs and liver, the cardiac organ consists of multiple substructures, i.e., ventricles, atriums, aortas, arteries, veins, and myocardium. These cardiac substructures are proximate to each other and have indiscernible boundaries (i.e., homogeneous intensity values), making it difficult for the segmentation network focus on the boundaries between the substructures. In this paper, to improve the segmentation accuracy between proximate organs, we introduce a novel model to exploit shape and boundary-aware features. We primarily propose a shape-aware attention module, that exploits distance regression, which can guide the model to focus on the edges between substructures so that it can outperform the conventional contour-based attention method. In the experiments, we used the Multi-Modality Whole Heart Segmentation dataset that has 20 CT cardiac images for training and validation, and 40 CT cardiac images for testing. The experimental results show that the proposed network produces more accurate results than state-of-the-art networks by improving the Dice similarity coefficient score by 4.97%. Our proposed shape-aware contour attention mechanism demonstrates that distance transformation and boundary features improve the actual attention map to strengthen the responses in the boundary area. Moreover, our proposed method significantly reduces the false-positive responses of the final output, resulting in accurate segmentation. △ Less

Submitted 27 May, 2021; originally announced May 2021.

arXiv:2103.05772 [pdf, other]

Introduction to Brain and Medical Images

Authors: Moo K. Chung

Abstract: This article is based on the first chapter of book Chung (2013), where brain and medical images are introduced. The most widely used brain imaging modalities are magnetic resonance images (MRI), functional-MRI (fMRI) and diffusion tensor images (DTI). A brief introduction to each imaging modality is explained. Further, we explain what kind of curve, volume and surface data that can be extracted fr… ▽ More This article is based on the first chapter of book Chung (2013), where brain and medical images are introduced. The most widely used brain imaging modalities are magnetic resonance images (MRI), functional-MRI (fMRI) and diffusion tensor images (DTI). A brief introduction to each imaging modality is explained. Further, we explain what kind of curve, volume and surface data that can be extracted from each modality. △ Less

Submitted 9 March, 2021; originally announced March 2021.

arXiv:2012.01584 [pdf, other]

Millimeter-Wave Massive MIMO Testbed with Hybrid Beamforming

Authors: MinKeun Chung, Liang Liu, Andreas Johansson, Martin Nilsson, Olof Zander, Zhinong Ying, Fredrik Tufvesson, Ove Edfors

Abstract: Massive multiple-input multiple-out (MIMO) technology is vital in millimeter-wave (mmWave) bands to obtain large array gains. However, there are practical challenges, such as high hardware cost and power consumption in such systems. A promising solution to these problems is to adopt a hybrid beamforming architecture. This architecture has a much lower number of transceiver (TRx) chains than the to… ▽ More Massive multiple-input multiple-out (MIMO) technology is vital in millimeter-wave (mmWave) bands to obtain large array gains. However, there are practical challenges, such as high hardware cost and power consumption in such systems. A promising solution to these problems is to adopt a hybrid beamforming architecture. This architecture has a much lower number of transceiver (TRx) chains than the total antenna number, resulting in cost- and energy-efficient systems. In this paper, we present a real-time mmWave (28 GHz) massive MIMO testbed with hybrid beamforming. This testbed has a 64-antenna/16-TRx unit for beam-selection, which can be expanded to larger array sizes in a modular way. For testing everything from baseband processing algorithms to scheduling and beam-selection in real propagation environments, we extend the capability of an existing 100-antenna/100-TRx massive MIMO testbed (below 6 GHz), built upon software-defined radio technology, to a flexible mmWave massive MIMO system. △ Less

Submitted 2 December, 2020; originally announced December 2020.

Comments: 54th Asilomar Conference on Signals, Systems, and Computers, Nov. 2020

arXiv:2007.09628 [pdf, other]

Phase-Noise Compensation for OFDM Systems Exploiting Coherence Bandwidth: Modeling, Algorithms, and Analysis

Authors: MinKeun Chung, Liang Liu, Ove Edfors

Abstract: Phase-noise (PN) estimation and compensation are crucial in millimeter-wave (mmWave) communication systems to achieve high reliability. The PN estimation, however, suffers from high computational complexity due to its fundamental characteristics, such as spectral spreading and fast-varying fluctuations. In this paper, we propose a new framework for low-complexity PN compensation in orthogonal freq… ▽ More Phase-noise (PN) estimation and compensation are crucial in millimeter-wave (mmWave) communication systems to achieve high reliability. The PN estimation, however, suffers from high computational complexity due to its fundamental characteristics, such as spectral spreading and fast-varying fluctuations. In this paper, we propose a new framework for low-complexity PN compensation in orthogonal frequency-division multiplexing systems. The proposed framework also includes a pilot allocation strategy to minimize its overhead. The key ideas are to exploit the coherence bandwidth of mmWave systems and to approximate the actual PN spectrum with its dominant components, resulting in a non-iterative solution by using linear minimum mean squared-error estimation. The proposed method obtains a reduction of more than 2.5x in total complexity, as compared to the existing methods. Furthermore, we derive closed-form expressions for normalized mean squared-errors (NMSEs) as a function of critical system parameters, which help in understanding the NMSE behavior in low and high signal-to-noise ratio regimes. Lastly, we study a trade-off between performance and pilot-overhead to provide insight into an appropriate approximation of the PN spectrum. △ Less

Submitted 27 September, 2021; v1 submitted 19 July, 2020; originally announced July 2020.

Comments: To appear in IEEE Transactions on Wireless Communications

arXiv:2001.04260 [pdf]

Improving Dysarthric Speech Intelligibility Using Cycle-consistent Adversarial Training

Authors: Seung Hee Yang, Minhwa Chung

Abstract: Dysarthria is a motor speech impairment affecting millions of people. Dysarthric speech can be far less intelligible than those of non-dysarthric speakers, causing significant communication difficulties. The goal of our work is to develop a model for dysarthric to healthy speech conversion using Cycle-consistent GAN. Using 18,700 dysarthric and 8,610 healthy control Korean utterances that were rec… ▽ More Dysarthria is a motor speech impairment affecting millions of people. Dysarthric speech can be far less intelligible than those of non-dysarthric speakers, causing significant communication difficulties. The goal of our work is to develop a model for dysarthric to healthy speech conversion using Cycle-consistent GAN. Using 18,700 dysarthric and 8,610 healthy control Korean utterances that were recorded for the purpose of automatic recognition of voice keyboard in a previous study, the generator is trained to transform dysarthric to healthy speech in the spectral domain, which is then converted back to speech. Objective evaluation using automatic speech recognition of the generated utterance on a held-out test set shows that the recognition performance is improved compared with the original dysarthic speech after performing adversarial training, as the absolute WER has been lowered by 33.4%. It demonstrates that the proposed GAN-based conversion method is useful for improving dysarthric speech intelligibility. △ Less

Submitted 9 January, 2020; originally announced January 2020.

Comments: To be Published on the 24th February in BIOSIGNALS 2020. arXiv admin note: text overlap with arXiv:1904.09407

arXiv:1909.00548 [pdf, other]

Resource Optimized Neural Architecture Search for 3D Medical Image Segmentation

Authors: Woong Bae, Seungho Lee, Yeha Lee, Beomhee Park, Minki Chung, Kyu-Hwan Jung

Abstract: Neural Architecture Search (NAS), a framework which automates the task of designing neural networks, has recently been actively studied in the field of deep learning. However, there are only a few NAS methods suitable for 3D medical image segmentation. Medical 3D images are generally very large; thus it is difficult to apply previous NAS methods due to their GPU computational burden and long train… ▽ More Neural Architecture Search (NAS), a framework which automates the task of designing neural networks, has recently been actively studied in the field of deep learning. However, there are only a few NAS methods suitable for 3D medical image segmentation. Medical 3D images are generally very large; thus it is difficult to apply previous NAS methods due to their GPU computational burden and long training time. We propose the resource-optimized neural architecture search method which can be applied to 3D medical segmentation tasks in a short training time (1.39 days for 1GB dataset) using a small amount of computation power (one RTX 2080Ti, 10.8GB GPU memory). Excellent performance can also be achieved without retraining(fine-tuning) which is essential in most NAS methods. These advantages can be achieved by using a reinforcement learning-based controller with parameter sharing and focusing on the optimal search space configuration of macro search rather than micro search. Our experiments demonstrate that the proposed NAS method outperforms manually designed networks with state-of-the-art performance in 3D medical image segmentation. △ Less

Submitted 2 September, 2019; originally announced September 2019.

Comments: MICCAI(International Conference on Medical Image Computing and Computer Assisted Intervention) 2019 accepted

arXiv:1807.00244 [pdf, other]

Automatic Identification of Twin Zygosity in Resting-State Functional MRI

Authors: Andrey Gritsenko, Martin A. Lindquist, Gregory R. Kirk, Moo K. Chung

Abstract: A key strength of twin studies arises from the fact that there are two types of twins, monozygotic and dizygotic, that share differing amounts of genetic information. Accurate differentiation of twin types allows efficient inference on genetic influences in a population. However, identification of zygosity is often prone to errors without genotying. In this study, we propose a novel pairwise featu… ▽ More A key strength of twin studies arises from the fact that there are two types of twins, monozygotic and dizygotic, that share differing amounts of genetic information. Accurate differentiation of twin types allows efficient inference on genetic influences in a population. However, identification of zygosity is often prone to errors without genotying. In this study, we propose a novel pairwise feature representation to classify the zygosity of twin pairs of resting state functional magnetic resonance images (rs-fMRI). For this, we project an fMRI signal to a set of basis functions and use the projection coefficients as the compact and discriminative feature representation of noisy fMRI. We encode the relationship between twins as the correlation between the new feature representations across brain regions. We employ hill climbing variable selection to identify brain regions that are the most genetically affected. The proposed framework was applied to 208 twin pairs and achieved 94.19% classification accuracy in automatically identifying the zygosity of paired images. △ Less

Submitted 26 October, 2018; v1 submitted 30 June, 2018; originally announced July 2018.

arXiv:1710.07849 [pdf, other]

Heat Kernel Smoothing in Irregular Image Domains

Authors: Moo K. Chung, Yanli Wang, Gurong Wu

Abstract: We present the discrete version of heat kernel smoothing on graph data structure. The method is used to smooth data in an irregularly shaped domains in 3D images. New statistical properties are derived. As an application, we show how to filter out data in the lung blood vessel trees obtained from computed tomography. The method can be further used in representing the complex vessel trees paramet… ▽ More We present the discrete version of heat kernel smoothing on graph data structure. The method is used to smooth data in an irregularly shaped domains in 3D images. New statistical properties are derived. As an application, we show how to filter out data in the lung blood vessel trees obtained from computed tomography. The method can be further used in representing the complex vessel trees parametrically and extracting the skeleton representation of the trees. △ Less

Submitted 21 October, 2017; originally announced October 2017.

Showing 1–24 of 24 results for author: Chung, M