-
Leveraging Novel Ensemble Learning Techniques and Landsat Multispectral Data for Estimating Olive Yields in Tunisia
Authors:
Mohamed Kefi,
Tien Dat Pham,
Thin Nguyen,
Mark G. Tjoelker,
Viola Devasirvatham,
Kenichi Kashiwagi
Abstract:
Olive production is an important tree crop in Mediterranean climates. However, olive yield varies significantly due to climate change. Accurately estimating yield using remote sensing and machine learning remains a complex challenge. In this study, we developed a streamlined pipeline for olive yield estimation in the Kairouan and Sousse governorates of Tunisia. We extracted features from multispec…
▽ More
Olive production is an important tree crop in Mediterranean climates. However, olive yield varies significantly due to climate change. Accurately estimating yield using remote sensing and machine learning remains a complex challenge. In this study, we developed a streamlined pipeline for olive yield estimation in the Kairouan and Sousse governorates of Tunisia. We extracted features from multispectral reflectance bands, vegetation indices derived from Landsat-8 OLI and Landsat-9 OLI-2 satellite imagery, along with digital elevation model data. These spatial features were combined with ground-based field survey data to form a structured tabular dataset. We then developed an automated ensemble learning framework, implemented using AutoGluon to train and evaluate multiple machine learning models, select optimal combinations through stacking, and generate robust yield predictions using five-fold cross-validation. The results demonstrate strong predictive performance from both sensors, with Landsat-8 OLI achieving R2 = 0.8635 and RMSE = 1.17 tons per ha, and Landsat-9 OLI-2 achieving R2 = 0.8378 and RMSE = 1.32 tons per ha. This study highlights a scalable, cost-effective, and accurate method for olive yield estimation, with potential applicability across diverse agricultural regions globally.
△ Less
Submitted 25 May, 2025;
originally announced June 2025.
-
Brightness-Invariant Tracking Estimation in Tagged MRI
Authors:
Zhangxing Bian,
Shuwen Wei,
Xiao Liang,
Yuan-Chiao Lu,
Samuel W. Remedios,
Fangxu Xing,
Jonghye Woo,
Dzung L. Pham,
Aaron Carass,
Philip V. Bayly,
Jiachen Zhuo,
Ahmed Alshareef,
Jerry L. Prince
Abstract:
Magnetic resonance (MR) tagging is an imaging technique for noninvasively tracking tissue motion in vivo by creating a visible pattern of magnetization saturation (tags) that deforms with the tissue. Due to longitudinal relaxation and progression to steady-state, the tags and tissue brightnesses change over time, which makes tracking with optical flow methods error-prone. Although Fourier methods…
▽ More
Magnetic resonance (MR) tagging is an imaging technique for noninvasively tracking tissue motion in vivo by creating a visible pattern of magnetization saturation (tags) that deforms with the tissue. Due to longitudinal relaxation and progression to steady-state, the tags and tissue brightnesses change over time, which makes tracking with optical flow methods error-prone. Although Fourier methods can alleviate these problems, they are also sensitive to brightness changes as well as spectral spreading due to motion. To address these problems, we introduce the brightness-invariant tracking estimation (BRITE) technique for tagged MRI. BRITE disentangles the anatomy from the tag pattern in the observed tagged image sequence and simultaneously estimates the Lagrangian motion. The inherent ill-posedness of this problem is addressed by leveraging the expressive power of denoising diffusion probabilistic models to represent the probabilistic distribution of the underlying anatomy and the flexibility of physics-informed neural networks to estimate biologically-plausible motion. A set of tagged MR images of a gel phantom was acquired with various tag periods and imaging flip angles to demonstrate the impact of brightness variations and to validate our method. The results show that BRITE achieves more accurate motion and strain estimates as compared to other state of the art methods, while also being resistant to tag fading.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
ECLARE: Efficient cross-planar learning for anisotropic resolution enhancement
Authors:
Samuel W. Remedios,
Shuwen Wei,
Shuo Han,
Jinwei Zhang,
Aaron Carass,
Kurt G. Schilling,
Dzung L. Pham,
Jerry L. Prince,
Blake E. Dewey
Abstract:
In clinical imaging, magnetic resonance (MR) image volumes are often acquired as stacks of 2D slices with decreased scan times, improved signal-to-noise ratio, and image contrasts unique to 2D MR pulse sequences. While this is sufficient for clinical evaluation, automated algorithms designed for 3D analysis perform poorly on multi-slice 2D MR volumes, especially those with thick slices and gaps be…
▽ More
In clinical imaging, magnetic resonance (MR) image volumes are often acquired as stacks of 2D slices with decreased scan times, improved signal-to-noise ratio, and image contrasts unique to 2D MR pulse sequences. While this is sufficient for clinical evaluation, automated algorithms designed for 3D analysis perform poorly on multi-slice 2D MR volumes, especially those with thick slices and gaps between slices. Super-resolution (SR) methods aim to address this problem, but previous methods do not address all of the following: slice profile shape estimation, slice gap, domain shift, and non-integer or arbitrary upsampling factors. In this paper, we propose ECLARE (Efficient Cross-planar Learning for Anisotropic Resolution Enhancement), a self-SR method that addresses each of these factors. ECLARE uses a slice profile estimated from the multi-slice 2D MR volume, trains a network to learn the mapping from low-resolution to high-resolution in-plane patches from the same volume, and performs SR with anti-aliasing. We compared ECLARE to cubic B-spline interpolation, SMORE, and other contemporary SR methods. We used realistic and representative simulations so that quantitative performance against ground truth can be computed, and ECLARE outperformed all other methods in both signal recovery and downstream tasks. Importantly, as ECLARE does not use external training data it cannot suffer from domain shift between training and testing. Our code is open-source and available at https://www.github.com/sremedios/eclare.
△ Less
Submitted 21 May, 2025; v1 submitted 14 March, 2025;
originally announced March 2025.
-
AdaCS: Adaptive Normalization for Enhanced Code-Switching ASR
Authors:
The Chuong Chu,
Vu Tuan Dat Pham,
Kien Dao,
Hoang Nguyen,
Quoc Hung Truong
Abstract:
Intra-sentential code-switching (CS) refers to the alternation between languages that happens within a single utterance and is a significant challenge for Automatic Speech Recognition (ASR) systems. For example, when a Vietnamese speaker uses foreign proper names or specialized terms within their speech. ASR systems often struggle to accurately transcribe intra-sentential CS due to their training…
▽ More
Intra-sentential code-switching (CS) refers to the alternation between languages that happens within a single utterance and is a significant challenge for Automatic Speech Recognition (ASR) systems. For example, when a Vietnamese speaker uses foreign proper names or specialized terms within their speech. ASR systems often struggle to accurately transcribe intra-sentential CS due to their training on monolingual data and the unpredictable nature of CS. This issue is even more pronounced for low-resource languages, where limited data availability hinders the development of robust models. In this study, we propose AdaCS, a normalization model integrates an adaptive bias attention module (BAM) into encoder-decoder network. This novel approach provides a robust solution to CS ASR in unseen domains, thereby significantly enhancing our contribution to the field. By utilizing BAM to both identify and normalize CS phrases, AdaCS enhances its adaptive capabilities with a biased list of words provided during inference. Our method demonstrates impressive performance and the ability to handle unseen CS phrases across various domains. Experiments show that AdaCS outperforms previous state-of-the-art method on Vietnamese CS ASR normalization by considerable WER reduction of 56.2% and 36.8% on the two proposed test sets.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
Is Registering Raw Tagged-MR Enough for Strain Estimation in the Era of Deep Learning?
Authors:
Zhangxing Bian,
Ahmed Alshareef,
Shuwen Wei,
Junyu Chen,
Yuli Wang,
Jonghye Woo,
Dzung L. Pham,
Jiachen Zhuo,
Aaron Carass,
Jerry L. Prince
Abstract:
Magnetic Resonance Imaging with tagging (tMRI) has long been utilized for quantifying tissue motion and strain during deformation. However, a phenomenon known as tag fading, a gradual decrease in tag visibility over time, often complicates post-processing. The first contribution of this study is to model tag fading by considering the interplay between $T_1$ relaxation and the repeated application…
▽ More
Magnetic Resonance Imaging with tagging (tMRI) has long been utilized for quantifying tissue motion and strain during deformation. However, a phenomenon known as tag fading, a gradual decrease in tag visibility over time, often complicates post-processing. The first contribution of this study is to model tag fading by considering the interplay between $T_1$ relaxation and the repeated application of radio frequency (RF) pulses during serial imaging sequences. This is a factor that has been overlooked in prior research on tMRI post-processing. Further, we have observed an emerging trend of utilizing raw tagged MRI within a deep learning-based (DL) registration framework for motion estimation. In this work, we evaluate and analyze the impact of commonly used image similarity objectives in training DL registrations on raw tMRI. This is then compared with the Harmonic Phase-based approach, a traditional approach which is claimed to be robust to tag fading. Our findings, derived from both simulated images and an actual phantom scan, reveal the limitations of various similarity losses in raw tMRI and emphasize caution in registration tasks where image intensity changes over time.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Automatic Tuning of Denoising Algorithms Parameters Without Ground Truth
Authors:
Arthur Floquet,
Sayantan Dutta,
Emmanuel Soubies,
Duong Hung Pham,
Denis Kouame
Abstract:
Denoising is omnipresent in image processing. It is usually addressed with algorithms relying on a set of hyperparameters that control the quality of the recovered image. Manual tuning of those parameters can be a daunting task, which calls for the development of automatic tuning methods. Given a denoising algorithm, the best set of parameters is the one that minimizes the error between denoised a…
▽ More
Denoising is omnipresent in image processing. It is usually addressed with algorithms relying on a set of hyperparameters that control the quality of the recovered image. Manual tuning of those parameters can be a daunting task, which calls for the development of automatic tuning methods. Given a denoising algorithm, the best set of parameters is the one that minimizes the error between denoised and ground-truth images. Clearly, this ideal approach is unrealistic, as the ground-truth images are unknown in practice. In this work, we propose unsupervised cost functions -- i.e., that only require the noisy image -- that allow us to reach this ideal gold standard performance. Specifically, the proposed approach makes it possible to obtain an average PSNR output within less than 1% of the best achievable PSNR.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
Towards an accurate and generalizable multiple sclerosis lesion segmentation model using self-ensembled lesion fusion
Authors:
Jinwei Zhang,
Lianrui Zuo,
Blake E. Dewey,
Samuel W. Remedios,
Dzung L. Pham,
Aaron Carass,
Jerry L. Prince
Abstract:
Automatic multiple sclerosis (MS) lesion segmentation using multi-contrast magnetic resonance (MR) images provides improved efficiency and reproducibility compared to manual delineation. Current state-of-the-art automatic MS lesion segmentation methods utilize modified U-Net-like architectures. However, in the literature, dedicated architecture modifications were always required to maximize their…
▽ More
Automatic multiple sclerosis (MS) lesion segmentation using multi-contrast magnetic resonance (MR) images provides improved efficiency and reproducibility compared to manual delineation. Current state-of-the-art automatic MS lesion segmentation methods utilize modified U-Net-like architectures. However, in the literature, dedicated architecture modifications were always required to maximize their performance. In addition, the best-performing methods have not proven to be generalizable to diverse test datasets with contrast variations and image artifacts. In this work, we developed an accurate and generalizable MS lesion segmentation model using the well-known U-Net architecture without further modification. A novel test-time self-ensembled lesion fusion strategy is proposed that not only achieved the best performance using the ISBI 2015 MS segmentation challenge data but also demonstrated robustness across various self-ensemble parameter choices. Moreover, equipped with instance normalization rather than batch normalization widely used in literature, the model trained on the ISBI challenge data generalized well on clinical test datasets from different scanners.
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
-
Harmonization-enriched domain adaptation with light fine-tuning for multiple sclerosis lesion segmentation
Authors:
Jinwei Zhang,
Lianrui Zuo,
Blake E. Dewey,
Samuel W. Remedios,
Savannah P. Hays,
Dzung L. Pham,
Jerry L. Prince,
Aaron Carass
Abstract:
Deep learning algorithms utilizing magnetic resonance (MR) images have demonstrated cutting-edge proficiency in autonomously segmenting multiple sclerosis (MS) lesions. Despite their achievements, these algorithms may struggle to extend their performance across various sites or scanners, leading to domain generalization errors. While few-shot or one-shot domain adaptation emerges as a potential so…
▽ More
Deep learning algorithms utilizing magnetic resonance (MR) images have demonstrated cutting-edge proficiency in autonomously segmenting multiple sclerosis (MS) lesions. Despite their achievements, these algorithms may struggle to extend their performance across various sites or scanners, leading to domain generalization errors. While few-shot or one-shot domain adaptation emerges as a potential solution to mitigate generalization errors, its efficacy might be hindered by the scarcity of labeled data in the target domain. This paper seeks to tackle this challenge by integrating one-shot adaptation data with harmonized training data that incorporates labels. Our approach involves synthesizing new training data with a contrast akin to that of the test domain, a process we refer to as "contrast harmonization" in MRI. Our experiments illustrate that the amalgamation of one-shot adaptation data with harmonized training data surpasses the performance of utilizing either data source in isolation. Notably, domain adaptation using exclusively harmonized training data achieved comparable or even superior performance compared to one-shot adaptation. Moreover, all adaptations required only minimal fine-tuning, ranging from 2 to 5 epochs for convergence.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
Scalable, low-cost, and versatile system design for air pollution and traffic density monitoring and analysis
Authors:
Thinh Gia Tran,
Dat Thanh Vo,
Long Chau Tran,
Hoang Viet Pham,
Chuong Dinh Le,
An Dinh Le,
Duy Anh Pham,
Hien Bich Vo
Abstract:
Vietnam requires a sustainable urbanization, for which city sensing is used in planning and de-cision-making. Large cities need portable, scalable, and inexpensive digital technology for this purpose. End-to-end air quality monitoring companies such as AirVisual and Plume Air have shown their reliability with portable devices outfitted with superior air sensors. They are pricey, yet homeowners use…
▽ More
Vietnam requires a sustainable urbanization, for which city sensing is used in planning and de-cision-making. Large cities need portable, scalable, and inexpensive digital technology for this purpose. End-to-end air quality monitoring companies such as AirVisual and Plume Air have shown their reliability with portable devices outfitted with superior air sensors. They are pricey, yet homeowners use them to get local air data without evaluating the causal effect. Our air quality inspection system is scalable, reasonably priced, and flexible. Minicomputer of the sys-tem remotely monitors PMS7003 and BME280 sensor data through a microcontroller processor. The 5-megapixel camera module enables researchers to infer the causal relationship between traffic intensity and dust concentration. The design enables inexpensive, commercial-grade hardware, with Azure Blob storing air pollution data and surrounding-area imagery and pre-venting the system from physically expanding. In addition, by including an air channel that re-plenishes and distributes temperature, the design improves ventilation and safeguards electrical components. The gadget allows for the analysis of the correlation between traffic and air quali-ty data, which might aid in the establishment of sustainable urban development plans and poli-cies.
△ Less
Submitted 8 December, 2022;
originally announced December 2022.
-
Interpretable HER2 scoring by evaluating clinical Guidelines through a weakly supervised, constrained Deep Learning Approach
Authors:
Manh Dan Pham,
Cyprien Tilmant,
Stéphanie Petit,
Isabelle Salmon,
Saima Ben Hadj,
Rutger H. J. Fick
Abstract:
The evaluation of the Human Epidermal growth factor Receptor-2 (HER2) expression is an important prognostic biomarker for breast cancer treatment selection. However, HER2 scoring has notoriously high interobserver variability due to stain variations between centers and the need to estimate visually the staining intensity in specific percentages of tumor area. In this paper, focusing on the interpr…
▽ More
The evaluation of the Human Epidermal growth factor Receptor-2 (HER2) expression is an important prognostic biomarker for breast cancer treatment selection. However, HER2 scoring has notoriously high interobserver variability due to stain variations between centers and the need to estimate visually the staining intensity in specific percentages of tumor area. In this paper, focusing on the interpretability of HER2 scoring by a pathologist, we propose a semi-automatic, two-stage deep learning approach that directly evaluates the clinical HER2 guidelines defined by the American Society of Clinical Oncology/ College of American Pathologists (ASCO/CAP). In the first stage, we segment the invasive tumor over the user-indicated Region of Interest (ROI). Then, in the second stage, we classify the tumor tissue into four HER2 classes. For the classification stage, we use weakly supervised, constrained optimization to find a model that classifies cancerous patches such that the tumor surface percentage meets the guidelines specification of each HER2 class. We end the second stage by freezing the model and refining its output logits in a supervised way to all slide labels in the training set. To ensure the quality of our dataset's labels, we conducted a multi-pathologist HER2 scoring consensus. For the assessment of doubtful cases where no consensus was found, our model can help by interpreting its HER2 class percentages output. We achieve a performance of 0.78 in F1-score on the test set while keeping our model interpretable for the pathologist, hopefully contributing to interpretable AI models in digital pathology.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
Deep filter bank regression for super-resolution of anisotropic MR brain images
Authors:
Samuel W. Remedios,
Shuo Han,
Yuan Xue,
Aaron Carass,
Trac D. Tran,
Dzung L. Pham,
Jerry L. Prince
Abstract:
In 2D multi-slice magnetic resonance (MR) acquisition, the through-plane signals are typically of lower resolution than the in-plane signals. While contemporary super-resolution (SR) methods aim to recover the underlying high-resolution volume, the estimated high-frequency information is implicit via end-to-end data-driven training rather than being explicitly stated and sought. To address this, w…
▽ More
In 2D multi-slice magnetic resonance (MR) acquisition, the through-plane signals are typically of lower resolution than the in-plane signals. While contemporary super-resolution (SR) methods aim to recover the underlying high-resolution volume, the estimated high-frequency information is implicit via end-to-end data-driven training rather than being explicitly stated and sought. To address this, we reframe the SR problem statement in terms of perfect reconstruction filter banks, enabling us to identify and directly estimate the missing information. In this work, we propose a two-stage approach to approximate the completion of a perfect reconstruction filter bank corresponding to the anisotropic acquisition of a particular scan. In stage 1, we estimate the missing filters using gradient descent and in stage 2, we use deep networks to learn the mapping from coarse coefficients to detail coefficients. In addition, the proposed formulation does not rely on external training data, circumventing the need for domain shift correction. Under our approach, SR performance is improved particularly in "slice gap" scenarios, likely due to the constrained solution space imposed by the framework.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
Behavior Analysis and Design of Concrete-Filled Steel Circular-Tube Short Columns Subjected to Axial Compression
Authors:
Duc-Duy Pham,
Phu-Cuong Nguyen
Abstract:
In this paper, a new finite element (FE) model using ABAQUS software was developed to investigate the compressive behavior of Concrete-Filled Steel Circular-Tube (CFSCT) columns. Experimental studies indicated that the confinement offered by the circular steel tube in a CFSCT column increased both the strength and ductility of the filled concrete. Base on the database of 663 test results CFSCT col…
▽ More
In this paper, a new finite element (FE) model using ABAQUS software was developed to investigate the compressive behavior of Concrete-Filled Steel Circular-Tube (CFSCT) columns. Experimental studies indicated that the confinement offered by the circular steel tube in a CFSCT column increased both the strength and ductility of the filled concrete. Base on the database of 663 test results CFSCT columns under axial compression are collected from the available literature, a formula to determine the lateral confining pressures on concrete. Concrete-Damaged Plasticity Model (CDPM) and parameters are available in ABAQUS are used in the analysis. From results analysis, a proposed formula for predicting ultimate load by determining intensification and diminution for concrete and steel. The proposed formula is then compared with the FE model, the previous study, and the design code current in strength prediction of CFSCT columns under compression. The comparative result shows that the FE model, the proposed formula is more stable and accurate than the previous study and current standards when using material normal or high strength.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
Contrast Adaptive Tissue Classification by Alternating Segmentation and Synthesis
Authors:
Dzung L. Pham,
Yi-Yu Chou,
Blake E. Dewey,
Daniel S. Reich,
John A. Butman,
Snehashis Roy
Abstract:
Deep learning approaches to the segmentation of magnetic resonance images have shown significant promise in automating the quantitative analysis of brain images. However, a continuing challenge has been its sensitivity to the variability of acquisition protocols. Attempting to segment images that have different contrast properties from those within the training data generally leads to significantl…
▽ More
Deep learning approaches to the segmentation of magnetic resonance images have shown significant promise in automating the quantitative analysis of brain images. However, a continuing challenge has been its sensitivity to the variability of acquisition protocols. Attempting to segment images that have different contrast properties from those within the training data generally leads to significantly reduced performance. Furthermore, heterogeneous data sets cannot be easily evaluated because the quantitative variation due to acquisition differences often dwarfs the variation due to the biological differences that one seeks to measure. In this work, we describe an approach using alternating segmentation and synthesis steps that adapts the contrast properties of the training data to the input image. This allows input images that do not resemble the training data to be more consistently segmented. A notable advantage of this approach is that only a single example of the acquisition protocol is required to adapt to its contrast properties. We demonstrate the efficacy of our approaching using brain images from a set of human subjects scanned with two different T1-weighted volumetric protocols.
△ Less
Submitted 3 March, 2021;
originally announced March 2021.
-
A grid-point detection method based on U-net for a structured light system
Authors:
Dieuthuy Pham,
Minhtuan Ha,
Changyan Xiao
Abstract:
Accurate detection of the feature points of the projected pattern plays an extremely important role in one-shot 3D reconstruction systems, especially for the ones using a grid pattern. To solve this problem, this paper proposes a grid-point detection method based on U-net. A specific dataset is designed that includes the images captured with the two-shot imaging method and the ones acquired with t…
▽ More
Accurate detection of the feature points of the projected pattern plays an extremely important role in one-shot 3D reconstruction systems, especially for the ones using a grid pattern. To solve this problem, this paper proposes a grid-point detection method based on U-net. A specific dataset is designed that includes the images captured with the two-shot imaging method and the ones acquired with the one-shot imaging method. Among them, the images in the first group after labeled as the ground truth images and the images captured at the same pose with the one-shot method are cut into small patches with the size of 64x64 pixels then feed to the training set. The remaining of the images in the second group is the test set. The experimental results show that our method can achieve a better detecting performance with higher accuracy in comparison with the previous methods.
△ Less
Submitted 5 December, 2020;
originally announced December 2020.
-
Fast High Resolution Blood Flow Estimation and Clutter Rejection via an Alternating Optimization Problem
Authors:
Duong-Hung Pham,
Adrian Basarab,
Jean-Pierre Remenieras,
Paul Rodríguez,
Denis Kouamé
Abstract:
This paper introduces a computationally efficient technique for estimating high-resolution Doppler blood flow from an ultrafast ultrasound image sequence. More precisely, it consists in a new fast alternating minimization algorithm that implements a blind deconvolution method based on robust principal component analysis. Numerical investigation carried out on \textit{in vivo} data shows the effici…
▽ More
This paper introduces a computationally efficient technique for estimating high-resolution Doppler blood flow from an ultrafast ultrasound image sequence. More precisely, it consists in a new fast alternating minimization algorithm that implements a blind deconvolution method based on robust principal component analysis. Numerical investigation carried out on \textit{in vivo} data shows the efficiency of the proposed approach in comparison with state-of-the-art methods.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
A Novel Fast 3D Single Image Super-Resolution Algorithm
Authors:
Nwigbo Kenule Tuador,
Duong Hung Pham,
Jérôme Michetti,
Adrian Basarab,
Denis Kouamé
Abstract:
This paper introduces a novel computationally efficient method of solving the 3D single image super-resolution (SR) problem, i.e., reconstruction of a high-resolution volume from its low-resolution counterpart. The main contribution lies in the original way of handling simultaneously the associated decimation and blurring operators, based on their underlying properties in the frequency domain. In…
▽ More
This paper introduces a novel computationally efficient method of solving the 3D single image super-resolution (SR) problem, i.e., reconstruction of a high-resolution volume from its low-resolution counterpart. The main contribution lies in the original way of handling simultaneously the associated decimation and blurring operators, based on their underlying properties in the frequency domain. In particular, the proposed decomposition technique of the 3D decimation operator allows a straightforward implementation for Tikhonov regularization, and can be further used to take into consideration other regularization functions such as the total variation, enabling the computational cost of state-of-the-art algorithms to be considerably decreased. Numerical experiments carried out showed that the proposed approach outperforms existing 3D SR methods.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Joint Blind Deconvolution and Robust Principal Component Analysis for Blood Flow Estimation in Medical Ultrasound Imaging
Authors:
Duong-Hung Pham,
Adrian Basarab,
Ilyess Zemmoura,
Jean-Pierre Remenieras,
Denis Kouame
Abstract:
This paper addresses the problem of high-resolution Doppler blood flow estimation from an ultrafast sequence of ultrasound images. Formulating the separation of clutter and blood components as an inverse problem has been shown in the literature to be a good alternative to spatio-temporal singular value decomposition (SVD)-based clutter filtering. In particular, a deconvolution step has recently be…
▽ More
This paper addresses the problem of high-resolution Doppler blood flow estimation from an ultrafast sequence of ultrasound images. Formulating the separation of clutter and blood components as an inverse problem has been shown in the literature to be a good alternative to spatio-temporal singular value decomposition (SVD)-based clutter filtering. In particular, a deconvolution step has recently been embedded in such a problem to mitigate the influence of the experimentally measured point spread function (PSF) of the imaging system. Deconvolution was shown in this context to improve the accuracy of the blood flow reconstruction. However, measuring the PSF requires non-trivial experimental setups. To overcome this limitation, we propose herein a blind deconvolution method able to estimate both the blood component and the PSF from Doppler data. Numerical experiments conducted on simulated and in vivo data demonstrate qualitatively and quantitatively the effectiveness of the proposed approach in comparison with the previous method based on experimentally measured PSF and two other state-of-the-art approaches.
△ Less
Submitted 10 July, 2020;
originally announced July 2020.
-
Recognition of 26 Degrees of Freedom of Hands Using Model-based approach and Depth-Color Images
Authors:
Cong Hoang Quach,
Minh Trien Pham,
Anh Viet Dang,
Dinh Tuan Pham,
Thuan Hoang Tran,
Manh Duong Phung
Abstract:
In this study, we present an model-based approach to recognize full 26 degrees of freedom of a human hand. Input data include RGB-D images acquired from a Kinect camera and a 3D model of the hand constructed from its anatomy and graphical matrices. A cost function is then defined so that its minimum value is achieved when the model and observation images are matched. To solve the optimization prob…
▽ More
In this study, we present an model-based approach to recognize full 26 degrees of freedom of a human hand. Input data include RGB-D images acquired from a Kinect camera and a 3D model of the hand constructed from its anatomy and graphical matrices. A cost function is then defined so that its minimum value is achieved when the model and observation images are matched. To solve the optimization problem in 26 dimensional space, the particle swarm optimization algorimth with improvements are used. In addition, parallel computation in graphical processing units (GPU) is utilized to handle computationally expensive tasks. Simulation and experimental results show that the system can recognize 26 degrees of freedom of hands with the processing time of 0.8 seconds per frame. The algorithm is robust to noise and the hardware requirement is simple with a single camera.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
High-Order Synchrosqueezing Transform for Multicomponent Signals Analysis -- With an Application to Gravitational-Wave Signal
Authors:
Duong-Hung Pham,
Sylvain Meignen
Abstract:
This study puts forward a generalization of the short-time Fourier-based Synchrosqueezing Transform using a new local estimate of instantaneous frequency. Such a technique enables not only to achieve a highly concentrated time-frequency representation for a wide variety of AM-FM multicomponent signals but also to reconstruct their modes with a high accuracy. Numerical investigation on synthetic an…
▽ More
This study puts forward a generalization of the short-time Fourier-based Synchrosqueezing Transform using a new local estimate of instantaneous frequency. Such a technique enables not only to achieve a highly concentrated time-frequency representation for a wide variety of AM-FM multicomponent signals but also to reconstruct their modes with a high accuracy. Numerical investigation on synthetic and gravitational-wave signals shows the efficiency of this new approach.
△ Less
Submitted 20 April, 2020;
originally announced April 2020.
-
Near real-time map building with multi-class image set labelling and classification of road conditions using convolutional neural networks
Authors:
Sheela Ramanna,
Cenker Sengoz,
Scott Kehler,
Dat Pham
Abstract:
Weather is an important factor affecting transportation and road safety. In this paper, we leverage state-of-the-art convolutional neural networks in labelling images taken by street and highway cameras located across across North America. Road camera snapshots were used in experiments with multiple deep learning frameworks to classify images by road condition. The training data for these experime…
▽ More
Weather is an important factor affecting transportation and road safety. In this paper, we leverage state-of-the-art convolutional neural networks in labelling images taken by street and highway cameras located across across North America. Road camera snapshots were used in experiments with multiple deep learning frameworks to classify images by road condition. The training data for these experiments used images labelled as dry, wet, snow/ice, poor, and offline. The experiments tested different configurations of six convolutional neural networks (VGG-16, ResNet50, Xception, InceptionResNetV2, EfficientNet-B0 and EfficientNet-B4) to assess their suitability to this problem. The precision, accuracy, and recall were measured for each framework configuration. In addition, the training sets were varied both in overall size and by size of individual classes. The final training set included 47,000 images labelled using the five aforementioned classes. The EfficientNet-B4 framework was found to be most suitable to this problem, achieving validation accuracy of 90.6%, although EfficientNet-B0 achieved an accuracy of 90.3% with half the execution time. It was observed that VGG-16 with transfer learning proved to be very useful for data acquisition and pseudo-labelling with limited hardware resources, throughout this project. The EfficientNet-B4 framework was then placed into a real-time production environment, where images could be classified in real-time on an ongoing basis. The classified images were then used to construct a map showing real-time road conditions at various camera locations across North America. The choice of these frameworks and our analysis take into account unique requirements of real-time map building functions. A detailed analysis of the process of semi-automated dataset labelling using these frameworks is also presented in this paper.
△ Less
Submitted 27 January, 2020;
originally announced January 2020.
-
CHAOS Challenge -- Combined (CT-MR) Healthy Abdominal Organ Segmentation
Authors:
A. Emre Kavur,
N. Sinem Gezer,
Mustafa Barış,
Sinem Aslan,
Pierre-Henri Conze,
Vladimir Groza,
Duc Duy Pham,
Soumick Chatterjee,
Philipp Ernst,
Savaş Özkan,
Bora Baydar,
Dmitry Lachinov,
Shuo Han,
Josef Pauli,
Fabian Isensee,
Matthias Perkonigg,
Rachana Sathish,
Ronnie Rajan,
Debdoot Sheet,
Gurbandurdy Dovletov,
Oliver Speck,
Andreas Nürnberger,
Klaus H. Maier-Hein,
Gözde Bozdağı Akar,
Gözde Ünal
, et al. (2 additional authors not shown)
Abstract:
Segmentation of abdominal organs has been a comprehensive, yet unresolved, research field for many years. In the last decade, intensive developments in deep learning (DL) have introduced new state-of-the-art segmentation systems. In order to expand the knowledge on these topics, the CHAOS - Combined (CT-MR) Healthy Abdominal Organ Segmentation challenge has been organized in conjunction with IEEE…
▽ More
Segmentation of abdominal organs has been a comprehensive, yet unresolved, research field for many years. In the last decade, intensive developments in deep learning (DL) have introduced new state-of-the-art segmentation systems. In order to expand the knowledge on these topics, the CHAOS - Combined (CT-MR) Healthy Abdominal Organ Segmentation challenge has been organized in conjunction with IEEE International Symposium on Biomedical Imaging (ISBI), 2019, in Venice, Italy. CHAOS provides both abdominal CT and MR data from healthy subjects for single and multiple abdominal organ segmentation. Five different but complementary tasks have been designed to analyze the capabilities of current approaches from multiple perspectives. The results are investigated thoroughly, compared with manual annotations and interactive methods. The analysis shows that the performance of DL models for single modality (CT / MR) can show reliable volumetric analysis performance (DICE: 0.98 $\pm$ 0.00 / 0.95 $\pm$ 0.01) but the best MSSD performance remain limited (21.89 $\pm$ 13.94 / 20.85 $\pm$ 10.63 mm). The performances of participating models decrease significantly for cross-modality tasks for the liver (DICE: 0.88 $\pm$ 0.15 MSSD: 36.33 $\pm$ 21.97 mm) and all organs (DICE: 0.85 $\pm$ 0.21 MSSD: 33.17 $\pm$ 38.93 mm). Despite contrary examples on different applications, multi-tasking DL models designed to segment all organs seem to perform worse compared to organ-specific ones (performance drop around 5\%). Besides, such directions of further research for cross-modality segmentation would significantly support real-world clinical applications. Moreover, having more than 1500 participants, another important contribution of the paper is the analysis on shortcomings of challenge organizations such as the effects of multiple submissions and peeking phenomena.
△ Less
Submitted 7 January, 2021; v1 submitted 17 January, 2020;
originally announced January 2020.
-
Adaptive neural network based dynamic surface control for uncertain dual arm robots
Authors:
Dung Tien Pham,
Thai Van Nguyen,
Hai Xuan Le,
Linh Nguyen,
Nguyen Huu Thai,
Tuan Anh Phan,
Hai Tuan Pham,
Anh Hoai Duong
Abstract:
The paper discusses an adaptive strategy to effectively control nonlinear manipulation motions of a dual arm robot (DAR) under system uncertainties including parameter variations, actuator nonlinearities and external disturbances. It is proposed that the control scheme is first derived from the dynamic surface control (DSC) method, which allows the robot's end-effectors to robustly track the desir…
▽ More
The paper discusses an adaptive strategy to effectively control nonlinear manipulation motions of a dual arm robot (DAR) under system uncertainties including parameter variations, actuator nonlinearities and external disturbances. It is proposed that the control scheme is first derived from the dynamic surface control (DSC) method, which allows the robot's end-effectors to robustly track the desired trajectories. Moreover, since exactly determining the DAR system's dynamics is impractical due to the system uncertainties, the uncertain system parameters are then proposed to be adaptively estimated by the use of the radial basis function network (RBFN). The adaptation mechanism is derived from the Lyapunov theory, which theoretically guarantees stability of the closed-loop control system. The effectiveness of the proposed RBFN-DSC approach is demonstrated by implementing the algorithm in a synthetic environment with realistic parameters, where the obtained results are highly promising.
△ Less
Submitted 8 May, 2019;
originally announced May 2019.
-
On Identification of Sparse Multivariable ARX Model: A Sparse Bayesian Learning Approach
Authors:
J. Jin,
Y. Yuan,
W. Pan,
D. L. T. Pham,
C. J. Tomlin,
A. Webb,
J. Goncalves
Abstract:
This paper begins with considering the identification of sparse linear time-invariant networks described by multivariable ARX models. Such models possess relatively simple structure thus used as a benchmark to promote further research. With identifiability of the network guaranteed, this paper presents an identification method that infers both the Boolean structure of the network and the internal…
▽ More
This paper begins with considering the identification of sparse linear time-invariant networks described by multivariable ARX models. Such models possess relatively simple structure thus used as a benchmark to promote further research. With identifiability of the network guaranteed, this paper presents an identification method that infers both the Boolean structure of the network and the internal dynamics between nodes. Identification is performed directly from data without any prior knowledge of the system, including its order. The proposed method solves the identification problem using Maximum a posteriori estimation (MAP) but with inseparable penalties for complexity, both in terms of element (order of nonzero connections) and group sparsity (network topology). Such an approach is widely applied in Compressive Sensing (CS) and known as Sparse Bayesian Learning (SBL). We then propose a novel scheme that combines sparse Bayesian and group sparse Bayesian to efficiently solve the problem. The resulted algorithm has a similar form of the standard Sparse Group Lasso (SGL) while with known noise variance, it simplifies to exact re-weighted SGL. The method and the developed toolbox can be applied to infer networks from a wide range of fields, including systems biology applications such as signaling and genetic regulatory networks.
△ Less
Submitted 30 September, 2016;
originally announced September 2016.