Search | arXiv e-print repository

DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover's Distance Improves Out-Of-Distribution Face Identification

Abstract: Face identification (FI) is ubiquitous and drives many high-stake decisions made by law enforcement. State-of-the-art FI approaches compare two images by taking the cosine similarity between their image embeddings. Yet, such an approach suffers from poor out-of-distribution (OOD) generalization to new types of images (e.g., when a query face is masked, cropped, or rotated) not included in the trai… ▽ More Face identification (FI) is ubiquitous and drives many high-stake decisions made by law enforcement. State-of-the-art FI approaches compare two images by taking the cosine similarity between their image embeddings. Yet, such an approach suffers from poor out-of-distribution (OOD) generalization to new types of images (e.g., when a query face is masked, cropped, or rotated) not included in the training set or the gallery. Here, we propose a re-ranking approach that compares two faces using the Earth Mover's Distance on the deep, spatial features of image patches. Our extra comparison stage explicitly examines image similarity at a fine-grained level (e.g., eyes to eyes) and is more robust to OOD perturbations and occlusions than traditional FI. Interestingly, without finetuning feature extractors, our method consistently improves the accuracy on all tested OOD queries: masked, cropped, rotated, and adversarial while obtaining similar results on in-distribution images. △ Less

Submitted 25 March, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

Comments: CVPR 2022

arXiv:2112.02592 [pdf]

A New Rapid Test to End COVID

Authors: Mark Ralph Baker, Fleur Conway, Filippo Dal Ben, Elizabeth Lucinda Hawthorne, Licia Iacoviello, A Agodi, Saquib Mukhtar, Hai Phan, Yemurai Rabvukwa, Jessica R Rogge

Abstract: Despite 93.1% to 95.8% of the UK adult population having been vaccinated and currently 83.5% to 89.8% of adults having received at least two doses (1), and despite many households testing twice a week with lateral flow tests (2), R at the time of writing is 0.9 to 1.1, with a growth rate range for England of between -1% and +1% (3). Furthermore, up to 30% of infected individuals are going on to ex… ▽ More Despite 93.1% to 95.8% of the UK adult population having been vaccinated and currently 83.5% to 89.8% of adults having received at least two doses (1), and despite many households testing twice a week with lateral flow tests (2), R at the time of writing is 0.9 to 1.1, with a growth rate range for England of between -1% and +1% (3). Furthermore, up to 30% of infected individuals are going on to experience Long Covid (4). The crisis is far from over and as new variants of concern like Omicron spread, the situation is not under control, even in the highly vaccinated and tested UK and far less so in many countries. The problem is likely to be replicated in other countries with currently low infection levels as isolation is eased in future, even if these countries reach a high level of vaccination. Additionally, concerns have been raised about fall in immunity by 6 months after receiving the Pfizer and AstraZeneca vaccines (5), and the ability of Omicron to re-infect and cause illness in vaccinated people, with urgent booster jabs now being given to attempt to mitigate this. A solution to stop the spread of all variants of COVID-19 is needed now, and we present it here: CLDC, a rapid test that is 98%+ sensitive, low cost and scalable. △ Less

Submitted 5 December, 2021; originally announced December 2021.

Comments: 3 pages

arXiv:2111.08446 [pdf, ps, other]

doi 10.1088/1361-6579/ac6049

Automatic Sleep Staging of EEG Signals: Recent Development, Challenges, and Future Directions

Authors: Huy Phan, Kaare Mikkelsen

Abstract: Modern deep learning holds a great potential to transform clinical practice on human sleep. Teaching a machine to carry out routine tasks would be a tremendous reduction in workload for clinicians. Sleep staging, a fundamental step in sleep practice, is a suitable task for this and will be the focus in this article. Recently, automatic sleep staging systems have been trained to mimic manual scorin… ▽ More Modern deep learning holds a great potential to transform clinical practice on human sleep. Teaching a machine to carry out routine tasks would be a tremendous reduction in workload for clinicians. Sleep staging, a fundamental step in sleep practice, is a suitable task for this and will be the focus in this article. Recently, automatic sleep staging systems have been trained to mimic manual scoring, leading to similar performance to human sleep experts, at least on scoring of healthy subjects. Despite tremendous progress, we have not seen automatic sleep scoring adopted widely in clinical environments. This review aims to give a shared view of the authors on the most recent state-of-the-art development in automatic sleep staging, the challenges that still need to be addressed, and the future directions for automatic sleep scoring to achieve clinical value. △ Less

Submitted 9 May, 2022; v1 submitted 3 November, 2021; originally announced November 2021.

Comments: This article has been published in the Physiological Measurement journal

arXiv:2111.08192 [pdf, other]

doi 10.1109/ICASSP43922.2022.9746132

SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays

Authors: Thi Ngoc Tho Nguyen, Douglas L. Jones, Karn N. Watcharasupat, Huy Phan, Woon-Seng Gan

Abstract: Polyphonic sound event localization and detection (SELD) has many practical applications in acoustic sensing and monitoring. However, the development of real-time SELD has been limited by the demanding computational requirement of most recent SELD systems. In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs. SALSA-Lite is a lightwei… ▽ More Polyphonic sound event localization and detection (SELD) has many practical applications in acoustic sensing and monitoring. However, the development of real-time SELD has been limited by the demanding computational requirement of most recent SELD systems. In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs. SALSA-Lite is a lightweight variation of a previously proposed SALSA feature for polyphonic SELD. SALSA, which stands for Spatial Cue-Augmented Log-Spectrogram, consists of multichannel log-spectrograms stacked channelwise with the normalized principal eigenvectors of the spectrotemporally corresponding spatial covariance matrices. In contrast to SALSA, which uses eigenvector-based spatial features, SALSA-Lite uses normalized inter-channel phase differences as spatial features, allowing a 30-fold speedup compared to the original SALSA feature. Experimental results on the TAU-NIGENS Spatial Sound Events 2021 dataset showed that the SALSA-Lite feature achieved competitive performance compared to the full SALSA feature, and significantly outperformed the traditional feature set of multichannel log-mel spectrograms with generalized cross-correlation spectra. Specifically, using SALSA-Lite features increased localization-dependent F1 score and class-dependent localization recall by 15% and 5%, respectively, compared to using multichannel log-mel spectrograms with generalized cross-correlation spectra. △ Less

Submitted 4 May, 2022; v1 submitted 15 November, 2021; originally announced November 2021.

Comments: arXiv admin note: text overlap with arXiv:2110.00275

Journal ref: Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 716-720

arXiv:2111.07708 [pdf, ps, other]

doi 10.1140/epjc/s10052-022-10225-z

General one-loop formulas for $H\rightarrow f\bar{f}γ$ and its applications

Authors: Vo Van On, Dzung Tri Tran, Chi Linh Nguyen, Khiem Hong Phan

Abstract: We present general one-loop contributions to the decay processes $H\rightarrow f\bar{f}γ$ including all possible the exchange of the additional heavy vector gauge bosons, heavy fermions, and charged (also neutral) scalar particles in the loop diagrams. As a result, the analytic results are valid in a wide class of beyond the standard models. Analytic formulas for the form factors are expressed in… ▽ More We present general one-loop contributions to the decay processes $H\rightarrow f\bar{f}γ$ including all possible the exchange of the additional heavy vector gauge bosons, heavy fermions, and charged (also neutral) scalar particles in the loop diagrams. As a result, the analytic results are valid in a wide class of beyond the standard models. Analytic formulas for the form factors are expressed in terms of Passarino-Veltman functions in the standard notations of {\tt LoopTools}. Hence, the decay rates can be computed numerically by using this package. The computations are then applied to the cases of the standard model, $U(1)_{B-L}$ extension of the standard model as well as two Higgs doublet model. Phenomenological results of the decay processes for all the above models are studied. We observe that the effects of new physics are sizable contributions and these can be probed at future colliders. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: 32 pages

Report number: DTU-2021-06

arXiv:2111.07698 [pdf, ps, other]

doi 10.1093/ptep/ptac012

One-loop contributions to the decay $H\rightarrow ν_l\barν_lγ$ in standard model revisited

Authors: Dzung Tri Tran, Khiem Hong Phan

Abstract: One-loop contributions to the decay $H\rightarrow ν_l\barν_lγ$ with $l=e, μ, τ$ within standard model framework are revisited in this paper. We derive two representations for the form factors in this calculation. As a result, the computations are not only checked numerically by verifying the ultraviolet finiteness of the results but also confirming the ward identity of the amplitude. We find that… ▽ More One-loop contributions to the decay $H\rightarrow ν_l\barν_lγ$ with $l=e, μ, τ$ within standard model framework are revisited in this paper. We derive two representations for the form factors in this calculation. As a result, the computations are not only checked numerically by verifying the ultraviolet finiteness of the results but also confirming the ward identity of the amplitude. We find that the results are good stability with varying ultraviolet cutoff parameters as well as satisfy the ward identity. In phenomenological results, all the physical results are examined with the present input parameters. Especially, we study the partial decay widths for the decay channels in both cases of the detected photon and invisible photon. Differential decay widths are also generated as a function of energy of final photon. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: Talked at 46th Vietnam Conference on Theoretical Physics (VCTP-46)

Report number: DTU-2021-05

Journal ref: Prog Theor Exp Phys (2022)

arXiv:2111.06419 [pdf, other]

doi 10.3847/1538-4357/ac503b

No longer ballistic, not yet diffusive--the formation of cosmic ray small-scale anisotropies

Authors: Marco Kuhlen, Vo Hong Minh Phan, Philipp Mertsch

Abstract: The arrival directions of TeV-PeV cosmic rays are remarkably uniform due to the isotropization of their directions by scattering on turbulent magnetic fields. Small anisotropies can exist in standard diffusion models, however, only on the largest angular scales. Yet, high-statistics observatories like IceCube and HAWC have found significant deviations from isotropy down to small angular scales. He… ▽ More The arrival directions of TeV-PeV cosmic rays are remarkably uniform due to the isotropization of their directions by scattering on turbulent magnetic fields. Small anisotropies can exist in standard diffusion models, however, only on the largest angular scales. Yet, high-statistics observatories like IceCube and HAWC have found significant deviations from isotropy down to small angular scales. Here, we explain the formation of small-scale anisotropies by considering pairs of cosmic rays that get correlated by their transport through the same realisation of the turbulent magnetic field. We argue that the formation of small-scale anisotropies is the reflection of the particular realisation of the turbulent magnetic field experienced by cosmic rays on time scales intermediate between the early, ballistic regime and the late, diffusive regime. We approach this problem in two different ways: First, we run test particle simulations in synthetic turbulence, covering for the first time the TV rigidities of observations with realistic turbulence parameters. Second, we extend the recently introduced mixing matrix approach and determine the steady-state angular power spectrum. Throughout, we adopt magneto-static, slab-like turbulence. We find excellent agreement between the predicted angular power spectra in both approaches over a large range of rigidities. In the future, measurements of small-scale anisotropies will be valuable in constraining the nature of the turbulent magnetic field in our Galactic neighborhood. △ Less

Submitted 11 November, 2021; originally announced November 2021.

Comments: 22 pages, 7 figures

Report number: TTK-21-43

arXiv:2110.13981 [pdf, other]

CHIP: CHannel Independence-based Pruning for Compact Neural Networks

Authors: Yang Sui, Miao Yin, Yi Xie, Huy Phan, Saman Zonouz, Bo Yuan

Abstract: Filter pruning has been widely used for neural network compression because of its enabled practical acceleration. To date, most of the existing filter pruning works explore the importance of filters via using intra-channel information. In this paper, starting from an inter-channel perspective, we propose to perform efficient filter pruning using Channel Independence, a metric that measures the cor… ▽ More Filter pruning has been widely used for neural network compression because of its enabled practical acceleration. To date, most of the existing filter pruning works explore the importance of filters via using intra-channel information. In this paper, starting from an inter-channel perspective, we propose to perform efficient filter pruning using Channel Independence, a metric that measures the correlations among different feature maps. The less independent feature map is interpreted as containing less useful information$/$knowledge, and hence its corresponding filter can be pruned without affecting model capacity. We systematically investigate the quantification metric, measuring scheme and sensitiveness$/$reliability of channel independence in the context of filter pruning. Our evaluation results for different models on various datasets show the superior performance of our approach. Notably, on CIFAR-10 dataset our solution can bring $0.90\%$ and $0.94\%$ accuracy increase over baseline ResNet-56 and ResNet-110 models, respectively, and meanwhile the model size and FLOPs are reduced by $42.8\%$ and $47.4\%$ (for ResNet-56) and $48.3\%$ and $52.1\%$ (for ResNet-110), respectively. On ImageNet dataset, our approach can achieve $40.8\%$ and $44.8\%$ storage and computation reductions, respectively, with $0.15\%$ accuracy increase over the baseline ResNet-50 model. The code is available at https://github.com/Eclipsess/CHIP_NeurIPS2021. △ Less

Submitted 3 April, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

Comments: Accepted by NeurIPS 2021. Model Compression, Channel Pruning, Filter Pruning, Deep Learning

arXiv:2110.11777 [pdf, other]

doi 10.3847/1538-4357/ac86d9

Modeling extinction and reddening effects by circumstellar dust in the Betelgeuse envelope in the presence of radiative torque disruption

Authors: Bao Truong, Le Ngoc Tram, Thiem Hoang, Nguyen Chau Giang, Pham Ngoc Diep, Dieu D. Nguyen, Nguyen Thi Phuong, Thuong D. Hoang, Nguyen Bich Ngoc, Nguyen Fuda, Hien Phan, Tuan Van Bui

Abstract: Circumstellar dust is formed and evolved within the envelope of evolved stars, including Asymptotic Giant Branch (AGB) and Red Supergiant (RSG). The extinction of stellar light by circumstellar dust is vital for interpreting RSG/AGB observations and determining high-mass RSG progenitors of core-collapse supernovae. Nevertheless, circumstellar dust properties are not well understood. Modern underst… ▽ More Circumstellar dust is formed and evolved within the envelope of evolved stars, including Asymptotic Giant Branch (AGB) and Red Supergiant (RSG). The extinction of stellar light by circumstellar dust is vital for interpreting RSG/AGB observations and determining high-mass RSG progenitors of core-collapse supernovae. Nevertheless, circumstellar dust properties are not well understood. Modern understanding of dust evolution suggests that intense stellar radiation can radically change the dust properties across the circumstellar envelope through the RAdiative Torque Disruption (RAT-D) mechanism. In this paper, we study the impacts of RAT-D on the grain size distribution (GSD) of circumstellar dust and model its effects on photometric observations of $α$ Orionis (Betelgeuse). Due to the RAT-D effects, large grains formed in the dust formation zone are disrupted into smaller species of size $a < 0.5\,\rmμm$. Using the GSD constrained by the RAT-D effects, we model the visual extinction of background stars and Betelgeuse. We find that the extinction decreases at near-UV, optical, and infrared wavelengths while increasing at far-UV wavelengths. The resulting flux potentially reproduces the observation from the near-UV to near-IR range. Our results can be used to explain dust extinction and photometric observations toward other RSG/AGB stars. △ Less

Submitted 29 July, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

Comments: 21 pages, 18 figures, 2 tables, accepted to ApJ

arXiv:2110.09605 [pdf, other]

Neural Synthesis of Footsteps Sound Effects with Generative Adversarial Networks

Authors: Marco Comunità, Huy Phan, Joshua D. Reiss

Abstract: Footsteps are among the most ubiquitous sound effects in multimedia applications. There is substantial research into understanding the acoustic features and developing synthesis models for footstep sound effects. In this paper, we present a first attempt at adopting neural synthesis for this task. We implemented two GAN-based architectures and compared the results with real recordings as well as s… ▽ More Footsteps are among the most ubiquitous sound effects in multimedia applications. There is substantial research into understanding the acoustic features and developing synthesis models for footstep sound effects. In this paper, we present a first attempt at adopting neural synthesis for this task. We implemented two GAN-based architectures and compared the results with real recordings as well as six traditional sound synthesis methods. Our architectures reached realism scores as high as recorded samples, showing encouraging results for the task at hand. △ Less

Submitted 10 December, 2021; v1 submitted 18 October, 2021; originally announced October 2021.

arXiv:2109.06089 [pdf, other]

doi 10.1140/epjc/s10052-022-10691-5

An explanation of experimental data of $(g-2)_{e,μ}$ in 3-3-1 models with inverse seesaw neutrinos

Authors: L. T. Hue, Khiem Hong Phan, T. Phong Nguyen, H. N. Long, H. T. Hung

Abstract: We show that the anomalous magnetic moment experimental data of muon and electron $(g-2)_{μ,e}$ can be explained simultaneously in simple extensions of the 3-3-1 models consisting of new heavy neutrinos and a singly charged Higgs boson. The heavy neutrinos generate active neutrino masses and mixing through the general seesaw mechanism. They also have non-zero Yukawa couplings with singly charged H… ▽ More We show that the anomalous magnetic moment experimental data of muon and electron $(g-2)_{μ,e}$ can be explained simultaneously in simple extensions of the 3-3-1 models consisting of new heavy neutrinos and a singly charged Higgs boson. The heavy neutrinos generate active neutrino masses and mixing through the general seesaw mechanism. They also have non-zero Yukawa couplings with singly charged Higgs bosons and right-handed charged leptons, which result in large one-loop contributions known as \emph{chirally-enhanced} ones. Numerical investigation confirms a conclusion indicated previously that these contributions are the key point to explain the large $(g-2)_{μ,e}$ data, provided that the inverse seesaw mechanism is necessary to allow both conditions that heavy neutrino masses are above few hundred GeV and non-unitary part of the active neutrino mixing matrix must be large enough. △ Less

Submitted 22 August, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

Comments: 43 pages,6 Figures. Numerical results and typos are corrected. New references are added

Journal ref: Eur. Phys. J. C 82 (2022) 722

arXiv:2109.04494 [pdf, other]

Erosion of Icy Interstellar Objects by Cosmic Rays and Implications for `Oumuamua

Authors: Vo Hong Minh Phan, Thiem Hoang, Abraham Loeb

Abstract: We study the destruction and modification of icy interstellar objects by cosmic rays and gas collisions. Using the cosmic-ray flux measured in the local interstellar medium as well as inferred from gamma-ray observations at the different galactocentric radii, we find that cosmic-ray erosion is significant for interstellar objects made of common types of ices. Interestingly, cosmic-ray heating migh… ▽ More We study the destruction and modification of icy interstellar objects by cosmic rays and gas collisions. Using the cosmic-ray flux measured in the local interstellar medium as well as inferred from gamma-ray observations at the different galactocentric radii, we find that cosmic-ray erosion is significant for interstellar objects made of common types of ices. Interestingly, cosmic-ray heating might destroy icy interstellar objects very efficiently such that the initial size of an N$_2$ fragment as suggested by \citet{jackson2021} to explain the composition of `Oumuamua should be at least 0.5 km in size in order to survive the journey of about 0.5 Gyr in the ISM and might be even larger if it originated from a region with an enhanced cosmic-ray flux. This implies an initial N$_2$ mass that is at least an order of magnitude larger than the final value, exacerbating the N$_2$ mass budget deficiency for explaining `Oumuamua. The erosion time due to cosmic-ray heating and gas collisions also allows us to set approximate limits on the initial size for other types of icy interstellar objects, e.g. composed of CO, CO$_2$, or CH$_4$. For a given initial size, we constrain the maximum distance to the birth site for interstellar objects with different speeds. We also find that cosmic-ray and gas heating could entirely modify the ice structure before destroying interstellar objects. △ Less

Submitted 28 October, 2021; v1 submitted 9 September, 2021; originally announced September 2021.

Comments: 7 pages and 3 figures

Report number: TTK-21-35

arXiv:2108.10520 [pdf, other]

Improving Object Detection by Label Assignment Distillation

Authors: Chuong H. Nguyen, Thuy C. Nguyen, Tuan N. Tang, Nam L. H. Phan

Abstract: Label assignment in object detection aims to assign targets, foreground or background, to sampled regions in an image. Unlike labeling for image classification, this problem is not well defined due to the object's bounding box. In this paper, we investigate the problem from a perspective of distillation, hence we call Label Assignment Distillation (LAD). Our initial motivation is very simple, we u… ▽ More Label assignment in object detection aims to assign targets, foreground or background, to sampled regions in an image. Unlike labeling for image classification, this problem is not well defined due to the object's bounding box. In this paper, we investigate the problem from a perspective of distillation, hence we call Label Assignment Distillation (LAD). Our initial motivation is very simple, we use a teacher network to generate labels for the student. This can be achieved in two ways: either using the teacher's prediction as the direct targets (soft label), or through the hard labels dynamically assigned by the teacher (LAD). Our experiments reveal that: (i) LAD is more effective than soft-label, but they are complementary. (ii) Using LAD, a smaller teacher can also improve a larger student significantly, while soft-label can't. We then introduce Co-learning LAD, in which two networks simultaneously learn from scratch and the role of teacher and student are dynamically interchanged. Using PAA-ResNet50 as a teacher, our LAD techniques can improve detectors PAA-ResNet101 and PAA-ResNeXt101 to $46 \rm AP$ and $47.5\rm AP$ on the COCO test-dev set. With a stronger teacher PAA-SwinB, we improve the students PAA-ResNet50 to $43.7\rm AP$ by only 1x schedule training and standard setting, and PAA-ResNet101 to $47.9\rm AP$, significantly surpassing the current methods. Our source code and checkpoints are released at https://git.io/JrDZo. △ Less

Submitted 19 October, 2021; v1 submitted 24 August, 2021; originally announced August 2021.

Comments: To appear in WACV 2022

arXiv:2108.10211 [pdf, ps, other]

doi 10.1109/TBME.2022.3174680

Pediatric Automatic Sleep Staging: A comparative study of state-of-the-art deep learning methods

Authors: Huy Phan, Alfred Mertins, Mathias Baumert

Abstract: Background: Despite the tremendous progress recently made towards automatic sleep staging in adults, it is currently unknown if the most advanced algorithms generalize to the pediatric population, which displays distinctive characteristics in overnight polysomnography (PSG). Methods: To answer the question, in this work, we conduct a large-scale comparative study on the state-of-the-art deep learn… ▽ More Background: Despite the tremendous progress recently made towards automatic sleep staging in adults, it is currently unknown if the most advanced algorithms generalize to the pediatric population, which displays distinctive characteristics in overnight polysomnography (PSG). Methods: To answer the question, in this work, we conduct a large-scale comparative study on the state-of-the-art deep learning methods for pediatric automatic sleep staging. Six different deep neural networks with diverging features are adopted to evaluate a sample of more than 1,200 children across a wide spectrum of obstructive sleep apnea (OSA) severity. Results: Our experimental results show that the individual performance of automated pediatric sleep stagers when evaluated on new subjects is equivalent to the expert-level one reported on adults. Combining the six stagers into ensemble models further boosts the staging accuracy, reaching an overall accuracy of 88.8%, a Cohen's kappa of 0.852, and a macro F1-score of 85.8%. At the same time, the ensemble models lead to reduced predictive uncertainty. The results also show that the studied algorithms and their ensembles are robust to concept drift when the training and test data were recorded seven months apart and after clinical intervention. Conclusion: However, we show that the improvements in the staging performance are not necessarily clinically significant although the ensemble models lead to more favorable clinical measures than the six standalone models. Significance: Detailed analyses further demonstrate "almost perfect" agreement between the automatic stagers to one another and their similar patterns on the staging errors, suggesting little room for improvement. △ Less

Submitted 10 May, 2022; v1 submitted 23 August, 2021; originally announced August 2021.

Comments: This article has been published in IEEE Transactions on Biomedical Engineering

arXiv:2108.10045 [pdf, other]

doi 10.3847/1538-4357/ac5abf

Studying magnetic fields and dust in M17 using polarized thermal dust emission observed by SOFIA/HAWC+

Authors: Thuong Duc Hoang, Nguyen Bich Ngoc, Pham Ngoc Diep, Le Ngoc Tram, Thiem Hoang, Wanggi Lim, Dieu D. Nguyen, Ngan Le, Nguyen Thi Phuong, Nguyen Fuda, Tuan Van Bui, Kate Pattle, Gia Bao Truong Le, Hien Phan, Nguyen Chau Giang

Abstract: We report the highest spatial resolution measurement of magnetic fields in M17 using thermal dust polarization taken by SOFIA/HAWC+ centered at 154 $μ$m wavelength. Using the Davis-Chandrasekhar-Fermi method, we found the presence of strong magnetic fields of $980 \pm 230\;μ$G and $1665 \pm 885\;μ$G in lower-density (M17-N) and higher-density (M17-S) regions, respectively. The magnetic field morph… ▽ More We report the highest spatial resolution measurement of magnetic fields in M17 using thermal dust polarization taken by SOFIA/HAWC+ centered at 154 $μ$m wavelength. Using the Davis-Chandrasekhar-Fermi method, we found the presence of strong magnetic fields of $980 \pm 230\;μ$G and $1665 \pm 885\;μ$G in lower-density (M17-N) and higher-density (M17-S) regions, respectively. The magnetic field morphology in M17-N possibly mimics the fields in gravitational collapse molecular cores while in M17-S the fields run perpendicular to the matter structure and display a pillar and an asymmetric hourglass shape. The mean values of the magnetic field strength are used to determine the Alfvénic Mach numbers ($\mathcal{M_A}$) of M17-N and M17-S which turn out to be sub-Alfvénic, or magnetic fields dominate turbulence. We calculate the mass-to-flux ratio, $λ$, and obtain $λ=0.07$ for M17-N and $0.28$ for M17-S. The sub-critical values of $λ$ are in agreement with the lack of massive stars formed in M17. To study dust physics, we analyze the relationship between the dust polarization fraction, $p$, and the thermal emission intensity, $I$, gas column density, $N({\rm H_2})$, and dust temperature, $T_{\rm d}$. The polarization fraction decreases with intensity as $I^{-α}$ with $α= 0.51$. The polarization fraction also decreases with increasing $N(\rm H_{2})$, which can be explained by the decrease of grain alignment by radiative torques (RATs) toward denser regions with a weaker radiation field and/or tangling of magnetic fields. The polarization fraction tends to increase with $T_{\rm d}$ first and then decreases when $T_ {\rm d} > 50$ K. The latter feature seen in the M17-N, where the gas density changes slowly with $T_{d}$, is consistent with the RAT disruption effect. △ Less

Submitted 12 November, 2021; v1 submitted 23 August, 2021; originally announced August 2021.

Comments: Submitted to Apj

Report number: 929:27 (21pp)

Journal ref: The Astrophysical Journal 2022 April 10

arXiv:2107.02094 [pdf, other]

Cosmic Ray Small Scale Anisotropies in Slab Turbulence

Authors: Marco Kuhlen, Philipp Mertsch, Vo Hong Minh Phan

Abstract: In the standard picture of cosmic ray transport the propagation of charged cosmic rays through turbulent magnetic fields is described as a random walk with cosmic rays scattering on magnetic field turbulence. This is in good agreement with the highly isotropic cosmic ray arrival directions as this diffusion process effectively isotropizes the cosmic ray distribution. High-statistics observatorie… ▽ More In the standard picture of cosmic ray transport the propagation of charged cosmic rays through turbulent magnetic fields is described as a random walk with cosmic rays scattering on magnetic field turbulence. This is in good agreement with the highly isotropic cosmic ray arrival directions as this diffusion process effectively isotropizes the cosmic ray distribution. High-statistics observatories like IceCube and HAWC have however observed significant deviations from isotropy down to very small angular scales. This is in strong tension with this standard picture of cosmic ray propagation. While large scale multipoles arise naturally, for example due to the earth's motion relative to the isotropic cosmic ray distribution, there is no intuitive mechanism to account for the observed anisotropies at smaller angular scales. By relaxing one of the standard assumptions of quasi linear theory and treating correlations between fluxes of cosmic rays from different directions explicitly we show that higher multipoles also are to be expected from particle propagation through turbulent magnetic fields. We present a first analytical calculation of the angular power spectrum assuming a physically motivated model of the magnetic field turbulence and find good agreement with numerical simulations. △ Less

Submitted 7 July, 2021; v1 submitted 5 July, 2021; originally announced July 2021.

Comments: 8 pages, 3 figures; Proceedings of the 37th International Cosmic Ray Conference (ICRC2021)

Journal ref: PoS(ICRC2021)164

arXiv:2106.14466 [pdf, ps, other]

General one-loop contributions to the decay $H\rightarrow ν_l\barν_lγ$

Authors: Khiem Hong Phan, Dzung Tri Tran, Le Tho Hue

Abstract: General one-loop contributions to the decay amplitudes $H\rightarrow ν_l\barν_lγ$ are presented, considering all possible contributions of additional heavy vector gauge bosons, fermions, and charged (and also neutral) scalar particles appearing in the loop diagrams. Moreover, the results can be applied directly when extra neutrinos (apart from three ones in standard model) are taken into account i… ▽ More General one-loop contributions to the decay amplitudes $H\rightarrow ν_l\barν_lγ$ are presented, considering all possible contributions of additional heavy vector gauge bosons, fermions, and charged (and also neutral) scalar particles appearing in the loop diagrams. Moreover, the results can be applied directly when extra neutrinos (apart from three ones in standard model) are taken into account in final states. Analytic results are presented in terms of Passarino-Veltman scalar functions which can be evaluated numerically using {\tt LoopTools}. In the standard model framework, these analytical results are generated and cross-checked with previous computations. We find that our results are well consistent with these computations. Within standard model limit, phenomenological results for the decay channels are also studied using the updated input parameters at the Large Hadron Collider. △ Less

Submitted 28 June, 2021; originally announced June 2021.

Comments: 19 pages, 10 figures, 3 Tables of data

Report number: DTU-2021-03

arXiv:2106.07568 [pdf]

Full interpretable machine learning in 2D with inline coordinates

Authors: Boris Kovalerchuk, Hoang Phan

Abstract: This paper proposed a new methodology for machine learning in 2-dimensional space (2-D ML) in inline coordinates. It is a full machine learning approach that does not require to deal with n-dimensional data in n-dimensional space. It allows discovering n-D patterns in 2-D space without loss of n-D information using graph representations of n-D data in 2-D. Specifically, it can be done with the inl… ▽ More This paper proposed a new methodology for machine learning in 2-dimensional space (2-D ML) in inline coordinates. It is a full machine learning approach that does not require to deal with n-dimensional data in n-dimensional space. It allows discovering n-D patterns in 2-D space without loss of n-D information using graph representations of n-D data in 2-D. Specifically, it can be done with the inline based coordinates in different modifications, including static and dynamic ones. The classification and regression algorithms based on these inline coordinates were introduced. A successful case study based on a benchmark data demonstrated the feasibility of the approach. This approach helps to consolidate further a whole new area of full 2-D machine learning as a promising ML methodology. It has advantages of abilities to involve actively the end-users into the discovering of models and their justification. Another advantage is providing interpretable ML models. △ Less

Submitted 3 July, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

Comments: 8 pages, 20 figures

arXiv:2106.03776 [pdf, other]

CDN-MEDAL: Two-stage Density and Difference Approximation Framework for Motion Analysis

Authors: Synh Viet-Uyen Ha, Cuong Tien Nguyen, Hung Ngoc Phan, Nhat Minh Chung, Phuong Hoai Ha

Abstract: Background modeling and subtraction is a promising research area with a variety of applications for video surveillance. Recent years have witnessed a proliferation of effective learning-based deep neural networks in this area. However, the techniques have only provided limited descriptions of scenes' properties while requiring heavy computations, as their single-valued mapping functions are learne… ▽ More Background modeling and subtraction is a promising research area with a variety of applications for video surveillance. Recent years have witnessed a proliferation of effective learning-based deep neural networks in this area. However, the techniques have only provided limited descriptions of scenes' properties while requiring heavy computations, as their single-valued mapping functions are learned to approximate the temporal conditional averages of observed target backgrounds and foregrounds. On the other hand, statistical learning in imagery domains has been a prevalent approach with high adaptation to dynamic context transformation, notably using Gaussian Mixture Models (GMM) with its generalization capabilities. By leveraging both, we propose a novel method called CDN-MEDAL-net for background modeling and subtraction with two convolutional neural networks. The first architecture, CDN-GM, is grounded on an unsupervised GMM statistical learning strategy to describe observed scenes' salient features. The second one, MEDAL-net, implements a light-weighted pipeline of online video background subtraction. Our two-stage architecture is small, but it is very effective with rapid convergence to representations of intricate motion patterns. Our experiments show that the proposed approach is not only capable of effectively extracting regions of moving objects in unseen cases, but it is also very efficient. △ Less

Submitted 21 September, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

Comments: 13 pages, 5 figures, to be submitted to IEEE TMM

arXiv:2106.01782 [pdf, other]

Machine learning models for DOTA 2 outcomes prediction

Authors: Kodirjon Akhmedov, Anh Huy Phan

Abstract: Prediction of the real-time multiplayer online battle arena (MOBA) games' match outcome is one of the most important and exciting tasks in Esports analytical research. This research paper predominantly focuses on building predictive machine and deep learning models to identify the outcome of the Dota 2 MOBA game using the new method of multi-forward steps predictions. Three models were investigate… ▽ More Prediction of the real-time multiplayer online battle arena (MOBA) games' match outcome is one of the most important and exciting tasks in Esports analytical research. This research paper predominantly focuses on building predictive machine and deep learning models to identify the outcome of the Dota 2 MOBA game using the new method of multi-forward steps predictions. Three models were investigated and compared: Linear Regression (LR), Neural Networks (NN), and a type of recurrent neural network Long Short-Term Memory (LSTM). In order to achieve the goals, we developed a data collecting python server using Game State Integration (GSI) to track the real-time data of the players. Once the exploratory feature analysis and tuning hyper-parameters were done, our models' experiments took place on different players with dissimilar backgrounds of playing experiences. The achieved accuracy scores depend on the multi-forward prediction parameters, which for the worse case in linear regression 69\% but on average 82\%, while in the deep learning models hit the utmost accuracy of prediction on average 88\% for NN, and 93\% for LSTM models. △ Less

Submitted 3 June, 2021; originally announced June 2021.

Comments: 11 pages, 12 figures, the paper will be published in IEEE Transactions on Games Journal

arXiv:2105.13083 [pdf, ps, other]

doi 10.1364/OE.427222

Valley-dependent Corner States in Honeycomb Photonic Crystal without Inversion Symmetry

Authors: Huyen Thanh Phan, Feng Liu, Katsunori Wakabayashi

Abstract: We study topological states of honeycomb photonic crystals in absence of inversion symmetry using plane wave expansion and finite element methods. The breaking of inversion symmetry in honeycomb lattice leads to contrasting topological valley indices, i.e., the valley-dependent Chern numbers in momentum space. We find that the topological corner states appear for 60$^\circ$ degree corners, but abs… ▽ More We study topological states of honeycomb photonic crystals in absence of inversion symmetry using plane wave expansion and finite element methods. The breaking of inversion symmetry in honeycomb lattice leads to contrasting topological valley indices, i.e., the valley-dependent Chern numbers in momentum space. We find that the topological corner states appear for 60$^\circ$ degree corners, but absent for other corners, which can be understood as the sign flip of valley Chern number at the corner. Our results provide an experimentally feasible platform for exploring valley-dependent higher-order topology in photonic systems. △ Less

Submitted 27 May, 2021; originally announced May 2021.

Comments: 13 pages, 6 figures

Journal ref: Optics Express, vol. 29, pp. 18277-18290 (2021)

arXiv:2105.11043 [pdf, ps, other]

doi 10.1109/TBME.2022.3147187

SleepTransformer: Automatic Sleep Staging with Interpretability and Uncertainty Quantification

Authors: Huy Phan, Kaare Mikkelsen, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Maarten De Vos

Abstract: Background: Black-box skepticism is one of the main hindrances impeding deep-learning-based automatic sleep scoring from being used in clinical environments. Methods: Towards interpretability, this work proposes a sequence-to-sequence sleep-staging model, namely SleepTransformer. It is based on the transformer backbone and offers interpretability of the model's decisions at both the epoch and sequ… ▽ More Background: Black-box skepticism is one of the main hindrances impeding deep-learning-based automatic sleep scoring from being used in clinical environments. Methods: Towards interpretability, this work proposes a sequence-to-sequence sleep-staging model, namely SleepTransformer. It is based on the transformer backbone and offers interpretability of the model's decisions at both the epoch and sequence level. We further propose a simple yet efficient method to quantify uncertainty in the model's decisions. The method, which is based on entropy, can serve as a metric for deferring low-confidence epochs to a human expert for further inspection. Results: Making sense of the transformer's self-attention scores for interpretability, at the epoch level, the attention scores are encoded as a heat map to highlight sleep-relevant features captured from the input EEG signal. At the sequence level, the attention scores are visualized as the influence of different neighboring epochs in an input sequence (i.e. the context) to recognition of a target epoch, mimicking the way manual scoring is done by human experts. Conclusion: Additionally, we demonstrate that SleepTransformer performs on par with existing methods on two databases of different sizes. Significance: Equipped with interpretability and the ability of uncertainty quantification, SleepTransformer holds promise for being integrated into clinical settings. △ Less

Submitted 26 January, 2022; v1 submitted 23 May, 2021; originally announced May 2021.

Comments: This article has been published in IEEE Transactions on Biomedical Engineering

arXiv:2105.00311 [pdf, other]

doi 10.1103/PhysRevLett.127.141101

Stochastic Fluctuations of Low-Energy Cosmic Rays and the Interpretation of Voyager Data

Authors: Vo Hong Minh Phan, Florian Schulze, Philipp Mertsch, Sarah Recchia, Stefano Gabici

Abstract: Data from the Voyager probes have provided us with the first measurement of cosmic ray intensities at MeV energies, an energy range which had previously not been explored. Simple extrapolations of models that fit data at GeV energies, e.g. from AMS-02, however, fail to reproduce the Voyager data in that the predicted intensities are too high. Oftentimes, this discrepancy is addressed by adding a b… ▽ More Data from the Voyager probes have provided us with the first measurement of cosmic ray intensities at MeV energies, an energy range which had previously not been explored. Simple extrapolations of models that fit data at GeV energies, e.g. from AMS-02, however, fail to reproduce the Voyager data in that the predicted intensities are too high. Oftentimes, this discrepancy is addressed by adding a break to the source spectrum or the diffusion coefficient in an ad hoc fashion, with a convincing physical explanation yet to be provided. Here, we argue that the discrete nature of cosmic ray sources, which is usually ignored, is instead a more likely explanation. We model the distribution of intensities expected from a statistical model of discrete sources and show that its expectation value is not representative, but has a spectral shape different from that for a typical configuration of sources. The Voyager proton and electron data are however compatible with the median of the intensity distribution. We stress that this model can explain the Voyager data without requiring any unphysical breaks. △ Less

Submitted 12 October, 2021; v1 submitted 1 May, 2021; originally announced May 2021.

Comments: 6 pages and 3 figures (supplemental material added)

Report number: TTK-21-14

arXiv:2104.05948 [pdf]

doi 10.1103/PhysRevMaterials.5.094404

Table-like magnetocaloric effect and enhanced refrigerant capacity in EuO1-δ thin films

Authors: P. Lampen, R. Madhogaria, N. S. Bingham, M. H. Phan, P. M. S. Monteiro, N. -J. Steinke, A. Ionescu, C. H. W. Barnes, H. Srikanth

Abstract: An approach to adjusting the conduction band population for tuning the magnetic and magnetocaloric response of EuO1-δ thin films through control of oxygen vacancies (δ = 0, 0.025, and 0.09) is presented. The films each showed a paramagnetic to ferromagnetic transition around 65 K, with an additional magnetic ordering transition at higher temperatures in the oxygen deficient samples. All transition… ▽ More An approach to adjusting the conduction band population for tuning the magnetic and magnetocaloric response of EuO1-δ thin films through control of oxygen vacancies (δ = 0, 0.025, and 0.09) is presented. The films each showed a paramagnetic to ferromagnetic transition around 65 K, with an additional magnetic ordering transition at higher temperatures in the oxygen deficient samples. All transitions are observed to be of second order. A maximum magnetic entropy change of 6.4 J/kg K over a field change of 2 T with a refrigerant capacity of 223 J/kg was found in the sample with δ = 0, and in all cases the refrigerant capacities of the thin films under study were found to exceed that reported for bulk EuO. Adjusting the oxygen content was shown to produce table-like magnetocaloric effects, desirable for ideal Ericsson-cycle magnetic refrigeration. These films are thus excellent candidates for small-scale magnetic cooling technology in the liquid nitrogen temperature range. △ Less

Submitted 13 April, 2021; originally announced April 2021.

Journal ref: Phys. Rev. Materials 5, 094404 (2021)

arXiv:2104.05460 [pdf, ps, other]

An adaptive splitting algorithm for the sum of three operators

Authors: Minh N. Dao, Hung M. Phan

Abstract: Splitting algorithms for finding a zero of sum of operators often involve multiple steps which are referred to as forward or backward steps. Forward steps are the explicit use of the operators and backward steps involve the operators implicitly via their resolvents. In this paper, we study an adaptive splitting algorithm for finding a zero of the sum of three operators. We assume that two of the o… ▽ More Splitting algorithms for finding a zero of sum of operators often involve multiple steps which are referred to as forward or backward steps. Forward steps are the explicit use of the operators and backward steps involve the operators implicitly via their resolvents. In this paper, we study an adaptive splitting algorithm for finding a zero of the sum of three operators. We assume that two of the operators are generalized monotone and their resolvents are computable, while the other operator is cocoercive but its resolvent is missing or costly to compute. Our splitting algorithm adapts new parameters to the generalized monotonicity of the operators and, at the same time, combines appropriate forward and backward steps to guarantee convergence to a solution of the problem. △ Less

Submitted 12 April, 2021; originally announced April 2021.

arXiv:2104.04567 [pdf, other]

Light-weight sleep monitoring: electrode distance matters more than placement for automatic scoring

Authors: Kaare B. Mikkelsen, Huy Phan, Mike L. Rank, Martin C. Hemmsen, Maarten de Vos, Preben Kidmose

Abstract: Modern sleep monitoring development is shifting towards the use of unobtrusive sensors combined with algorithms for automatic sleep scoring. Many different combinations of wet and dry electrodes, ear-centered, forehead-mounted or headband-inspired designs have been proposed, alongside an ever growing variety of machine learning algorithms for automatic sleep scoring. In this paper, we compare 13 d… ▽ More Modern sleep monitoring development is shifting towards the use of unobtrusive sensors combined with algorithms for automatic sleep scoring. Many different combinations of wet and dry electrodes, ear-centered, forehead-mounted or headband-inspired designs have been proposed, alongside an ever growing variety of machine learning algorithms for automatic sleep scoring. In this paper, we compare 13 different, realistic sensor setups derived from the same data set and analysed with the same pipeline. We find that all setups which include both a lateral and an EOG derivation show similar, state-of-the-art performance, with average Cohen's kappa values of at least 0.80. This indicates that electrode distance, rather than position, is important for accurate sleep scoring. Finally, based on the results presented, we argue that with the current competitive performance of automated staging approaches, there is an urgent need for establishing an improved benchmark beyond current single human rater scoring. △ Less

Submitted 13 April, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

Comments: 8 pages, 8 figures

arXiv:2103.14248 [pdf, ps, other]

One-loop $W$ boson contributions to the decay $H\rightarrow Zγ$ in the general $R_ξ$ gauge

Authors: Dzung Tri Tran, Le Tho Hue, Khiem Hong Phan

Abstract: One-loop $W$ boson contributions to the decay $H\rightarrow Zγ$ in the general $R_ξ$ gauge are presented. The analytical results are expressed in terms of well-known Passarino-Veltman functions which their numerical evaluations can be generated using {\tt LoopTools}. In the limit $d\rightarrow 4$, we have shown that these analytical results are independent of the unphysical parameter $ξ$ and consi… ▽ More One-loop $W$ boson contributions to the decay $H\rightarrow Zγ$ in the general $R_ξ$ gauge are presented. The analytical results are expressed in terms of well-known Passarino-Veltman functions which their numerical evaluations can be generated using {\tt LoopTools}. In the limit $d\rightarrow 4$, we have shown that these analytical results are independent of the unphysical parameter $ξ$ and consistent with previous results. The gauge parameter independence are also checked numerically for consistence. Our results are also well stable with different values of $ξ=0, 1, 100,$ and $ξ\rightarrow \infty$. △ Less

Submitted 30 June, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

Comments: 15 pages, 1 Table of data

Report number: DTU-2021-02

arXiv:2103.10045 [pdf, ps, other]

One-loop form factors for $H\rightarrow γ^*γ^*$ in $R_ξ$ gauge

Authors: Khiem Hong Phan, Dzung Tri Tran

Abstract: In this paper, we present general one-loop form factors for $H\rightarrow γ^* γ^*$ in $R_ξ$ gauge, considering all cases of two on-shell, one on-shell and two off-shell for final photons. The calculations are performed in standard model and in arbitrary beyond the standard models which charged scalar particles may be exchanged in one-loop diagrams. Analytic results for the form factors are shown i… ▽ More In this paper, we present general one-loop form factors for $H\rightarrow γ^* γ^*$ in $R_ξ$ gauge, considering all cases of two on-shell, one on-shell and two off-shell for final photons. The calculations are performed in standard model and in arbitrary beyond the standard models which charged scalar particles may be exchanged in one-loop diagrams. Analytic results for the form factors are shown in general forms which are expressed in terms of the Passarino-Veltman functions. We also confirm the results in previous computations which are available for the case of two on-shell photons. The $ξ$-independent of the result is also discussed. We find that numerical results are good stability with varying $ξ=0,1$ and $ξ\rightarrow \infty$. △ Less

Submitted 18 March, 2021; originally announced March 2021.

Comments: 19 pages, one Table of data

Report number: DTU-2021-01

arXiv:2103.07159 [pdf, other]

An Adaptive Alternating Direction Method of Multipliers

Authors: Sedi Bartz, Rubén Campoy, Hung M. Phan

Abstract: The alternating direction method of multipliers (ADMM) is a powerful splitting algorithm for linearly constrained convex optimization problems. In view of its popularity and applicability, a growing attention is drawn towards the ADMM in nonconvex settings. Recent studies of minimization problems for noncovex functions include various combinations of assumptions on the objective function including… ▽ More The alternating direction method of multipliers (ADMM) is a powerful splitting algorithm for linearly constrained convex optimization problems. In view of its popularity and applicability, a growing attention is drawn towards the ADMM in nonconvex settings. Recent studies of minimization problems for noncovex functions include various combinations of assumptions on the objective function including, in particular, a Lipschitz gradient assumption. We consider the case where the objective is the sum of a strongly convex function and a weakly convex function. To this end we present and study an adaptive version of the ADMM which incorporates generalized notions of convexity and varying penalty parameters adapted to the convexity constants of the functions. We prove convergence of the scheme under natural assumptions. To this end we employ the recent adaptive Douglas--Rachford algorithm by revisiting the well known duality relation between the classical ADMM and the Douglas--Rachford splitting algorithm, generalizing this connection to our setting. We illustrate our approach by relating and comparing to alternatives, and by numerical experiments on a signal denoising problem. △ Less

Submitted 17 August, 2022; v1 submitted 12 March, 2021; originally announced March 2021.

MSC Class: 47H05; 47N10; 47J25; 49M27; 65K15

arXiv:2103.04451 [pdf, other]

doi 10.1016/j.dam.2015.02.006

On the Termination of Some Biclique Operators on Multipartite Graphs

Authors: Christophe Crespelle, Matthieu Latapy, Thi Ha Duong Phan

Abstract: We define a new graph operator, called the weak-factor graph, which comes from the context of complex network modelling. The weak-factor operator is close to the well-known clique-graph operator but it rather operates in terms of bicliques in a multipartite graph. We address the problem of the termination of the series of graphs obtained by iteratively applying the weak-factor operator starting fr… ▽ More We define a new graph operator, called the weak-factor graph, which comes from the context of complex network modelling. The weak-factor operator is close to the well-known clique-graph operator but it rather operates in terms of bicliques in a multipartite graph. We address the problem of the termination of the series of graphs obtained by iteratively applying the weak-factor operator starting from a given input graph. As for the clique-graph operator, it turns out that some graphs give rise to series that do not terminate. Therefore, we design a slight variation of the weak-factor operator, called clean-factor, and prove that its associated series terminates for all input graphs. In addition, we show that the multipartite graph on which the series terminates has a very nice combinatorial structure: we exhibit a bijection between its vertices and the chains of the inclusion order on the intersections of the maximal cliques of the input graph. △ Less

Submitted 7 March, 2021; originally announced March 2021.

Journal ref: Discrete Applied Mathematics 195, 2015

arXiv:2103.04447 [pdf, ps, other]

doi 10.1007/978-3-642-17458-2_1

Termination of Multipartite Graph Series Arising from Complex Network Modelling

Authors: Matthieu Latapy, Thi Ha Duong Phan, Christophe Crespelle, Thanh Qui Nguyen

Abstract: An intense activity is nowadays devoted to the definition of models capturing the properties of complex networks. Among the most promising approaches, it has been proposed to model these graphs via their clique incidence bipartite graphs. However, this approach has, until now, severe limitations resulting from its incapacity to reproduce a key property of this object: the overlapping nature of cli… ▽ More An intense activity is nowadays devoted to the definition of models capturing the properties of complex networks. Among the most promising approaches, it has been proposed to model these graphs via their clique incidence bipartite graphs. However, this approach has, until now, severe limitations resulting from its incapacity to reproduce a key property of this object: the overlapping nature of cliques in complex networks. In order to get rid of these limitations we propose to encode the structure of clique overlaps in a network thanks to a process consisting in iteratively factorising the maximal bicliques between the upper level and the other levels of a multipartite graph. We show that the most natural definition of this factorising process leads to infinite series for some instances. Our main result is to design a restriction of this process that terminates for any arbitrary graph. Moreover, we show that the resulting multipartite graph has remarkable combinatorial properties and is closely related to another fundamental combinatorial object. Finally, we show that, in practice, this multipartite graph is computationally tractable and has a size that makes it suitable for complex network modelling. △ Less

Submitted 7 March, 2021; originally announced March 2021.

Comments: Published in LNCS, proceedings of the 4th International Conference on Combinatorial Optimization and Applications (COCOA), 2010

arXiv:2103.02420 [pdf, ps, other]

Multi-view Audio and Music Classification

Authors: Huy Phan, Huy Le Nguyen, Oliver Y. Chén, Lam Pham, Philipp Koch, Ian McLoughlin, Alfred Mertins

Abstract: We propose in this work a multi-view learning approach for audio and music classification. Considering four typical low-level representations (i.e. different views) commonly used for audio and music recognition tasks, the proposed multi-view network consists of four subnetworks, each handling one input types. The learned embedding in the subnetworks are then concatenated to form the multi-view emb… ▽ More We propose in this work a multi-view learning approach for audio and music classification. Considering four typical low-level representations (i.e. different views) commonly used for audio and music recognition tasks, the proposed multi-view network consists of four subnetworks, each handling one input types. The learned embedding in the subnetworks are then concatenated to form the multi-view embedding for classification similar to a simple concatenation network. However, apart from the joint classification branch, the network also maintains four classification branches on the single-view embedding of the subnetworks. A novel method is then proposed to keep track of the learning behavior on the classification branches and adapt their weights to proportionally blend their gradients for network training. The weights are adapted in such a way that learning on a branch that is generalizing well will be encouraged whereas learning on a branch that is overfitting will be slowed down. Experiments on three different audio and music classification tasks show that the proposed multi-view network not only outperforms the single-view baselines but also is superior to the multi-view baselines based on concatenation and late fusion. △ Less

Submitted 3 March, 2021; originally announced March 2021.

Comments: Accepted to ICASSP 2021

arXiv:2102.03814 [pdf, other]

doi 10.1109/TBME.2021.3137184

MIN2Net: End-to-End Multi-Task Learning for Subject-Independent Motor Imagery EEG Classification

Authors: Phairot Autthasan, Rattanaphon Chaisaen, Thapanun Sudhawiyangkul, Phurin Rangpong, Suktipol Kiatthaveephong, Nat Dilokthanakul, Gun Bhakdisongkhram, Huy Phan, Cuntai Guan, Theerawit Wilaiprasitporn

Abstract: Advances in the motor imagery (MI)-based brain-computer interfaces (BCIs) allow control of several applications by decoding neurophysiological phenomena, which are usually recorded by electroencephalography (EEG) using a non-invasive technique. Despite great advances in MI-based BCI, EEG rhythms are specific to a subject and various changes over time. These issues point to significant challenges t… ▽ More Advances in the motor imagery (MI)-based brain-computer interfaces (BCIs) allow control of several applications by decoding neurophysiological phenomena, which are usually recorded by electroencephalography (EEG) using a non-invasive technique. Despite great advances in MI-based BCI, EEG rhythms are specific to a subject and various changes over time. These issues point to significant challenges to enhance the classification performance, especially in a subject-independent manner. To overcome these challenges, we propose MIN2Net, a novel end-to-end multi-task learning to tackle this task. We integrate deep metric learning into a multi-task autoencoder to learn a compact and discriminative latent representation from EEG and perform classification simultaneously. This approach reduces the complexity in pre-processing, results in significant performance improvement on EEG classification. Experimental results in a subject-independent manner show that MIN2Net outperforms the state-of-the-art techniques, achieving an F1-score improvement of 6.72%, and 2.23% on the SMR-BCI, and OpenBMI datasets, respectively. We demonstrate that MIN2Net improves discriminative information in the latent representation. This study indicates the possibility and practicality of using this model to develop MI-based BCI applications for new users without the need for calibration. △ Less

Submitted 7 January, 2022; v1 submitted 7 February, 2021; originally announced February 2021.

Journal ref: IEEE Transactions on Biomedical Engineering 2021

arXiv:2102.01763 [pdf]

Optimization of the high-frequency magnetoimpedance response in melt-extracted Co-rich microwires through novel multiple-step Joule heating

Authors: O. Thiabgoh, T. Eggers, C. Albrecht, V. O. Jimenez, H. Shen, S. D. Jiang, J. F. Sun, D. S. Lam, V. D. Lam, M. H. Phan

Abstract: The optimization of high frequency giant magnetoimpedance (GMI) effect and its magnetic field sensitivity in melt-extracted Co69.25Fe4.25Si13B12.5Nb1 amorphous microwires, through a multi-step Joule annealing (MSA) technique, was systematically studied. The surface morphology, microstructure, surface magnetic property, and high frequency GMI response of the Co-rich microwires were explored using s… ▽ More The optimization of high frequency giant magnetoimpedance (GMI) effect and its magnetic field sensitivity in melt-extracted Co69.25Fe4.25Si13B12.5Nb1 amorphous microwires, through a multi-step Joule annealing (MSA) technique, was systematically studied. The surface morphology, microstructure, surface magnetic property, and high frequency GMI response of the Co-rich microwires were explored using scanning electron microscopy (SEM), magneto-optical Kerr effect (MOKE) magnetometry, transmission electron microscopy (TEM), and impedance analyzer, respectively. An initial dc current (idc) of 20 mA, which was then increased by 20 mA at every time-step (10 min) up to 300 mA, was applied to the microwires. The MSA of 20 mA to 100 mA remarkably improved the GMI ratio and its field sensitivity up to 760% (1.75 time of that of the as-prepared), and 925%/Oe (more than 17.92 times of that of the as-prepared) at an operating frequency of 20 MHz, respectively. Our study indicates that the MSA technique can enhance the microstructures and the surface magnetic domain structures of the Co-rich magnetic microwires, giving rise to the GMI enhancement. This technique is suitable for improving the GMI sensitivity at small magnetic fields, which is highly promising for biomedical sensing and healthcare monitoring. △ Less

Submitted 2 February, 2021; originally announced February 2021.

arXiv:2012.15029 [pdf, other]

VinDr-CXR: An open dataset of chest X-rays with radiologist's annotations

Authors: Ha Q. Nguyen, Khanh Lam, Linh T. Le, Hieu H. Pham, Dat Q. Tran, Dung B. Nguyen, Dung D. Le, Chi M. Pham, Hang T. T. Tong, Diep H. Dinh, Cuong D. Do, Luu T. Doan, Cuong N. Nguyen, Binh T. Nguyen, Que V. Nguyen, Au D. Hoang, Hien N. Phan, Anh T. Nguyen, Phuong H. Ho, Dat T. Ngo, Nghia T. Nguyen, Nhan T. Nguyen, Minh Dao, Van Vu

Abstract: Most of the existing chest X-ray datasets include labels from a list of findings without specifying their locations on the radiographs. This limits the development of machine learning algorithms for the detection and localization of chest abnormalities. In this work, we describe a dataset of more than 100,000 chest X-ray scans that were retrospectively collected from two major hospitals in Vietnam… ▽ More Most of the existing chest X-ray datasets include labels from a list of findings without specifying their locations on the radiographs. This limits the development of machine learning algorithms for the detection and localization of chest abnormalities. In this work, we describe a dataset of more than 100,000 chest X-ray scans that were retrospectively collected from two major hospitals in Vietnam. Out of this raw data, we release 18,000 images that were manually annotated by a total of 17 experienced radiologists with 22 local labels of rectangles surrounding abnormalities and 6 global labels of suspected diseases. The released dataset is divided into a training set of 15,000 and a test set of 3,000. Each scan in the training set was independently labeled by 3 radiologists, while each scan in the test set was labeled by the consensus of 5 radiologists. We designed and built a labeling platform for DICOM images to facilitate these annotation procedures. All images are made publicly available (https://www.physionet.org/content/vindr-cxr/1.0.0/) in DICOM format along with the labels of both the training set and the test set. △ Less

Submitted 20 March, 2022; v1 submitted 29 December, 2020; originally announced December 2020.

Comments: 11 pages, under review by Nature Scientific Data

arXiv:2012.13699 [pdf, ps, other]

Inception-Based Network and Multi-Spectrogram Ensemble Applied For Predicting Respiratory Anomalies and Lung Diseases

Authors: Lam Pham, Huy Phan, Ross King, Alfred Mertins, Ian McLoughlin

Abstract: This paper presents an inception-based deep neural network for detecting lung diseases using respiratory sound input. Recordings of respiratory sound collected from patients are firstly transformed into spectrograms where both spectral and temporal information are well presented, referred to as front-end feature extraction. These spectrograms are then fed into the proposed network, referred to as… ▽ More This paper presents an inception-based deep neural network for detecting lung diseases using respiratory sound input. Recordings of respiratory sound collected from patients are firstly transformed into spectrograms where both spectral and temporal information are well presented, referred to as front-end feature extraction. These spectrograms are then fed into the proposed network, referred to as back-end classification, for detecting whether patients suffer from lung-relevant diseases. Our experiments, conducted over the ICBHI benchmark meta-dataset of respiratory sound, achieve competitive ICBHI scores of 0.53/0.45 and 0.87/0.85 regarding respiratory anomaly and disease detection, respectively. △ Less

Submitted 26 December, 2020; originally announced December 2020.

arXiv:2011.07859 [pdf, other]

A General Network Architecture for Sound Event Localization and Detection Using Transfer Learning and Recurrent Neural Network

Authors: Thi Ngoc Tho Nguyen, Ngoc Khanh Nguyen, Huy Phan, Lam Pham, Kenneth Ooi, Douglas L. Jones, Woon-Seng Gan

Abstract: Polyphonic sound event detection and localization (SELD) task is challenging because it is difficult to jointly optimize sound event detection (SED) and direction-of-arrival (DOA) estimation in the same network. We propose a general network architecture for SELD in which the SELD network comprises sub-networks that are pretrained to solve SED and DOA estimation independently, and a recurrent layer… ▽ More Polyphonic sound event detection and localization (SELD) task is challenging because it is difficult to jointly optimize sound event detection (SED) and direction-of-arrival (DOA) estimation in the same network. We propose a general network architecture for SELD in which the SELD network comprises sub-networks that are pretrained to solve SED and DOA estimation independently, and a recurrent layer that combines the SED and DOA estimation outputs into SELD outputs. The recurrent layer does the alignment between the sound classes and DOAs of sound events while being unaware of how these outputs are produced by the upstream SED and DOA estimation algorithms. This simple network architecture is compatible with different existing SED and DOA estimation algorithms. It is highly practical since the sub-networks can be improved independently. The experimental results using the DCASE 2020 SELD dataset show that the performances of our proposed network architecture using different SED and DOA estimation algorithms and different audio formats are competitive with other state-of-the-art SELD algorithms. The source code for the proposed SELD network architecture is available at Github. △ Less

Submitted 16 November, 2020; originally announced November 2020.

arXiv:2010.09132 [pdf, ps, other]

Self-Attention Generative Adversarial Network for Speech Enhancement

Authors: Huy Phan, Huy Le Nguyen, Oliver Y. Chén, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins

Abstract: Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we… ▽ More Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we empirically study the effect of placing the self-attention layer at the (de)convolutional layers with varying layer indices as well as at all of them when memory allows. Our experiments show that introducing self-attention to SEGAN leads to consistent improvement across the objective evaluation metrics of enhancement performance. Furthermore, applying at different (de)convolutional layers does not significantly alter performance, suggesting that it can be conveniently applied at the highest-level (de)convolutional layer with the smallest memory overhead. △ Less

Submitted 6 February, 2021; v1 submitted 18 October, 2020; originally announced October 2020.

Comments: 46th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2021). Source code is available at http://github.com/pquochuy/sasegan

arXiv:2009.05527 [pdf, ps, other]

On Multitask Loss Function for Audio Event Detection and Localization

Authors: Huy Phan, Lam Pham, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins

Abstract: Audio event localization and detection (SELD) have been commonly tackled using multitask models. Such a model usually consists of a multi-label event classification branch with sigmoid cross-entropy loss for event activity detection and a regression branch with mean squared error loss for direction-of-arrival estimation. In this work, we propose a multitask regression model, in which both (multi-l… ▽ More Audio event localization and detection (SELD) have been commonly tackled using multitask models. Such a model usually consists of a multi-label event classification branch with sigmoid cross-entropy loss for event activity detection and a regression branch with mean squared error loss for direction-of-arrival estimation. In this work, we propose a multitask regression model, in which both (multi-label) event detection and localization are formulated as regression problems and use the mean squared error loss homogeneously for model training. We show that the common combination of heterogeneous loss functions causes the network to underfit the data whereas the homogeneous mean squared error loss leads to better convergence and performance. Experiments on the development and validation sets of the DCASE 2020 SELD task demonstrate that the proposed system also outperforms the DCASE 2020 SELD baseline across all the detection and localization metrics, reducing the overall SELD error (the combined metric) by approximately 10% absolute. △ Less

Submitted 11 September, 2020; originally announced September 2020.

Comments: Accepted for publication in DCASE 2020 Workshop

arXiv:2009.02935 [pdf, other]

UIT-HSE at WNUT-2020 Task 2: Exploiting CT-BERT for Identifying COVID-19 Information on the Twitter Social Network

Authors: Khiem Vinh Tran, Hao Phu Phan, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Abstract: Recently, COVID-19 has affected a variety of real-life aspects of the world and led to dreadful consequences. More and more tweets about COVID-19 has been shared publicly on Twitter. However, the plurality of those Tweets are uninformative, which is challenging to build automatic systems to detect the informative ones for useful AI applications. In this paper, we present our results at the W-NUT 2… ▽ More Recently, COVID-19 has affected a variety of real-life aspects of the world and led to dreadful consequences. More and more tweets about COVID-19 has been shared publicly on Twitter. However, the plurality of those Tweets are uninformative, which is challenging to build automatic systems to detect the informative ones for useful AI applications. In this paper, we present our results at the W-NUT 2020 Shared Task 2: Identification of Informative COVID-19 English Tweets. In particular, we propose our simple but effective approach using the transformer-based models based on COVID-Twitter-BERT (CT-BERT) with different fine-tuning techniques. As a result, we achieve the F1-Score of 90.94\% with the third place on the leaderboard of this task which attracted 56 submitted teams in total. △ Less

Submitted 13 November, 2020; v1 submitted 7 September, 2020; originally announced September 2020.

Comments: Accepted by 2020 The 6th Workshop on Noisy User-generated Text (W-NUT) - EMNLP 2020

Journal ref: https://www.aclweb.org/anthology/2020.wnut-1.53/

arXiv:2008.08748 [pdf, other]

DPMC: Weighted Model Counting by Dynamic Programming on Project-Join Trees

Authors: Jeffrey M. Dudek, Vu H. N. Phan, Moshe Y. Vardi

Abstract: We propose a unifying dynamic-programming framework to compute exact literal-weighted model counts of formulas in conjunctive normal form. At the center of our framework are project-join trees, which specify efficient project-join orders to apply additive projections (variable eliminations) and joins (clause multiplications). In this framework, model counting is performed in two phases. First, the… ▽ More We propose a unifying dynamic-programming framework to compute exact literal-weighted model counts of formulas in conjunctive normal form. At the center of our framework are project-join trees, which specify efficient project-join orders to apply additive projections (variable eliminations) and joins (clause multiplications). In this framework, model counting is performed in two phases. First, the planning phase constructs a project-join tree from a formula. Second, the execution phase computes the model count of the formula, employing dynamic programming as guided by the project-join tree. We empirically evaluate various methods for the planning phase and compare constraint-satisfaction heuristics with tree-decomposition tools. We also investigate the performance of different data structures for the execution phase and compare algebraic decision diagrams with tensors. We show that our dynamic-programming model-counting framework DPMC is competitive with the state-of-the-art exact weighted model counters cachet, c2d, d4, and miniC2D. △ Less

Submitted 19 August, 2020; originally announced August 2020.

Comments: Full version of paper at CP 2020 (26th International Conference on Principles and Practice of Constraint Programming)

arXiv:2007.10428 [pdf]

The large magnetocaloric effect and refrigerant capacity in nanocrystalline/ amorphous Gd$_3$Ni/Gd$_{65}$Ni$_{35}$ composite microwires

Authors: Y. F. Wang, Y. Y. Yu, H. Belliveau, N. T. M. Duc, H. X. Shen, J. F. Sun, J. S. Liu, F. X. Qin, S. C. Yu, H. Srikanth, M. H. Phan

Abstract: We report on a novel class of nanocrystalline/amorphous Gd$_3$Ni/Gd$_{65}$Ni$_{35}$ composite microwires, which was created directly by melt-extraction through controlled solidification. X-ray diffraction (XRD) and transmission electron microscopy (TEM) confirmed the formation of a biphase nanocrystalline/amorphous structure in these wires. Magnetic and magnetocaloric experiments indicate the larg… ▽ More We report on a novel class of nanocrystalline/amorphous Gd$_3$Ni/Gd$_{65}$Ni$_{35}$ composite microwires, which was created directly by melt-extraction through controlled solidification. X-ray diffraction (XRD) and transmission electron microscopy (TEM) confirmed the formation of a biphase nanocrystalline/amorphous structure in these wires. Magnetic and magnetocaloric experiments indicate the large magnetic entropy change (-$Δ$SM ~9.64 J/kg K) and the large refrigerant capacity (RC ~742.1 J/kg) around the Curie temperature of ~120 K for a field change of 5 T. These values are ~1.5 times larger relative to its bulk counterpart, and are superior to other candidate materials being considered for active magnetic refrigeration in the liquid nitrogen temperature range. △ Less

Submitted 26 July, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

arXiv:2007.05492 [pdf, other]

doi 10.1109/TPAMI.2021.3070057

XSleepNet: Multi-View Sequential Model for Automatic Sleep Staging

Authors: Huy Phan, Oliver Y. Chén, Minh C. Tran, Philipp Koch, Alfred Mertins, Maarten De Vos

Abstract: Automating sleep staging is vital to scale up sleep assessment and diagnosis to serve millions experiencing sleep deprivation and disorders and enable longitudinal sleep monitoring in home environments. Learning from raw polysomnography signals and their derived time-frequency image representations has been prevalent. However, learning from multi-view inputs (e.g., both the raw signals and the tim… ▽ More Automating sleep staging is vital to scale up sleep assessment and diagnosis to serve millions experiencing sleep deprivation and disorders and enable longitudinal sleep monitoring in home environments. Learning from raw polysomnography signals and their derived time-frequency image representations has been prevalent. However, learning from multi-view inputs (e.g., both the raw signals and the time-frequency images) for sleep staging is difficult and not well understood. This work proposes a sequence-to-sequence sleep staging model, XSleepNet, that is capable of learning a joint representation from both raw signals and time-frequency images. Since different views may generalize or overfit at different rates, the proposed network is trained such that the learning pace on each view is adapted based on their generalization/overfitting behavior. In simple terms, the learning on a particular view is speeded up when it is generalizing well and slowed down when it is overfitting. View-specific generalization/overfitting measures are computed on-the-fly during the training course and used to derive weights to blend the gradients from different views. As a result, the network is able to retain the representation power of different views in the joint features which represent the underlying distribution better than those learned by each individual view alone. Furthermore, the XSleepNet architecture is principally designed to gain robustness to the amount of training data and to increase the complementarity between the input views. Experimental results on five databases of different sizes show that XSleepNet consistently outperforms the single-view baselines and the multi-view baseline with a simple fusion strategy. Finally, XSleepNet also outperforms prior sleep staging methods and improves previous state-of-the-art results on the experimental databases. △ Less

Submitted 31 March, 2021; v1 submitted 8 July, 2020; originally announced July 2020.

Comments: This article has been published in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

arXiv:2007.04070 [pdf, other]

Learning Neural Textual Representations for Citation Recommendation

Authors: Binh Thanh Kieu, Inigo Jauregi Unanue, Son Bao Pham, Hieu Xuan Phan, Massimo Piccardi

Abstract: With the rapid growth of the scientific literature, manually selecting appropriate citations for a paper is becoming increasingly challenging and time-consuming. While several approaches for automated citation recommendation have been proposed in the recent years, effective document representations for citation recommendation are still elusive to a large extent. For this reason, in this paper we p… ▽ More With the rapid growth of the scientific literature, manually selecting appropriate citations for a paper is becoming increasingly challenging and time-consuming. While several approaches for automated citation recommendation have been proposed in the recent years, effective document representations for citation recommendation are still elusive to a large extent. For this reason, in this paper we propose a novel approach to citation recommendation which leverages a deep sequential representation of the documents (Sentence-BERT) cascaded with Siamese and triplet networks in a submodular scoring function. To the best of our knowledge, this is the first approach to combine deep representations and submodular selection for a task of citation recommendation. Experiments have been carried out using a popular benchmark dataset - the ACL Anthology Network corpus - and evaluated against baselines and a state-of-the-art approach using metrics such as the MRR and F1-at-k score. The results show that the proposed approach has been able to outperform all the compared approaches in every measured metric. △ Less

Submitted 8 July, 2020; originally announced July 2020.

Comments: Accepted in ICPR 2020

arXiv:2006.14540 [pdf, other]

Graph Convolutional Neural Networks for analysis of EEG signals, BCI application

Authors: Mirfarid Musavian Ghazani, Anh Huy Phan

Abstract: Decoding brain signals has gained many attention and has found much applications in recent years such as Brain Computer Interfaces, communicating with controlling external devices using the user's intentions, occupies an emerging field with the potential of changing the world, with diverse applications from rehabilitation to human augmentation. This being said brain signal analysis, EEG brain sign… ▽ More Decoding brain signals has gained many attention and has found much applications in recent years such as Brain Computer Interfaces, communicating with controlling external devices using the user's intentions, occupies an emerging field with the potential of changing the world, with diverse applications from rehabilitation to human augmentation. This being said brain signal analysis, EEG brain signal analysis in particular, is a challenging task. With the advances and achievements in the field of deep learning in problem solving with using only raw data, few attempts has been carried in recent years, to apply deep learning to tackle EEG among other types of brain signals. In this study, we propose a novel loss function, called DeepCSP to extend the classical Common Spatial Patterns to a non linear, differentiable module to serve as the loss function to enforce linearly separable latent representations of EEG signals belonging to different classes in an end to end manner on raw signals without the need to perform extensive feature engineering. With recent generalizations of deep learning methods to work on arbitrarily structured graphs and the introduced loss we have proposed two light weight models to decode EEG signals and carried experiments to show their performance. △ Less

Submitted 16 June, 2020; originally announced June 2020.

Comments: 11 pages, 5 figures

arXiv:2006.01413 [pdf]

Resolving Class Imbalance in Object Detection with Weighted Cross Entropy Losses

Authors: Trong Huy Phan, Kazuma Yamamoto

Abstract: Object detection is an important task in computer vision which serves a lot of real-world applications such as autonomous driving, surveillance and robotics. Along with the rapid thrive of large-scale data, numerous state-of-the-art generalized object detectors (e.g. Faster R-CNN, YOLO, SSD) were developed in the past decade. Despite continual efforts in model modification and improvement in train… ▽ More Object detection is an important task in computer vision which serves a lot of real-world applications such as autonomous driving, surveillance and robotics. Along with the rapid thrive of large-scale data, numerous state-of-the-art generalized object detectors (e.g. Faster R-CNN, YOLO, SSD) were developed in the past decade. Despite continual efforts in model modification and improvement in training strategies to boost detection accuracy, there are still limitations in performance of detectors when it comes to specialized datasets with uneven object class distributions. This originates from the common usage of Cross Entropy loss function for object classification sub-task that simply ignores the frequency of appearance of object class during training, and thus results in lower accuracies for object classes with fewer number of samples. Class-imbalance in general machine learning has been widely studied, however, little attention has been paid on the subject of object detection. In this paper, we propose to explore and overcome such problem by application of several weighted variants of Cross Entropy loss, for examples Balanced Cross Entropy, Focal Loss and Class-Balanced Loss Based on Effective Number of Samples to our object detector. Experiments with BDD100K (a highly class-imbalanced driving database acquired from on-vehicle cameras capturing mostly Car-class objects and other minority object classes such as Bus, Person and Motor) have proven better class-wise performances of detector trained with the afore-mentioned loss functions. △ Less

Submitted 2 June, 2020; originally announced June 2020.

arXiv:2005.14506 [pdf, other]

Deep convolutional tensor network

Authors: Philip Blagoveschensky, Anh Huy Phan

Abstract: Neural networks have achieved state of the art results in many areas, supposedly due to parameter sharing, locality, and depth. Tensor networks (TNs) are linear algebraic representations of quantum many-body states based on their entanglement structure. TNs have found use in machine learning. We devise a novel TN based model called Deep convolutional tensor network (DCTN) for image classification,… ▽ More Neural networks have achieved state of the art results in many areas, supposedly due to parameter sharing, locality, and depth. Tensor networks (TNs) are linear algebraic representations of quantum many-body states based on their entanglement structure. TNs have found use in machine learning. We devise a novel TN based model called Deep convolutional tensor network (DCTN) for image classification, which has parameter sharing, locality, and depth. It is based on the Entangled plaquette states (EPS) TN. We show how EPS can be implemented as a backpropagatable layer. We test DCTN on MNIST, FashionMNIST, and CIFAR10 datasets. A shallow DCTN performs well on MNIST and FashionMNIST and has a small parameter count. Unfortunately, depth increases overfitting and thus decreases test accuracy. Also, DCTN of any depth performs badly on CIFAR10 due to overfitting. It is to be determined why. We discuss how the hyperparameters of DCTN affect its training and overfitting. △ Less

Submitted 14 November, 2020; v1 submitted 29 May, 2020; originally announced May 2020.

Comments: 14 pages, 18 figures, to be published in the proceedings of NeurIPS 2020 Quantum tensor networks in machine learning workshop

ACM Class: I.5.1

arXiv:2005.12962 [pdf, other]

A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems

Authors: Huy Kinh Phan, Viet Lam Phung, Tuan Anh Dinh, Bao Quoc Nguyen

Abstract: In recent years, statistical parametric speech synthesis (SPSS) systems have been widely utilized in many interactive speech-based systems (e.g.~Amazon's Alexa, Bose's headphones). To select a suitable SPSS system, both speech quality and performance efficiency (e.g.~decoding time) must be taken into account. In the paper, we compared four popular Vietnamese SPSS techniques using: 1) hidden Markov… ▽ More In recent years, statistical parametric speech synthesis (SPSS) systems have been widely utilized in many interactive speech-based systems (e.g.~Amazon's Alexa, Bose's headphones). To select a suitable SPSS system, both speech quality and performance efficiency (e.g.~decoding time) must be taken into account. In the paper, we compared four popular Vietnamese SPSS techniques using: 1) hidden Markov models (HMM), 2) deep neural networks (DNN), 3) generative adversarial networks (GAN), and 4) end-to-end (E2E) architectures, which consists of Tacontron~2 and WaveGlow vocoder in terms of speech quality and performance efficiency. We showed that the E2E systems accomplished the best quality, but required the power of GPU to achieve real-time performance. We also showed that the HMM-based system had inferior speech quality, but it was the most efficient system. Surprisingly, the E2E systems were more efficient than the DNN and GAN in inference on GPU. Surprisingly, the GAN-based system did not outperform the DNN in term of quality. △ Less

Submitted 26 May, 2020; originally announced May 2020.

Comments: 9 pages, submitted to KSE 2020

arXiv:2005.06715 [pdf]

doi 10.1109/LRA.2020.3005127

Towards the Long-Endurance Flight of an Insect-Inspired, Tailless, Two-Winged, Flapping-Wing Flying Robot

Authors: Hoang Vu Phan, Steven Aurecianus, Thi Kim Loan Au, Taesam Kang, Hoon Cheol Park

Abstract: A hover-capable insect-inspired flying robot that can remain long in the air has shown its potential use for both confined indoor and outdoor applications to complete assigned tasks. In this letter, we report improvements in the flight endurance of our 15.8 g robot, named KUBeetle-S, using a low-voltage power source. The robot is equipped with a simple but effective control mechanism that can modu… ▽ More A hover-capable insect-inspired flying robot that can remain long in the air has shown its potential use for both confined indoor and outdoor applications to complete assigned tasks. In this letter, we report improvements in the flight endurance of our 15.8 g robot, named KUBeetle-S, using a low-voltage power source. The robot is equipped with a simple but effective control mechanism that can modulate the stroke plane for attitude stabilization and control. Due to the demand for extended flight, we performed a series of experiments on the lift generation and power requirement of the robot with different stroke amplitudes and wing areas. We show that a larger wing with less inboard wing area improves the lift-to-power ratio and produces a peak lift-to-weight ratio of 1.34 at 3.7 V application. Flight tests show that the robot employing the selected wing could hover for 8.8 minutes. Moreover, the robot could perform maneuvers in any direction, fly outdoors, and carry payload, demonstrating its ability to enter the next phase of autonomous flight. △ Less

Submitted 1 July, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

Comments: Accepted for publications in IEEE Robotics and Automation Letters (RA-L) and the Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. 8 pages, 12 figures

arXiv:2005.06305 [pdf, other]

Binarizing MobileNet via Evolution-based Searching

Authors: Hai Phan, Zechun Liu, Dang Huynh, Marios Savvides, Kwang-Ting Cheng, Zhiqiang Shen

Abstract: Binary Neural Networks (BNNs), known to be one among the effectively compact network architectures, have achieved great outcomes in the visual tasks. Designing efficient binary architectures is not trivial due to the binary nature of the network. In this paper, we propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet, a compact network wi… ▽ More Binary Neural Networks (BNNs), known to be one among the effectively compact network architectures, have achieved great outcomes in the visual tasks. Designing efficient binary architectures is not trivial due to the binary nature of the network. In this paper, we propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet, a compact network with separable depth-wise convolution. Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs), assuming an approximately optimal trade-off between computational cost and model accuracy. Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution while optimizing the model performance in terms of complexity and latency. The approach is threefold. First, we train strong baseline binary networks with a wide range of random group combinations at each convolutional layer. This set-up gives the binary neural networks a capability of preserving essential information through layers. Second, to find a good set of hyperparameters for group convolutions we make use of the evolutionary search which leverages the exploration of efficient 1-bit models. Lastly, these binary models are trained from scratch in a usual manner to achieve the final binary model. Various experiments on ImageNet are conducted to show that following our construction guideline, the final model achieves 60.09% Top-1 accuracy and outperforms the state-of-the-art CI-BCNN with the same computational cost. △ Less

Submitted 15 May, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

Comments: Accepted by CVPR2020

Showing 151–200 of 322 results for author: Phan, H