-
Deep Plug-and-Play Prior for Hyperspectral Image Restoration
Authors:
Zeqiang Lai,
Kaixuan Wei,
Ying Fu
Abstract:
Deep-learning-based hyperspectral image (HSI) restoration methods have gained great popularity for their remarkable performance but often demand expensive network retraining whenever the specifics of task changes. In this paper, we propose to restore HSIs in a unified approach with an effective plug-and-play method, which can jointly retain the flexibility of optimization-based methods and utilize…
▽ More
Deep-learning-based hyperspectral image (HSI) restoration methods have gained great popularity for their remarkable performance but often demand expensive network retraining whenever the specifics of task changes. In this paper, we propose to restore HSIs in a unified approach with an effective plug-and-play method, which can jointly retain the flexibility of optimization-based methods and utilize the powerful representation capability of deep neural networks. Specifically, we first develop a new deep HSI denoiser leveraging gated recurrent convolution units, short- and long-term skip connections, and an augmented noise level map to better exploit the abundant spatio-spectral information within HSIs. It, therefore, leads to the state-of-the-art performance on HSI denoising under both Gaussian and complex noise settings. Then, the proposed denoiser is inserted into the plug-and-play framework as a powerful implicit HSI prior to tackle various HSI restoration tasks. Through extensive experiments on HSI super-resolution, compressed sensing, and inpainting, we demonstrate that our approach often achieves superior performance, which is competitive with or even better than the state-of-the-art on each task, via a single model without any task-specific training.
△ Less
Submitted 17 September, 2022;
originally announced September 2022.
-
Tensor Completion via Tensor Train Based Low-Rank Quotient Geometry under a Preconditioned Metric
Authors:
Jian-Feng Cai,
Wen Huang,
Haifeng Wang,
Ke Wei
Abstract:
This paper investigates the low-rank tensor completion problem, which is about recovering a tensor from partially observed entries. We consider this problem in the tensor train format and extend the preconditioned metric from the matrix case to the tensor case. The first-order and second-order quotient geometry of the manifold of fixed tensor train rank tensors under this metric is studied in deta…
▽ More
This paper investigates the low-rank tensor completion problem, which is about recovering a tensor from partially observed entries. We consider this problem in the tensor train format and extend the preconditioned metric from the matrix case to the tensor case. The first-order and second-order quotient geometry of the manifold of fixed tensor train rank tensors under this metric is studied in detail. Algorithms, including Riemannian gradient descent, Riemannian conjugate gradient, and Riemannian Gauss-Newton, have been proposed for the tensor completion problem based on the quotient geometry. It has also been shown that the Riemannian Gauss-Newton method on the quotient geometry is equivalent to the Riemannian Gauss-Newton method on the embedded geometry with a specific retraction. Empirical evaluations on random instances as well as on function-related tensors show that the proposed algorithms are competitive with other existing algorithms in terms of recovery ability, convergence performance, and reconstruction quality.
△ Less
Submitted 18 April, 2023; v1 submitted 11 September, 2022;
originally announced September 2022.
-
Decentralized Natural Policy Gradient with Variance Reduction for Collaborative Multi-Agent Reinforcement Learning
Authors:
Jinchi Chen,
Jie Feng,
Weiguo Gao,
Ke Wei
Abstract:
This paper studies a policy optimization problem arising from collaborative multi-agent reinforcement learning in a decentralized setting where agents communicate with their neighbors over an undirected graph to maximize the sum of their cumulative rewards. A novel decentralized natural policy gradient method, dubbed Momentum-based Decentralized Natural Policy Gradient (MDNPG), is proposed, which…
▽ More
This paper studies a policy optimization problem arising from collaborative multi-agent reinforcement learning in a decentralized setting where agents communicate with their neighbors over an undirected graph to maximize the sum of their cumulative rewards. A novel decentralized natural policy gradient method, dubbed Momentum-based Decentralized Natural Policy Gradient (MDNPG), is proposed, which incorporates natural gradient, momentum-based variance reduction, and gradient tracking into the decentralized stochastic gradient ascent framework. The $\mathcal{O}(n^{-1}ε^{-3})$ sample complexity for MDNPG to converge to an $ε$-stationary point has been established under standard assumptions, where $n$ is the number of agents. It indicates that MDNPG can achieve the optimal convergence rate for decentralized policy gradient methods and possesses a linear speedup in contrast to centralized optimization methods. Moreover, superior empirical performance of MDNPG over other state-of-the-art algorithms has been demonstrated by extensive numerical experiments.
△ Less
Submitted 5 September, 2022;
originally announced September 2022.
-
Let Me Check the Examples: Enhancing Demonstration Learning via Explicit Imitation
Authors:
Sirui Wang,
Kaiwen Wei,
Hongzhi Zhang,
Yuntao Li,
Wei Wu
Abstract:
Demonstration learning aims to guide the prompt prediction via providing answered demonstrations in the few shot settings. Despite achieving promising results, existing work only concatenates the answered examples as demonstrations to the prompt template (including the raw context) without any additional operation, neglecting the prompt-demonstration dependencies. Besides, prior research found tha…
▽ More
Demonstration learning aims to guide the prompt prediction via providing answered demonstrations in the few shot settings. Despite achieving promising results, existing work only concatenates the answered examples as demonstrations to the prompt template (including the raw context) without any additional operation, neglecting the prompt-demonstration dependencies. Besides, prior research found that randomly replacing the labels of demonstrations marginally hurts performance, illustrating that the model could not properly learn the knowledge brought by the demonstrations. Inspired by the human learning process, in this paper, we introduce Imitation DEMOnstration Learning (Imitation-Demo) to strengthen demonstration learning via explicitly imitating human review behaviour, which includes: (1) contrastive learning mechanism to concentrate on the similar demonstrations. (2) demonstration-label re-prediction method to consolidate known knowledge. Experiment results show that our proposed method achieves state-of-the-art performance on 11 out of 14 classification corpora. Further studies also prove that Imitation-Demo strengthen the association between prompt and demonstrations, which could provide the basis for exploring how demonstration learning works.
△ Less
Submitted 31 August, 2022;
originally announced September 2022.
-
Experimental study of secure quantum key distribution with source and detection imperfections
Authors:
Ye Chen,
Chunfeng Huang,
Zihao Chen,
Wenjie He,
Chengxian Zhang,
Shihai Sun,
Kejin Wei
Abstract:
The quantum key distribution (QKD), guaranteed by the principle of quantum physics, is a promising solution for future secure information and communication technology. However, device imperfections compromise the security of real-life QKD systems, restricting the wide deployment of QKD. This study reports a decoy-state BB84 QKD experiment that considers both source and detection imperfections. In…
▽ More
The quantum key distribution (QKD), guaranteed by the principle of quantum physics, is a promising solution for future secure information and communication technology. However, device imperfections compromise the security of real-life QKD systems, restricting the wide deployment of QKD. This study reports a decoy-state BB84 QKD experiment that considers both source and detection imperfections. In particular, we achieved a rigorous finite-key security bound over fiber links of up to 75 km by applying a systematic performance analysis. Furthermore, our study considers more device imperfections than most previous experiments, and the proposed theory can be extended to other discrete-variable QKD systems. These features constitute a crucial step toward securing QKD with imperfect practical devices.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.
-
Constraints on Spin-Spin-Velocity-Dependent Interaction
Authors:
Wei Ji,
Weipeng Li,
Pavel Fadeev,
Filip Ficek,
Jianan Qin,
Kai Wei,
Yong-Chun Liu,
Dmitry Budker
Abstract:
The existence of exotic spin-dependent forces may shine light on new physics beyond the Standard Model. We utilize two iron shielded SmCo$_5$ electron-spin sources and two optically pumped magnetometers to search for exotic long-range spin-spin-velocity-dependent force. The orientations of spin sources and magnetometers are optimized such that the exotic force is enhanced and common-mode noise is…
▽ More
The existence of exotic spin-dependent forces may shine light on new physics beyond the Standard Model. We utilize two iron shielded SmCo$_5$ electron-spin sources and two optically pumped magnetometers to search for exotic long-range spin-spin-velocity-dependent force. The orientations of spin sources and magnetometers are optimized such that the exotic force is enhanced and common-mode noise is effectively subtracted. We set direct limit on proton-electron interaction in the force range from 1\,cm to 1\,km. Our experiment represents more than ten orders of magnitude improvement than previous works.
△ Less
Submitted 21 November, 2022; v1 submitted 1 August, 2022;
originally announced August 2022.
-
ME-GAN: Learning Panoptic Electrocardio Representations for Multi-view ECG Synthesis Conditioned on Heart Diseases
Authors:
Jintai Chen,
Kuanlun Liao,
Kun Wei,
Haochao Ying,
Danny Z. Chen,
Jian Wu
Abstract:
Electrocardiogram (ECG) is a widely used non-invasive diagnostic tool for heart diseases. Many studies have devised ECG analysis models (e.g., classifiers) to assist diagnosis. As an upstream task, researches have built generative models to synthesize ECG data, which are beneficial to providing training samples, privacy protection, and annotation reduction. However, previous generative methods for…
▽ More
Electrocardiogram (ECG) is a widely used non-invasive diagnostic tool for heart diseases. Many studies have devised ECG analysis models (e.g., classifiers) to assist diagnosis. As an upstream task, researches have built generative models to synthesize ECG data, which are beneficial to providing training samples, privacy protection, and annotation reduction. However, previous generative methods for ECG often neither synthesized multi-view data, nor dealt with heart disease conditions. In this paper, we propose a novel disease-aware generative adversarial network for multi-view ECG synthesis called ME-GAN, which attains panoptic electrocardio representations conditioned on heart diseases and projects the representations onto multiple standard views to yield ECG signals. Since ECG manifestations of heart diseases are often localized in specific waveforms, we propose a new "mixup normalization" to inject disease information precisely into suitable locations. In addition, we propose a view discriminator to revert disordered ECG views into a pre-determined order, supervising the generator to obtain ECG representing correct view characteristics. Besides, a new metric, rFID, is presented to assess the quality of the synthesized ECG signals. Comprehensive experiments verify that our ME-GAN performs well on multi-view ECG signal synthesis with trusty morbid manifestations.
△ Less
Submitted 29 May, 2023; v1 submitted 21 July, 2022;
originally announced July 2022.
-
Approximation Theory of Total Variation Minimization for Data Completion
Authors:
Jian-Feng Cai,
Jae Kyu Choi,
Ke Wei
Abstract:
Total variation (TV) minimization is one of the most important techniques in modern signal/image processing, and has wide range of applications. While there are numerous recent works on the restoration guarantee of the TV minimization in the framework of compressed sensing, there are few works on the restoration guarantee of the restoration from partial observations. This paper is to analyze the e…
▽ More
Total variation (TV) minimization is one of the most important techniques in modern signal/image processing, and has wide range of applications. While there are numerous recent works on the restoration guarantee of the TV minimization in the framework of compressed sensing, there are few works on the restoration guarantee of the restoration from partial observations. This paper is to analyze the error of TV based restoration from random entrywise samples. In particular, we estimate the error between the underlying original data and the approximate solution that interpolates (or approximates with an error bound depending on the noise level) the given data that has the minimal TV seminorm among all possible solutions. Finally, we further connect the error estimate for the discrete model to the sparse gradient restoration problem and to the approximation to the underlying function from which the underlying true data comes.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
A three-dimensional generalization of QRT maps
Authors:
Jaume Alonso,
Yuri B. Suris,
Kangning Wei
Abstract:
We propose a geometric construction of three-dimensional birational maps that preserve two pencils of quadrics. The maps act as compositions of involutions, which, in turn, act along the straight line generators of the quadrics of the first pencil and are defined by the intersections with quadrics of the second pencil. On each quadric of the first pencil, the maps act as two-dimensional QRT maps.…
▽ More
We propose a geometric construction of three-dimensional birational maps that preserve two pencils of quadrics. The maps act as compositions of involutions, which, in turn, act along the straight line generators of the quadrics of the first pencil and are defined by the intersections with quadrics of the second pencil. On each quadric of the first pencil, the maps act as two-dimensional QRT maps.
While these maps are of a pretty high degree in general, we find geometric conditions which guarantee that the degree is reduced to 3. The resulting degree 3 maps are illustrated by two known and two novel Kahan-type discretizations of three-dimensional Nambu systems, including the Euler top and the Zhukovski-Volterra gyrostat with two non-vanishing components of the gyrostatic momentum.
△ Less
Submitted 11 June, 2023; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Optimizing nonadiabatic geometric quantum gates against off-resonance error by dynamical correction in a silicon-based spin qubit
Authors:
Liu-Jun Guo,
Hai Xu,
Zi-Yu Fang,
Tao Chen,
Kejin Wei,
Chengxian Zhang
Abstract:
Geometric quantum gates are performed by using the geometric phase, making them particularly robust to the pulse amplitude error due to the intrinsic global property. However, in many systems, such as the silicon-based spin qubits, the off-resonance error is the dominant noise, which can cause dephasing and is always difficult to deal with for a geometric gate. Thus how to deal with the off-resona…
▽ More
Geometric quantum gates are performed by using the geometric phase, making them particularly robust to the pulse amplitude error due to the intrinsic global property. However, in many systems, such as the silicon-based spin qubits, the off-resonance error is the dominant noise, which can cause dephasing and is always difficult to deal with for a geometric gate. Thus how to deal with the off-resonance error is very significant for the application of the geometric gates. A recent work in \emph{Phy. Rev. Appl. 16, 044005 (2021)} reveals that by inserting two $π$-pulse dynamically corrected sequences into the evolution paths, the holonomic quantum gate is effective to suppress the pulse amplitude error, however it is still useless for combating the off-resonance error. Inspired by this work, we combine using the techniques of dynamical correction and path design. Surprisingly, we find that by picking up a specific evolution path inserted by only a $π$-pulse dynamically corrected sequence, the obtained optimized geometric gate is robust to the off-resonance error, assuming the noise is static. Further, by calculating the filter function considering the realistic $1/f$-type noise in silicon, the related results show that the performance of the optimized geometric gate can also surpass both the conventional geometric gate and the naive dynamical gate constructed without using the geometric phase. Our results indicate dynamical correction is an powerful tool to improve the geometric gate.
△ Less
Submitted 8 January, 2023; v1 submitted 10 July, 2022;
originally announced July 2022.
-
Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR
Authors:
Kun Wei,
Yike Zhang,
Sining Sun,
Lei Xie,
Long Ma
Abstract:
Leveraging context information is an intuitive idea to improve performance on conversational automatic speech recognition(ASR). Previous works usually adopt recognized hypotheses of historical utterances as preceding context, which may bias the current recognized hypothesis due to the inevitable historicalrecognition errors. To avoid this problem, we propose an audio-textual cross-modal representa…
▽ More
Leveraging context information is an intuitive idea to improve performance on conversational automatic speech recognition(ASR). Previous works usually adopt recognized hypotheses of historical utterances as preceding context, which may bias the current recognized hypothesis due to the inevitable historicalrecognition errors. To avoid this problem, we propose an audio-textual cross-modal representation extractor to learn contextual representations directly from preceding speech. Specifically, it consists of two modal-related encoders, extracting high-level latent features from speech and the corresponding text, and a cross-modal encoder, which aims to learn the correlation between speech and text. We randomly mask some input tokens and input sequences of each modality. Then a token-missing or modal-missing prediction with a modal-level CTC loss on the cross-modal encoder is performed. Thus, the model captures not only the bi-directional context dependencies in a specific modality but also relationships between different modalities. Then, during the training of the conversational ASR system, the extractor will be frozen to extract the textual representation of preceding speech, while such representation is used as context fed to the ASR decoder through attention mechanism. The effectiveness of the proposed approach is validated on several Mandarin conversation corpora and the highest character error rate (CER) reduction up to 16% is achieved on the MagicData dataset.
△ Less
Submitted 3 July, 2022;
originally announced July 2022.
-
Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Authors:
Kun Wei,
Pengcheng Guo,
Ning Jiang
Abstract:
Transformer-based models have demonstrated their effectiveness in automatic speech recognition (ASR) tasks and even shown superior performance over the conventional hybrid framework. The main idea of Transformers is to capture the long-range global context within an utterance by self-attention layers. However, for scenarios like conversational speech, such utterance-level modeling will neglect con…
▽ More
Transformer-based models have demonstrated their effectiveness in automatic speech recognition (ASR) tasks and even shown superior performance over the conventional hybrid framework. The main idea of Transformers is to capture the long-range global context within an utterance by self-attention layers. However, for scenarios like conversational speech, such utterance-level modeling will neglect contextual dependencies that span across utterances. In this paper, we propose to explicitly model the inter-sentential information in a Transformer based end-to-end architecture for conversational speech recognition. Specifically, for the encoder network, we capture the contexts of previous speech and incorporate such historic information into current input by a context-aware residual attention mechanism. For the decoder, the prediction of current utterance is also conditioned on the historic linguistic information through a conditional decoder framework. We show the effectiveness of our proposed method on several open-source dialogue corpora and the proposed method consistently improved the performance from the utterance-level Transformer-based ASR models.
△ Less
Submitted 2 July, 2022;
originally announced July 2022.
-
Siamese Contrastive Embedding Network for Compositional Zero-Shot Learning
Authors:
Xiangyu Li,
Xu Yang,
Kun Wei,
Cheng Deng,
Muli Yang
Abstract:
Compositional Zero-Shot Learning (CZSL) aims to recognize unseen compositions formed from seen state and object during training. Since the same state may be various in the visual appearance while entangled with different objects, CZSL is still a challenging task. Some methods recognize state and object with two trained classifiers, ignoring the impact of the interaction between object and state; t…
▽ More
Compositional Zero-Shot Learning (CZSL) aims to recognize unseen compositions formed from seen state and object during training. Since the same state may be various in the visual appearance while entangled with different objects, CZSL is still a challenging task. Some methods recognize state and object with two trained classifiers, ignoring the impact of the interaction between object and state; the other methods try to learn the joint representation of the state-object compositions, leading to the domain gap between seen and unseen composition sets. In this paper, we propose a novel Siamese Contrastive Embedding Network (SCEN) (Code: https://github.com/XDUxyLi/SCEN-master) for unseen composition recognition. Considering the entanglement between state and object, we embed the visual feature into a Siamese Contrastive Space to capture prototypes of them separately, alleviating the interaction between state and object. In addition, we design a State Transition Module (STM) to increase the diversity of training compositions, improving the robustness of the recognition model. Extensive experiments indicate that our method significantly outperforms the state-of-the-art approaches on three challenging benchmark datasets, including the recent proposed C-QGA dataset.
△ Less
Submitted 29 June, 2022;
originally announced June 2022.
-
Magnetic Ordering in GdAuAl$_4$Ge$_2$ and TbAuAl$_4$Ge$_2$: layered compounds with triangular lanthanide nets
Authors:
Keke Feng,
Ian Andreas Leahy,
Olatunde Oladehin,
Kaya Wei,
Minhyea Lee,
Ryan Baumbach
Abstract:
We report the synthesis of the entire $Ln$AuAl$_4$Ge$_2$ ($Ln$ = Y, Pr, Nd, Sm, Gd, Tb, Dy, Ho, Er, and Tm) series and focus on the magnetic properties of GdAuAl$_4$Ge$_2$ and TbAuAl$_4$Ge$_2$. Temperature and magnetic field dependent magnetization, heat capacity, and electrical resistivity measurements reveal that both compounds exhibit several magnetically ordered states at low temperatures, wit…
▽ More
We report the synthesis of the entire $Ln$AuAl$_4$Ge$_2$ ($Ln$ = Y, Pr, Nd, Sm, Gd, Tb, Dy, Ho, Er, and Tm) series and focus on the magnetic properties of GdAuAl$_4$Ge$_2$ and TbAuAl$_4$Ge$_2$. Temperature and magnetic field dependent magnetization, heat capacity, and electrical resistivity measurements reveal that both compounds exhibit several magnetically ordered states at low temperatures, with evidence for magnetic fluctuations extending into the paramagnetic temperature region. For magnetic fields applied in the $ab$-plane there are several ordered state regions that are associated with metamagnetic phase transitions, consistent with there being multiple nearly degenerate ground states. Despite Gd being an isotropic $S$-state ion and Tb having an anisotropic $J$-state, there are similarities in the phase diagrams for the two compounds, suggesting that factors such as the symmetry of the crystalline lattice, which features well separated triangular planes of lanthanide ions, or the Ruderman-Kittel-Kasuya-Yosida interaction as defined by the Fermi surface topography control the magnetism. We also point out similarities to other centrosymmetric compounds that host skyrmion lattices such as Gd$_2$PdSi$_3$, and propose that the $Ln$AuAl$_4$Ge$_2$ family of compounds are of interest as reservoirs for complex magnetism and electronic behaviors such as the topological Hall effect.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
A neural prosody encoder for end-ro-end dialogue act classification
Authors:
Kai Wei,
Dillon Knox,
Martin Radfar,
Thanh Tran,
Markus Muller,
Grant P. Strimel,
Nathan Susanj,
Athanasios Mouchtaris,
Maurizio Omologo
Abstract:
Dialogue act classification (DAC) is a critical task for spoken language understanding in dialogue systems. Prosodic features such as energy and pitch have been shown to be useful for DAC. Despite their importance, little research has explored neural approaches to integrate prosodic features into end-to-end (E2E) DAC models which infer dialogue acts directly from audio signals. In this work, we pr…
▽ More
Dialogue act classification (DAC) is a critical task for spoken language understanding in dialogue systems. Prosodic features such as energy and pitch have been shown to be useful for DAC. Despite their importance, little research has explored neural approaches to integrate prosodic features into end-to-end (E2E) DAC models which infer dialogue acts directly from audio signals. In this work, we propose an E2E neural architecture that takes into account the need for characterizing prosodic phenomena co-occurring at different levels inside an utterance. A novel part of this architecture is a learnable gating mechanism that assesses the importance of prosodic features and selectively retains core information necessary for E2E DAC. Our proposed model improves DAC accuracy by 1.07% absolute across three publicly available benchmark datasets.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
Re-evaluation of the isoscalar mixing angle within selected mesonic nonets
Authors:
Xue-Chao Feng,
Ke-Wei Wei
Abstract:
Based on the relations from the meson-meson mass mixing matrix, the mixing angles of isoscalar state have been re-evaluated via mass relations and latest experimental results. The results in the present work are compared with the values from different theoretical models, meanwhile, the quarkonia content of isoscalar state are presented. In order to check the validity of analysis, some predictions…
▽ More
Based on the relations from the meson-meson mass mixing matrix, the mixing angles of isoscalar state have been re-evaluated via mass relations and latest experimental results. The results in the present work are compared with the values from different theoretical models, meanwhile, the quarkonia content of isoscalar state are presented. In order to check the validity of analysis, some predictions on the decays of the isoscalar state are presented. These predictions may be useful for the phenomenological analysis for meson nonet in future experiments.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
Intelligent Reflection Enabling Technologies for Integrated and Green Internet-of-Everything Beyond 5G: Communication, Sensing, and Security
Authors:
Wei Shi,
Wei Xu,
Xiaohu You,
Chunming Zhao,
Kejun Wei
Abstract:
Internet-of-Everything (IoE) has gradually been recognized as an integral part of future wireless networks. In IoE, there can be an ultra-massive number of smart devices of various types to be served, imposing multi-dimensional requirements on wireless communication, sensing, and security. In this article, we provide a tutorial overview of the promising intelligent reflection communication (IRC) t…
▽ More
Internet-of-Everything (IoE) has gradually been recognized as an integral part of future wireless networks. In IoE, there can be an ultra-massive number of smart devices of various types to be served, imposing multi-dimensional requirements on wireless communication, sensing, and security. In this article, we provide a tutorial overview of the promising intelligent reflection communication (IRC) technologies, including reconfigurable intelligent surface (RIS) and ambient backscatter communication (AmBC), to support the requirements of IoE applications beyond the fifth-generation (5G) wireless communication network. Specifically, we elaborate on the benefits of IRC-assisted IoE in the context of the space-air-ground integrated communications and green communications, which are regarded as key features of supporting future IoE application from society and industries. Furthermore, we envision that the IRC-assisted communication and sensing can mutually benefit each other and articulate multiple ways of enhancing the security in IoE by the IRC. Numerical results help illustrate the importance of the IRC in unfavorable secrecy environments. Finally, open research issues and challenges about the IRC-assisted IoE are presented.
△ Less
Submitted 6 May, 2022;
originally announced May 2022.
-
Distributed Neural Precoding for Hybrid mmWave MIMO Communications with Limited Feedback
Authors:
Kai Wei,
Jindan Xu,
Wei Xu,
Ning Wang,
Dong Chen
Abstract:
Hybrid precoding is a cost-efficient technique for millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) communications. This paper proposes a deep learning approach by using a distributed neural network for hybrid analog-and-digital precoding design with limited feedback. The proposed distributed neural precoding network, called DNet, is committed to achieving two objectives. Fir…
▽ More
Hybrid precoding is a cost-efficient technique for millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) communications. This paper proposes a deep learning approach by using a distributed neural network for hybrid analog-and-digital precoding design with limited feedback. The proposed distributed neural precoding network, called DNet, is committed to achieving two objectives. First, the DNet realizes channel state information (CSI) compression with a distributed architecture of neural networks, which enables practical deployment on multiple users. Specifically, this neural network is composed of multiple independent sub-networks with the same structure and parameters, which reduces both the number of training parameters and network complexity. Secondly, DNet learns the calculation of hybrid precoding from reconstructed CSI from limited feedback. Different from existing black-box neural network design, the DNet is specifically designed according to the data form of the matrix calculation of hybrid precoding. Simulation results show that the proposed DNet significantly improves the performance up to nearly 50% compared to traditional limited feedback precoding methods under the tests with various CSI compression ratios.
△ Less
Submitted 18 April, 2022;
originally announced April 2022.
-
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding
Authors:
Xuandi Fu,
Feng-Ju Chang,
Martin Radfar,
Kai Wei,
Jing Liu,
Grant P. Strimel,
Kanthashree Mysore Sathyendra
Abstract:
End-to-end Spoken Language Understanding (E2E SLU) has attracted increasing interest due to its advantages of joint optimization and low latency when compared to traditionally cascaded pipelines. Existing E2E SLU models usually follow a two-stage configuration where an Automatic Speech Recognition (ASR) network first predicts a transcript which is then passed to a Natural Language Understanding (N…
▽ More
End-to-end Spoken Language Understanding (E2E SLU) has attracted increasing interest due to its advantages of joint optimization and low latency when compared to traditionally cascaded pipelines. Existing E2E SLU models usually follow a two-stage configuration where an Automatic Speech Recognition (ASR) network first predicts a transcript which is then passed to a Natural Language Understanding (NLU) module through an interface to infer semantic labels, such as intent and slot tags. This design, however, does not consider the NLU posterior while making transcript predictions, nor correct the NLU prediction error immediately by considering the previously predicted word-pieces. In addition, the NLU model in the two-stage system is not streamable, as it must wait for the audio segments to complete processing, which ultimately impacts the latency of the SLU system. In this work, we propose a streamable multi-task semantic transducer model to address these considerations. Our proposed architecture predicts ASR and NLU labels auto-regressively and uses a semantic decoder to ingest both previously predicted word-pieces and slot tags while aggregating them through a fusion network. Using an industry scale SLU and a public FSC dataset, we show the proposed model outperforms the two-stage E2E SLU model for both ASR and NLU metrics.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
Exclusive $π^{-}$ Electroproduction off the Neutron in Deuterium in the Resonance Region
Authors:
Y. Tian,
R. W. Gothe,
V. I. Mokeev,
G. Hollis,
M. J. Amaryan,
W. R. Armstrong,
H. Atac,
H. Avakian,
L. Barion,
M. Battaglieri,
I. Bedlinskiy,
B. Benkel,
F. Benmokhtar,
A. Bianconi,
L. Biondo,
A. Biselli,
F. Bossù,
S. Boiarinov,
M. Bondì,
K. T. Brinkmann,
W. J. Briscoe,
S. Bueltmann,
D. Bulumulla,
V. D. Burkert,
R. Capobianco
, et al. (118 additional authors not shown)
Abstract:
New results for the exclusive and quasi-free cross sections off neutrons bound in deuterium $γ_vn(p) \rightarrow pπ^{-} (p)$ are presented over a wide final state hadron angle range with a kinematic coverage of the invariant mass ($W$) up to 1.825 GeV and the virtual photon four-momentum transfer squared ($Q^{2}$) from 0.4 to 1.0 GeV$^2$. The exclusive structure functions were extracted and their…
▽ More
New results for the exclusive and quasi-free cross sections off neutrons bound in deuterium $γ_vn(p) \rightarrow pπ^{-} (p)$ are presented over a wide final state hadron angle range with a kinematic coverage of the invariant mass ($W$) up to 1.825 GeV and the virtual photon four-momentum transfer squared ($Q^{2}$) from 0.4 to 1.0 GeV$^2$. The exclusive structure functions were extracted and their Legendre moments were obtained. Final-state-interaction contributions have been kinematically separated from the extracted quasi-free cross sections off bound neutrons solely based on the analysis of the experimental data. These new results will serve as long-awaited input for phenomenological analyses to extract the $Q^{2}$ evolution of previously unavailable $n \to N^{*}$ electroexcitation amplitudes and to improve state-of-the-art models of neutrino scattering off nuclei by augmenting the already available results from free protons.
△ Less
Submitted 11 January, 2023; v1 submitted 31 March, 2022;
originally announced March 2022.
-
Theory-guided investigation on magnetic evolution of MnPt$_{5-x}$Pd$_x$P and discovery of anti-CeCoIn$_5$-type ferromagnetic MnPd$_5$P
Authors:
Ranuri S. Dissanayaka Mudiyanselage,
Chang-Jong Kang,
Kaya Wei,
Zhixue Shu,
Tai Kong,
Ryan Baumbach,
Gabriel Kotliar,
Weiwei Xie
Abstract:
We report the magnetic changes from canted antiferromagnetic to ferromagnetic orderings in anti-115-type MnPt$_{5-x}$Pd$_x$P ($x$ = 1, 2, 2.5, 3, 4, and 5) and the discovery of a new rare-earth-free ferromagnet, MnPd$_5$P by both theoretical prediction and experimental investigation. The family compounds were synthesized using high temperature solid state method and characterized to crystalize in…
▽ More
We report the magnetic changes from canted antiferromagnetic to ferromagnetic orderings in anti-115-type MnPt$_{5-x}$Pd$_x$P ($x$ = 1, 2, 2.5, 3, 4, and 5) and the discovery of a new rare-earth-free ferromagnet, MnPd$_5$P by both theoretical prediction and experimental investigation. The family compounds were synthesized using high temperature solid state method and characterized to crystalize in the anti-CeCoIn$_5$ type with the space group P4/mmm exhibiting a two-dimensional layered structural feature. The magnetic property measurements indicate that the compounds ordered from canted A-type antiferromagnet in MnPt$_5$P to ferromagnet above the room temperature with varying degrees of coercivity and magnetic moments in MnPd$_5$P by reducing the spin orbital coupling. The results of the MnPt$_{5-x}$Pd$_x$P have been analyzed in comparison to the other candidates of the 151 family of Mn(Pt/Pd)$_5$(P/As) to understand the complex structure-magnetism relationships.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
New Constraints on Exotic Spin-Velocity-Dependent Interactions
Authors:
Kai Wei,
Wei Ji,
Changbo Fu,
Arne Wickenbrock,
Jiancheng Fang,
Victor Flambaum,
Dmitry Budker
Abstract:
Experimental searches for new, "fifth" forces are attracting a lot of attention because they allow to test theoretical extensions to the standard model. Here, we report a new experimental search for possible fifth forces, specifically spin-and-velocity dependent forces, by using a K-Rb-$^{21}$Ne co-magnetometer and a tungsten ring featuring a high nucleon density. Taking advantage of the high sens…
▽ More
Experimental searches for new, "fifth" forces are attracting a lot of attention because they allow to test theoretical extensions to the standard model. Here, we report a new experimental search for possible fifth forces, specifically spin-and-velocity dependent forces, by using a K-Rb-$^{21}$Ne co-magnetometer and a tungsten ring featuring a high nucleon density. Taking advantage of the high sensitivity of the co-magnetometer, the pseudomagnetic field from the fifth force is measured to be $<7$\,aT. This sets new limits on coupling constants for the neutron-nucleon and proton-nucleon interactions in the range of $\ge 0.1$ m. The coupling constant limits are established to be $|g_V^n|<6.6\times 10^{-11}$ and $|g_V^p|<3.0\times 10^{-10}$, which are more than one order of magnitude tighter than astronomical and cosmological limits on the coupling between the new gauge boson such as Z$'$ and standard model particles.
△ Less
Submitted 14 March, 2022;
originally announced March 2022.
-
Trusted AI in Multi-agent Systems: An Overview of Privacy and Security for Distributed Learning
Authors:
Chuan Ma,
Jun Li,
Kang Wei,
Bo Liu,
Ming Ding,
Long Yuan,
Zhu Han,
H. Vincent Poor
Abstract:
Motivated by the advancing computational capacity of distributed end-user equipments (UEs), as well as the increasing concerns about sharing private data, there has been considerable recent interest in machine learning (ML) and artificial intelligence (AI) that can be processed on on distributed UEs. Specifically, in this paradigm, parts of an ML process are outsourced to multiple distributed UEs,…
▽ More
Motivated by the advancing computational capacity of distributed end-user equipments (UEs), as well as the increasing concerns about sharing private data, there has been considerable recent interest in machine learning (ML) and artificial intelligence (AI) that can be processed on on distributed UEs. Specifically, in this paradigm, parts of an ML process are outsourced to multiple distributed UEs, and then the processed ML information is aggregated on a certain level at a central server, which turns a centralized ML process into a distributed one, and brings about significant benefits. However, this new distributed ML paradigm raises new risks of privacy and security issues. In this paper, we provide a survey of the emerging security and privacy risks of distributed ML from a unique perspective of information exchange levels, which are defined according to the key steps of an ML process, i.e.: i) the level of preprocessed data, ii) the level of learning models, iii) the level of extracted knowledge and, iv) the level of intermediate results. We explore and analyze the potential of threats for each information exchange level based on an overview of the current state-of-the-art attack mechanisms, and then discuss the possible defense methods against such threats. Finally, we complete the survey by providing an outlook on the challenges and possible directions for future research in this critical area.
△ Less
Submitted 9 August, 2023; v1 submitted 18 February, 2022;
originally announced February 2022.
-
Conversational Speech Recognition By Learning Conversation-level Characteristics
Authors:
Kun Wei,
Yike Zhang,
Sining Sun,
Lei Xie,
Long Ma
Abstract:
Conversational automatic speech recognition (ASR) is a task to recognize conversational speech including multiple speakers. Unlike sentence-level ASR, conversational ASR can naturally take advantages from specific characteristics of conversation, such as role preference and topical coherence. This paper proposes a conversational ASR model which explicitly learns conversation-level characteristics…
▽ More
Conversational automatic speech recognition (ASR) is a task to recognize conversational speech including multiple speakers. Unlike sentence-level ASR, conversational ASR can naturally take advantages from specific characteristics of conversation, such as role preference and topical coherence. This paper proposes a conversational ASR model which explicitly learns conversation-level characteristics under the prevalent end-to-end neural framework. The highlights of the proposed model are twofold. First, a latent variational module (LVM) is attached to a conformer-based encoder-decoder ASR backbone to learn role preference and topical coherence. Second, a topic model is specifically adopted to bias the outputs of the decoder to words in the predicted topics. Experiments on two Mandarin conversational ASR tasks show that the proposed model achieves a maximum 12% relative character error rate (CER) reduction.
△ Less
Submitted 17 February, 2022; v1 submitted 15 February, 2022;
originally announced February 2022.
-
Vertical Federated Learning: Challenges, Methodologies and Experiments
Authors:
Kang Wei,
Jun Li,
Chuan Ma,
Ming Ding,
Sha Wei,
Fan Wu,
Guihai Chen,
Thilina Ranbaduge
Abstract:
Recently, federated learning (FL) has emerged as a promising distributed machine learning (ML) technology, owing to the advancing computational and sensing capacities of end-user devices, however with the increasing concerns on users' privacy. As a special architecture in FL, vertical FL (VFL) is capable of constructing a hyper ML model by embracing sub-models from different clients. These sub-mod…
▽ More
Recently, federated learning (FL) has emerged as a promising distributed machine learning (ML) technology, owing to the advancing computational and sensing capacities of end-user devices, however with the increasing concerns on users' privacy. As a special architecture in FL, vertical FL (VFL) is capable of constructing a hyper ML model by embracing sub-models from different clients. These sub-models are trained locally by vertically partitioned data with distinct attributes. Therefore, the design of VFL is fundamentally different from that of conventional FL, raising new and unique research issues. In this paper, we aim to discuss key challenges in VFL with effective solutions, and conduct experiments on real-life datasets to shed light on these issues. Specifically, we first propose a general framework on VFL, and highlight the key differences between VFL and conventional FL. Then, we discuss research challenges rooted in VFL systems under four aspects, i.e., security and privacy risks, expensive computation and communication costs, possible structural damage caused by model splitting, and system heterogeneity. Afterwards, we develop solutions to addressing the aforementioned challenges, and conduct extensive experiments to showcase the effectiveness of our proposed solutions.
△ Less
Submitted 5 August, 2024; v1 submitted 9 February, 2022;
originally announced February 2022.
-
Beam-Recoil Transferred Polarization in $K^+Y$ Electroproduction in the Nucleon Resonance Region with CLAS12
Authors:
D. S. Carman,
A. D'Angelo,
L. Lanza,
V. I. Mokeev,
K. P. Adhikari,
M. J. Amaryan,
W. R. Armstrong,
H. Atac,
H. Avakian,
C. Ayerbe Gayoso,
N. A. Baltzell,
L. Barion,
M. Battaglieri,
I. Bedlinskiy,
B. Benkel,
A. Bianconi,
A. S. Biselli,
M. Bondi,
S. Boiarinov,
F. Bossu,
W. J. Briscoe,
S. Bueltmann,
D. Bulumulla,
V. D. Burkert,
R. Capobianco
, et al. (116 additional authors not shown)
Abstract:
Beam-recoil transferred polarizations for the exclusive electroproduction of $K^+Λ$ and $K^+Σ^0$ final states from an unpolarized proton target have been measured using the CLAS12 spectrometer at Jefferson Laboratory. The measurements at beam energies of 6.535~GeV and 7.546~GeV span the range of four-momentum transfer $Q^2$ from 0.3 to 4.5~GeV$^2$ and invariant energy $W$ from 1.6 to 2.4~GeV, whil…
▽ More
Beam-recoil transferred polarizations for the exclusive electroproduction of $K^+Λ$ and $K^+Σ^0$ final states from an unpolarized proton target have been measured using the CLAS12 spectrometer at Jefferson Laboratory. The measurements at beam energies of 6.535~GeV and 7.546~GeV span the range of four-momentum transfer $Q^2$ from 0.3 to 4.5~GeV$^2$ and invariant energy $W$ from 1.6 to 2.4~GeV, while covering the full center-of-mass angular range of the $K^+$. These new data extend the existing hyperon polarization data from CLAS in a similar kinematic range but from a significantly larger dataset. They represent an important addition to the world data, allowing for better exploration of the reaction mechanism in strangeness production processes, for further understanding of the spectrum and structure of excited nucleon states, and for improved insight into the strong interaction in the regime of non-perturbative dynamics.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
$b$-hadron spectroscopy study based on the similarity of double bottom baryon and bottom meson
Authors:
Bing Chen,
Si-Qiang Luo,
Ke-Wei Wei,
Xiang Liu
Abstract:
The dynamical similarity which exists between the $λ$-mode excited $bbq$ baryons ($q$ refers to the $u$, $d$, and $s$ quarks) and the $\bar{b}q$ mesons inspires us to carry out a combined study of their spectroscopy. In this work, the masses and strong decays of these low-lying $b\bar{q}$ and $bbq$ states are studied by the same theoretical methods, and the dynamical similarity which is implied in…
▽ More
The dynamical similarity which exists between the $λ$-mode excited $bbq$ baryons ($q$ refers to the $u$, $d$, and $s$ quarks) and the $\bar{b}q$ mesons inspires us to carry out a combined study of their spectroscopy. In this work, the masses and strong decays of these low-lying $b\bar{q}$ and $bbq$ states are studied by the same theoretical methods, and the dynamical similarity which is implied in their mass spectra and strong decays are also discussed. The recent discovered $\bar{b}q$ states, including the $B_J(5840)$, $B_J(5970)$, $B_{sJ}(6064)$, and $B_{sJ}(6114)$, are analyzed. According to our result, the $B_J(5840)$ could be assigned as a 2$S$ state, while the $B_J(5970)$ could be regarded as a member of the $1D(2^-,~3^-)_{j_q=5/2}$ doublet. The $B_{sJ}(6064)$ and $B_{sJ}(6114)$ are probably the $D$-wave states. Especially, they could be explained as the members of the $1D(1^-,~2^-)_{j_q=3/2}$ and $1D(2^-,~3^-)_{j_q=5/2}$ doublets, respectively. The predicted masses and decay properties of other unknown $\bar{b}q/bbq$ states may provide useful clues to the future experiment.
△ Less
Submitted 21 April, 2022; v1 submitted 14 January, 2022;
originally announced January 2022.
-
Experimental secure quantum key distribution in presence of polarization-dependent loss
Authors:
Chunfeng Huang,
Ye Chen,
Long Jin,
Minming Geng,
Junwei Wang,
Zhenrong Zhang,
Kejin Wei
Abstract:
Quantum key distribution (QKD) is theoretically secure using the principle of quantum mechanics; therefore, QKD is a promising solution for the future of secure communication. Although several experimental demonstrations of QKD have been reported, they have not considered the polarization-dependent loss in state preparation in the key-rate estimation. In this study, we experimentally characterized…
▽ More
Quantum key distribution (QKD) is theoretically secure using the principle of quantum mechanics; therefore, QKD is a promising solution for the future of secure communication. Although several experimental demonstrations of QKD have been reported, they have not considered the polarization-dependent loss in state preparation in the key-rate estimation. In this study, we experimentally characterized polarization-dependent loss in realistic state-preparation devices and verified that a considerable PDL exists in fiber- and silicon-based polarization modulators. Hence, the security of such QKD systems is compromised because of the secure key rate overestimation. Furthermore, we report a decoy-state BB84 QKD experiment considering polarization-dependent loss. Finally, we achieved rigorous finite-key security bound over up to 75 km fiber links by applying a recently proposed security proof. This study considers more realistic source flaws than most previous experiments; thus, it is crucial toward a secure QKD with imperfect practical devices.
△ Less
Submitted 3 January, 2022;
originally announced January 2022.
-
On Distinctive Properties of Universal Perturbations
Authors:
Sung Min Park,
Kuo-An Wei,
Kai Xiao,
Jerry Li,
Aleksander Madry
Abstract:
We identify properties of universal adversarial perturbations (UAPs) that distinguish them from standard adversarial perturbations. Specifically, we show that targeted UAPs generated by projected gradient descent exhibit two human-aligned properties: semantic locality and spatial invariance, which standard targeted adversarial perturbations lack. We also demonstrate that UAPs contain significantly…
▽ More
We identify properties of universal adversarial perturbations (UAPs) that distinguish them from standard adversarial perturbations. Specifically, we show that targeted UAPs generated by projected gradient descent exhibit two human-aligned properties: semantic locality and spatial invariance, which standard targeted adversarial perturbations lack. We also demonstrate that UAPs contain significantly less signal for generalization than standard adversarial perturbations -- that is, UAPs leverage non-robust features to a smaller extent than standard adversarial perturbations.
△ Less
Submitted 31 December, 2021;
originally announced December 2021.
-
Polarized Structure Function $σ_{LT'}$ from $π^0 p$ Electroproduction Data in the Resonance Region at $0.4$ GeV$^2 < Q^2 < 1.0$ GeV$^2$
Authors:
E. L. Isupov,
V. D. Burkert,
A. A. Golubenko,
K. Joo,
N. S. Markov,
V. I. Mokeev,
L. C. Smith,
W. R. Armstrong,
H. Atac,
H. Avakian,
N. A. Baltzell,
L. Barion,
M. Battaglieri,
I. Bedlinskiy,
F. Benmokhtar,
A. Bianconi,
L. Biondo,
A. S. Biselli,
M. Bondi,
F. Bossù,
W. J. Briscoe,
W. K. Brooks,
D. Bulumulla,
R. A. Capobianco,
D. S. Carman
, et al. (116 additional authors not shown)
Abstract:
The first results on the $σ_{LT'}$ structure function in exclusive $π^0p$ electroproduction at invariant masses of the final state of 1.5 GeV $<$ $W$ $<$ 1.8 GeV and in the range of photon virtualities 0.4 GeV$^2 < Q^2 < 1.0$ GeV$^2$ were obtained from data on beam spin asymmetries and differential cross sections measured with the CLAS detector at Jefferson Lab. The Legendre moments determined fro…
▽ More
The first results on the $σ_{LT'}$ structure function in exclusive $π^0p$ electroproduction at invariant masses of the final state of 1.5 GeV $<$ $W$ $<$ 1.8 GeV and in the range of photon virtualities 0.4 GeV$^2 < Q^2 < 1.0$ GeV$^2$ were obtained from data on beam spin asymmetries and differential cross sections measured with the CLAS detector at Jefferson Lab. The Legendre moments determined from the $σ_{LT'}$ structure function have demonstrated sensitivity to the contributions from the nucleon resonances in the second and third resonance regions. These new data on the beam spin asymmetries in $π^0p$ electroproduction extend the opportunities for the extraction of the nucleon resonance electroexcitation amplitudes in the mass range above 1.6 GeV.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
Attentive Contextual Carryover for Multi-Turn End-to-End Spoken Language Understanding
Authors:
Kai Wei,
Thanh Tran,
Feng-Ju Chang,
Kanthashree Mysore Sathyendra,
Thejaswi Muniyappa,
Jing Liu,
Anirudh Raju,
Ross McGowan,
Nathan Susanj,
Ariya Rastrow,
Grant P. Strimel
Abstract:
Recent years have seen significant advances in end-to-end (E2E) spoken language understanding (SLU) systems, which directly predict intents and slots from spoken audio. While dialogue history has been exploited to improve conventional text-based natural language understanding systems, current E2E SLU approaches have not yet incorporated such critical contextual signals in multi-turn and task-orien…
▽ More
Recent years have seen significant advances in end-to-end (E2E) spoken language understanding (SLU) systems, which directly predict intents and slots from spoken audio. While dialogue history has been exploited to improve conventional text-based natural language understanding systems, current E2E SLU approaches have not yet incorporated such critical contextual signals in multi-turn and task-oriented dialogues. In this work, we propose a contextual E2E SLU model architecture that uses a multi-head attention mechanism over encoded previous utterances and dialogue acts (actions taken by the voice assistant) of a multi-turn dialogue. We detail alternative methods to integrate these contexts into the state-ofthe-art recurrent and transformer-based models. When applied to a large de-identified dataset of utterances collected by a voice assistant, our method reduces average word and semantic error rates by 10.8% and 12.6%, respectively. We also present results on a publicly available dataset and show that our method significantly improves performance over a noncontextual baseline
△ Less
Submitted 13 December, 2021;
originally announced December 2021.
-
Universal Deep Network for Steganalysis of Color Image based on Channel Representation
Authors:
Kangkang Wei,
Weiqi Luo,
Shunquan Tan,
Jiwu Huang
Abstract:
Up to now, most existing steganalytic methods are designed for grayscale images, and they are not suitable for color images that are widely used in current social networks. In this paper, we design a universal color image steganalysis network (called UCNet) in spatial and JPEG domains. The proposed method includes preprocessing, convolutional, and classification modules. To preserve the steganogra…
▽ More
Up to now, most existing steganalytic methods are designed for grayscale images, and they are not suitable for color images that are widely used in current social networks. In this paper, we design a universal color image steganalysis network (called UCNet) in spatial and JPEG domains. The proposed method includes preprocessing, convolutional, and classification modules. To preserve the steganographic artifacts in each color channel, in preprocessing module, we firstly separate the input image into three channels according to the corresponding embedding spaces (i.e. RGB for spatial steganography and YCbCr for JPEG steganography), and then extract the image residuals with 62 fixed high-pass filters, finally concatenate all truncated residuals for subsequent analysis rather than adding them together with normal convolution like existing CNN-based steganalyzers. To accelerate the network convergence and effectively reduce the number of parameters, in convolutional module, we carefully design three types of layers with different shortcut connections and group convolution structures to further learn high-level steganalytic features. In classification module, we employ a global average pooling and fully connected layer for classification. We conduct extensive experiments on ALASKA II to demonstrate that the proposed method can achieve state-of-the-art results compared with the modern CNN-based steganalyzers (e.g., SRNet and J-YeNet) in both spatial and JPEG domains, while keeping relatively few memory requirements and training time. Furthermore, we also provide necessary descriptions and many ablation experiments to verify the rationality of the network design.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
An Inexact Riemannian Proximal Gradient Method
Authors:
Wen Huang,
Ke Wei
Abstract:
This paper considers the problem of minimizing the summation of a differentiable function and a nonsmooth function on a Riemannian manifold. In recent years, proximal gradient method and its invariants have been generalized to the Riemannian setting for solving such problems. Different approaches to generalize the proximal mapping to the Riemannian setting lead versions of Riemannian proximal grad…
▽ More
This paper considers the problem of minimizing the summation of a differentiable function and a nonsmooth function on a Riemannian manifold. In recent years, proximal gradient method and its invariants have been generalized to the Riemannian setting for solving such problems. Different approaches to generalize the proximal mapping to the Riemannian setting lead versions of Riemannian proximal gradient methods. However, their convergence analyses all rely on solving their Riemannian proximal mapping exactly, which is either too expensive or impracticable. In this paper, we study the convergence of an inexact Riemannian proximal gradient method. It is proven that if the proximal mapping is solved sufficiently accurately, then the global convergence and local convergence rate based on the Riemannian Kurdyka-Łojasiewicz property can be guaranteed. Moreover, practical conditions on the accuracy for solving the Riemannian proximal mapping are provided. As a byproduct, the proximal gradient method on the Stiefel manifold proposed in~[CMSZ2020] can be viewed as the inexact Riemannian proximal gradient method provided the proximal mapping is solved to certain accuracy. Finally, numerical experiments on sparse principal component analysis are conducted to test the proposed practical conditions.
△ Less
Submitted 14 November, 2021;
originally announced November 2021.
-
Improved Xception with Dual Attention Mechanism and Feature Fusion for Face Forgery Detection
Authors:
Hao Lin,
Weiqi Luo,
Kangkang Wei,
Minglin Liu
Abstract:
With the rapid development of deep learning technology, more and more face forgeries by deepfake are widely spread on social media, causing serious social concern. Face forgery detection has become a research hotspot in recent years, and many related methods have been proposed until now. For those images with low quality and/or diverse sources, however, the detection performances of existing metho…
▽ More
With the rapid development of deep learning technology, more and more face forgeries by deepfake are widely spread on social media, causing serious social concern. Face forgery detection has become a research hotspot in recent years, and many related methods have been proposed until now. For those images with low quality and/or diverse sources, however, the detection performances of existing methods are still far from satisfactory. In this paper, we propose an improved Xception with dual attention mechanism and feature fusion for face forgery detection. Different from the middle flow in original Xception model, we try to catch different high-semantic features of the face images using different levels of convolution, and introduce the convolutional block attention module and feature fusion to refine and reorganize those high-semantic features. In the exit flow, we employ the self-attention mechanism and depthwise separable convolution to learn the global information and local information of the fused features separately to improve the classification the ability of the proposed model. Experimental results evaluated on three Deepfake datasets demonstrate that the proposed method outperforms Xception as well as other related methods both in effectiveness and generalization ability.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
Measurement of charged-pion production in deep-inelastic scattering off nuclei with the CLAS detector
Authors:
S. Moran,
R. Dupre,
H. Hakobyan,
M. Arratia,
W. K. Brooks,
A. Borquez,
A. El Alaoui,
L. El Fassi,
K. Hafidi,
R. Mendez,
T. Mineeva,
S. J. Paul,
M. J. Amaryan,
Giovanni Angelini,
Whitney R. Armstrong,
H. Atac,
N. A. Baltzell,
L. Barion,
M. Bashkanov,
M. Battaglieri,
I. Bedlinskiy,
Fatiha Benmokhtar,
A. Bianconi,
L. Biondo,
A. S. Biselli
, et al. (119 additional authors not shown)
Abstract:
Background: Energetic quarks in nuclear DIS propagate through the nuclear medium. Processes that are believed to occur inside nuclei include quark energy loss through medium-stimulated gluon bremsstrahlung and intra-nuclear interactions of forming hadrons. More data are required to gain a more complete understanding of these effects. Purpose: To test the theoretical models of parton transport and…
▽ More
Background: Energetic quarks in nuclear DIS propagate through the nuclear medium. Processes that are believed to occur inside nuclei include quark energy loss through medium-stimulated gluon bremsstrahlung and intra-nuclear interactions of forming hadrons. More data are required to gain a more complete understanding of these effects. Purpose: To test the theoretical models of parton transport and hadron formation, we compared their predictions for the nuclear and kinematic dependence of pion production in nuclei. Methods: We have measured charged-pion production in semi-inclusive DIS off D, C, Fe, and Pb using the CLAS detector and the CEBAF 5.014 GeV electron beam. We report results on the nuclear-to-deuterium multiplicity ratio for $π^{+}$ and $π^{-}$ as a function of energy transfer, four-momentum transfer, and pion energy fraction or transverse momentum - the first three-dimensional study of its kind. Results: The $π^{+}$ multiplicity ratio is found to depend strongly on the pion fractional energy $z$, and reaches minimum values of $0.67\pm0.03$, $0.43\pm0.02$, and $0.27\pm0.01$ for the C, Fe, and Pb targets, respectively. The $z$ dependences of the multiplicity ratios for $π^{+}$ and $π^{-}$ are equal within uncertainties for C and Fe targets but show differences at the level of 10$\%$ for the Pb-target data. The results are qualitatively described by the GiBUU transport model, as well as with a model based on hadron absorption, but are in tension with calculations based on nuclear fragmentation functions. Conclusions: These precise results will strongly constrain the kinematic and flavor dependence of nuclear effects in hadron production, probing an unexplored kinematic region. They will help to reveal how the nucleus reacts to a fast quark, thereby shedding light on its color structure, transport properties, and on the mechanisms of the hadronization process.
△ Less
Submitted 13 January, 2022; v1 submitted 21 September, 2021;
originally announced September 2021.
-
Is Attention Better Than Matrix Decomposition?
Authors:
Zhengyang Geng,
Meng-Hao Guo,
Hongxu Chen,
Xia Li,
Ke Wei,
Zhouchen Lin
Abstract:
As an essential ingredient of modern deep learning, attention mechanism, especially self-attention, plays a vital role in the global correlation discovery. However, is hand-crafted attention irreplaceable when modeling the global context? Our intriguing finding is that self-attention is not better than the matrix decomposition (MD) model developed 20 years ago regarding the performance and computa…
▽ More
As an essential ingredient of modern deep learning, attention mechanism, especially self-attention, plays a vital role in the global correlation discovery. However, is hand-crafted attention irreplaceable when modeling the global context? Our intriguing finding is that self-attention is not better than the matrix decomposition (MD) model developed 20 years ago regarding the performance and computational cost for encoding the long-distance dependencies. We model the global context issue as a low-rank recovery problem and show that its optimization algorithms can help design global information blocks. This paper then proposes a series of Hamburgers, in which we employ the optimization algorithms for solving MDs to factorize the input representations into sub-matrices and reconstruct a low-rank embedding. Hamburgers with different MDs can perform favorably against the popular global context module self-attention when carefully coping with gradients back-propagated through MDs. Comprehensive experiments are conducted in the vision tasks where it is crucial to learn the global context, including semantic segmentation and image generation, demonstrating significant improvements over self-attention and its variants.
△ Less
Submitted 28 December, 2021; v1 submitted 9 September, 2021;
originally announced September 2021.
-
First-time measurement of Timelike Compton Scattering
Authors:
P. Chatagnon,
S. Niccolai,
S. Stepanyan,
M. J. Amaryan,
G. Angelini,
W. R. Armstrong,
H. Atac,
C. Ayerbe Gayoso,
N. A. Baltzell,
L. Barion,
M. Bashkanov,
M. Battaglieri,
I. Bedlinskiy,
F. Benmokhtar,
A. Bianconi,
L. Biondo,
A. S. Biselli,
M. Bondi,
F. Bossù,
S. Boiarinov,
W. J. Briscoe,
W. K. Brooks,
D. Bulumulla,
V. D. Burkert,
D. S. Carman
, et al. (124 additional authors not shown)
Abstract:
We present the first measurement of the Timelike Compton Scattering process, $γp\to p^\prime γ^* (γ^*\to e^+e^-) $, obtained with the CLAS12 detector at Jefferson Lab. The photon beam polarization and the decay lepton angular asymmetries are reported in the range of timelike photon virtualities $2.25<Q^{\prime 2}<9$ GeV$^2$, squared momentum transferred $0.1<-t<0.8$ GeV$^2$, and average total cent…
▽ More
We present the first measurement of the Timelike Compton Scattering process, $γp\to p^\prime γ^* (γ^*\to e^+e^-) $, obtained with the CLAS12 detector at Jefferson Lab. The photon beam polarization and the decay lepton angular asymmetries are reported in the range of timelike photon virtualities $2.25<Q^{\prime 2}<9$ GeV$^2$, squared momentum transferred $0.1<-t<0.8$ GeV$^2$, and average total center-of-mass energy squared ${s}=14.5$ GeV$^2$. The photon beam polarization asymmetry, similar to the beam-spin asymmetry in Deeply Virtual Compton Scattering, is sensitive to the imaginary part of the Compton Form Factors and provides a way to test the universality of the Generalized Parton Distributions. The angular asymmetry of the decay leptons accesses the real part of the Compton Form Factors and thus the D-term in the parametrization of the Generalized Parton Distributions.
△ Less
Submitted 26 August, 2021;
originally announced August 2021.
-
Implicit Regularization and Entrywise Convergence of Riemannian Optimization for Low Tucker-Rank Tensor Completion
Authors:
Haifeng Wang,
Jinchi Chen,
Ke Wei
Abstract:
This paper is concerned with the low Tucker-rank tensor completion problem, which is about reconstructing a tensor $ T \in\mathbb{R}^{n\times n \times n}$ of low multilinear rank from partially observed entries. Riemannian optimization algorithms are a class of efficient methods for this problem, but the theoretical convergence analysis is still lacking. In this manuscript, we establish the entryw…
▽ More
This paper is concerned with the low Tucker-rank tensor completion problem, which is about reconstructing a tensor $ T \in\mathbb{R}^{n\times n \times n}$ of low multilinear rank from partially observed entries. Riemannian optimization algorithms are a class of efficient methods for this problem, but the theoretical convergence analysis is still lacking. In this manuscript, we establish the entrywise convergence of the vanilla Riemannian gradient method for low Tucker-rank tensor completion under the nearly optimal sampling complexity $O(n^{3/2})$. Meanwhile, the implicit regularization phenomenon of the algorithm has also been revealed. As far as we know, this is the first work that has shown the entrywise convergence and implicit regularization property of a non-convex method for low Tucker-rank tensor completion. The analysis relies on the leave-one-out technique, and some of the technical results developed in the paper might be of broader interest in investigating the properties of other non-convex methods for this problem.
△ Less
Submitted 2 August, 2023; v1 submitted 17 August, 2021;
originally announced August 2021.
-
SR-HetGNN:Session-based Recommendation with Heterogeneous Graph Neural Network
Authors:
Jinpeng Chen,
Haiyang Li,
Xudong Zhang,
Fan Zhang,
Senzhang Wang,
Kaimin Wei,
Jiaqi Ji
Abstract:
The Session-Based Recommendation System aims to predict the user's next click based on their previous session sequence. The current studies generally learn user preferences according to the transitions of items in the user's session sequence. However, other effective information in the session sequence, such as user profiles, are largely ignored which may lead to the model unable to learn the user…
▽ More
The Session-Based Recommendation System aims to predict the user's next click based on their previous session sequence. The current studies generally learn user preferences according to the transitions of items in the user's session sequence. However, other effective information in the session sequence, such as user profiles, are largely ignored which may lead to the model unable to learn the user's specific preferences. In this paper, we propose SR-HetGNN, a novel session recommendation method that uses a heterogeneous graph neural network (HetGNN) to learn session embeddings and capture the specific preferences of anonymous users. Specifically, SR-HetGNN first constructs heterogeneous graphs containing various types of nodes according to the session sequence, which can capture the dependencies among items, users, and sessions. Second, HetGNN captures the complex transitions between items and learns the item embeddings containing user information. Finally, local and global session embeddings are combined with the attentional network to obtain the final session embedding, considering the influence of users' long and short-term preferences. SR-HetGNN is shown to be superior to the existing state-of-the-art session-based recommendation methods through extensive experiments over two real large datasets Diginetica and Tmall.
△ Less
Submitted 5 October, 2023; v1 submitted 12 August, 2021;
originally announced August 2021.
-
Improved $Λp$ Elastic Scattering Cross Sections Between 0.9 and 2.0 GeV/c and Connections to the Neutron Star Equation of State
Authors:
CLAS Collaboration,
J. Rowley,
N. Compton,
C. Djalali,
K. Hicks,
J. Price,
N. Zachariou,
K. P. Adhikari,
W. R. Armstrong,
H. Atac,
L. Baashen,
L. Barion,
M. Bashkanov,
M. Battaglieri,
I. Bedlinskiy,
F. Benmokhtar,
A. Bianconi,
L. Biondo,
A. S. Biselli,
M. Bondi,
F. Bossu,
S. Boiarinov,
W. J. Briscoe,
W. K. Brooks,
D. Bulumulla
, et al. (121 additional authors not shown)
Abstract:
Strange matter is believed to exist in the cores of neutron stars based on simple kinematics. If this is true, then hyperon-nucleon interactions will play a significant part in the neutron star equation of state (EOS). Yet, compared to other elastic scattering processes, there is very little data on $Λ$-$N$ scattering. This experiment utilized the CLAS detector to study the $Λp \rightarrow Λp$ ela…
▽ More
Strange matter is believed to exist in the cores of neutron stars based on simple kinematics. If this is true, then hyperon-nucleon interactions will play a significant part in the neutron star equation of state (EOS). Yet, compared to other elastic scattering processes, there is very little data on $Λ$-$N$ scattering. This experiment utilized the CLAS detector to study the $Λp \rightarrow Λp$ elastic scattering cross section in the incident $Λ$ momentum range 0.9-2.0 GeV/c. This is the first data on this reaction in several decades. The new cross sections have significantly better accuracy and precision than the existing world data, and the techniques developed here can also be used in future experiments.
△ Less
Submitted 6 August, 2021;
originally announced August 2021.
-
Physics-based Noise Modeling for Extreme Low-light Photography
Authors:
Kaixuan Wei,
Ying Fu,
Yinqiang Zheng,
Jiaolong Yang
Abstract:
Enhancing the visibility in extreme low-light environments is a challenging task. Under nearly lightless condition, existing image denoising methods could easily break down due to significantly low SNR. In this paper, we systematically study the noise statistics in the imaging pipeline of CMOS photosensors, and formulate a comprehensive noise model that can accurately characterize the real noise s…
▽ More
Enhancing the visibility in extreme low-light environments is a challenging task. Under nearly lightless condition, existing image denoising methods could easily break down due to significantly low SNR. In this paper, we systematically study the noise statistics in the imaging pipeline of CMOS photosensors, and formulate a comprehensive noise model that can accurately characterize the real noise structures. Our novel model considers the noise sources caused by digital camera electronics which are largely overlooked by existing methods yet have significant influence on raw measurement in the dark. It provides a way to decouple the intricate noise structure into different statistical distributions with physical interpretations. Moreover, our noise model can be used to synthesize realistic training data for learning-based low-light denoising algorithms. In this regard, although promising results have been shown recently with deep convolutional neural networks, the success heavily depends on abundant noisy clean image pairs for training, which are tremendously difficult to obtain in practice. Generalizing their trained models to images from new devices is also problematic. Extensive experiments on multiple low-light denoising datasets -- including a newly collected one in this work covering various devices -- show that a deep neural network trained with our proposed noise formation model can reach surprisingly-high accuracy. The results are on par with or sometimes even outperform training with paired real data, opening a new door to real-world extreme low-light photography.
△ Less
Submitted 4 August, 2021;
originally announced August 2021.
-
Dynamic Proximal Unrolling Network for Compressive Imaging
Authors:
Yixiao Yang,
Ran Tao,
Kaixuan Wei,
Ying Fu
Abstract:
Compressive imaging aims to recover a latent image from under-sampled measurements, suffering from a serious ill-posed inverse problem. Recently, deep neural networks have been applied to this problem with superior results, owing to the learned advanced image priors. These approaches, however, require training separate models for different imaging modalities and sampling ratios, leading to overfit…
▽ More
Compressive imaging aims to recover a latent image from under-sampled measurements, suffering from a serious ill-posed inverse problem. Recently, deep neural networks have been applied to this problem with superior results, owing to the learned advanced image priors. These approaches, however, require training separate models for different imaging modalities and sampling ratios, leading to overfitting to specific settings. In this paper, a dynamic proximal unrolling network (dubbed DPUNet) was proposed, which can handle a variety of measurement matrices via one single model without retraining. Specifically, DPUNet can exploit both the embedded observation model via gradient descent and imposed image priors by learned dynamic proximal operators, achieving joint reconstruction. A key component of DPUNet is a dynamic proximal mapping module, whose parameters can be dynamically adjusted at the inference stage and make it adapt to different imaging settings. Experimental results demonstrate that the proposed DPUNet can effectively handle multiple compressive imaging modalities under varying sampling ratios and noise levels via only one trained model, and outperform the state-of-the-art approaches.
△ Less
Submitted 25 October, 2021; v1 submitted 22 July, 2021;
originally announced July 2021.
-
A Novel Strategy for GaN-on-Diamond Device with a High Thermal Boundary Conductance
Authors:
Fengwen Mu,
Bin Xu,
Xinhua Wang,
Runhua Gao,
Sen Huang,
Ke Wei,
Kai Takeuchi,
Xiaojuan Chen,
Haibo Yin,
Dahai Wang,
Jiahan Yu,
Tadatomo Suga,
Junichiro Shiomi,
Xinyu Liu
Abstract:
To achieve high device performance and high reliability for the gallium nitride (GaN)-based high electron mobility transistors (HEMTs), efficient heat dissipation is important but remains challenging. Enormous efforts have been made to transfer a GaN device layer onto a diamond substrate with a high thermal conductivity by bonding. In this work, two GaN-diamond bonded composites are prepared via m…
▽ More
To achieve high device performance and high reliability for the gallium nitride (GaN)-based high electron mobility transistors (HEMTs), efficient heat dissipation is important but remains challenging. Enormous efforts have been made to transfer a GaN device layer onto a diamond substrate with a high thermal conductivity by bonding. In this work, two GaN-diamond bonded composites are prepared via modified surface activated bonding (SAB) at room temperature with silicon interlayers of different thicknesses (15 nm and 22 nm). Before and after post-annealing process at 800 oC, thermal boundary conductance (TBC) across the bonded interface including the interlayer and the stress of GaN layer are investigated by time-domain thermoreflectance and Raman spectroscopy, respectively. After bonding, the 15 nm Si interlayer achieved a higher TBC. The post-annealing significantly increased the TBC of both interfaces, while the TBC of 22 nm silicon interlayer increased greater and became higher than that of 15 nm. Detailed investigation of the microstructure and composition of the interfaces were carried out to understand the difference in interfacial thermal conduction. The obtained stress was no more than 230 MPa for both before and after the annealing, and this high thermal stability of the bonded composites indicates that the room temperature bonding can realize a GaN-on-diamond template suitable for further epitaxial growth or device process. This work brings a novel strategy of SAB followed by high-temperature annealing to fabricate a GaN-on-diamond device with a high TBC.
△ Less
Submitted 22 July, 2021;
originally announced July 2021.
-
Low-Latency Federated Learning over Wireless Channels with Differential Privacy
Authors:
Kang Wei,
Jun Li,
Chuan Ma,
Ming Ding,
Cailian Chen,
Shi Jin,
Zhu Han,
H. Vincent Poor
Abstract:
In federated learning (FL), model training is distributed over clients and local models are aggregated by a central server. The performance of uploaded models in such situations can vary widely due to imbalanced data distributions, potential demands on privacy protections, and quality of transmissions. In this paper, we aim to minimize FL training delay over wireless channels, constrained by overa…
▽ More
In federated learning (FL), model training is distributed over clients and local models are aggregated by a central server. The performance of uploaded models in such situations can vary widely due to imbalanced data distributions, potential demands on privacy protections, and quality of transmissions. In this paper, we aim to minimize FL training delay over wireless channels, constrained by overall training performance as well as each client's differential privacy (DP) requirement. We solve this problem in the framework of multi-agent multi-armed bandit (MAMAB) to deal with the situation where there are multiple clients confornting different unknown transmission environments, e.g., channel fading and interferences. Specifically, we first transform the long-term constraints on both training performance and each client's DP into a virtual queue based on the Lyapunov drift technique. Then, we convert the MAMAB to a max-min bipartite matching problem at each communication round, by estimating rewards with the upper confidence bound (UCB) approach. More importantly, we propose two efficient solutions to this matching problem, i.e., modified Hungarian algorithm and greedy matching with a better alternative (GMBA), in which the first one can achieve the optimal solution with a high complexity while the second one approaches a better trade-off by enabling a verified low-complexity with little performance loss. In addition, we develop an upper bound on the expected regret of this MAMAB based FL framework, which shows a linear growth over the logarithm of communication rounds, justifying its theoretical feasibility. Extensive experimental results are conducted to validate the effectiveness of our proposed algorithms, and the impacts of various parameters on the FL performance over wireless edge networks are also discussed.
△ Less
Submitted 11 September, 2021; v1 submitted 20 June, 2021;
originally announced June 2021.
-
Quantum crosstalk cancellation for fast entangling gates and improved multi-qubit performance
Authors:
K. X. Wei,
E. Magesan,
I. Lauer,
S. Srinivasan,
D. F. Bogorin,
S. Carnevale,
G. A. Keefe,
Y. Kim,
D. Klaus,
W. Landers,
N. Sundaresan,
C. Wang,
E. J. Zhang,
M. Steffen,
O. E. Dial,
D. C. McKay,
A. Kandala
Abstract:
Quantum computers built with superconducting artificial atoms already stretch the limits of their classical counterparts. While the lowest energy states of these artificial atoms serve as the qubit basis, the higher levels are responsible for both a host of attractive gate schemes as well as generating undesired interactions. In particular, when coupling these atoms to generate entanglement, the h…
▽ More
Quantum computers built with superconducting artificial atoms already stretch the limits of their classical counterparts. While the lowest energy states of these artificial atoms serve as the qubit basis, the higher levels are responsible for both a host of attractive gate schemes as well as generating undesired interactions. In particular, when coupling these atoms to generate entanglement, the higher levels cause shifts in the computational levels that leads to unwanted $ZZ$ quantum crosstalk. Here, we present a novel technique to manipulate the energy levels and mitigate this crosstalk via a simultaneous AC Stark effect on coupled qubits. This breaks a fundamental deadlock between qubit-qubit coupling and crosstalk, leading to a 90ns CNOT with a gate error of (0.19 $\pm$ 0.02) $\%$ and the demonstration of a novel CZ gate with fixed-coupling single-junction transmon qubits. Furthermore, we show a definitive improvement in circuit performance with crosstalk cancellation over seven qubits, demonstrating the scalability of the technique. This work paves the way for superconducting hardware with faster gates and greatly improved multi-qubit circuit fidelities.
△ Less
Submitted 1 June, 2021;
originally announced June 2021.
-
Blockchain Assisted Federated Learning over Wireless Channels: Dynamic Resource Allocation and Client Scheduling
Authors:
Xiumei Deng,
Jun Li,
Chuan Ma,
Kang Wei,
Long Shi,
Ming Ding,
Wen Chen,
H. Vincent Poor
Abstract:
The blockchain technology has been extensively studied to enable distributed and tamper-proof data processing in federated learning (FL). Most existing blockchain assisted FL (BFL) frameworks have employed a third-party blockchain network to decentralize the model aggregation process. However, decentralized model aggregation is vulnerable to pooling and collusion attacks from the third-party block…
▽ More
The blockchain technology has been extensively studied to enable distributed and tamper-proof data processing in federated learning (FL). Most existing blockchain assisted FL (BFL) frameworks have employed a third-party blockchain network to decentralize the model aggregation process. However, decentralized model aggregation is vulnerable to pooling and collusion attacks from the third-party blockchain network. Driven by this issue, we propose a novel BFL framework that features the integration of training and mining at the client side. To optimize the learning performance of FL, we propose to maximize the long-term time average (LTA) training data size under a constraint of LTA energy consumption. To this end, we formulate a joint optimization problem of training client selection and resource allocation (i.e., the transmit power and computation frequency at the client side), and solve the long-term mixed integer non-linear programming based on a Lyapunov technique. In particular, the proposed dynamic resource allocation and client scheduling (DRACS) algorithm can achieve a trade-off of [$\mathcal{O}(1/V)$, $\mathcal{O}(\sqrt{V})$] to balance the maximization of the LTA training data size and the minimization of the LTA energy consumption with a control parameter $V$. Our experimental results show that the proposed DRACS algorithm achieves better learning accuracy than benchmark client scheduling strategies with limited time or energy consumption.
△ Less
Submitted 31 October, 2022; v1 submitted 31 May, 2021;
originally announced May 2021.
-
Federated Learning with Unreliable Clients: Performance Analysis and Mechanism Design
Authors:
Chuan Ma,
Jun Li,
Ming Ding,
Kang Wei,
Wen Chen,
H. Vincent Poor
Abstract:
Owing to the low communication costs and privacy-promoting capabilities, Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients. However, with the distributed architecture, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training. In this paper,…
▽ More
Owing to the low communication costs and privacy-promoting capabilities, Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients. However, with the distributed architecture, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training. In this paper, we model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk. Specifically, we first investigate the impact on the models caused by unreliable clients by deriving a convergence upper bound on the loss function based on the gradient descent updates. Our theoretical bounds reveal that with a fixed amount of total computational resources, there exists an optimal number of local training iterations in terms of convergence performance. We further design a novel defensive mechanism, named deep neural network based secure aggregation (DeepSA). Our experimental results validate our theoretical analysis. In addition, the effectiveness of DeepSA is verified by comparing with other state-of-the-art defensive mechanisms.
△ Less
Submitted 31 July, 2021; v1 submitted 10 May, 2021;
originally announced May 2021.
-
Involutions of Halphen Pencils of Index 2 and Discrete Integrable Systems
Authors:
Kangning Wei
Abstract:
We constructed involutions for a Halphen pencil of index 2, and proved that the birational mapping corresponding to the autonomous reduction of the elliptic Painlevé equation for the same pencil can be obtained as the composition of two such involutions.
We constructed involutions for a Halphen pencil of index 2, and proved that the birational mapping corresponding to the autonomous reduction of the elliptic Painlevé equation for the same pencil can be obtained as the composition of two such involutions.
△ Less
Submitted 7 May, 2021;
originally announced May 2021.
-
Simple Quantum Key Distribution using a Stable Transmitter-Receiver Scheme
Authors:
Di Ma,
Xin Liu,
Chunfeng Huang,
Huasheng Chen,
Huanbin Lin,
Kejin Wei
Abstract:
Quantum Key Distribution (QKD) is a technology that allows secure key exchange between two distant users. A widespread adoption of QKD requires the development of simple, low-cost, and stable systems. However, implementation of the current QKD requires a complex self-alignment process during the initial stage and an additional hardware to compensate the environmental disturbances. In this study, w…
▽ More
Quantum Key Distribution (QKD) is a technology that allows secure key exchange between two distant users. A widespread adoption of QKD requires the development of simple, low-cost, and stable systems. However, implementation of the current QKD requires a complex self-alignment process during the initial stage and an additional hardware to compensate the environmental disturbances. In this study, we have presented the implementation of a simple QKD with the help of a stable transmitter-receiver scheme, which simplifies the self-alignment and is robust enough to withstand environmental disturbances. In case of the stability test, the implementation system is able to remain stable for 48 hours and exhibits an average quantum bit error rate of less than 1\% without any feedback control. The scheme is also tested over a fiber spool, obtaining a stable and secure finite key rate of 7.32k bits per second over a fiber spool extending up to 75 km. The demonstrated long-term stability and obtained secure key rate prove that our method of implementation is a promising alternative for practical QKD systems, in particular, for Cubesat platform and satellite applications.
△ Less
Submitted 15 April, 2021;
originally announced April 2021.
-
Fast quantum state reconstruction via accelerated non-convex programming
Authors:
Junhyung Lyle Kim,
George Kollias,
Amir Kalev,
Ken X. Wei,
Anastasios Kyrillidis
Abstract:
We propose a new quantum state reconstruction method that combines ideas from compressed sensing, non-convex optimization, and acceleration methods. The algorithm, called Momentum-Inspired Factored Gradient Descent (\texttt{MiFGD}), extends the applicability of quantum tomography for larger systems. Despite being a non-convex method, \texttt{MiFGD} converges \emph{provably} close to the true densi…
▽ More
We propose a new quantum state reconstruction method that combines ideas from compressed sensing, non-convex optimization, and acceleration methods. The algorithm, called Momentum-Inspired Factored Gradient Descent (\texttt{MiFGD}), extends the applicability of quantum tomography for larger systems. Despite being a non-convex method, \texttt{MiFGD} converges \emph{provably} close to the true density matrix at an accelerated linear rate, in the absence of experimental and statistical noise, and under common assumptions. With this manuscript, we present the method, prove its convergence property and provide Frobenius norm bound guarantees with respect to the true density matrix. From a practical point of view, we benchmark the algorithm performance with respect to other existing methods, in both synthetic and real experiments performed on an IBM's quantum processing unit. We find that the proposed algorithm performs orders of magnitude faster than state of the art approaches, with the same or better accuracy. In both synthetic and real experiments, we observed accurate and robust reconstruction, despite experimental and statistical noise in the tomographic data. Finally, we provide a ready-to-use code for state tomography of multi-qubit systems.
△ Less
Submitted 23 March, 2022; v1 submitted 14 April, 2021;
originally announced April 2021.