-
SzCORE as a benchmark: report from the seizure detection challenge at the 2025 AI in Epilepsy and Neurological Disorders Conference
Authors:
Jonathan Dan,
Amirhossein Shahbazinia,
Christodoulos Kechris,
David Atienza
Abstract:
Reliable automatic seizure detection from long-term EEG remains a challenge, as current machine learning models often fail to generalize across patients or clinical settings. Manual EEG review remains the clinical standard, underscoring the need for robust models and standardized evaluation. To rigorously assess algorithm performance, we organized a challenge using a private dataset of continuous…
▽ More
Reliable automatic seizure detection from long-term EEG remains a challenge, as current machine learning models often fail to generalize across patients or clinical settings. Manual EEG review remains the clinical standard, underscoring the need for robust models and standardized evaluation. To rigorously assess algorithm performance, we organized a challenge using a private dataset of continuous EEG recordings from 65 subjects (4,360 hours). Expert neurophysiologists annotated the data, providing ground truth for seizure events. Participants were required to detect seizure onset and duration, with evaluation based on event-based metrics, including sensitivity, precision, F1-score, and false positives per day. The SzCORE framework ensured standardized evaluation. The primary ranking criterion was the event-based F1-score, reflecting clinical relevance by balancing sensitivity and false positives. The challenge received 30 submissions from 19 teams, with 28 algorithms evaluated. Results revealed wide variability in performance, with a top F1-score of 43% (sensitivity 37%, precision 45%), highlighting the ongoing difficulty of seizure detection. The challenge also revealed a gap between reported performance and real-world evaluation, emphasizing the importance of rigorous benchmarking. Compared to previous challenges and commercial systems, the best-performing algorithm in this contest showed improved performance. Importantly, the challenge platform now supports continuous benchmarking, enabling reproducible research, integration of new datasets, and clinical evaluation of seizure detection algorithms using a standardized framework.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
Time series saliency maps: explaining models across multiple domains
Authors:
Christodoulos Kechris,
Jonathan Dan,
David Atienza
Abstract:
Traditional saliency map methods, popularized in computer vision, highlight individual points (pixels) of the input that contribute the most to the model's output. However, in time-series they offer limited insights as semantically meaningful features are often found in other domains. We introduce Cross-domain Integrated Gradients, a generalization of Integrated Gradients. Our method enables featu…
▽ More
Traditional saliency map methods, popularized in computer vision, highlight individual points (pixels) of the input that contribute the most to the model's output. However, in time-series they offer limited insights as semantically meaningful features are often found in other domains. We introduce Cross-domain Integrated Gradients, a generalization of Integrated Gradients. Our method enables feature attributions on any domain that can be formulated as an invertible, differentiable transformation of the time domain. Crucially, our derivation extends the original Integrated Gradients into the complex domain, enabling frequency-based attributions. We provide the necessary theoretical guarantees, namely, path independence and completeness. Our approach reveals interpretable, problem-specific attributions that time-domain methods cannot capture, on three real-world tasks: wearable sensor heart rate extraction, electroencephalography-based seizure detection, and zero-shot time-series forecasting. We release an open-source Tensorflow/PyTorch library to enable plug-and-play cross-domain explainability for time-series models. These results demonstrate the ability of cross-domain integrated gradients to provide semantically meaningful insights in time-series models that are impossible with traditional time-domain saliency.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
TransparentGS: Fast Inverse Rendering of Transparent Objects with Gaussians
Authors:
Letian Huang,
Dongwei Ye,
Jialin Dan,
Chengzhi Tao,
Huiwen Liu,
Kun Zhou,
Bo Ren,
Yuanqi Li,
Yanwen Guo,
Jie Guo
Abstract:
The emergence of neural and Gaussian-based radiance field methods has led to considerable advancements in novel view synthesis and 3D object reconstruction. Nonetheless, specular reflection and refraction continue to pose significant challenges due to the instability and incorrect overfitting of radiance fields to high-frequency light variations. Currently, even 3D Gaussian Splatting (3D-GS), as a…
▽ More
The emergence of neural and Gaussian-based radiance field methods has led to considerable advancements in novel view synthesis and 3D object reconstruction. Nonetheless, specular reflection and refraction continue to pose significant challenges due to the instability and incorrect overfitting of radiance fields to high-frequency light variations. Currently, even 3D Gaussian Splatting (3D-GS), as a powerful and efficient tool, falls short in recovering transparent objects with nearby contents due to the existence of apparent secondary ray effects. To address this issue, we propose TransparentGS, a fast inverse rendering pipeline for transparent objects based on 3D-GS. The main contributions are three-fold. Firstly, an efficient representation of transparent objects, transparent Gaussian primitives, is designed to enable specular refraction through a deferred refraction strategy. Secondly, we leverage Gaussian light field probes (GaussProbe) to encode both ambient light and nearby contents in a unified framework. Thirdly, a depth-based iterative probes query (IterQuery) algorithm is proposed to reduce the parallax errors in our probe-based framework. Experiments demonstrate the speed and accuracy of our approach in recovering transparent objects from complex environments, as well as several applications in computer graphics and vision.
△ Less
Submitted 1 May, 2025; v1 submitted 25 April, 2025;
originally announced April 2025.
-
Treatment Effect Estimation for Exponential Family Outcomes using Neural Networks with Targeted Regularization
Authors:
Jiahong Li,
Zeqin Yang,
Jiayi Dan,
Jixing Xu,
Zhichao Zou,
Peng Zhen,
Jiecheng Guo
Abstract:
Neural Networks (NNs) have became a natural choice for treatment effect estimation due to their strong approximation capabilities. Nevertheless, how to design NN-based estimators with desirable properties, such as low bias and doubly robustness, still remains a significant challenge. A common approach to address this is targeted regularization, which modifies the objective function of NNs. However…
▽ More
Neural Networks (NNs) have became a natural choice for treatment effect estimation due to their strong approximation capabilities. Nevertheless, how to design NN-based estimators with desirable properties, such as low bias and doubly robustness, still remains a significant challenge. A common approach to address this is targeted regularization, which modifies the objective function of NNs. However, existing works on targeted regularization are limited to Gaussian-distributed outcomes, significantly restricting their applicability in real-world scenarios. In this work, we aim to bridge this blank by extending this framework to the boarder exponential family outcomes. Specifically, we first derive the von-Mises expansion of the Average Dose function of Canonical Functions (ADCF), which inspires us how to construct a doubly robust estimator with good properties. Based on this, we develop a NN-based estimator for ADCF by generalizing functional targeted regularization to exponential families, and provide the corresponding theoretical convergence rate. Extensive experimental results demonstrate the effectiveness of our proposed model.
△ Less
Submitted 11 February, 2025;
originally announced February 2025.
-
Quantifying Climate Change Impacts on Renewable Energy Generation: A Super-Resolution Recurrent Diffusion Model
Authors:
Xiaochong Dong,
Jun Dan,
Yingyun Sun,
Yang Liu,
Xuemin Zhang,
Shengwei Mei
Abstract:
Driven by global climate change and the ongoing energy transition, the coupling between power supply capabilities and meteorological factors has become increasingly significant. Over the long term, accurately quantifying the power generation of renewable energy under the influence of climate change is essential for the development of sustainable power systems. However, due to interdisciplinary dif…
▽ More
Driven by global climate change and the ongoing energy transition, the coupling between power supply capabilities and meteorological factors has become increasingly significant. Over the long term, accurately quantifying the power generation of renewable energy under the influence of climate change is essential for the development of sustainable power systems. However, due to interdisciplinary differences in data requirements, climate data often lacks the necessary hourly resolution to capture the short-term variability and uncertainties of renewable energy resources. To address this limitation, a super-resolution recurrent diffusion model (SRDM) has been developed to enhance the temporal resolution of climate data and model the short-term uncertainty. The SRDM incorporates a pre-trained decoder and a denoising network, that generates long-term, high-resolution climate data through a recurrent coupling mechanism. The high-resolution climate data is then converted into power value using the mechanism model, enabling the simulation of wind and photovoltaic (PV) power generation on future long-term scales. Case studies were conducted in the Ejina region of Inner Mongolia, China, using fifth-generation reanalysis (ERA5) and coupled model intercomparison project (CMIP6) data under two climate pathways: SSP126 and SSP585. The results demonstrate that the SRDM outperforms existing generative models in generating super-resolution climate data. Furthermore, the research highlights the estimation biases introduced when low-resolution climate data is used for power conversion.
△ Less
Submitted 24 March, 2025; v1 submitted 15 December, 2024;
originally announced December 2024.
-
Cough-E: A multimodal, privacy-preserving cough detection algorithm for the edge
Authors:
Stefano Albini,
Lara Orlandic,
Jonathan Dan,
Jérôme Thevenot,
Tomas Teijeiro,
Denisa Andreea Constantinescu,
David Atienza
Abstract:
Continuous cough monitors can greatly aid doctors in home monitoring and treatment of respiratory diseases. Although many algorithms have been proposed, they still face limitations in data privacy and short-term monitoring. Edge-AI offers a promising solution by processing privacy-sensitive data near the source, but challenges arise in deploying resource-intensive algorithms on constrained devices…
▽ More
Continuous cough monitors can greatly aid doctors in home monitoring and treatment of respiratory diseases. Although many algorithms have been proposed, they still face limitations in data privacy and short-term monitoring. Edge-AI offers a promising solution by processing privacy-sensitive data near the source, but challenges arise in deploying resource-intensive algorithms on constrained devices. From a suitable selection of audio and kinematic signals, our methodology aims at the optimal selection of features via Recursive Feature Elimination with Cross-Validation (RFECV), which exploits the explainability of the selected XGB model. Additionally, it analyzes the use of Mel spectrogram features, instead of the more common MFCC. Moreover, a set of hyperparameters for a multimodal implementation of the classifier is explored. Finally, it evaluates the performance based on clinically relevant event-based metrics. We apply our methodology to develop Cough-E, an energy-efficient, multimodal and edge AI cough detection algorithm. It exploits audio and kinematic data in two distinct classifiers, jointly cooperating for a balanced energy and performance trade-off. We demonstrate that our algorithm can be executed in real-time on an ARM Cortex M33 microcontroller. Cough-E achieves a 70.56\% energy saving when compared to the audio-only approach, at the cost of a 1.26\% relative performance drop, resulting in a 0.78 F1-score. Both Cough-E and the edge-aware model optimization methodology are publicly available as open-source code. This approach demonstrates the benefits of the proposed hardware-aware methodology to enable privacy-preserving cough monitors on the edge, paving the way to efficient cough monitoring.
△ Less
Submitted 31 October, 2024;
originally announced October 2024.
-
FaceChain-FACT: Face Adapter with Decoupled Training for Identity-preserved Personalization
Authors:
Cheng Yu,
Haoyu Xie,
Lei Shang,
Yang Liu,
Jun Dan,
Liefeng Bo,
Baigui Sun
Abstract:
In the field of human-centric personalized image generation, the adapter-based method obtains the ability to customize and generate portraits by text-to-image training on facial data. This allows for identity-preserved personalization without additional fine-tuning in inference. Although there are improvements in efficiency and fidelity, there is often a significant performance decrease in test fo…
▽ More
In the field of human-centric personalized image generation, the adapter-based method obtains the ability to customize and generate portraits by text-to-image training on facial data. This allows for identity-preserved personalization without additional fine-tuning in inference. Although there are improvements in efficiency and fidelity, there is often a significant performance decrease in test following ability, controllability, and diversity of generated faces compared to the base model. In this paper, we analyze that the performance degradation is attributed to the failure to decouple identity features from other attributes during extraction, as well as the failure to decouple the portrait generation training from the overall generation task. To address these issues, we propose the Face Adapter with deCoupled Training (FACT) framework, focusing on both model architecture and training strategy. To decouple identity features from others, we leverage a transformer-based face-export encoder and harness fine-grained identity features. To decouple the portrait generation training, we propose Face Adapting Increment Regularization~(FAIR), which effectively constrains the effect of face adapters on the facial region, preserving the generative ability of the base model. Additionally, we incorporate a face condition drop and shuffle mechanism, combined with curriculum learning, to enhance facial controllability and diversity. As a result, FACT solely learns identity preservation from training data, thereby minimizing the impact on the original text-to-image capabilities of the base model. Extensive experiments show that FACT has both controllability and fidelity in both text-to-image generation and inpainting solutions for portrait generation.
△ Less
Submitted 25 October, 2024; v1 submitted 16 October, 2024;
originally announced October 2024.
-
TopoFR: A Closer Look at Topology Alignment on Face Recognition
Authors:
Jun Dan,
Yang Liu,
Jiankang Deng,
Haoyu Xie,
Siyuan Li,
Baigui Sun,
Shan Luo
Abstract:
The field of face recognition (FR) has undergone significant advancements with the rise of deep learning. Recently, the success of unsupervised learning and graph neural networks has demonstrated the effectiveness of data structure information. Considering that the FR task can leverage large-scale training data, which intrinsically contains significant structure information, we aim to investigate…
▽ More
The field of face recognition (FR) has undergone significant advancements with the rise of deep learning. Recently, the success of unsupervised learning and graph neural networks has demonstrated the effectiveness of data structure information. Considering that the FR task can leverage large-scale training data, which intrinsically contains significant structure information, we aim to investigate how to encode such critical structure information into the latent space. As revealed from our observations, directly aligning the structure information between the input and latent spaces inevitably suffers from an overfitting problem, leading to a structure collapse phenomenon in the latent space. To address this problem, we propose TopoFR, a novel FR model that leverages a topological structure alignment strategy called PTSA and a hard sample mining strategy named SDE. Concretely, PTSA uses persistent homology to align the topological structures of the input and latent spaces, effectively preserving the structure information and improving the generalization performance of FR model. To mitigate the impact of hard samples on the latent space structure, SDE accurately identifies hard samples by automatically computing structure damage score (SDS) for each sample, and directs the model to prioritize optimizing these samples. Experimental results on popular face benchmarks demonstrate the superiority of our TopoFR over the state-of-the-art methods. Code and models are available at: https://github.com/modelscope/facechain/tree/main/face_module/TopoFR.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
Spectral-GS: Taming 3D Gaussian Splatting with Spectral Entropy
Authors:
Letian Huang,
Jie Guo,
Jialin Dan,
Ruoyu Fu,
Shujie Wang,
Yuanqi Li,
Yanwen Guo
Abstract:
Recently, 3D Gaussian Splatting (3D-GS) has achieved impressive results in novel view synthesis, demonstrating high fidelity and efficiency. However, it easily exhibits needle-like artifacts, especially when increasing the sampling rate. Mip-Splatting tries to remove these artifacts with a 3D smoothing filter for frequency constraints and a 2D Mip filter for approximated supersampling. Unfortunate…
▽ More
Recently, 3D Gaussian Splatting (3D-GS) has achieved impressive results in novel view synthesis, demonstrating high fidelity and efficiency. However, it easily exhibits needle-like artifacts, especially when increasing the sampling rate. Mip-Splatting tries to remove these artifacts with a 3D smoothing filter for frequency constraints and a 2D Mip filter for approximated supersampling. Unfortunately, it tends to produce over-blurred results, and sometimes needle-like Gaussians still persist. Our spectral analysis of the covariance matrix during optimization and densification reveals that current 3D-GS lacks shape awareness, relying instead on spectral radius and view positional gradients to determine splitting. As a result, needle-like Gaussians with small positional gradients and low spectral entropy fail to split and overfit high-frequency details. Furthermore, both the filters used in 3D-GS and Mip-Splatting reduce the spectral entropy and increase the condition number during zooming in to synthesize novel view, causing view inconsistencies and more pronounced artifacts. Our Spectral-GS, based on spectral analysis, introduces 3D shape-aware splitting and 2D view-consistent filtering strategies, effectively addressing these issues, enhancing 3D-GS's capability to represent high-frequency details without noticeable artifacts, and achieving high-quality photorealistic rendering.
△ Less
Submitted 15 October, 2024; v1 submitted 19 September, 2024;
originally announced September 2024.
-
Don't Think It Twice: Exploit Shift Invariance for Efficient Online Streaming Inference of CNNs
Authors:
Christodoulos Kechris,
Jonathan Dan,
Jose Miranda,
David Atienza
Abstract:
Deep learning time-series processing often relies on convolutional neural networks with overlapping windows. This overlap allows the network to produce an output faster than the window length. However, it introduces additional computations. This work explores the potential to optimize computational efficiency during inference by exploiting convolution's shift-invariance properties to skip the calc…
▽ More
Deep learning time-series processing often relies on convolutional neural networks with overlapping windows. This overlap allows the network to produce an output faster than the window length. However, it introduces additional computations. This work explores the potential to optimize computational efficiency during inference by exploiting convolution's shift-invariance properties to skip the calculation of layer activations between successive overlapping windows. Although convolutions are shift-invariant, zero-padding and pooling operations, widely used in such networks, are not efficient and complicate efficient streaming inference. We introduce StreamiNNC, a strategy to deploy Convolutional Neural Networks for online streaming inference. We explore the adverse effects of zero padding and pooling on the accuracy of streaming inference, deriving theoretical error upper bounds for pooling during streaming. We address these limitations by proposing signal padding and pooling alignment and provide guidelines for designing and deploying models for StreamiNNC. We validate our method in simulated data and on three real-world biomedical signal processing applications. StreamiNNC achieves a low deviation between streaming output and normal inference for all three networks (2.03 - 3.55% NRMSE). This work demonstrates that it is possible to linearly speed up the inference of streaming CNNs processing overlapping windows, negating the additional computation typically incurred by overlapping windows.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
-
DC is all you need: describing ReLU from a signal processing standpoint
Authors:
Christodoulos Kechris,
Jonathan Dan,
Jose Miranda,
David Atienza
Abstract:
Non-linear activation functions are crucial in Convolutional Neural Networks. However, until now they have not been well described in the frequency domain. In this work, we study the spectral behavior of ReLU, a popular activation function. We use the ReLU's Taylor expansion to derive its frequency domain behavior. We demonstrate that ReLU introduces higher frequency oscillations in the signal and…
▽ More
Non-linear activation functions are crucial in Convolutional Neural Networks. However, until now they have not been well described in the frequency domain. In this work, we study the spectral behavior of ReLU, a popular activation function. We use the ReLU's Taylor expansion to derive its frequency domain behavior. We demonstrate that ReLU introduces higher frequency oscillations in the signal and a constant DC component. Furthermore, we investigate the importance of this DC component, where we demonstrate that it helps the model extract meaningful features related to the input frequency content. We accompany our theoretical derivations with experiments and real-world examples. First, we numerically validate our frequency response model. Then we observe ReLU's spectral behavior on two example models and a real-world one. Finally, we experimentally investigate the role of the DC component introduced by ReLU in the CNN's representations. Our results indicate that the DC helps to converge to a weight configuration that is close to the initial random weights.
△ Less
Submitted 11 May, 2025; v1 submitted 23 July, 2024;
originally announced July 2024.
-
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation
Authors:
Mushui Liu,
Yuhang Ma,
Yang Zhen,
Jun Dan,
Yunlong Yu,
Zeng Zhao,
Zhipeng Hu,
Bai Liu,
Changjie Fan
Abstract:
Diffusion models have exhibited substantial success in text-to-image generation. However, they often encounter challenges when dealing with complex and dense prompts involving multiple objects, attribute binding, and long descriptions. In this paper, we propose a novel framework called \textbf{LLM4GEN}, which enhances the semantic understanding of text-to-image diffusion models by leveraging the r…
▽ More
Diffusion models have exhibited substantial success in text-to-image generation. However, they often encounter challenges when dealing with complex and dense prompts involving multiple objects, attribute binding, and long descriptions. In this paper, we propose a novel framework called \textbf{LLM4GEN}, which enhances the semantic understanding of text-to-image diffusion models by leveraging the representation of Large Language Models (LLMs). It can be seamlessly incorporated into various diffusion models as a plug-and-play component. A specially designed Cross-Adapter Module (CAM) integrates the original text features of text-to-image models with LLM features, thereby enhancing text-to-image generation. Additionally, to facilitate and correct entity-attribute relationships in text prompts, we develop an entity-guided regularization loss to further improve generation performance. We also introduce DensePrompts, which contains $7,000$ dense prompts to provide a comprehensive evaluation for the text-to-image generation task. Experiments indicate that LLM4GEN significantly improves the semantic alignment of SD1.5 and SDXL, demonstrating increases of 9.69\% and 12.90\% in color on T2I-CompBench, respectively. Moreover, it surpasses existing models in terms of sample quality, image-text alignment, and human evaluation.
△ Less
Submitted 27 August, 2024; v1 submitted 30 June, 2024;
originally announced July 2024.
-
How to Count Coughs: An Event-Based Framework for Evaluating Automatic Cough Detection Algorithm Performance
Authors:
Lara Orlandic,
Jonathan Dan,
Jerome Thevenot,
Tomas Teijeiro,
Alain Sauty,
David Atienza
Abstract:
Chronic cough disorders are widespread and challenging to assess because they rely on subjective patient questionnaires about cough frequency. Wearable devices running Machine Learning (ML) algorithms are promising for quantifying daily coughs, providing clinicians with objective metrics to track symptoms and evaluate treatments. However, there is a mismatch between state-of-the-art metrics for co…
▽ More
Chronic cough disorders are widespread and challenging to assess because they rely on subjective patient questionnaires about cough frequency. Wearable devices running Machine Learning (ML) algorithms are promising for quantifying daily coughs, providing clinicians with objective metrics to track symptoms and evaluate treatments. However, there is a mismatch between state-of-the-art metrics for cough counting algorithms and the information relevant to clinicians. Most works focus on distinguishing cough from non-cough samples, which does not directly provide clinically relevant outcomes such as the number of cough events or their temporal patterns. In addition, typical metrics such as specificity and accuracy can be biased by class imbalance. We propose using event-based evaluation metrics aligned with clinical guidelines on significant cough counting endpoints. We use an ML classifier to illustrate the shortcomings of traditional sample-based accuracy measurements, highlighting their variance due to dataset class imbalance and sample window length. We also present an open-source event-based evaluation framework to test algorithm performance in identifying cough events and rejecting false positives. We provide examples and best practice guidelines in event-based cough counting as a necessary first step to assess algorithm performance with clinical relevance.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Symmetry quantification and segmentation in STEM imaging through Zernike moments
Authors:
Jiadong Dan,
Cheng Zhang,
Xiaoxu Zhao,
N. Duane Loh
Abstract:
We present a method using Zernike moments for quantifying rotational and reflectional symmetries in scanning transmission electron microscopy (STEM) images, aimed at improving structural analysis of materials at the atomic scale. This technique is effective against common imaging noises and is potentially suited for low-dose imaging and identifying quantum defects. We showcase its utility in the u…
▽ More
We present a method using Zernike moments for quantifying rotational and reflectional symmetries in scanning transmission electron microscopy (STEM) images, aimed at improving structural analysis of materials at the atomic scale. This technique is effective against common imaging noises and is potentially suited for low-dose imaging and identifying quantum defects. We showcase its utility in the unsupervised segmentation of polytypes in a twisted bilayer TaS$_2$, enabling accurate differentiation of structural phases and monitoring transitions caused by electron beam effects. This approach enhances the analysis of structural variations in crystalline materials, marking a notable advancement in the characterization of structures in materials science.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation
Authors:
Mushui Liu,
Jun Dan,
Ziqian Lu,
Yunlong Yu,
Yingming Li,
Xi Li
Abstract:
Due to the large-scale image size and object variations, current CNN-based and Transformer-based approaches for remote sensing image semantic segmentation are suboptimal for capturing the long-range dependency or limited to the complex computational complexity. In this paper, we propose CM-UNet, comprising a CNN-based encoder for extracting local image features and a Mamba-based decoder for aggreg…
▽ More
Due to the large-scale image size and object variations, current CNN-based and Transformer-based approaches for remote sensing image semantic segmentation are suboptimal for capturing the long-range dependency or limited to the complex computational complexity. In this paper, we propose CM-UNet, comprising a CNN-based encoder for extracting local image features and a Mamba-based decoder for aggregating and integrating global information, facilitating efficient semantic segmentation of remote sensing images. Specifically, a CSMamba block is introduced to build the core segmentation decoder, which employs channel and spatial attention as the gate activation condition of the vanilla Mamba to enhance the feature interaction and global-local information fusion. Moreover, to further refine the output features from the CNN encoder, a Multi-Scale Attention Aggregation (MSAA) module is employed to merge the different scale features. By integrating the CSMamba block and MSAA module, CM-UNet effectively captures the long-range dependencies and multi-scale global contextual information of large-scale remote-sensing images. Experimental results obtained on three benchmarks indicate that the proposed CM-UNet outperforms existing methods in various performance metrics. The codes are available at https://github.com/XiaoBuL/CM-UNet.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
KID-PPG: Knowledge Informed Deep Learning for Extracting Heart Rate from a Smartwatch
Authors:
Christodoulos Kechris,
Jonathan Dan,
Jose Miranda,
David Atienza
Abstract:
Accurate extraction of heart rate from photoplethysmography (PPG) signals remains challenging due to motion artifacts and signal degradation. Although deep learning methods trained as a data-driven inference problem offer promising solutions, they often underutilize existing knowledge from the medical and signal processing community. In this paper, we address three shortcomings of deep learning mo…
▽ More
Accurate extraction of heart rate from photoplethysmography (PPG) signals remains challenging due to motion artifacts and signal degradation. Although deep learning methods trained as a data-driven inference problem offer promising solutions, they often underutilize existing knowledge from the medical and signal processing community. In this paper, we address three shortcomings of deep learning models: motion artifact removal, degradation assessment, and physiologically plausible analysis of the PPG signal. We propose KID-PPG, a knowledge-informed deep learning model that integrates expert knowledge through adaptive linear filtering, deep probabilistic inference, and data augmentation. We evaluate KID-PPG on the PPGDalia dataset, achieving an average mean absolute error of 2.85 beats per minute, surpassing existing reproducible methods. Our results demonstrate a significant performance improvement in heart rate tracking through the incorporation of prior knowledge into deep learning models. This approach shows promise in enhancing various biomedical applications by incorporating existing expert knowledge in deep learning models.
△ Less
Submitted 9 October, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio
Authors:
Chao Xu,
Yang Liu,
Jiazheng Xing,
Weida Wang,
Mingze Sun,
Jun Dan,
Tianxin Huang,
Siyuan Li,
Zhi-Qi Cheng,
Ying Tai,
Baigui Sun
Abstract:
In this paper, we abstract the process of people hearing speech, extracting meaningful cues, and creating various dynamically audio-consistent talking faces, termed Listening and Imagining, into the task of high-fidelity diverse talking faces generation from a single audio. Specifically, it involves two critical challenges: one is to effectively decouple identity, content, and emotion from entangl…
▽ More
In this paper, we abstract the process of people hearing speech, extracting meaningful cues, and creating various dynamically audio-consistent talking faces, termed Listening and Imagining, into the task of high-fidelity diverse talking faces generation from a single audio. Specifically, it involves two critical challenges: one is to effectively decouple identity, content, and emotion from entangled audio, and the other is to maintain intra-video diversity and inter-video consistency. To tackle the issues, we first dig out the intricate relationships among facial factors and simplify the decoupling process, tailoring a Progressive Audio Disentanglement for accurate facial geometry and semantics learning, where each stage incorporates a customized training module responsible for a specific factor. Secondly, to achieve visually diverse and audio-synchronized animation solely from input audio within a single model, we introduce the Controllable Coherent Frame generation, which involves the flexible integration of three trainable adapters with frozen Latent Diffusion Models (LDMs) to focus on maintaining facial geometry and semantics, as well as texture and temporal coherence between frames. In this way, we inherit high-quality diverse generation from LDMs while significantly improving their controllability at a low training cost. Extensive experiments demonstrate the flexibility and effectiveness of our method in handling this paradigm. The codes will be released at https://github.com/modelscope/facechain.
△ Less
Submitted 31 March, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
PRCL: Probabilistic Representation Contrastive Learning for Semi-Supervised Semantic Segmentation
Authors:
Haoyu Xie,
Changqi Wang,
Jian Zhao,
Yang Liu,
Jun Dan,
Chong Fu,
Baigui Sun
Abstract:
Tremendous breakthroughs have been developed in Semi-Supervised Semantic Segmentation (S4) through contrastive learning. However, due to limited annotations, the guidance on unlabeled images is generated by the model itself, which inevitably exists noise and disturbs the unsupervised training process. To address this issue, we propose a robust contrastive-based S4 framework, termed the Probabilist…
▽ More
Tremendous breakthroughs have been developed in Semi-Supervised Semantic Segmentation (S4) through contrastive learning. However, due to limited annotations, the guidance on unlabeled images is generated by the model itself, which inevitably exists noise and disturbs the unsupervised training process. To address this issue, we propose a robust contrastive-based S4 framework, termed the Probabilistic Representation Contrastive Learning (PRCL) framework to enhance the robustness of the unsupervised training process. We model the pixel-wise representation as Probabilistic Representations (PR) via multivariate Gaussian distribution and tune the contribution of the ambiguous representations to tolerate the risk of inaccurate guidance in contrastive learning. Furthermore, we introduce Global Distribution Prototypes (GDP) by gathering all PRs throughout the whole training process. Since the GDP contains the information of all representations with the same class, it is robust from the instant noise in representations and bears the intra-class variance of representations. In addition, we generate Virtual Negatives (VNs) based on GDP to involve the contrastive learning process. Extensive experiments on two public benchmarks demonstrate the superiority of our PRCL framework.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
SzCORE: A Seizure Community Open-source Research Evaluation framework for the validation of EEG-based automated seizure detection algorithms
Authors:
Jonathan Dan,
Una Pale,
Alireza Amirshahi,
William Cappelletti,
Thorir Mar Ingolfsson,
Xiaying Wang,
Andrea Cossettini,
Adriano Bernini,
Luca Benini,
Sándor Beniczky,
David Atienza,
Philippe Ryvlin
Abstract:
The need for high-quality automated seizure detection algorithms based on electroencephalography (EEG) becomes ever more pressing with the increasing use of ambulatory and long-term EEG monitoring. Heterogeneity in validation methods of these algorithms influences the reported results and makes comprehensive evaluation and comparison challenging. This heterogeneity concerns in particular the choic…
▽ More
The need for high-quality automated seizure detection algorithms based on electroencephalography (EEG) becomes ever more pressing with the increasing use of ambulatory and long-term EEG monitoring. Heterogeneity in validation methods of these algorithms influences the reported results and makes comprehensive evaluation and comparison challenging. This heterogeneity concerns in particular the choice of datasets, evaluation methodologies, and performance metrics. In this paper, we propose a unified framework designed to establish standardization in the validation of EEG-based seizure detection algorithms. Based on existing guidelines and recommendations, the framework introduces a set of recommendations and standards related to datasets, file formats, EEG data input content, seizure annotation input and output, cross-validation strategies, and performance metrics. We also propose the 10-20 seizure detection benchmark, a machine-learning benchmark based on public datasets converted to a standardized format. This benchmark defines the machine-learning task as well as reporting metrics. We illustrate the use of the benchmark by evaluating a set of existing seizure detection algorithms. The SzCORE (Seizure Community Open-source Research Evaluation) framework and benchmark are made publicly available along with an open-source software library to facilitate research use, while enabling rigorous evaluation of the clinical significance of the algorithms, fostering a collective effort to more optimally detect seizures to improve the lives of people with epilepsy.
△ Less
Submitted 8 March, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
LasTGL: An Industrial Framework for Large-Scale Temporal Graph Learning
Authors:
Jintang Li,
Jiawang Dan,
Ruofan Wu,
Jing Zhou,
Sheng Tian,
Yunfei Liu,
Baokun Wang,
Changhua Meng,
Weiqiang Wang,
Yuchang Zhu,
Liang Chen,
Zibin Zheng
Abstract:
Over the past few years, graph neural networks (GNNs) have become powerful and practical tools for learning on (static) graph-structure data. However, many real-world applications, such as social networks and e-commerce, involve temporal graphs where nodes and edges are dynamically evolving. Temporal graph neural networks (TGNNs) have progressively emerged as an extension of GNNs to address time-e…
▽ More
Over the past few years, graph neural networks (GNNs) have become powerful and practical tools for learning on (static) graph-structure data. However, many real-world applications, such as social networks and e-commerce, involve temporal graphs where nodes and edges are dynamically evolving. Temporal graph neural networks (TGNNs) have progressively emerged as an extension of GNNs to address time-evolving graphs and have gradually become a trending research topic in both academics and industry. Advancing research and application in such an emerging field necessitates the development of new tools to compose TGNN models and unify their different schemes for dealing with temporal graphs. In this work, we introduce LasTGL, an industrial framework that integrates unified and extensible implementations of common temporal graph learning algorithms for various advanced tasks. The purpose of LasTGL is to provide the essential building blocks for solving temporal graph learning tasks, focusing on the guiding principles of user-friendliness and quick prototyping on which PyTorch is based. In particular, LasTGL provides comprehensive temporal graph datasets, TGNN models and utilities along with well-documented tutorials, making it suitable for both absolute beginners and expert deep learning practitioners alike.
△ Less
Submitted 30 November, 2023; v1 submitted 28 November, 2023;
originally announced November 2023.
-
Hetero$^2$Net: Heterophily-aware Representation Learning on Heterogenerous Graphs
Authors:
Jintang Li,
Zheng Wei,
Jiawang Dan,
Jing Zhou,
Yuchang Zhu,
Ruofan Wu,
Baokun Wang,
Zhang Zhen,
Changhua Meng,
Hong Jin,
Zibin Zheng,
Liang Chen
Abstract:
Real-world graphs are typically complex, exhibiting heterogeneity in the global structure, as well as strong heterophily within local neighborhoods. While a growing body of literature has revealed the limitations of common graph neural networks (GNNs) in handling homogeneous graphs with heterophily, little work has been conducted on investigating the heterophily properties in the context of hetero…
▽ More
Real-world graphs are typically complex, exhibiting heterogeneity in the global structure, as well as strong heterophily within local neighborhoods. While a growing body of literature has revealed the limitations of common graph neural networks (GNNs) in handling homogeneous graphs with heterophily, little work has been conducted on investigating the heterophily properties in the context of heterogeneous graphs. To bridge this research gap, we identify the heterophily in heterogeneous graphs using metapaths and propose two practical metrics to quantitatively describe the levels of heterophily. Through in-depth investigations on several real-world heterogeneous graphs exhibiting varying levels of heterophily, we have observed that heterogeneous graph neural networks (HGNNs), which inherit many mechanisms from GNNs designed for homogeneous graphs, fail to generalize to heterogeneous graphs with heterophily or low level of homophily. To address the challenge, we present Hetero$^2$Net, a heterophily-aware HGNN that incorporates both masked metapath prediction and masked label prediction tasks to effectively and flexibly handle both homophilic and heterophilic heterogeneous graphs. We evaluate the performance of Hetero$^2$Net on five real-world heterogeneous graph benchmarks with varying levels of heterophily. The results demonstrate that Hetero$^2$Net outperforms strong baselines in the semi-supervised node classification task, providing valuable insights into effectively handling more complex heterogeneous graphs.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Self-supervision meets kernel graph neural models: From architecture to augmentations
Authors:
Jiawang Dan,
Ruofan Wu,
Yunpeng Liu,
Baokun Wang,
Changhua Meng,
Tengfei Liu,
Tianyi Zhang,
Ningtao Wang,
Xing Fu,
Qi Li,
Weiqiang Wang
Abstract:
Graph representation learning has now become the de facto standard when handling graph-structured data, with the framework of message-passing graph neural networks (MPNN) being the most prevailing algorithmic tool. Despite its popularity, the family of MPNNs suffers from several drawbacks such as transparency and expressivity. Recently, the idea of designing neural models on graphs using the theor…
▽ More
Graph representation learning has now become the de facto standard when handling graph-structured data, with the framework of message-passing graph neural networks (MPNN) being the most prevailing algorithmic tool. Despite its popularity, the family of MPNNs suffers from several drawbacks such as transparency and expressivity. Recently, the idea of designing neural models on graphs using the theory of graph kernels has emerged as a more transparent as well as sometimes more expressive alternative to MPNNs known as kernel graph neural networks (KGNNs). Developments on KGNNs are currently a nascent field of research, leaving several challenges from algorithmic design and adaptation to other learning paradigms such as self-supervised learning. In this paper, we improve the design and learning of KGNNs. Firstly, we extend the algorithmic formulation of KGNNs by allowing a more flexible graph-level similarity definition that encompasses former proposals like random walk graph kernel, as well as providing a smoother optimization objective that alleviates the need of introducing combinatorial learning procedures. Secondly, we enhance KGNNs through the lens of self-supervision via developing a novel structure-preserving graph data augmentation method called latent graph augmentation (LGA). Finally, we perform extensive empirical evaluations to demonstrate the efficacy of our proposed mechanisms. Experimental results over benchmark datasets suggest that our proposed model achieves competitive performance that is comparable to or sometimes outperforming state-of-the-art graph representation learning frameworks with or without self-supervision on graph classification tasks. Comparisons against other previously established graph data augmentation methods verify that the proposed LGA augmentation scheme captures better semantics of graph-level invariance.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
TransFace: Calibrating Transformer Training for Face Recognition from a Data-Centric Perspective
Authors:
Jun Dan,
Yang Liu,
Haoyu Xie,
Jiankang Deng,
Haoran Xie,
Xuansong Xie,
Baigui Sun
Abstract:
Vision Transformers (ViTs) have demonstrated powerful representation ability in various visual tasks thanks to their intrinsic data-hungry nature. However, we unexpectedly find that ViTs perform vulnerably when applied to face recognition (FR) scenarios with extremely large datasets. We investigate the reasons for this phenomenon and discover that the existing data augmentation approach and hard s…
▽ More
Vision Transformers (ViTs) have demonstrated powerful representation ability in various visual tasks thanks to their intrinsic data-hungry nature. However, we unexpectedly find that ViTs perform vulnerably when applied to face recognition (FR) scenarios with extremely large datasets. We investigate the reasons for this phenomenon and discover that the existing data augmentation approach and hard sample mining strategy are incompatible with ViTs-based FR backbone due to the lack of tailored consideration on preserving face structural information and leveraging each local token information. To remedy these problems, this paper proposes a superior FR model called TransFace, which employs a patch-level data augmentation strategy named DPAP and a hard sample mining strategy named EHSM. Specially, DPAP randomly perturbs the amplitude information of dominant patches to expand sample diversity, which effectively alleviates the overfitting problem in ViTs. EHSM utilizes the information entropy in the local tokens to dynamically adjust the importance weight of easy and hard samples during training, leading to a more stable prediction. Experiments on several benchmarks demonstrate the superiority of our TransFace. Code and models are available at https://github.com/DanJun6737/TransFace.
△ Less
Submitted 19 August, 2023;
originally announced August 2023.
-
A multiscale generative model to understand disorder in domain boundaries
Authors:
Jiadong Dan,
Moaz Waqar,
Ivan Erofeev,
Kui Yao,
John Wang,
Stephen J. Pennycook,
N. Duane Loh
Abstract:
A continuing challenge in atomic resolution microscopy is to identify significant structural motifs and their assembly rules in synthesized materials with limited observations. Here we propose and validate a simple and effective hybrid generative model capable of predicting unseen domain boundaries in a potassium sodium niobate thin film from only a small number of observations, without expensive…
▽ More
A continuing challenge in atomic resolution microscopy is to identify significant structural motifs and their assembly rules in synthesized materials with limited observations. Here we propose and validate a simple and effective hybrid generative model capable of predicting unseen domain boundaries in a potassium sodium niobate thin film from only a small number of observations, without expensive first-principles calculation. Our results demonstrate that complicated domain boundary structures can arise from simple interpretable local rules, played out probabilistically. We also found new significant tileable boundary motifs and evidence that our system creates domain boundaries with the highest entropy. More broadly, our work shows that simple yet interpretable machine learning models can help us describe and understand the nature and origin of disorder in complex materials.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
GUARD: Graph Universal Adversarial Defense
Authors:
Jintang Li,
Jie Liao,
Ruofan Wu,
Liang Chen,
Zibin Zheng,
Jiawang Dan,
Changhua Meng,
Weiqiang Wang
Abstract:
Graph convolutional networks (GCNs) have been shown to be vulnerable to small adversarial perturbations, which becomes a severe threat and largely limits their applications in security-critical scenarios. To mitigate such a threat, considerable research efforts have been devoted to increasing the robustness of GCNs against adversarial attacks. However, current defense approaches are typically desi…
▽ More
Graph convolutional networks (GCNs) have been shown to be vulnerable to small adversarial perturbations, which becomes a severe threat and largely limits their applications in security-critical scenarios. To mitigate such a threat, considerable research efforts have been devoted to increasing the robustness of GCNs against adversarial attacks. However, current defense approaches are typically designed to prevent GCNs from untargeted adversarial attacks and focus on overall performance, making it challenging to protect important local nodes from more powerful targeted adversarial attacks. Additionally, a trade-off between robustness and performance is often made in existing research. Such limitations highlight the need for developing an effective and efficient approach that can defend local nodes against targeted attacks, without compromising the overall performance of GCNs. In this work, we present a simple yet effective method, named Graph Universal Adversarial Defense (GUARD). Unlike previous works, GUARD protects each individual node from attacks with a universal defensive patch, which is generated once and can be applied to any node (node-agnostic) in a graph. GUARD is fast, straightforward to implement without any change to network architecture nor any additional parameters, and is broadly applicable to any GCNs. Extensive experiments on four benchmark datasets demonstrate that GUARD significantly improves robustness for several established GCNs against multiple adversarial attacks and outperforms state-of-the-art defense methods by large margins.
△ Less
Submitted 12 August, 2023; v1 submitted 20 April, 2022;
originally announced April 2022.
-
Grouped Variable Selection for Generalized Eigenvalue Problems
Authors:
Jonathan Dan,
Simon Geirnaert,
Alexander Bertrand
Abstract:
Many problems require the selection of a subset of variables from a full set of optimization variables. The computational complexity of an exhaustive search over all possible subsets of variables is, however, prohibitively expensive, necessitating more efficient but potentially suboptimal search strategies. We focus on sparse variable selection for generalized Rayleigh quotient optimization and ge…
▽ More
Many problems require the selection of a subset of variables from a full set of optimization variables. The computational complexity of an exhaustive search over all possible subsets of variables is, however, prohibitively expensive, necessitating more efficient but potentially suboptimal search strategies. We focus on sparse variable selection for generalized Rayleigh quotient optimization and generalized eigenvalue problems. Such problems often arise in the signal processing field, e.g., in the design of optimal data-driven filters. We extend and generalize existing work on convex optimization-based variable selection using semidefinite relaxations toward group-sparse variable selection using the $\ell_{1,\infty}$-norm. This group-sparsity allows, for instance, to perform sensor selection for spatio-temporal (instead of purely spatial) filters, and to select variables based on multiple generalized eigenvectors instead of only the dominant one. Furthermore, we extensively compare our method to state-of-the-art methods for sensor selection for spatio-temporal filter design in a simulated sensor network setting. The results show both the proposed algorithm and backward greedy selection method best approximate the exhaustive solution. However, the backward greedy selection has more specific failure cases, in particular for ill-conditioned covariance matrices. As such, the proposed algorithm is the most robust currently available method for group-sparse variable selection in generalized eigenvalue problems.
△ Less
Submitted 26 January, 2022; v1 submitted 28 May, 2021;
originally announced May 2021.
-
Learning Motifs and their Hierarchies in Atomic Resolution Microscopy
Authors:
Jiadong Dan,
Xiaoxu Zhao,
Shoucong Ning,
Jiong Lu,
Kian Ping Loh,
N. Duane Loh,
Stephen J. Pennycook
Abstract:
Progress in functional materials discovery has been accelerated by advances in high throughput materials synthesis and by the development of high-throughput computation. However, a complementary robust and high throughput structural characterization framework is still lacking. New methods and tools in the field of machine learning suggest that a highly automated high-throughput structural characte…
▽ More
Progress in functional materials discovery has been accelerated by advances in high throughput materials synthesis and by the development of high-throughput computation. However, a complementary robust and high throughput structural characterization framework is still lacking. New methods and tools in the field of machine learning suggest that a highly automated high-throughput structural characterization framework based on atomic-level imaging can establish the crucial statistical link between structure and macroscopic properties. Here we develop a machine learning framework towards this goal. Our framework captures local structural features in images with Zernike polynomials, which is demonstrably noise-robust, flexible, and accurate. These features are then classified into readily interpretable structural motifs with a hierarchical active learning scheme powered by a novel unsupervised two-stage relaxed clustering scheme. We have successfully demonstrated the accuracy and efficiency of the proposed methodology by mapping a full spectrum of structural defects, including point defects, line defects, and planar defects in scanning transmission electron microscopy (STEM) images of various 2D materials, with greatly improved separability over existing methods. Our techniques can be easily and flexibly applied to other types of microscopy data with complex features, providing a solid foundation for automatic, multiscale feature analysis with high veracity.
△ Less
Submitted 29 November, 2021; v1 submitted 23 May, 2020;
originally announced May 2020.
-
IMAC: Impulsive-mitigation adaptive sparse channel estimation based on Gaussian-mixture model
Authors:
Tingping Zhang,
Jingpei Dan,
Guan Gui
Abstract:
Broadband frequency-selective fading channels usually have the inherent sparse nature. By exploiting the sparsity, adaptive sparse channel estimation (ASCE) methods, e.g., reweighted L1-norm least mean square (RL1-LMS), could bring a performance gain if additive noise satisfying Gaussian assumption. In real communication environments, however, channel estimation performance is often deteriorated b…
▽ More
Broadband frequency-selective fading channels usually have the inherent sparse nature. By exploiting the sparsity, adaptive sparse channel estimation (ASCE) methods, e.g., reweighted L1-norm least mean square (RL1-LMS), could bring a performance gain if additive noise satisfying Gaussian assumption. In real communication environments, however, channel estimation performance is often deteriorated by unexpected non-Gaussian noises which include conventional Gaussian noises and impulsive interferences. To design stable communication systems, hence, it is urgent to develop advanced channel estimation methods to remove the impulsive interference and to exploit channel sparsity simultaneously. In this paper, robust impulsive-mitigation adaptive sparse channel estimation (IMAC) method is proposed for solving aforementioned technical issues. Specifically, first of all, the non-Gaussian noise model is described by Gaussian mixture model (GMM). Secondly, cost function of reweighted L1-norm penalized least absolute error standard (RL1-LAE) algorithm is constructed. Then, RL1-LAE algorithm is derived for realizing IMAC method. Finally, representative simulation results are provided to corroborate the studies.
△ Less
Submitted 2 March, 2015;
originally announced March 2015.
-
Conditions for supersonic bent Marshak waves
Authors:
Qiang Xu,
Xiao-dong Ren,
Jing Li,
Jia-kun Dan,
Kun-lun Wang,
Shao-tong Zhou
Abstract:
Supersonic radiation diffusion approximation is a useful way to study the radiation transportation. Considering the bent Marshak wave theory in 2-dimensions, and an invariable source temperature, we get the supersonic radiation diffusion conditions which are about the Mach number $M>8(1+\sqrt{\ep})/3$, and the optical depth $τ>1$. A large Mach number requires a high temperature, while a large opti…
▽ More
Supersonic radiation diffusion approximation is a useful way to study the radiation transportation. Considering the bent Marshak wave theory in 2-dimensions, and an invariable source temperature, we get the supersonic radiation diffusion conditions which are about the Mach number $M>8(1+\sqrt{\ep})/3$, and the optical depth $τ>1$. A large Mach number requires a high temperature, while a large optical depth requires a low temperature. Only when the source temperature is in a proper region these conditions can be satisfied. Assuming the material opacity and the specific internal energy depend on the temperature and the density as a form of power law, for a given density, these conditions correspond to a region about source temperature and the length of the sample. This supersonic diffusion region involves both lower and upper limit of source temperature, while that in 1-dimension only gives a lower limit. Taking $\rm SiO_2$ and the Au for example, we show the supersonic region numerically.
△ Less
Submitted 15 October, 2014;
originally announced October 2014.
-
Significance of self magnetic field in long-distance collimation of laser-generated electron beams
Authors:
Shi Chen,
Jiaofeng Huang,
Yifei Niu,
Jiakun Dan,
Ziyu Chen,
Jianfeng Li
Abstract:
Long-distance collimation of fast electron beams generated by laser-metallic-wire targets has been observed in recent experiments, while the mechanism behind this phenomenon remains unclear. In this work, we investigate in detail the laser-wire interaction processes with a simplified model and Classical Trajectory Monte Carlo simulations, and demonstrate the significance of the self magnetic field…
▽ More
Long-distance collimation of fast electron beams generated by laser-metallic-wire targets has been observed in recent experiments, while the mechanism behind this phenomenon remains unclear. In this work, we investigate in detail the laser-wire interaction processes with a simplified model and Classical Trajectory Monte Carlo simulations, and demonstrate the significance of the self magnetic fields of the beams in the long-distance collimation. Good agreements of simulated image plate patterns with various experiments and detailed analysis of electron trajectories show that the self magnetic fields provide restoring force that is critical for the beam collimation. By studying the wire-length dependence of beam divergence in certain experiments, we clarify that the role of the metallic wire is to balance the space-charge effect and thus maintain the collimation.
△ Less
Submitted 9 October, 2014;
originally announced October 2014.
-
Magnetic Generation due to Mass Difference between Charge Carriers
Authors:
Shi Chen,
JiaKun Dan,
ZiYu Chen,
JianFeng Li
Abstract:
The possibility of spontaneous magnetization due to the "asymmetry in mass" of charge carriers in a system is investigated. Analysis shows that when the masses of positive and negative charge carriers are identical, no magnetization is predicted. However, if the masses of two species are different, spontaneous magnetic field would appear, either due to the equipartition of magnetic energy or due t…
▽ More
The possibility of spontaneous magnetization due to the "asymmetry in mass" of charge carriers in a system is investigated. Analysis shows that when the masses of positive and negative charge carriers are identical, no magnetization is predicted. However, if the masses of two species are different, spontaneous magnetic field would appear, either due to the equipartition of magnetic energy or due to fluctuations together with a feedback mechanism. The conditions for magnetization to occur are also obtained, in the form of n-T phase diagram. The theory proposed here, if confirmed by future observations and/or experiments, would provide a new insight on the origin of magnetic fields in the universe.
△ Less
Submitted 31 October, 2013;
originally announced November 2013.