-
ReStNet: A Reusable & Stitchable Network for Dynamic Adaptation on IoT Devices
Authors:
Maoyu Wang,
Yao Lu,
Jiaqi Nie,
Zeyu Wang,
Yun Lin,
Qi Xuan,
Guan Gui
Abstract:
With the rapid development of deep learning, a growing number of pre-trained models have been publicly available. However, deploying these fixed models in real-world IoT applications is challenging because different devices possess heterogeneous computational and memory resources, making it impossible to deploy a single model across all platforms. Although traditional compression methods, such as…
▽ More
With the rapid development of deep learning, a growing number of pre-trained models have been publicly available. However, deploying these fixed models in real-world IoT applications is challenging because different devices possess heterogeneous computational and memory resources, making it impossible to deploy a single model across all platforms. Although traditional compression methods, such as pruning, quantization, and knowledge distillation, can improve efficiency, they become inflexible once applied and cannot adapt to changing resource constraints. To address these issues, we propose ReStNet, a Reusable and Stitchable Network that dynamically constructs a hybrid network by stitching two pre-trained models together. Implementing ReStNet requires addressing several key challenges, including how to select the optimal stitching points, determine the stitching order of the two pre-trained models, and choose an effective fine-tuning strategy. To systematically address these challenges and adapt to varying resource constraints, ReStNet determines the stitching point by calculating layer-wise similarity via Centered Kernel Alignment (CKA). It then constructs the hybrid model by retaining early layers from a larger-capacity model and appending deeper layers from a smaller one. To facilitate efficient deployment, only the stitching layer is fine-tuned. This design enables rapid adaptation to changing budgets while fully leveraging available resources. Moreover, ReStNet supports both homogeneous (CNN-CNN, Transformer-Transformer) and heterogeneous (CNN-Transformer) stitching, allowing to combine different model families flexibly. Extensive experiments on multiple benchmarks demonstrate that ReStNet achieve flexible accuracy-efficiency trade-offs at runtime while significantly reducing training cost.
△ Less
Submitted 8 June, 2025;
originally announced June 2025.
-
Better Reasoning with Less Data: Enhancing VLMs Through Unified Modality Scoring
Authors:
Mingjie Xu,
Andrew Estornell,
Hongzheng Yang,
Yuzhi Zhao,
Zhaowei Zhu,
Qi Xuan,
Jiaheng Wei
Abstract:
The application of visual instruction tuning and other post-training techniques has significantly enhanced the capabilities of Large Language Models (LLMs) in visual understanding, enriching Vision-Language Models (VLMs) with more comprehensive visual language datasets. However, the effectiveness of VLMs is highly dependent on large-scale, high-quality datasets that ensure precise recognition and…
▽ More
The application of visual instruction tuning and other post-training techniques has significantly enhanced the capabilities of Large Language Models (LLMs) in visual understanding, enriching Vision-Language Models (VLMs) with more comprehensive visual language datasets. However, the effectiveness of VLMs is highly dependent on large-scale, high-quality datasets that ensure precise recognition and accurate reasoning. Two key challenges hinder progress: (1) noisy alignments between images and the corresponding text, which leads to misinterpretation, and (2) ambiguous or misleading text, which obscures visual content. To address these challenges, we propose SCALE (Single modality data quality and Cross modality Alignment Evaluation), a novel quality-driven data selection pipeline for VLM instruction tuning datasets. Specifically, SCALE integrates a cross-modality assessment framework that first assigns each data entry to its appropriate vision-language task, generates general and task-specific captions (covering scenes, objects, style, etc.), and evaluates the alignment, clarity, task rarity, text coherence, and image clarity of each entry based on the generated captions. We reveal that: (1) current unimodal quality assessment methods evaluate one modality while overlooking the rest, which can underestimate samples essential for specific tasks and discard the lower-quality instances that help build model robustness; and (2) appropriately generated image captions provide an efficient way to transfer the image-text multimodal task into a unified text modality.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
FCOS: A Two-Stage Recoverable Model Pruning Framework for Automatic Modulation Recognition
Authors:
Yao Lu,
Tengfei Ma,
Zeyu Wang,
Zhuangzhi Chen,
Dongwei Xu,
Yun Lin,
Qi Xuan,
Guan Gui
Abstract:
With the rapid development of wireless communications and the growing complexity of digital modulation schemes, traditional manual modulation recognition methods struggle to extract reliable signal features and meet real-time requirements in modern scenarios. Recently, deep learning based Automatic Modulation Recognition (AMR) approaches have greatly improved classification accuracy. However, thei…
▽ More
With the rapid development of wireless communications and the growing complexity of digital modulation schemes, traditional manual modulation recognition methods struggle to extract reliable signal features and meet real-time requirements in modern scenarios. Recently, deep learning based Automatic Modulation Recognition (AMR) approaches have greatly improved classification accuracy. However, their large model sizes and high computational demands hinder deployment on resource-constrained devices. Model pruning provides a general approach to reduce model complexity, but existing weight, channel, and layer pruning techniques each present a trade-off between compression rate, hardware acceleration, and accuracy preservation. To this end, in this paper, we introduce FCOS, a novel Fine-to-COarse two-Stage pruning framework that combines channel-level pruning with layer-level collapse diagnosis to achieve extreme compression, high performance and efficient inference. In the first stage of FCOS, hierarchical clustering and parameter fusion are applied to channel weights to achieve channel-level pruning. Then a Layer Collapse Diagnosis (LaCD) module uses linear probing to identify layer collapse and removes the collapsed layers due to high channel compression ratio. Experiments on multiple AMR benchmarks demonstrate that FCOS outperforms existing channel and layer pruning methods. Specifically, FCOS achieves 95.51% FLOPs reduction and 95.31% parameter reduction while still maintaining performance close to the original ResNet56, with only a 0.46% drop in accuracy on Sig2019-12. Code is available at https://github.com/yaolu-zjut/FCOS.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
GUARD: Generation-time LLM Unlearning via Adaptive Restriction and Detection
Authors:
Zhijie Deng,
Chris Yuhao Liu,
Zirui Pang,
Xinlei He,
Lei Feng,
Qi Xuan,
Zhaowei Zhu,
Jiaheng Wei
Abstract:
Large Language Models (LLMs) have demonstrated strong capabilities in memorizing vast amounts of knowledge across diverse domains. However, the ability to selectively forget specific knowledge is critical for ensuring the safety and compliance of deployed models. Existing unlearning efforts typically fine-tune the model with resources such as forget data, retain data, and a calibration model. Thes…
▽ More
Large Language Models (LLMs) have demonstrated strong capabilities in memorizing vast amounts of knowledge across diverse domains. However, the ability to selectively forget specific knowledge is critical for ensuring the safety and compliance of deployed models. Existing unlearning efforts typically fine-tune the model with resources such as forget data, retain data, and a calibration model. These additional gradient steps blur the decision boundary between forget and retain knowledge, making unlearning often at the expense of overall performance. To avoid the negative impact of fine-tuning, it would be better to unlearn solely at inference time by safely guarding the model against generating responses related to the forget target, without destroying the fluency of text generation. In this work, we propose Generation-time Unlearning via Adaptive Restriction and Detection (GUARD), a framework that enables dynamic unlearning during LLM generation. Specifically, we first employ a prompt classifier to detect unlearning targets and extract the corresponding forbidden token. We then dynamically penalize and filter candidate tokens during generation using a combination of token matching and semantic matching, effectively preventing the model from leaking the forgotten content. Experimental results on copyright content unlearning tasks over the Harry Potter dataset and the MUSE benchmark, as well as entity unlearning tasks on the TOFU dataset, demonstrate that GUARD achieves strong forget quality across various tasks while causing almost no degradation to the LLM's general capabilities, striking an excellent trade-off between forgetting and utility.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
Adaptive Substructure-Aware Expert Model for Molecular Property Prediction
Authors:
Tianyi Jiang,
Zeyu Wang,
Shanqing Yu,
Qi Xuan
Abstract:
Molecular property prediction is essential for applications such as drug discovery and toxicity assessment. While Graph Neural Networks (GNNs) have shown promising results by modeling molecules as molecular graphs, their reliance on data-driven learning limits their ability to generalize, particularly in the presence of data imbalance and diverse molecular substructures. Existing methods often ove…
▽ More
Molecular property prediction is essential for applications such as drug discovery and toxicity assessment. While Graph Neural Networks (GNNs) have shown promising results by modeling molecules as molecular graphs, their reliance on data-driven learning limits their ability to generalize, particularly in the presence of data imbalance and diverse molecular substructures. Existing methods often overlook the varying contributions of different substructures to molecular properties, treating them uniformly. To address these challenges, we propose ASE-Mol, a novel GNN-based framework that leverages a Mixture-of-Experts (MoE) approach for molecular property prediction. ASE-Mol incorporates BRICS decomposition and significant substructure awareness to dynamically identify positive and negative substructures. By integrating a MoE architecture, it reduces the adverse impact of negative motifs while improving adaptability to positive motifs. Experimental results on eight benchmark datasets demonstrate that ASE-Mol achieves state-of-the-art performance, with significant improvements in both accuracy and interpretability.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Hierarchical Local-Global Feature Learning for Few-shot Malicious Traffic Detection
Authors:
Songtao Peng,
Lei Wang,
Wu Shuai,
Hao Song,
Jiajun Zhou,
Shanqing Yu,
Qi Xuan
Abstract:
With the rapid growth of internet traffic, malicious network attacks have become increasingly frequent and sophisticated, posing significant threats to global cybersecurity. Traditional detection methods, including rule-based and machine learning-based approaches, struggle to accurately identify emerging threats, particularly in scenarios with limited samples. While recent advances in few-shot lea…
▽ More
With the rapid growth of internet traffic, malicious network attacks have become increasingly frequent and sophisticated, posing significant threats to global cybersecurity. Traditional detection methods, including rule-based and machine learning-based approaches, struggle to accurately identify emerging threats, particularly in scenarios with limited samples. While recent advances in few-shot learning have partially addressed the data scarcity issue, existing methods still exhibit high false positive rates and lack the capability to effectively capture crucial local traffic patterns. In this paper, we propose HLoG, a novel hierarchical few-shot malicious traffic detection framework that leverages both local and global features extracted from network sessions. HLoG employs a sliding-window approach to segment sessions into phases, capturing fine-grained local interaction patterns through hierarchical bidirectional GRU encoding, while simultaneously modeling global contextual dependencies. We further design a session similarity assessment module that integrates local similarity with global self-attention-enhanced representations, achieving accurate and robust few-shot traffic classification. Comprehensive experiments on three meticulously reconstructed datasets demonstrate that HLoG significantly outperforms existing state-of-the-art methods. Particularly, HLoG achieves superior recall rates while substantially reducing false positives, highlighting its effectiveness and practical value in real-world cybersecurity applications.
△ Less
Submitted 1 April, 2025;
originally announced April 2025.
-
Unveiling Latent Information in Transaction Hashes: Hypergraph Learning for Ethereum Ponzi Scheme Detection
Authors:
Junhao Wu,
Yixin Yang,
Chengxiang Jin,
Silu Mu,
Xiaolei Qian,
Jiajun Zhou,
Shanqing Yu,
Qi Xuan
Abstract:
With the widespread adoption of Ethereum, financial frauds such as Ponzi schemes have become increasingly rampant in the blockchain ecosystem, posing significant threats to the security of account assets. Existing Ethereum fraud detection methods typically model account transactions as graphs, but this approach primarily focuses on binary transactional relationships between accounts, failing to ad…
▽ More
With the widespread adoption of Ethereum, financial frauds such as Ponzi schemes have become increasingly rampant in the blockchain ecosystem, posing significant threats to the security of account assets. Existing Ethereum fraud detection methods typically model account transactions as graphs, but this approach primarily focuses on binary transactional relationships between accounts, failing to adequately capture the complex multi-party interaction patterns inherent in Ethereum. To address this, we propose a hypergraph modeling method for the Ponzi scheme detection method in Ethereum, called HyperDet. Specifically, we treat transaction hashes as hyperedges that connect all the relevant accounts involved in a transaction. Additionally, we design a two-step hypergraph sampling strategy to significantly reduce computational complexity. Furthermore, we introduce a dual-channel detection module, including the hypergraph detection channel and the hyper-homo graph detection channel, to be compatible with existing detection methods. Experimental results show that, compared to traditional homogeneous graph-based methods, the hyper-homo graph detection channel achieves significant performance improvements, demonstrating the superiority of hypergraph in Ponzi scheme detection. This research offers innovations for modeling complex relationships in blockchain data.
△ Less
Submitted 27 March, 2025;
originally announced March 2025.
-
MCLRL: A Multi-Domain Contrastive Learning with Reinforcement Learning Framework for Few-Shot Modulation Recognition
Authors:
Dongwei Xu,
Yutao Zhu,
Yao Lu,
Youpeng Feng,
Yun Lin,
Qi Xuan
Abstract:
With the rapid advancements in wireless communication technology, automatic modulation recognition (AMR) plays a critical role in ensuring communication security and reliability. However, numerous challenges, including higher performance demands, difficulty in data acquisition under specific scenarios, limited sample size, and low-quality labeled data, hinder its development. Few-shot learning (FS…
▽ More
With the rapid advancements in wireless communication technology, automatic modulation recognition (AMR) plays a critical role in ensuring communication security and reliability. However, numerous challenges, including higher performance demands, difficulty in data acquisition under specific scenarios, limited sample size, and low-quality labeled data, hinder its development. Few-shot learning (FSL) offers an effective solution by enabling models to achieve satisfactory performance with only a limited number of labeled samples. While most FSL techniques are applied in the field of computer vision, they are not directly applicable to wireless signal processing. This study does not propose a new FSL-specific signal model but introduces a framework called MCLRL. This framework combines multi-domain contrastive learning with reinforcement learning. Multi-domain representations of signals enhance feature richness, while integrating contrastive learning and reinforcement learning architectures enables the extraction of deep features for classification. In downstream tasks, the model achieves excellent performance using only a few samples and minimal training cycles. Experimental results show that the MCLRL framework effectively extracts key features from signals, performs well in FSL tasks, and maintains flexibility in signal model selection.
△ Less
Submitted 26 February, 2025;
originally announced February 2025.
-
Mixture of Decoupled Message Passing Experts with Entropy Constraint for General Node Classification
Authors:
Xuanze Chen,
Jiajun Zhou,
Jinsong Chen,
Shanqing Yu,
Qi Xuan
Abstract:
The varying degrees of homophily and heterophily in real-world graphs persistently constrain the universality of graph neural networks (GNNs) for node classification. Adopting a data-centric perspective, this work reveals an inherent preference of different graphs towards distinct message encoding schemes: homophilous graphs favor local propagation, while heterophilous graphs exhibit preference fo…
▽ More
The varying degrees of homophily and heterophily in real-world graphs persistently constrain the universality of graph neural networks (GNNs) for node classification. Adopting a data-centric perspective, this work reveals an inherent preference of different graphs towards distinct message encoding schemes: homophilous graphs favor local propagation, while heterophilous graphs exhibit preference for flexible combinations of propagation and transformation. To address this, we propose GNNMoE, a universal node classification framework based on the Mixture-of-Experts (MoE) mechanism. The framework first constructs diverse message-passing experts through recombination of fine-grained encoding operators, then designs soft and hard gating layers to allocate the most suitable expert networks for each node's representation learning, thereby enhancing both model expressiveness and adaptability to diverse graphs. Furthermore, considering that soft gating might introduce encoding noise in homophilous scenarios, we introduce an entropy constraint to guide sharpening of soft gates, achieving organic integration of weighted combination and Top-K selection. Extensive experiments demonstrate that GNNMoE significantly outperforms mainstream GNNs, heterophilous GNNs, and graph transformers in both node classification performance and universality across diverse graph datasets.
△ Less
Submitted 11 February, 2025;
originally announced February 2025.
-
An effective method for profiling core-periphery structures in complex networks
Authors:
Jiaqi Nie,
Qi Xuan,
Dehong Gao,
Zhongyuan Ruan
Abstract:
Profiling core-periphery structures in networks has attracted significant attention, leading to the development of various methods. Among these, the rich-core method is distinguished for being entirely parameter-free and scalable to large networks. However, the cores it identifies are not always structurally cohesive, as they may lack high link density. Here, we propose an improved method building…
▽ More
Profiling core-periphery structures in networks has attracted significant attention, leading to the development of various methods. Among these, the rich-core method is distinguished for being entirely parameter-free and scalable to large networks. However, the cores it identifies are not always structurally cohesive, as they may lack high link density. Here, we propose an improved method building upon the rich-core framework. Instead of relying on node degree, our approach incorporates both the node's coreness $k$ and its centrality within the $k$-core. We apply the approach to twelve real-world networks, and find that the cores identified are generally denser compared to those derived from the rich-core method. Additionally, we demonstrate that the proposed method provides a natural way for identifying an exceptionally dense core, i.e., a clique, which often approximates or even matches the maximum clique in many real-world networks. Furthermore, we extend the method to multiplex networks, and show its effectiveness in identifying dense multiplex cores across several well-studied datasets. Our study may offer valuable insights into exploring the meso-scale properties of complex networks.
△ Less
Submitted 16 April, 2025; v1 submitted 11 February, 2025;
originally announced February 2025.
-
Multi-view Correlation-aware Network Traffic Detection on Flow Hypergraph
Authors:
Jiajun Zhou,
Wentao Fu,
Hao Song,
Shanqing Yu,
Qi Xuan,
Xiaoniu Yang
Abstract:
As the Internet rapidly expands, the increasing complexity and diversity of network activities pose significant challenges to effective network governance and security regulation. Network traffic, which serves as a crucial data carrier of network activities, has become indispensable in this process. Network traffic detection aims to monitor, analyze, and evaluate the data flows transmitted across…
▽ More
As the Internet rapidly expands, the increasing complexity and diversity of network activities pose significant challenges to effective network governance and security regulation. Network traffic, which serves as a crucial data carrier of network activities, has become indispensable in this process. Network traffic detection aims to monitor, analyze, and evaluate the data flows transmitted across the network to ensure network security and optimize performance. However, existing network traffic detection methods generally suffer from several limitations: 1) a narrow focus on characterizing traffic features from a single perspective; 2) insufficient exploration of discriminative features for different traffic; 3) poor generalization to different traffic scenarios. To address these issues, we propose a multi-view correlation-aware framework named FlowID for network traffic detection. FlowID captures multi-view traffic features via temporal and interaction awareness, while a hypergraph encoder further explores higher-order relationships between flows. To overcome the challenges of data imbalance and label scarcity, we design a dual-contrastive proxy task, enhancing the framework's ability to differentiate between various traffic flows through traffic-to-traffic and group-to-group contrast. Extensive experiments on five real-world datasets demonstrate that FlowID significantly outperforms existing methods in accuracy, robustness, and generalization across diverse network scenarios, particularly in detecting malicious traffic.
△ Less
Submitted 15 January, 2025;
originally announced January 2025.
-
Efficient Parallel Genetic Algorithm for Perturbed Substructure Optimization in Complex Network
Authors:
Shanqing Yu,
Meng Zhou,
Jintao Zhou,
Minghao Zhao,
Yidan Song,
Yao Lu,
Zeyu Wang,
Qi Xuan
Abstract:
Evolutionary computing, particularly genetic algorithm (GA), is a combinatorial optimization method inspired by natural selection and the transmission of genetic information, which is widely used to identify optimal solutions to complex problems through simulated programming and iteration. Due to its strong adaptability, flexibility, and robustness, GA has shown significant performance and potenti…
▽ More
Evolutionary computing, particularly genetic algorithm (GA), is a combinatorial optimization method inspired by natural selection and the transmission of genetic information, which is widely used to identify optimal solutions to complex problems through simulated programming and iteration. Due to its strong adaptability, flexibility, and robustness, GA has shown significant performance and potentiality on perturbed substructure optimization (PSSO), an important graph mining problem that achieves its goals by modifying network structures. However, the efficiency and practicality of GA-based PSSO face enormous challenges due to the complexity and diversity of application scenarios. While some research has explored acceleration frameworks in evolutionary computing, their performance on PSSO remains limited due to a lack of scenario generalizability. Based on these, this paper is the first to present the GA-based PSSO Acceleration framework (GAPA), which simplifies the GA development process and supports distributed acceleration. Specifically, it reconstructs the genetic operation and designs a development framework for efficient parallel acceleration. Meanwhile, GAPA includes an extensible library that optimizes and accelerates 10 PSSO algorithms, covering 4 crucial tasks for graph mining. Comprehensive experiments on 18 datasets across 4 tasks and 10 algorithms effectively demonstrate the superiority of GAPA, achieving an average of 4x the acceleration of Evox. The repository is in https://github.com/NetAlsGroup/GAPA.
△ Less
Submitted 30 December, 2024;
originally announced December 2024.
-
CoF: Coarse to Fine-Grained Image Understanding for Multi-modal Large Language Models
Authors:
Yeyuan Wang,
Dehong Gao,
Bin Li,
Rujiao Long,
Lei Yi,
Xiaoyan Cai,
Libin Yang,
Jinxia Zhang,
Shanqing Yu,
Qi Xuan
Abstract:
The impressive performance of Large Language Model (LLM) has prompted researchers to develop Multi-modal LLM (MLLM), which has shown great potential for various multi-modal tasks. However, current MLLM often struggles to effectively address fine-grained multi-modal challenges. We argue that this limitation is closely linked to the models' visual grounding capabilities. The restricted spatial aware…
▽ More
The impressive performance of Large Language Model (LLM) has prompted researchers to develop Multi-modal LLM (MLLM), which has shown great potential for various multi-modal tasks. However, current MLLM often struggles to effectively address fine-grained multi-modal challenges. We argue that this limitation is closely linked to the models' visual grounding capabilities. The restricted spatial awareness and perceptual acuity of visual encoders frequently lead to interference from irrelevant background information in images, causing the models to overlook subtle but crucial details. As a result, achieving fine-grained regional visual comprehension becomes difficult. In this paper, we break down multi-modal understanding into two stages, from Coarse to Fine (CoF). In the first stage, we prompt the MLLM to locate the approximate area of the answer. In the second stage, we further enhance the model's focus on relevant areas within the image through visual prompt engineering, adjusting attention weights of pertinent regions. This, in turn, improves both visual grounding and overall performance in downstream tasks. Our experiments show that this approach significantly boosts the performance of baseline models, demonstrating notable generalization and effectiveness. Our CoF approach is available online at https://github.com/Gavin001201/CoF.
△ Less
Submitted 22 December, 2024;
originally announced December 2024.
-
Mixture of Experts Meets Decoupled Message Passing: Towards General and Adaptive Node Classification
Authors:
Xuanze Chen,
Jiajun Zhou,
Shanqing Yu,
Qi Xuan
Abstract:
Graph neural networks excel at graph representation learning but struggle with heterophilous data and long-range dependencies. And graph transformers address these issues through self-attention, yet face scalability and noise challenges on large-scale graphs. To overcome these limitations, we propose GNNMoE, a universal model architecture for node classification. This architecture flexibly combine…
▽ More
Graph neural networks excel at graph representation learning but struggle with heterophilous data and long-range dependencies. And graph transformers address these issues through self-attention, yet face scalability and noise challenges on large-scale graphs. To overcome these limitations, we propose GNNMoE, a universal model architecture for node classification. This architecture flexibly combines fine-grained message-passing operations with a mixture-of-experts mechanism to build feature encoding blocks. Furthermore, by incorporating soft and hard gating layers to assign the most suitable expert networks to each node, we enhance the model's expressive power and adaptability to different graph types. In addition, we introduce adaptive residual connections and an enhanced FFN module into GNNMoE, further improving the expressiveness of node representation. Extensive experimental results demonstrate that GNNMoE performs exceptionally well across various types of graph data, effectively alleviating the over-smoothing issue and global noise, enhancing model robustness and adaptability, while also ensuring computational efficiency on large-scale graphs.
△ Less
Submitted 11 February, 2025; v1 submitted 11 December, 2024;
originally announced December 2024.
-
Reassessing Layer Pruning in LLMs: New Insights and Methods
Authors:
Yao Lu,
Hao Cheng,
Yujie Fang,
Zeyu Wang,
Jiaheng Wei,
Dongwei Xu,
Qi Xuan,
Xiaoniu Yang,
Zhaowei Zhu
Abstract:
Although large language models (LLMs) have achieved remarkable success across various domains, their considerable scale necessitates substantial computational resources, posing significant challenges for deployment in resource-constrained environments. Layer pruning, as a simple yet effective compression method, removes layers of a model directly, reducing computational overhead. However, what are…
▽ More
Although large language models (LLMs) have achieved remarkable success across various domains, their considerable scale necessitates substantial computational resources, posing significant challenges for deployment in resource-constrained environments. Layer pruning, as a simple yet effective compression method, removes layers of a model directly, reducing computational overhead. However, what are the best practices for layer pruning in LLMs? Are sophisticated layer selection metrics truly effective? Does the LoRA (Low-Rank Approximation) family, widely regarded as a leading method for pruned model fine-tuning, truly meet expectations when applied to post-pruning fine-tuning? To answer these questions, we dedicate thousands of GPU hours to benchmarking layer pruning in LLMs and gaining insights across multiple dimensions. Our results demonstrate that a simple approach, i.e., pruning the final 25\% of layers followed by fine-tuning the \texttt{lm\_head} and the remaining last three layer, yields remarkably strong performance. Following this guide, we prune Llama-3.1-8B-It and obtain a model that outperforms many popular LLMs of similar size, such as ChatGLM2-6B, Vicuna-7B-v1.5, Qwen1.5-7B and Baichuan2-7B. We release the optimal model weights on Huggingface, and the code is available on GitHub.
△ Less
Submitted 23 November, 2024;
originally announced November 2024.
-
RedTest: Towards Measuring Redundancy in Deep Neural Networks Effectively
Authors:
Yao Lu,
Peixin Zhang,
Jingyi Wang,
Lei Ma,
Xiaoniu Yang,
Qi Xuan
Abstract:
Deep learning has revolutionized computing in many real-world applications, arguably due to its remarkable performance and extreme convenience as an end-to-end solution. However, deep learning models can be costly to train and to use, especially for those large-scale models, making it necessary to optimize the original overly complicated models into smaller ones in scenarios with limited resources…
▽ More
Deep learning has revolutionized computing in many real-world applications, arguably due to its remarkable performance and extreme convenience as an end-to-end solution. However, deep learning models can be costly to train and to use, especially for those large-scale models, making it necessary to optimize the original overly complicated models into smaller ones in scenarios with limited resources such as mobile applications or simply for resource saving. The key question in such model optimization is, how can we effectively identify and measure the redundancy in a deep learning model structure. While several common metrics exist in the popular model optimization techniques to measure the performance of models after optimization, they are not able to quantitatively inform the degree of remaining redundancy. To address the problem, we present a novel testing approach, i.e., RedTest, which proposes a novel testing metric called Model Structural Redundancy Score (MSRS) to quantitatively measure the degree of redundancy in a deep learning model structure. We first show that MSRS is effective in both revealing and assessing the redundancy issues in many state-of-the-art models, which urgently calls for model optimization. Then, we utilize MSRS to assist deep learning model developers in two practical application scenarios: 1) in Neural Architecture Search, we design a novel redundancy-aware algorithm to guide the search for the optimal model structure and demonstrate its effectiveness by comparing it to existing standard NAS practice; 2) in the pruning of large-scale pre-trained models, we prune the redundant layers of pre-trained models with the guidance of layer similarity to derive less redundant ones of much smaller size. Extensive experimental results demonstrate that removing such redundancy has a negligible effect on the model utility.
△ Less
Submitted 15 November, 2024;
originally announced November 2024.
-
Lateral Movement Detection via Time-aware Subgraph Classification on Authentication Logs
Authors:
Jiajun Zhou,
Jiacheng Yao,
Xuanze Chen,
Shanqing Yu,
Qi Xuan,
Xiaoniu Yang
Abstract:
Lateral movement is a crucial component of advanced persistent threat (APT) attacks in networks. Attackers exploit security vulnerabilities in internal networks or IoT devices, expanding their control after initial infiltration to steal sensitive data or carry out other malicious activities, posing a serious threat to system security. Existing research suggests that attackers generally employ seem…
▽ More
Lateral movement is a crucial component of advanced persistent threat (APT) attacks in networks. Attackers exploit security vulnerabilities in internal networks or IoT devices, expanding their control after initial infiltration to steal sensitive data or carry out other malicious activities, posing a serious threat to system security. Existing research suggests that attackers generally employ seemingly unrelated operations to mask their malicious intentions, thereby evading existing lateral movement detection methods and hiding their intrusion traces. In this regard, we analyze host authentication log data from a graph perspective and propose a multi-scale lateral movement detection framework called LMDetect. The main workflow of this framework proceeds as follows: 1) Construct a heterogeneous multigraph from host authentication log data to strengthen the correlations among internal system entities; 2) Design a time-aware subgraph generator to extract subgraphs centered on authentication events from the heterogeneous authentication multigraph; 3) Design a multi-scale attention encoder that leverages both local and global attention to capture hidden anomalous behavior patterns in the authentication subgraphs, thereby achieving lateral movement detection. Extensive experiments on two real-world authentication log datasets demonstrate the effectiveness and superiority of our framework in detecting lateral movement behaviors.
△ Less
Submitted 15 November, 2024;
originally announced November 2024.
-
Double Whammy: Stealthy Data Manipulation aided Reconstruction Attack on Graph Federated Learning
Authors:
Jinyin Chen,
Minying Ma,
Haibin Zheng,
Qi Xuan
Abstract:
Recent research has constructed successful graph reconstruction attack (GRA) on GFL. But these attacks are still challenged in aspects of effectiveness and stealth. To address the issues, we propose the first Data Manipulation aided Reconstruction attack on GFL, dubbed as DMan4Rec. The malicious client is born to manipulate its locally collected data to enhance graph stealing privacy from benign o…
▽ More
Recent research has constructed successful graph reconstruction attack (GRA) on GFL. But these attacks are still challenged in aspects of effectiveness and stealth. To address the issues, we propose the first Data Manipulation aided Reconstruction attack on GFL, dubbed as DMan4Rec. The malicious client is born to manipulate its locally collected data to enhance graph stealing privacy from benign ones, so as to construct double whammy on GFL. It differs from previous work in three terms: (1) effectiveness - to fully utilize the sparsity and feature smoothness of the graph, novel penalty terms are designed adaptive to diverse similarity functions for connected and unconnected node pairs, as well as incorporation label smoothing on top of the original cross-entropy loss. (2) scalability - DMan4Rec is capable of both white-box and black-box attacks via training a supervised model to infer the posterior probabilities obtained from limited queries (3) stealthiness - by manipulating the malicious client's node features, it can maintain the overall graph structure's invariance and conceal the attack. Comprehensive experiments on four real datasets and three GNN models demonstrate that DMan4Rec achieves the state-of-the-art (SOTA) attack performance, e.g., the attack AUC and precision improved by 9.2% and 10.5% respectively compared with the SOTA baselines. Particularly, DMan4Rec achieves an AUC score and a precision score of up to 99.59% and 99.56%, respectively in black-box setting. Nevertheless, the complete overlap of the distribution graphs supports the stealthiness of the attack. Besides, DMan4Rec still beats the defensive GFL, which alarms a new threat to GFL.
△ Less
Submitted 5 November, 2024;
originally announced November 2024.
-
Rethinking Graph Transformer Architecture Design for Node Classification
Authors:
Jiajun Zhou,
Xuanze Chen,
Chenxuan Xie,
Yu Shanqing,
Qi Xuan,
Xiaoniu Yang
Abstract:
Graph Transformer (GT), as a special type of Graph Neural Networks (GNNs), utilizes multi-head attention to facilitate high-order message passing. However, this also imposes several limitations in node classification applications: 1) nodes are susceptible to global noise; 2) self-attention computation cannot scale well to large graphs. In this work, we conduct extensive observational experiments t…
▽ More
Graph Transformer (GT), as a special type of Graph Neural Networks (GNNs), utilizes multi-head attention to facilitate high-order message passing. However, this also imposes several limitations in node classification applications: 1) nodes are susceptible to global noise; 2) self-attention computation cannot scale well to large graphs. In this work, we conduct extensive observational experiments to explore the adaptability of the GT architecture in node classification tasks and draw several conclusions: the current multi-head self-attention module in GT can be completely replaceable, while the feed-forward neural network module proves to be valuable. Based on this, we decouple the propagation (P) and transformation (T) of GNNs and explore a powerful GT architecture, named GNNFormer, which is based on the P/T combination message passing and adapted for node classification in both homophilous and heterophilous scenarios. Extensive experiments on 12 benchmark datasets demonstrate that our proposed GT architecture can effectively adapt to node classification tasks without being affected by global noise and computational efficiency limitations.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
Network Anomaly Traffic Detection via Multi-view Feature Fusion
Authors:
Song Hao,
Wentao Fu,
Xuanze Chen,
Chengxiang Jin,
Jiajun Zhou,
Shanqing Yu,
Qi Xuan
Abstract:
Traditional anomalous traffic detection methods are based on single-view analysis, which has obvious limitations in dealing with complex attacks and encrypted communications. In this regard, we propose a Multi-view Feature Fusion (MuFF) method for network anomaly traffic detection. MuFF models the temporal and interactive relationships of packets in network traffic based on the temporal and intera…
▽ More
Traditional anomalous traffic detection methods are based on single-view analysis, which has obvious limitations in dealing with complex attacks and encrypted communications. In this regard, we propose a Multi-view Feature Fusion (MuFF) method for network anomaly traffic detection. MuFF models the temporal and interactive relationships of packets in network traffic based on the temporal and interactive viewpoints respectively. It learns temporal and interactive features. These features are then fused from different perspectives for anomaly traffic detection. Extensive experiments on six real traffic datasets show that MuFF has excellent performance in network anomalous traffic detection, which makes up for the shortcomings of detection under a single perspective.
△ Less
Submitted 12 September, 2024;
originally announced September 2024.
-
Photonic time-delayed reservoir computing based on lithium niobate microring resonators
Authors:
Yuan Wang,
Ming Li,
Mingyi Gao,
Chang-Ling Zou,
Chun-Hua Dong,
Xiaoniu Yang,
Qi Xuan,
HongLiang Ren
Abstract:
On-chip micro-ring resonators (MRRs) have been proposed for constructing delay reservoir computing (RC) systems, offering a highly scalable, high-density computational architecture that is easy to manufacture. However, most proposed RC schemes have utilized passive integrated optical components based on silicon-on-insulator (SOI), and RC systems based on lithium niobate on insulator (LNOI) have no…
▽ More
On-chip micro-ring resonators (MRRs) have been proposed for constructing delay reservoir computing (RC) systems, offering a highly scalable, high-density computational architecture that is easy to manufacture. However, most proposed RC schemes have utilized passive integrated optical components based on silicon-on-insulator (SOI), and RC systems based on lithium niobate on insulator (LNOI) have not yet been reported. The nonlinear optical effects exhibited by lithium niobate microphotonic devices introduce new possibilities for RC design. In this work, we design an RC scheme based on a series-coupled MRR array, leveraging the unique interplay between thermo-optic nonlinearity and photorefractive effects in lithium niobate. We first demonstrate the existence of three regions defined by wavelength detuning between the primary LNOI micro-ring resonator and the coupled micro-ring array, where one region achieves an optimal balance between nonlinearity and high memory capacity at extremely low input energy, leading to superior computational performance. We then discuss in detail the impact of each ring's nonlinearity and the system's symbol duration on performance. Finally, we design a wavelength-division multiplexing (WDM) based multi-task parallel computing scheme, showing that the computational performance for multiple tasks matches that of single-task computations.
△ Less
Submitted 24 August, 2024;
originally announced August 2024.
-
Social contagion under hybrid interactions
Authors:
Xincheng Shu,
Man Yang,
Zhongyuan Ruan,
Qi Xuan
Abstract:
Threshold-driven models and game theory are two fundamental paradigms for describing human interactions in social systems. However, in mimicking social contagion processes, models that simultaneously incorporate these two mechanisms have been largely overlooked. Here, we study a general model that integrates hybrid interaction forms by assuming that a part of nodes in a network are driven by the t…
▽ More
Threshold-driven models and game theory are two fundamental paradigms for describing human interactions in social systems. However, in mimicking social contagion processes, models that simultaneously incorporate these two mechanisms have been largely overlooked. Here, we study a general model that integrates hybrid interaction forms by assuming that a part of nodes in a network are driven by the threshold mechanism, while the remaining nodes exhibit imitation behavior governed by their rationality (under the game-theoretic framework). Our results reveal that the spreading dynamics are determined by the payoff of adoption. For positive payoffs, increasing the density of highly rational nodes can promote the adoption process, accompanied by a double phase transition. The degree of rationality can regulate the spreading speed, with less rational imitators slowing down the spread. We further find that the results are opposite for negative payoffs of adoption. This model may provide valuable insights into understanding the complex dynamics of social contagion phenomena in real-world social networks.
△ Less
Submitted 20 October, 2024; v1 submitted 9 August, 2024;
originally announced August 2024.
-
MDM: Advancing Multi-Domain Distribution Matching for Automatic Modulation Recognition Dataset Synthesis
Authors:
Dongwei Xu,
Jiajun Chen,
Yao Lu,
Tianhao Xia,
Qi Xuan,
Wei Wang,
Yun Lin,
Xiaoniu Yang
Abstract:
Recently, deep learning technology has been successfully introduced into Automatic Modulation Recognition (AMR) tasks. However, the success of deep learning is all attributed to the training on large-scale datasets. Such a large amount of data brings huge pressure on storage, transmission and model training. In order to solve the problem of large amount of data, some researchers put forward the me…
▽ More
Recently, deep learning technology has been successfully introduced into Automatic Modulation Recognition (AMR) tasks. However, the success of deep learning is all attributed to the training on large-scale datasets. Such a large amount of data brings huge pressure on storage, transmission and model training. In order to solve the problem of large amount of data, some researchers put forward the method of data distillation, which aims to compress large training data into smaller synthetic datasets to maintain its performance. While numerous data distillation techniques have been developed within the realm of image processing, the unique characteristics of signals set them apart. Signals exhibit distinct features across various domains, necessitating specialized approaches for their analysis and processing. To this end, a novel dataset distillation method--Multi-domain Distribution Matching (MDM) is proposed. MDM employs the Discrete Fourier Transform (DFT) to translate timedomain signals into the frequency domain, and then uses a model to compute distribution matching losses between the synthetic and real datasets, considering both the time and frequency domains. Ultimately, these two losses are integrated to update the synthetic dataset. We conduct extensive experiments on three AMR datasets. Experimental results show that, compared with baseline methods, our method achieves better performance under the same compression ratio. Furthermore, we conduct crossarchitecture generalization experiments on several models, and the experimental results show that our synthetic datasets can generalize well on other unseen models.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Enhancing Ethereum Fraud Detection via Generative and Contrastive Self-supervision
Authors:
Chenxiang Jin,
Jiajun Zhou,
Chenxuan Xie,
Shanqing Yu,
Qi Xuan,
Xiaoniu Yang
Abstract:
The rampant fraudulent activities on Ethereum hinder the healthy development of the blockchain ecosystem, necessitating the reinforcement of regulations. However, multiple imbalances involving account interaction frequencies and interaction types in the Ethereum transaction environment pose significant challenges to data mining-based fraud detection research. To address this, we first propose the…
▽ More
The rampant fraudulent activities on Ethereum hinder the healthy development of the blockchain ecosystem, necessitating the reinforcement of regulations. However, multiple imbalances involving account interaction frequencies and interaction types in the Ethereum transaction environment pose significant challenges to data mining-based fraud detection research. To address this, we first propose the concept of meta-interactions to refine interaction behaviors in Ethereum, and based on this, we present a dual self-supervision enhanced Ethereum fraud detection framework, named Meta-IFD. This framework initially introduces a generative self-supervision mechanism to augment the interaction features of accounts, followed by a contrastive self-supervision mechanism to differentiate various behavior patterns, and ultimately characterizes the behavioral representations of accounts and mines potential fraud risks through multi-view interaction feature learning. Extensive experiments on real Ethereum datasets demonstrate the effectiveness and superiority of our framework in detecting common Ethereum fraud behaviors such as Ponzi schemes and phishing scams. Additionally, the generative module can effectively alleviate the interaction distribution imbalance in Ethereum data, while the contrastive module significantly enhances the framework's ability to distinguish different behavior patterns. The source code will be available in https://github.com/GISec-Team/Meta-IFD.
△ Less
Submitted 20 December, 2024; v1 submitted 1 August, 2024;
originally announced August 2024.
-
Exploring agent interaction patterns in the comment sections of fake and real news
Authors:
Kailun Zhu,
Songtao Peng,
Jiaqi Nie,
Zhongyuan Ruan,
Shanqing Yu,
Qi Xuan
Abstract:
User comments on social media have been recognized as a crucial factor in distinguishing between fake and real news, with many studies focusing on the textual content of user reactions. However, the interactions among agents in the comment sections for fake and real news have not been fully explored. In this study, we analyze a dataset comprising both fake and real news from Reddit to investigate…
▽ More
User comments on social media have been recognized as a crucial factor in distinguishing between fake and real news, with many studies focusing on the textual content of user reactions. However, the interactions among agents in the comment sections for fake and real news have not been fully explored. In this study, we analyze a dataset comprising both fake and real news from Reddit to investigate agent interaction patterns, considering both the network structure and the sentiment of the nodes. Our findings reveal that (i) comments on fake news are more likely to form groups, (ii) compared to fake news, where users generate more negative sentiment, real news tend to elicit more neutral and positive sentiments. Additionally, nodes with similar sentiments cluster together more tightly than anticipated. From a dynamic perspective, we found that the sentiment distribution among nodes stabilizes early and remains stable over time. These findings have both theoretical and practical implications, particularly for the early detection of real and fake news within social networks.
△ Less
Submitted 11 October, 2024; v1 submitted 6 July, 2024;
originally announced July 2024.
-
Dual-view Aware Smart Contract Vulnerability Detection for Ethereum
Authors:
Jiacheng Yao,
Maolin Wang,
Wanqi Chen,
Chengxiang Jin,
Jiajun Zhou,
Shanqing Yu,
Qi Xuan
Abstract:
The wide application of Ethereum technology has brought technological innovation to traditional industries. As one of Ethereum's core applications, smart contracts utilize diverse contract codes to meet various functional needs and have gained widespread use. However, the non-tamperability of smart contracts, coupled with vulnerabilities caused by natural flaws or human errors, has brought unprece…
▽ More
The wide application of Ethereum technology has brought technological innovation to traditional industries. As one of Ethereum's core applications, smart contracts utilize diverse contract codes to meet various functional needs and have gained widespread use. However, the non-tamperability of smart contracts, coupled with vulnerabilities caused by natural flaws or human errors, has brought unprecedented challenges to blockchain security. Therefore, in order to ensure the healthy development of blockchain technology and the stability of the blockchain community, it is particularly important to study the vulnerability detection techniques for smart contracts. In this paper, we propose a Dual-view Aware Smart Contract Vulnerability Detection Framework named DVDet. The framework initially converts the source code and bytecode of smart contracts into weighted graphs and control flow sequences, capturing potential risk features from these two perspectives and integrating them for analysis, ultimately achieving effective contract vulnerability detection. Comprehensive experiments on the Ethereum dataset show that our method outperforms others in detecting vulnerabilities.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
A Generic Layer Pruning Method for Signal Modulation Recognition Deep Learning Models
Authors:
Yao Lu,
Yutao Zhu,
Yuqi Li,
Dongwei Xu,
Yun Lin,
Qi Xuan,
Xiaoniu Yang
Abstract:
With the successful application of deep learning in communications systems, deep neural networks are becoming the preferred method for signal classification. Although these models yield impressive results, they often come with high computational complexity and large model sizes, which hinders their practical deployment in communication systems. To address this challenge, we propose a novel layer p…
▽ More
With the successful application of deep learning in communications systems, deep neural networks are becoming the preferred method for signal classification. Although these models yield impressive results, they often come with high computational complexity and large model sizes, which hinders their practical deployment in communication systems. To address this challenge, we propose a novel layer pruning method. Specifically, we decompose the model into several consecutive blocks, each containing consecutive layers with similar semantics. Then, we identify layers that need to be preserved within each block based on their contribution. Finally, we reassemble the pruned blocks and fine-tune the compact model. Extensive experiments on five datasets demonstrate the efficiency and effectiveness of our method over a variety of state-of-the-art baselines, including layer pruning and channel pruning methods.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Node Injection Attack Based on Label Propagation Against Graph Neural Network
Authors:
Peican Zhu,
Zechen Pan,
Keke Tang,
Xiaodong Cui,
Jinhuan Wang,
Qi Xuan
Abstract:
Graph Neural Network (GNN) has achieved remarkable success in various graph learning tasks, such as node classification, link prediction and graph classification. The key to the success of GNN lies in its effective structure information representation through neighboring aggregation. However, the attacker can easily perturb the aggregation process through injecting fake nodes, which reveals that G…
▽ More
Graph Neural Network (GNN) has achieved remarkable success in various graph learning tasks, such as node classification, link prediction and graph classification. The key to the success of GNN lies in its effective structure information representation through neighboring aggregation. However, the attacker can easily perturb the aggregation process through injecting fake nodes, which reveals that GNN is vulnerable to the graph injection attack. Existing graph injection attack methods primarily focus on damaging the classical feature aggregation process while overlooking the neighborhood aggregation process via label propagation. To bridge this gap, we propose the label-propagation-based global injection attack (LPGIA) which conducts the graph injection attack on the node classification task. Specifically, we analyze the aggregation process from the perspective of label propagation and transform the graph injection attack problem into a global injection label specificity attack problem. To solve this problem, LPGIA utilizes a label propagation-based strategy to optimize the combinations of the nodes connected to the injected node. Then, LPGIA leverages the feature mapping to generate malicious features for injected nodes. In extensive experiments against representative GNNs, LPGIA outperforms the previous best-performing injection attack method in various datasets, demonstrating its superiority and transferability.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction
Authors:
Zeyu Wang,
Tianyi Jiang,
Yao Lu,
Xiaoze Bao,
Shanqing Yu,
Bin Wei,
Qi Xuan
Abstract:
Recently, few-shot molecular property prediction (FSMPP) has garnered increasing attention. Despite impressive breakthroughs achieved by existing methods, they often overlook the inherent many-to-many relationships between molecules and properties, which limits their performance. For instance, similar substructures of molecules can inspire the exploration of new compounds. Additionally, the relati…
▽ More
Recently, few-shot molecular property prediction (FSMPP) has garnered increasing attention. Despite impressive breakthroughs achieved by existing methods, they often overlook the inherent many-to-many relationships between molecules and properties, which limits their performance. For instance, similar substructures of molecules can inspire the exploration of new compounds. Additionally, the relationships between properties can be quantified, with high-related properties providing more information in exploring the target property than those low-related. To this end, this paper proposes a novel meta-learning FSMPP framework (KRGTS), which comprises the Knowledge-enhanced Relation Graph module and the Task Sampling module. The knowledge-enhanced relation graph module constructs the molecule-property multi-relation graph (MPMRG) to capture the many-to-many relationships between molecules and properties. The task sampling module includes a meta-training task sampler and an auxiliary task sampler, responsible for scheduling the meta-training process and sampling high-related auxiliary tasks, respectively, thereby achieving efficient meta-knowledge learning and reducing noise introduction. Empirically, extensive experiments on five datasets demonstrate the superiority of KRGTS over a variety of state-of-the-art methods. The code is available in https://github.com/Vencent-Won/KRGTS-public.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Facilitating Feature and Topology Lightweighting: An Ethereum Transaction Graph Compression Method for Malicious Account Detection
Authors:
Jiajun Zhou,
Xuanze Chen,
Shengbo Gong,
Chenkai Hu,
Chengxiang Jin,
Shanqing Yu,
Qi Xuan
Abstract:
Ethereum has become one of the primary global platforms for cryptocurrency, playing an important role in promoting the diversification of the financial ecosystem. However, the relative lag in regulation has led to a proliferation of malicious activities in Ethereum, posing a serious threat to fund security. Existing regulatory methods usually detect malicious accounts through feature engineering o…
▽ More
Ethereum has become one of the primary global platforms for cryptocurrency, playing an important role in promoting the diversification of the financial ecosystem. However, the relative lag in regulation has led to a proliferation of malicious activities in Ethereum, posing a serious threat to fund security. Existing regulatory methods usually detect malicious accounts through feature engineering or large-scale transaction graph mining. However, due to the immense scale of transaction data and malicious attacks, these methods suffer from inefficiency and low robustness during data processing and anomaly detection. In this regard, we propose an Ethereum Transaction Graph Compression method named TGC4Eth, which assists malicious account detection by lightweighting both features and topology of the transaction graph. At the feature level, we select transaction features based on their low importance to improve the robustness of the subsequent detection models against feature evasion attacks; at the topology level, we employ focusing and coarsening processes to compress the structure of the transaction graph, thereby improving both data processing and inference efficiency of detection models. Extensive experiments demonstrate that TGC4Eth significantly improves the computational efficiency of existing detection models while preserving the connectivity of the transaction graph. Furthermore, TGC4Eth enables existing detection models to maintain stable performance and exhibit high robustness against feature evasion attacks.
△ Less
Submitted 1 July, 2024; v1 submitted 13 May, 2024;
originally announced May 2024.
-
Improving Network Degree Correlation by Degree-preserving Rewiring
Authors:
Shuo Zou,
Bo Zhou,
Qi Xuan
Abstract:
Degree correlation is a crucial measure in networks, significantly impacting network topology and dynamical behavior. The degree sequence of a network is a significant characteristic, and altering network degree correlation through degree-preserving rewiring poses an interesting problem. In this paper, we define the problem of maximizing network degree correlation through a finite number of rewiri…
▽ More
Degree correlation is a crucial measure in networks, significantly impacting network topology and dynamical behavior. The degree sequence of a network is a significant characteristic, and altering network degree correlation through degree-preserving rewiring poses an interesting problem. In this paper, we define the problem of maximizing network degree correlation through a finite number of rewirings and use the assortativity coefficient to measure it. We analyze the changes in assortativity coefficient under degree-preserving rewiring and establish its relationship with the s-metric. Under our assumptions, we prove the problem to be monotonic and submodular, leading to the proposal of the GA method to enhance network degree correlation. By formulating an integer programming model, we demonstrate that the GA method can effectively approximate the optimal solution and validate its superiority over other baseline methods through experiments on three types of real-world networks. Additionally, we introduce three heuristic rewiring strategies, EDA, TA and PEA, and demonstrate their applicability to different types of networks. Furthermore, we extend our investigation to explore the impact of these rewiring strategies on several spectral robustness metrics based on the adjacency matrix. Finally, we examine the robustness of various centrality metrics in the network while enhancing network degree correlation using the GA method.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Exploring the Impact of Dataset Bias on Dataset Distillation
Authors:
Yao Lu,
Jianyang Gu,
Xuguang Chen,
Saeed Vahidian,
Qi Xuan
Abstract:
Dataset Distillation (DD) is a promising technique to synthesize a smaller dataset that preserves essential information from the original dataset. This synthetic dataset can serve as a substitute for the original large-scale one, and help alleviate the training workload. However, current DD methods typically operate under the assumption that the dataset is unbiased, overlooking potential bias issu…
▽ More
Dataset Distillation (DD) is a promising technique to synthesize a smaller dataset that preserves essential information from the original dataset. This synthetic dataset can serve as a substitute for the original large-scale one, and help alleviate the training workload. However, current DD methods typically operate under the assumption that the dataset is unbiased, overlooking potential bias issues within the dataset itself. To fill in this blank, we systematically investigate the influence of dataset bias on DD. To the best of our knowledge, this is the first exploration in the DD domain. Given that there are no suitable biased datasets for DD, we first construct two biased datasets, CMNIST-DD and CCIFAR10-DD, to establish a foundation for subsequent analysis. Then we utilize existing DD methods to generate synthetic datasets on CMNIST-DD and CCIFAR10-DD, and evaluate their performance following the standard process. Experiments demonstrate that biases present in the original dataset significantly impact the performance of the synthetic dataset in most cases, which highlights the necessity of identifying and mitigating biases in the original datasets during DD. Finally, we reformulate DD within the context of a biased dataset. Our code along with biased datasets are available at https://github.com/yaolu-zjut/Biased-DD.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
A Federated Parameter Aggregation Method for Node Classification Tasks with Different Graph Network Structures
Authors:
Hao Song,
Jiacheng Yao,
Zhengxi Li,
Shaocong Xu,
Shibo Jin,
Jiajun Zhou,
Chenbo Fu,
Qi Xuan,
Shanqing Yu
Abstract:
Over the past few years, federated learning has become widely used in various classical machine learning fields because of its collaborative ability to train data from multiple sources without compromising privacy. However, in the area of graph neural networks, the nodes and network structures of graphs held by clients are different in many practical applications, and the aggregation method that d…
▽ More
Over the past few years, federated learning has become widely used in various classical machine learning fields because of its collaborative ability to train data from multiple sources without compromising privacy. However, in the area of graph neural networks, the nodes and network structures of graphs held by clients are different in many practical applications, and the aggregation method that directly shares model gradients cannot be directly applied to this scenario. Therefore, this work proposes a federated aggregation method FLGNN applied to various graph federation scenarios and investigates the aggregation effect of parameter sharing at each layer of the graph neural network model. The effectiveness of the federated aggregation method FLGNN is verified by experiments on real datasets. Additionally, for the privacy security of FLGNN, this paper designs membership inference attack experiments and differential privacy defense experiments. The results show that FLGNN performs good robustness, and the success rate of privacy theft is further reduced by adding differential privacy defense methods.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Backfire Effect Reveals Early Controversy in Online Media
Authors:
Songtao Peng,
Tao Jin,
Kailun Zhu,
Qi Xuan,
Yong Min
Abstract:
The rapid development of online media has significantly facilitated the public's information consumption, knowledge acquisition, and opinion exchange. However, it has also led to more violent conflicts in online discussions. Therefore, controversy detection becomes important for computational and social sciences. Previous research on detection methods has primarily focused on larger datasets and m…
▽ More
The rapid development of online media has significantly facilitated the public's information consumption, knowledge acquisition, and opinion exchange. However, it has also led to more violent conflicts in online discussions. Therefore, controversy detection becomes important for computational and social sciences. Previous research on detection methods has primarily focused on larger datasets and more complex computational models but has rarely examined the underlying mechanisms of conflict, particularly the psychological motivations behind them. In this paper, we present evidence that conflicting posts tend to have a high proportion of "ascending gradient of likes", i.e., replies get more likes than comments. Additionally, there is a gradient in the number of replies between the neighboring tiers as well. We develop two new gradient features and demonstrate the common enhancement effect of our features in terms of controversy detection models. Further, multiple evaluation algorithms are used to compare structural, interactive, and textual features with the new features across multiple Chinese and English media. The results show that it is a general case that gradient features are significantly different in terms of controversy and are more important than other features. More thoroughly, we discuss the mechanism by which the ascending gradient emerges, suggesting that the case is related to the "backfire effect" in ideological conflicts that have received recent attention. The features formed by the psychological mechanism also show excellent detection performance in application scenarios where only a few hot information or early information are considered. Our findings can provide a new perspective for online conflict behavior analysis and early detection.
△ Less
Submitted 30 May, 2025; v1 submitted 5 March, 2024;
originally announced March 2024.
-
A Feasible Method for Constrained Derivative-Free Optimization
Authors:
Melody Qiming Xuan,
Jorge Nocedal
Abstract:
This paper explores a method for solving constrained optimization problems when the derivatives of the objective function are unavailable, while the derivatives of the constraints are known. We allow the objective and constraint function to be nonconvex. The method constructs a quadratic model of the objective function via interpolation and computes a step by minimizing this model subject to the o…
▽ More
This paper explores a method for solving constrained optimization problems when the derivatives of the objective function are unavailable, while the derivatives of the constraints are known. We allow the objective and constraint function to be nonconvex. The method constructs a quadratic model of the objective function via interpolation and computes a step by minimizing this model subject to the original constraints in the problem and a trust region constraint. The step computation requires the solution of a general nonlinear program, which is economically feasible when the constraints and their derivatives are very inexpensive to compute compared to the objective function. The paper includes a summary of numerical results that highlight the method's promising potential.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Multi-Modal Representation Learning for Molecular Property Prediction: Sequence, Graph, Geometry
Authors:
Zeyu Wang,
Tianyi Jiang,
Jinhuan Wang,
Qi Xuan
Abstract:
Molecular property prediction refers to the task of labeling molecules with some biochemical properties, playing a pivotal role in the drug discovery and design process. Recently, with the advancement of machine learning, deep learning-based molecular property prediction has emerged as a solution to the resource-intensive nature of traditional methods, garnering significant attention. Among them,…
▽ More
Molecular property prediction refers to the task of labeling molecules with some biochemical properties, playing a pivotal role in the drug discovery and design process. Recently, with the advancement of machine learning, deep learning-based molecular property prediction has emerged as a solution to the resource-intensive nature of traditional methods, garnering significant attention. Among them, molecular representation learning is the key factor for molecular property prediction performance. And there are lots of sequence-based, graph-based, and geometry-based methods that have been proposed. However, the majority of existing studies focus solely on one modality for learning molecular representations, failing to comprehensively capture molecular characteristics and information. In this paper, a novel multi-modal representation learning model, which integrates the sequence, graph, and geometry characteristics, is proposed for molecular property prediction, called SGGRL. Specifically, we design a fusion layer to fusion the representation of different modalities. Furthermore, to ensure consistency across modalities, SGGRL is trained to maximize the similarity of representations for the same molecule while minimizing similarity for different molecules. To verify the effectiveness of SGGRL, seven molecular datasets, and several baselines are used for evaluation and comparison. The experimental results demonstrate that SGGRL consistently outperforms the baselines in most cases. This further underscores the capability of SGGRL to comprehensively capture molecular information. Overall, the proposed SGGRL model showcases its potential to revolutionize molecular property prediction by leveraging multi-modal representation learning to extract diverse and comprehensive molecular insights. Our code is released at https://github.com/Vencent-Won/SGGRL.
△ Less
Submitted 8 January, 2024; v1 submitted 6 January, 2024;
originally announced January 2024.
-
MAD-MulW: A Multi-Window Anomaly Detection Framework for BGP Security Events
Authors:
Songtao Peng,
Yiping Chen,
Xincheng Shu,
Wu Shuai,
Shenhao Fang,
Zhongyuan Ruan,
Qi Xuan
Abstract:
In recent years, various international security events have occurred frequently and interacted between real society and cyberspace. Traditional traffic monitoring mainly focuses on the local anomalous status of events due to a large amount of data. BGP-based event monitoring makes it possible to perform differential analysis of international events. For many existing traffic anomaly detection meth…
▽ More
In recent years, various international security events have occurred frequently and interacted between real society and cyberspace. Traditional traffic monitoring mainly focuses on the local anomalous status of events due to a large amount of data. BGP-based event monitoring makes it possible to perform differential analysis of international events. For many existing traffic anomaly detection methods, we have observed that the window-based noise reduction strategy effectively improves the success rate of time series anomaly detection. Motivated by this observation, we propose an unsupervised anomaly detection model, MAD-MulW, which incorporates a multi-window serial framework. Firstly, we design the W-GAT module to adaptively update the sample weights within the window and retain the updated information of the trailing sample, which not only reduces the outlier samples' noise but also avoids the space consumption of data scale expansion. Then, the W-LAT module based on predictive reconstruction both captures the trend of sample fluctuations over a certain period of time and increases the interclass variation through the reconstruction of the predictive sample. Our model has been experimentally validated on multiple BGP anomalous events with an average F1 score of over 90\%, which demonstrates the significant improvement effect of the stage windows and adaptive strategy on the efficiency and stability of the timing model.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Deep Learning-Based Frequency Offset Estimation
Authors:
Tao Chen,
Shilian Zheng,
Jiawei Zhu,
Qi Xuan,
Xiaoniu Yang
Abstract:
In wireless communication systems, the asynchronization of the oscillators in the transmitter and the receiver along with the Doppler shift due to relative movement may lead to the presence of carrier frequency offset (CFO) in the received signals. Estimation of CFO is crucial for subsequent processing such as coherent demodulation. In this brief, we demonstrate the utilization of deep learning fo…
▽ More
In wireless communication systems, the asynchronization of the oscillators in the transmitter and the receiver along with the Doppler shift due to relative movement may lead to the presence of carrier frequency offset (CFO) in the received signals. Estimation of CFO is crucial for subsequent processing such as coherent demodulation. In this brief, we demonstrate the utilization of deep learning for CFO estimation by employing a residual network (ResNet) to learn and extract signal features from the raw in-phase (I) and quadrature (Q) components of the signals. We use multiple modulation schemes in the training set to make the trained model adaptable to multiple modulations or even new signals. In comparison to the commonly used traditional CFO estimation methods, our proposed IQ-ResNet method exhibits superior performance across various scenarios including different oversampling ratios, various signal lengths, and different channels
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Augmenting Radio Signals with Wavelet Transform for Deep Learning-Based Modulation Recognition
Authors:
Tao Chen,
Shilian Zheng,
Kunfeng Qiu,
Luxin Zhang,
Qi Xuan,
Xiaoniu Yang
Abstract:
The use of deep learning for radio modulation recognition has become prevalent in recent years. This approach automatically extracts high-dimensional features from large datasets, facilitating the accurate classification of modulation schemes. However, in real-world scenarios, it may not be feasible to gather sufficient training data in advance. Data augmentation is a method used to increase the d…
▽ More
The use of deep learning for radio modulation recognition has become prevalent in recent years. This approach automatically extracts high-dimensional features from large datasets, facilitating the accurate classification of modulation schemes. However, in real-world scenarios, it may not be feasible to gather sufficient training data in advance. Data augmentation is a method used to increase the diversity and quantity of training dataset and to reduce data sparsity and imbalance. In this paper, we propose data augmentation methods that involve replacing detail coefficients decomposed by discrete wavelet transform for reconstructing to generate new samples and expand the training set. Different generation methods are used to generate replacement sequences. Simulation results indicate that our proposed methods significantly outperform the other augmentation methods.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
RK-core: An Established Methodology for Exploring the Hierarchical Structure within Datasets
Authors:
Yao Lu,
Yutian Huang,
Jiaqi Nie,
Zuohui Chen,
Qi Xuan
Abstract:
Recently, the field of machine learning has undergone a transition from model-centric to data-centric. The advancements in diverse learning tasks have been propelled by the accumulation of more extensive datasets, subsequently facilitating the training of larger models on these datasets. However, these datasets remain relatively under-explored. To this end, we introduce a pioneering approach known…
▽ More
Recently, the field of machine learning has undergone a transition from model-centric to data-centric. The advancements in diverse learning tasks have been propelled by the accumulation of more extensive datasets, subsequently facilitating the training of larger models on these datasets. However, these datasets remain relatively under-explored. To this end, we introduce a pioneering approach known as RK-core, to empower gaining a deeper understanding of the intricate hierarchical structure within datasets. Across several benchmark datasets, we find that samples with low coreness values appear less representative of their respective categories, and conversely, those with high coreness values exhibit greater representativeness. Correspondingly, samples with high coreness values make a more substantial contribution to the performance in comparison to those with low coreness values. Building upon this, we further employ RK-core to analyze the hierarchical structure of samples with different coreset selection methods. Remarkably, we find that a high-quality coreset should exhibit hierarchical diversity instead of solely opting for representative samples. The code is available at https://github.com/yaolu-zjut/Kcore.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
BufferSearch: Generating Black-Box Adversarial Texts With Lower Queries
Authors:
Wenjie Lv,
Zhen Wang,
Yitao Zheng,
Zhehua Zhong,
Qi Xuan,
Tianyi Chen
Abstract:
Machine learning security has recently become a prominent topic in the natural language processing (NLP) area. The existing black-box adversarial attack suffers prohibitively from the high model querying complexity, resulting in easily being captured by anti-attack monitors. Meanwhile, how to eliminate redundant model queries is rarely explored. In this paper, we propose a query-efficient approach…
▽ More
Machine learning security has recently become a prominent topic in the natural language processing (NLP) area. The existing black-box adversarial attack suffers prohibitively from the high model querying complexity, resulting in easily being captured by anti-attack monitors. Meanwhile, how to eliminate redundant model queries is rarely explored. In this paper, we propose a query-efficient approach BufferSearch to effectively attack general intelligent NLP systems with the minimal number of querying requests. In general, BufferSearch makes use of historical information and conducts statistical test to avoid incurring model queries frequently. Numerically, we demonstrate the effectiveness of BufferSearch on various benchmark text-classification experiments by achieving the competitive attacking performance but with a significant reduction of query quantity. Furthermore, BufferSearch performs multiple times better than competitors within restricted query budget. Our work establishes a strong benchmark for the future study of query-efficiency in NLP adversarial attacks.
△ Less
Submitted 14 October, 2023;
originally announced October 2023.
-
Attacking The Assortativity Coefficient Under A Rewiring Strategy
Authors:
Shuo Zou,
Bo Zhou,
Qi Xuan
Abstract:
Degree correlation is an important characteristic of networks, which is usually quantified by the assortativity coefficient. However, concerns arise about changing the assortativity coefficient of a network when networks suffer from adversarial attacks. In this paper, we analyze the factors that affect the assortativity coefficient and study the optimization problem of maximizing or minimizing the…
▽ More
Degree correlation is an important characteristic of networks, which is usually quantified by the assortativity coefficient. However, concerns arise about changing the assortativity coefficient of a network when networks suffer from adversarial attacks. In this paper, we analyze the factors that affect the assortativity coefficient and study the optimization problem of maximizing or minimizing the assortativity coefficient (r) in rewired networks with $k$ pairs of edges. We propose a greedy algorithm and formulate the optimization problem using integer programming to obtain the optimal solution for this problem. Through experiments, we demonstrate the reasonableness and effectiveness of our proposed algorithm. For example, rewired edges 10% in the ER network, the assortativity coefficient improved by 60%.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Can pre-trained models assist in dataset distillation?
Authors:
Yao Lu,
Xuguang Chen,
Yuchen Zhang,
Jianyang Gu,
Tianle Zhang,
Yifan Zhang,
Xiaoniu Yang,
Qi Xuan,
Kai Wang,
Yang You
Abstract:
Dataset Distillation (DD) is a prominent technique that encapsulates knowledge from a large-scale original dataset into a small synthetic dataset for efficient training. Meanwhile, Pre-trained Models (PTMs) function as knowledge repositories, containing extensive information from the original dataset. This naturally raises a question: Can PTMs effectively transfer knowledge to synthetic datasets,…
▽ More
Dataset Distillation (DD) is a prominent technique that encapsulates knowledge from a large-scale original dataset into a small synthetic dataset for efficient training. Meanwhile, Pre-trained Models (PTMs) function as knowledge repositories, containing extensive information from the original dataset. This naturally raises a question: Can PTMs effectively transfer knowledge to synthetic datasets, guiding DD accurately? To this end, we conduct preliminary experiments, confirming the contribution of PTMs to DD. Afterwards, we systematically study different options in PTMs, including initialization parameters, model architecture, training epoch and domain knowledge, revealing that: 1) Increasing model diversity enhances the performance of synthetic datasets; 2) Sub-optimal models can also assist in DD and outperform well-trained ones in certain cases; 3) Domain-specific PTMs are not mandatory for DD, but a reasonable domain match is crucial. Finally, by selecting optimal options, we significantly improve the cross-architecture generalization over baseline DD methods. We hope our work will facilitate researchers to develop better DD techniques. Our code is available at https://github.com/yaolu-zjut/DDInterpreter.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Multi-triplet Feature Augmentation for Ponzi Scheme Detection in Ethereum
Authors:
Chengxiang Jin,
Jiajun Zhou,
Shengbo Gong,
Chenxuan Xie,
Qi Xuan
Abstract:
Blockchain technology revolutionizes the Internet, but also poses increasing risks, particularly in cryptocurrency finance. On the Ethereum platform, Ponzi schemes, phishing scams, and a variety of other frauds emerge. Existing Ponzi scheme detection approaches based on heterogeneous transaction graph modeling leverages semantic information between node (account) pairs to establish connections, ov…
▽ More
Blockchain technology revolutionizes the Internet, but also poses increasing risks, particularly in cryptocurrency finance. On the Ethereum platform, Ponzi schemes, phishing scams, and a variety of other frauds emerge. Existing Ponzi scheme detection approaches based on heterogeneous transaction graph modeling leverages semantic information between node (account) pairs to establish connections, overlooking the semantic attributes inherent to the edges (interactions). To overcome this, we construct heterogeneous Ethereum interaction graphs with multiple triplet interaction patterns to better depict the real Ethereum environment. Based on this, we design a new framework named multi-triplet augmented heterogeneous graph neural network (MAHGNN) for Ponzi scheme detection. We introduce the Conditional Variational Auto Encoder (CVAE) to capture the semantic information of different triplet interaction patterns, which facilitates the characterization on account features. Extensive experiments demonstrate that MAHGNN is capable of addressing the problem of multi-edge interactions in heterogeneous Ethereum interaction graphs and achieving state-of-the-art performance in Ponzi scheme detection.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
Photonic time-delayed reservoir computing based on series coupled microring resonators with high memory capacity
Authors:
Yijia Li,
Ming Li,
MingYi Gao,
Chang-Ling Zou,
Chun-Hua Dong,
Jin Lu,
Yali Qin,
XiaoNiu Yang,
Qi Xuan,
Hongliang Ren
Abstract:
On-chip microring resonators (MRRs) have been proposed to construct the time-delayed reservoir computing (RC), which offers promising configurations available for computation with high scalability, high-density computing, and easy fabrication. A single MRR, however, is inadequate to supply enough memory for the computational task with diverse memory requirements. Large memory needs are met by the…
▽ More
On-chip microring resonators (MRRs) have been proposed to construct the time-delayed reservoir computing (RC), which offers promising configurations available for computation with high scalability, high-density computing, and easy fabrication. A single MRR, however, is inadequate to supply enough memory for the computational task with diverse memory requirements. Large memory needs are met by the MRR with optical feedback waveguide, but at the expense of its large footprint. In the structure, the ultra-long optical feedback waveguide substantially limits the scalable photonic RC integrated designs. In this paper, a time-delayed RC is proposed by utilizing a silicon-based nonlinear MRR in conjunction with an array of linear MRRs. These linear MRRs possess a high quality factor, providing sufficient memory capacity for the entire system. We quantitatively analyze and assess the proposed RC structure's performance on three classical tasks with diverse memory requirements, i.e., the Narma 10, Mackey-Glass, and Santa Fe chaotic timeseries prediction tasks. The proposed system exhibits comparable performance to the MRR with an ultra-long optical feedback waveguide-based system when it comes to handling the Narma 10 task, which requires a significant memory capacity. Nevertheless, the overall length of these linear MRRs is significantly smaller, by three orders of magnitude, compared to the ultra-long feedback waveguide in the MRR with optical feedback waveguide-based system. The compactness of this structure has significant implications for the scalability and seamless integration of photonic RC.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
MONA: An Efficient and Scalable Strategy for Targeted k-Nodes Collapse
Authors:
Yuqian Lv,
Bo Zhou,
Jinhuan Wang,
Shanqing Yu,
Qi Xuan
Abstract:
The concept of k-core plays an important role in measuring the cohesiveness and engagement of a network. And recent studies have shown the vulnerability of k-core under adversarial attacks. However, there are few researchers concentrating on the vulnerability of individual nodes within k-core. Therefore, in this paper, we attempt to study Targeted k-Nodes Collapse Problem (TNsCP), which focuses on…
▽ More
The concept of k-core plays an important role in measuring the cohesiveness and engagement of a network. And recent studies have shown the vulnerability of k-core under adversarial attacks. However, there are few researchers concentrating on the vulnerability of individual nodes within k-core. Therefore, in this paper, we attempt to study Targeted k-Nodes Collapse Problem (TNsCP), which focuses on removing a minimal size set of edges to make multiple target k-nodes collapse. For this purpose, we first propose a novel algorithm named MOD for candidate reduction. Then we introduce an efficient strategy named MONA, based on MOD, to address TNsCP. Extensive experiments validate the effectiveness and scalability of MONA compared to several baselines. An open-source implementation is available at https://github.com/Yocenly/MONA.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Epidemic spreading under game-based self-quarantine behaviors: The different effects of local and global information
Authors:
Zegang Huang,
Xincheng Shu,
Qi Xuan,
Zhongyuan Ruan
Abstract:
During the outbreak of an epidemic, individuals may modify their behaviors in response to external (including local and global) infection-related information. However, the difference between local and global information in influencing the spread of diseases remains inadequately explored. Here we study a simple epidemic model that incorporates the game-based self-quarantine behavior of individuals,…
▽ More
During the outbreak of an epidemic, individuals may modify their behaviors in response to external (including local and global) infection-related information. However, the difference between local and global information in influencing the spread of diseases remains inadequately explored. Here we study a simple epidemic model that incorporates the game-based self-quarantine behavior of individuals, taking into account the influence of local infection status, global disease prevalence and node heterogeneity (non-identical degree distribution). Our findings reveal that local information can effectively contain an epidemic, even with only a small proportion of individuals opting for self-quarantine. On the other hand, global information can cause infection evolution curves shaking during the declining phase of an epidemic, owing to the synchronous release of nodes with the same degree from the quarantined state. In contrast, the releasing pattern under the local information appears to be more random. This shaking phenomenon can be observed in various types of networks associated with different characteristics. Moreover, it is found that under the proposed game-epidemic framework, a disease is more difficult to spread in heterogeneous networks than in homogeneous networks, which differs from conventional epidemic models.
△ Less
Submitted 17 July, 2024; v1 submitted 4 August, 2023;
originally announced August 2023.
-
PathMLP: Smooth Path Towards High-order Homophily
Authors:
Jiajun Zhou,
Chenxuan Xie,
Shengbo Gong,
Jiaxu Qian,
Shanqing Yu,
Qi Xuan,
Xiaoniu Yang
Abstract:
Real-world graphs exhibit increasing heterophily, where nodes no longer tend to be connected to nodes with the same label, challenging the homophily assumption of classical graph neural networks (GNNs) and impeding their performance. Intriguingly, from the observation of heterophilous data, we notice that certain high-order information exhibits higher homophily, which motivates us to involve high-…
▽ More
Real-world graphs exhibit increasing heterophily, where nodes no longer tend to be connected to nodes with the same label, challenging the homophily assumption of classical graph neural networks (GNNs) and impeding their performance. Intriguingly, from the observation of heterophilous data, we notice that certain high-order information exhibits higher homophily, which motivates us to involve high-order information in node representation learning. However, common practices in GNNs to acquire high-order information mainly through increasing model depth and altering message-passing mechanisms, which, albeit effective to a certain extent, suffer from three shortcomings: 1) over-smoothing due to excessive model depth and propagation times; 2) high-order information is not fully utilized; 3) low computational efficiency. In this regard, we design a similarity-based path sampling strategy to capture smooth paths containing high-order homophily. Then we propose a lightweight model based on multi-layer perceptrons (MLP), named PathMLP, which can encode messages carried by paths via simple transformation and concatenation operations, and effectively learn node representations in heterophilous graphs through adaptive path aggregation. Extensive experiments demonstrate that our method outperforms baselines on 16 out of 20 datasets, underlining its effectiveness and superiority in alleviating the heterophily problem. In addition, our method is immune to over-smoothing and has high computational efficiency. The source code will be available in https://github.com/Graph4Sec-Team/PathMLP.
△ Less
Submitted 21 August, 2024; v1 submitted 23 June, 2023;
originally announced June 2023.
-
Subgraph Networks Based Contrastive Learning
Authors:
Jinhuan Wang,
Jiafei Shao,
Zeyu Wang,
Shanqing Yu,
Qi Xuan,
Xiaoniu Yang
Abstract:
Graph contrastive learning (GCL), as a self-supervised learning method, can solve the problem of annotated data scarcity. It mines explicit features in unannotated graphs to generate favorable graph representations for downstream tasks. Most existing GCL methods focus on the design of graph augmentation strategies and mutual information estimation operations. Graph augmentation produces augmented…
▽ More
Graph contrastive learning (GCL), as a self-supervised learning method, can solve the problem of annotated data scarcity. It mines explicit features in unannotated graphs to generate favorable graph representations for downstream tasks. Most existing GCL methods focus on the design of graph augmentation strategies and mutual information estimation operations. Graph augmentation produces augmented views by graph perturbations. These views preserve a locally similar structure and exploit explicit features. However, these methods have not considered the interaction existing in subgraphs. To explore the impact of substructure interactions on graph representations, we propose a novel framework called subgraph network-based contrastive learning (SGNCL). SGNCL applies a subgraph network generation strategy to produce augmented views. This strategy converts the original graph into an Edge-to-Node mapping network with both topological and attribute features. The single-shot augmented view is a first-order subgraph network that mines the interaction between nodes, node-edge, and edges. In addition, we also investigate the impact of the second-order subgraph augmentation on mining graph structure interactions, and further, propose a contrastive objective that fuses the first-order and second-order subgraph information. We compare SGNCL with classical and state-of-the-art graph contrastive learning methods on multiple benchmark datasets of different domains. Extensive experiments show that SGNCL achieves competitive or better performance (top three) on all datasets in unsupervised learning settings. Furthermore, SGNCL achieves the best average gain of 6.9\% in transfer learning compared to the best method. Finally, experiments also demonstrate that mining substructure interactions have positive implications for graph contrastive learning.
△ Less
Submitted 30 March, 2024; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Clarify Confused Nodes via Separated Learning
Authors:
Jiajun Zhou,
Shengbo Gong,
Xuanze Chen,
Chenxuan Xie,
Shanqing Yu,
Qi Xuan,
Xiaoniu Yang
Abstract:
Graph neural networks (GNNs) have achieved remarkable advances in graph-oriented tasks. However, real-world graphs invariably contain a certain proportion of heterophilous nodes, challenging the homophily assumption of traditional GNNs and hindering their performance. Most existing studies continue to design generic models with shared weights between heterophilous and homophilous nodes. Despite th…
▽ More
Graph neural networks (GNNs) have achieved remarkable advances in graph-oriented tasks. However, real-world graphs invariably contain a certain proportion of heterophilous nodes, challenging the homophily assumption of traditional GNNs and hindering their performance. Most existing studies continue to design generic models with shared weights between heterophilous and homophilous nodes. Despite the incorporation of high-order messages or multi-channel architectures, these efforts often fall short. A minority of studies attempt to train different node groups separately but suffer from inappropriate separation metrics and low efficiency. In this paper, we first propose a new metric, termed Neighborhood Confusion (NC), to facilitate a more reliable separation of nodes. We observe that node groups with different levels of NC values exhibit certain differences in intra-group accuracy and visualized embeddings. These pave the way for Neighborhood Confusion-guided Graph Convolutional Network (NCGCN), in which nodes are grouped by their NC values and accept intra-group weight sharing and message passing. Extensive experiments on both homophilous and heterophilous benchmarks demonstrate that our framework can effectively separate nodes and yield significant performance improvement compared to the latest methods. The source code will be available in https://github.com/GISec-Team/NCGNN.
△ Less
Submitted 3 February, 2025; v1 submitted 4 June, 2023;
originally announced June 2023.