Search | arXiv e-print repository

dyGRASS: Dynamic Spectral Graph Sparsification via Localized Random Walks on GPUs

Authors: Yihang Yuan, Ali Aghdaei, Zhuo Feng

Abstract: This work presents dyGRASS, an efficient dynamic algorithm for spectral sparsification of large undirected graphs that undergo streaming edge insertions and deletions. At its core, dyGRASS employs a random-walk-based method to efficiently estimate node-to-node distances in both the original graph (for decremental update) and its sparsifier (for incremental update). For incremental updates, dyGRASS… ▽ More This work presents dyGRASS, an efficient dynamic algorithm for spectral sparsification of large undirected graphs that undergo streaming edge insertions and deletions. At its core, dyGRASS employs a random-walk-based method to efficiently estimate node-to-node distances in both the original graph (for decremental update) and its sparsifier (for incremental update). For incremental updates, dyGRASS enables the identification of spectrally critical edges among the updates to capture the latest structural changes. For decremental updates, dyGRASS facilitates the recovery of important edges from the original graph back into the sparsifier. To further enhance computational efficiency, dyGRASS employs a GPU-based non-backtracking random walk scheme that allows multiple walkers to operate simultaneously across various target updates. This parallelization significantly improves both the performance and scalability of the proposed dyGRASS framework. Our comprehensive experimental evaluations reveal that dyGRASS achieves approximately a 10x speedup compared to the state-of-the-art incremental sparsification (inGRASS) algorithm while eliminating the setup overhead and improving solution quality in incremental spectral sparsification tasks. Moreover, dyGRASS delivers high efficiency and superior solution quality for fully dynamic graph sparsification, accommodating both edge insertions and deletions across a diverse range of graph instances originating from integrated circuit simulations, finite element analysis, and social networks. △ Less

Submitted 6 May, 2025; v1 submitted 5 May, 2025; originally announced May 2025.

arXiv:2503.21840 [pdf]

Vision Language Models versus Machine Learning Models Performance on Polyp Detection and Classification in Colonoscopy Images

Authors: Mohammad Amin Khalafi, Seyed Amir Ahmad Safavi-Naini, Ameneh Salehi, Nariman Naderi, Dorsa Alijanzadeh, Pardis Ketabi Moghadam, Kaveh Kavosi, Negar Golestani, Shabnam Shahrokh, Soltanali Fallah, Jamil S Samaan, Nicholas P. Tatonetti, Nicholas Hoerter, Girish Nadkarni, Hamid Asadzadeh Aghdaei, Ali Soroush

Abstract: Introduction: This study provides a comprehensive performance assessment of vision-language models (VLMs) against established convolutional neural networks (CNNs) and classic machine learning models (CMLs) for computer-aided detection (CADe) and computer-aided diagnosis (CADx) of colonoscopy polyp images. Method: We analyzed 2,258 colonoscopy images with corresponding pathology reports from 428 pa… ▽ More Introduction: This study provides a comprehensive performance assessment of vision-language models (VLMs) against established convolutional neural networks (CNNs) and classic machine learning models (CMLs) for computer-aided detection (CADe) and computer-aided diagnosis (CADx) of colonoscopy polyp images. Method: We analyzed 2,258 colonoscopy images with corresponding pathology reports from 428 patients. We preprocessed all images using standardized techniques (resizing, normalization, and augmentation) and implemented a rigorous comparative framework evaluating 11 distinct models: ResNet50, 4 CMLs (random forest, support vector machine, logistic regression, decision tree), two specialized contrastive vision language encoders (CLIP, BiomedCLIP), and three general-purpose VLMs ( GPT-4 Gemini-1.5-Pro, Claude-3-Opus). Our performance assessment focused on two clinical tasks: polyp detection (CADe) and classification (CADx). Result: In polyp detection, ResNet50 achieved the best performance (F1: 91.35%, AUROC: 0.98), followed by BiomedCLIP (F1: 88.68%, AUROC: [AS1] ). GPT-4 demonstrated comparable effectiveness to traditional machine learning approaches (F1: 81.02%, AUROC: [AS2] ), outperforming other general-purpose VLMs. For polyp classification, performance rankings remained consistent but with lower overall metrics. ResNet50 maintained the highest efficacy (weighted F1: 74.94%), while GPT-4 demonstrated moderate capability (weighted F1: 41.18%), significantly exceeding other VLMs (Claude-3-Opus weighted F1: 25.54%, Gemini 1.5 Pro weighted F1: 6.17%). Conclusion: CNNs remain superior for both CADx and CADe tasks. However, VLMs like BioMedCLIP and GPT-4 may be useful for polyp detection tasks where training CNNs is not feasible. △ Less

Submitted 27 March, 2025; originally announced March 2025.

Comments: Code is available at: https://github.com/aminkhalafi/CML-vs-LLM-on-Polyp-Detection. CoI: AlSo serves on the advisory board and holds equity in Virgo Surgical Solutions. The other authors declare no conflicts of interest. Data

MSC Class: 92C50; 68T50 ACM Class: J.3

arXiv:2410.10875 [pdf, ps, other]

SHyPar: A Spectral Coarsening Approach to Hypergraph Partitioning

Authors: Hamed Sajadinia, Ali Aghdaei, Zhuo Feng

Abstract: State-of-the-art hypergraph partitioners utilize a multilevel paradigm to construct progressively coarser hypergraphs across multiple layers, guiding cut refinements at each level of the hierarchy. Traditionally, these partitioners employ heuristic methods for coarsening and do not consider the structural features of hypergraphs. In this work, we introduce a multilevel spectral framework, SHyPar,… ▽ More State-of-the-art hypergraph partitioners utilize a multilevel paradigm to construct progressively coarser hypergraphs across multiple layers, guiding cut refinements at each level of the hierarchy. Traditionally, these partitioners employ heuristic methods for coarsening and do not consider the structural features of hypergraphs. In this work, we introduce a multilevel spectral framework, SHyPar, for partitioning large-scale hypergraphs by leveraging hyperedge effective resistances and flow-based community detection techniques. Inspired by the latest theoretical spectral clustering frameworks, such as HyperEF and HyperSF, SHyPar aims to decompose large hypergraphs into multiple subgraphs with few inter-partition hyperedges (cut size). A key component of SHyPar is a flow-based local clustering scheme for hypergraph coarsening, which incorporates a max-flow-based algorithm to produce clusters with substantially improved conductance. Additionally, SHyPar utilizes an effective resistance-based rating function for merging nodes that are strongly connected (coupled). Compared with existing state-of-the-art hypergraph partitioning methods, our extensive experimental results on real-world VLSI designs demonstrate that SHyPar can more effectively partition hypergraphs, achieving state-of-the-art solution quality. △ Less

Submitted 6 July, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

Comments: 13 pages, 10 figures, 6 tables

arXiv:2407.07358 [pdf, other]

SGM-PINN: Sampling Graphical Models for Faster Training of Physics-Informed Neural Networks

Authors: John Anticev, Ali Aghdaei, Wuxinlin Cheng, Zhuo Feng

Abstract: SGM-PINN is a graph-based importance sampling framework to improve the training efficacy of Physics-Informed Neural Networks (PINNs) on parameterized problems. By applying a graph decomposition scheme to an undirected Probabilistic Graphical Model (PGM) built from the training dataset, our method generates node clusters encoding conditional dependence between training samples. Biasing sampling tow… ▽ More SGM-PINN is a graph-based importance sampling framework to improve the training efficacy of Physics-Informed Neural Networks (PINNs) on parameterized problems. By applying a graph decomposition scheme to an undirected Probabilistic Graphical Model (PGM) built from the training dataset, our method generates node clusters encoding conditional dependence between training samples. Biasing sampling towards more important clusters allows smaller mini-batches and training datasets, improving training speed and accuracy. We additionally fuse an efficient robustness metric with residual losses to determine regions requiring additional sampling. Experiments demonstrate the advantages of the proposed framework, achieving $3\times$ faster convergence compared to prior state-of-the-art sampling methods. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2406.10500 [pdf, other]

Geodesic Distance Between Graphs: A Spectral Metric for Assessing the Stability of Graph Neural Networks

Authors: Soumen Sikder Shuvo, Ali Aghdaei, Zhuo Feng

Abstract: This paper presents a spectral framework for assessing the generalization and stability of Graph Neural Networks (GNNs) by introducing a Graph Geodesic Distance (GGD) metric. For two different graphs with the same number of nodes, our framework leverages a spectral graph matching procedure to find node correspondence so that the geodesic distance between them can be subsequently computed by solvin… ▽ More This paper presents a spectral framework for assessing the generalization and stability of Graph Neural Networks (GNNs) by introducing a Graph Geodesic Distance (GGD) metric. For two different graphs with the same number of nodes, our framework leverages a spectral graph matching procedure to find node correspondence so that the geodesic distance between them can be subsequently computed by solving a generalized eigenvalue problem associated with their Laplacian matrices. For graphs with different sizes, a resistance-based spectral graph coarsening scheme is introduced to reduce the size of the bigger graph while preserving the original spectral properties. We show that the proposed GGD metric can effectively quantify dissimilarities between two graphs by encapsulating their differences in key structural (spectral) properties, such as effective resistances between nodes, cuts, the mixing time of random walks, etc. Through extensive experiments comparing with the state-of-the-art metrics, such as the latest Tree-Mover's Distance (TMD) metric, the proposed GGD metric shows significantly improved performance for stability evaluation of GNNs especially when only partial node features are available. △ Less

Submitted 4 October, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

arXiv:2402.16990 [pdf, other]

inGRASS: Incremental Graph Spectral Sparsification via Low-Resistance-Diameter Decomposition

Authors: Ali Aghdaei, Zhuo Feng

Abstract: This work presents inGRASS, a novel algorithm designed for incremental spectral sparsification of large undirected graphs. The proposed inGRASS algorithm is highly scalable and parallel-friendly, having a nearly-linear time complexity for the setup phase and the ability to update the spectral sparsifier in $O(\log N)$ time for each incremental change made to the original graph with $N$ nodes. A ke… ▽ More This work presents inGRASS, a novel algorithm designed for incremental spectral sparsification of large undirected graphs. The proposed inGRASS algorithm is highly scalable and parallel-friendly, having a nearly-linear time complexity for the setup phase and the ability to update the spectral sparsifier in $O(\log N)$ time for each incremental change made to the original graph with $N$ nodes. A key component in the setup phase of inGRASS is a multilevel resistance embedding framework introduced for efficiently identifying spectrally-critical edges and effectively detecting redundant ones, which is achieved by decomposing the initial sparsifier into many node clusters with bounded effective-resistance diameters leveraging a low-resistance-diameter decomposition (LRD) scheme. The update phase of inGRASS exploits low-dimensional node embedding vectors for efficiently estimating the importance and uniqueness of each newly added edge. As demonstrated through extensive experiments, inGRASS achieves up to over $200 \times$ speedups while retaining comparable solution quality in incremental spectral sparsification of graphs obtained from various datasets, such as circuit simulations, finite element analysis, and social networks. △ Less

Submitted 5 September, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: Accepted on DAC 2024

arXiv:2402.08653 [pdf, other]

SAGMAN: Stability Analysis of Graph Neural Networks on the Manifolds

Authors: Wuxinlin Cheng, Chenhui Deng, Ali Aghdaei, Zhiru Zhang, Zhuo Feng

Abstract: Modern graph neural networks (GNNs) can be sensitive to changes in the input graph structure and node features, potentially resulting in unpredictable behavior and degraded performance. In this work, we introduce a spectral framework known as SAGMAN for examining the stability of GNNs. This framework assesses the distance distortions that arise from the nonlinear mappings of GNNs between the input… ▽ More Modern graph neural networks (GNNs) can be sensitive to changes in the input graph structure and node features, potentially resulting in unpredictable behavior and degraded performance. In this work, we introduce a spectral framework known as SAGMAN for examining the stability of GNNs. This framework assesses the distance distortions that arise from the nonlinear mappings of GNNs between the input and output manifolds: when two nearby nodes on the input manifold are mapped (through a GNN model) to two distant ones on the output manifold, it implies a large distance distortion and thus a poor GNN stability. We propose a distance-preserving graph dimension reduction (GDR) approach that utilizes spectral graph embedding and probabilistic graphical models (PGMs) to create low-dimensional input/output graph-based manifolds for meaningful stability analysis. Our empirical evaluations show that SAGMAN effectively assesses the stability of each node when subjected to various edge or feature perturbations, offering a scalable approach for evaluating the stability of GNNs, extending to applications within recommendation systems. Furthermore, we illustrate its utility in downstream tasks, notably in enhancing GNN stability and facilitating adversarial targeted attacks. △ Less

Submitted 9 October, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

arXiv:2210.14813 [pdf, other]

HyperEF: Spectral Hypergraph Coarsening by Effective-Resistance Clustering

Authors: Ali Aghdaei, Zhuo Feng

Abstract: This paper introduces a scalable algorithmic framework (HyperEF) for spectral coarsening (decomposition) of large-scale hypergraphs by exploiting hyperedge effective resistances. Motivated by the latest theoretical framework for low-resistance-diameter decomposition of simple graphs, HyperEF aims at decomposing large hypergraphs into multiple node clusters with only a few inter-cluster hyperedges.… ▽ More This paper introduces a scalable algorithmic framework (HyperEF) for spectral coarsening (decomposition) of large-scale hypergraphs by exploiting hyperedge effective resistances. Motivated by the latest theoretical framework for low-resistance-diameter decomposition of simple graphs, HyperEF aims at decomposing large hypergraphs into multiple node clusters with only a few inter-cluster hyperedges. The key component in HyperEF is a nearly-linear time algorithm for estimating hyperedge effective resistances, which allows incorporating the latest diffusion-based non-linear quadratic operators defined on hypergraphs. To achieve good runtime scalability, HyperEF searches within the Krylov subspace (or approximate eigensubspace) for identifying the nearly-optimal vectors for approximating the hyperedge effective resistances. In addition, a node weight propagation scheme for multilevel spectral hypergraph decomposition has been introduced for achieving even greater node coarsening ratios. When compared with state-of-the-art hypergraph partitioning (clustering) methods, extensive experiment results on real-world VLSI designs show that HyperEF can more effectively coarsen (decompose) hypergraphs without losing key structural (spectral) properties of the original hypergraphs, while achieving over $70\times$ runtime speedups over hMetis and $20\times$ speedups over HyperSF. △ Less

Submitted 3 December, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

Comments: Accepted on ICCAD 2022

arXiv:2108.07901 [pdf, other]

HyperSF: Spectral Hypergraph Coarsening via Flow-based Local Clustering

Authors: Ali Aghdaei, Zhiqiang Zhao, Zhuo Feng

Abstract: Hypergraphs allow modeling problems with multi-way high-order relationships. However, the computational cost of most existing hypergraph-based algorithms can be heavily dependent upon the input hypergraph sizes. To address the ever-increasing computational challenges, graph coarsening can be potentially applied for preprocessing a given hypergraph by aggressively aggregating its vertices (nodes).… ▽ More Hypergraphs allow modeling problems with multi-way high-order relationships. However, the computational cost of most existing hypergraph-based algorithms can be heavily dependent upon the input hypergraph sizes. To address the ever-increasing computational challenges, graph coarsening can be potentially applied for preprocessing a given hypergraph by aggressively aggregating its vertices (nodes). However, state-of-the-art hypergraph partitioning (clustering) methods that incorporate heuristic graph coarsening techniques are not optimized for preserving the structural (global) properties of hypergraphs. In this work, we propose an efficient spectral hypergraph coarsening scheme (HyperSF) for well preserving the original spectral (structural) properties of hypergraphs. Our approach leverages a recent strongly-local max-flow-based clustering algorithm for detecting the sets of hypergraph vertices that minimize ratio cut. To further improve the algorithm efficiency, we propose a divide-and-conquer scheme by leveraging spectral clustering of the bipartite graphs corresponding to the original hypergraphs. Our experimental results for a variety of hypergraphs extracted from real-world VLSI design benchmarks show that the proposed hypergraph coarsening algorithm can significantly improve the multi-way conductance of hypergraph clustering as well as runtime efficiency when compared with existing state-of-the-art algorithms. △ Less

Submitted 21 December, 2021; v1 submitted 17 August, 2021; originally announced August 2021.

Comments: Accepted by ICCAD 2021

arXiv:1906.06388 [pdf, other]

Intelligent Anomaly Detection and Mitigation in Data Centers

Authors: Ashkan Aghdai, Kang Xi, H. Jonathan Chao

Abstract: Data centers play a key role in today's Internet. Cloud applications are mainly hosted on multi-tenant warehouse-scale data centers. Anomalies pose a serious threat to data centers' operations. If not controlled properly, a simple anomaly can spread throughout the data center, resulting in a cascading failure. Amazon AWS had been affected by such incidents recently. Although some solutions are pro… ▽ More Data centers play a key role in today's Internet. Cloud applications are mainly hosted on multi-tenant warehouse-scale data centers. Anomalies pose a serious threat to data centers' operations. If not controlled properly, a simple anomaly can spread throughout the data center, resulting in a cascading failure. Amazon AWS had been affected by such incidents recently. Although some solutions are proposed to detect anomalies and prevent cascading failures, they mainly rely on application-specific metrics and case-based diagnosis to detect the anomalies. Given the variety of applications on a multi-tenant data center, proposed solutions are not capable of detecting anomalies in a timely manner. In this paper we design an application-agnostic anomaly detection scheme. More specifically, our design uses a highly distributed data mining scheme over network-level traffic metrics to detect anomalies. Once anomalies are detected, simple actions are taken to mitigate the damage. This ensures that errors are confined and prevents cascading failures before administrators intervene. △ Less

Submitted 14 June, 2019; originally announced June 2019.

arXiv:1905.05258 [pdf, other]

Enabling Mobility in LTE-Compatible Mobile-edge Computing with Programmable Switches

Authors: Ashkan Aghdai, Yang Xu, Mark Huang, David H. Dai, H. Jonathan Chao

Abstract: Network softwarization triggered a new wave of innovation in modern network design. The next generation of mobile networks embraces this trend. Mobile-edge computing (MEC) is a key part of emerging mobile networks that enables ultra-low latency mission-critical application such as vehicle-to vehicle communication. MEC aims at bringing delay-sensitive applications closer to the radio access network… ▽ More Network softwarization triggered a new wave of innovation in modern network design. The next generation of mobile networks embraces this trend. Mobile-edge computing (MEC) is a key part of emerging mobile networks that enables ultra-low latency mission-critical application such as vehicle-to vehicle communication. MEC aims at bringing delay-sensitive applications closer to the radio access network to enable ultra-low latency for users and decrease the back-haul pressure on mobile service providers. However, there are no practical solutions to enable mobility at MEC where connections are no longer anchored to the core network and serving applications are supposed to move as their users move. We propose the mobile-edge gateway (MEGW) to address this gap. MEGW enables mobility for MEC applications transparently and without requiring any modifications to existing protocols and applications. MEGW supports mobility by reconstructing mobile users' location via listening to LTE control plane in addition to using two-stage location-dependent traffic steering for edge connections. Networks can incrementally upgrade to support MEC by upgrading some IP router to programmable switches that run MEGW. We have implemented MEGW using P4 language and verified its compatibility with existing LTE networks in a testbed running reference LTE protocol stack. Furthermore, using packet-level simulations we show that the two-stage traffic steering algorithm reduces the number of application migrations and simplifies service provisioning. △ Less

Submitted 13 May, 2019; originally announced May 2019.

arXiv:1811.09731 [pdf, other]

doi 10.1109/NFV-SDN47374.2019.9040109

In-network Congestion-aware Load Balancing at Transport Layer

Authors: Ashkan Aghdai, Michael I. -C. Wang, Yang Xu, Charles H. -P. Wen, H. Jonathan Chao

Abstract: Load balancing at transport layer is an important function in data centers, content delivery networks, and mobile networks, where per-connection consistency (PCC) has to be met for optimal performance. Cloud-native L4 load balancers are commonly deployed as virtual network functions (VNFs) and are a critical forwarding element in modern cloud infrastructure. We identify load imbalance among servic… ▽ More Load balancing at transport layer is an important function in data centers, content delivery networks, and mobile networks, where per-connection consistency (PCC) has to be met for optimal performance. Cloud-native L4 load balancers are commonly deployed as virtual network functions (VNFs) and are a critical forwarding element in modern cloud infrastructure. We identify load imbalance among service instances as the main cause of additional processing delay caused by transport-layer load balancers. Existing transport-layer load balancers rely on one of two methods: host-level traffic redirection, which may add as much as 12.48% additional traffic to underlying networks, or connection tracking, which consumes a considerable amount of memory in load balancers. Both of these methods result in inefficient usage of network resources. We propose the in-network congestion-aware load Balancer (INCAB) to achieve even load distribution among service instances and optimal network resources usage in addition to meeting the PCC requirement. We show that INCAB is capable of identifying and monitoring each instance's most-utilized resource and can improve the load distribution among all service instances. INCAB utilizes a Bloom filter and an ultra-compact connection table for in-network flow distribution. Furthermore, it does not rely on end hosts for traffic redirection. Our flow level simulations show that INCAB improves flows' average completion time by 31.97% compared to stateless solutions. △ Less

Submitted 13 June, 2019; v1 submitted 23 November, 2018; originally announced November 2018.

arXiv:1806.08455 [pdf, other]

Spotlight: Scalable Transport Layer Load Balancing for Data Center Networks

Authors: Ashkan Aghdai, Cing-Yu Chu, Yang Xu, David H. Dai, Jun Xu, H. Jonathan Chao

Abstract: Load Balancing plays a vital role in modern data centers to distribute traffic among instances of network functions or services. State-of-the-art load balancers such as Silkroad dispatch traffic obliviously without considering the real-time utilization of service instances and therefore can lead to uneven load distribution and suboptimal performance. In this paper, we design and implement Spotli… ▽ More Load Balancing plays a vital role in modern data centers to distribute traffic among instances of network functions or services. State-of-the-art load balancers such as Silkroad dispatch traffic obliviously without considering the real-time utilization of service instances and therefore can lead to uneven load distribution and suboptimal performance. In this paper, we design and implement Spotlight, a scalable and distributed load balancing architecture that maintains connection-to-instance mapping consistency at the edge of data center networks. Spotlight uses a new stateful flow dispatcher which periodically polls instances' load and dispatches incoming connections to instances in proportion to their available capacity. Our design utilizes distributed control plane and in-band flow dispatching and thus scales horizontally in data center networks. Through extensive flow-level simulation and packet-level experiments on a testbed, we demonstrate that compared to existing methods Spotlight distributes the traffic more efficiently and has near-optimum performance in terms of overall service utilization. Moreover, Spotlight is not sensitive to utilization polling interval and therefore can be implemented with low polling frequency to reduce the amount of control traffic. Indeed, Spotlight achieves the mentioned performance improvements using O(100ms) polling interval. △ Less

Submitted 23 February, 2019; v1 submitted 21 June, 2018; originally announced June 2018.

arXiv:1705.09999 [pdf, ps, other]

doi 10.1109/NFV-SDN.2017.8169825

Design of a Hybrid Modular Switch

Authors: Ashkan Aghdai, Yang Xu, H. Jonathan Chao

Abstract: Network Function Virtualization (NFV) shed new light for the design, deployment, and management of cloud networks. Many network functions such as firewalls, load balancers, and intrusion detection systems can be virtualized by servers. However, network operators often have to sacrifice programmability in order to achieve high throughput, especially at networks' edge where complex network functions… ▽ More Network Function Virtualization (NFV) shed new light for the design, deployment, and management of cloud networks. Many network functions such as firewalls, load balancers, and intrusion detection systems can be virtualized by servers. However, network operators often have to sacrifice programmability in order to achieve high throughput, especially at networks' edge where complex network functions are required. Here, we design, implement, and evaluate Hybrid Modular Switch (HyMoS). The hybrid hardware/software switch is designed to meet requirements for modern-day NFV applications in providing high-throughput, with a high degree of programmability. HyMoS utilizes P4-compatible Network Interface Cards (NICs), PCI Express interface and CPU to act as line cards, switch fabric, and fabric controller respectively. In our implementation of HyMos, PCI Express interface is turned into a non-blocking switch fabric with a throughput of hundreds of Gigabits per second. Compared to existing NFV infrastructure, HyMoS offers modularity in hardware and software as well as a higher degree of programmability by supporting a superset of P4 language. △ Less

Submitted 28 May, 2017; originally announced May 2017.

Showing 1–14 of 14 results for author: Aghdai, A