Search | arXiv e-print repository

Evaluating Loss Functions for Graph Neural Networks: Towards Pretraining and Generalization

Authors: Khushnood Abbas, Ruizhe Hou, Zhou Wengang, Dong Shi, Niu Ling, Satyaki Nan, Alireza Abbasi

Abstract: Graph Neural Networks (GNNs) became useful for learning on non-Euclidean data. However, their best performance depends on choosing the right model architecture and the training objective, also called the loss function. Researchers have studied these parts separately, but a large-scale evaluation has not looked at how GNN models and many loss functions work together across different tasks. To fix t… ▽ More Graph Neural Networks (GNNs) became useful for learning on non-Euclidean data. However, their best performance depends on choosing the right model architecture and the training objective, also called the loss function. Researchers have studied these parts separately, but a large-scale evaluation has not looked at how GNN models and many loss functions work together across different tasks. To fix this, we ran a thorough study - it included seven well-known GNN architectures. We also used a large group of 30 single plus mixed loss functions. The study looked at both inductive and transductive settings. Our evaluation spanned three distinct real-world datasets, assessing performance in both inductive and transductive settings using 21 comprehensive evaluation metrics. From these extensive results (detailed in supplementary information 1 \& 2), we meticulously analyzed the top ten model-loss combinations for each metric based on their average rank. Our findings reveal that, especially for the inductive case: 1) Hybrid loss functions generally yield superior and more robust performance compared to single loss functions, indicating the benefit of multi-objective optimization. 2) The GIN architecture always showed the highest-level average performance, especially with Cross-Entropy loss. 3) Although some combinations had overall lower average ranks, models such as GAT, particularly with certain hybrid losses, demonstrated incredible specialized strengths, maximizing the most top-1 results among the individual metrics, emphasizing subtle strengths for particular task demands. 4) On the other hand, the MPNN architecture typically lagged behind the scenarios it was tested against. △ Less

Submitted 16 June, 2025; originally announced June 2025.

Comments: ACM single column 633 pages

arXiv:2503.09446 [pdf, other]

Sparse Autoencoder as a Zero-Shot Classifier for Concept Erasing in Text-to-Image Diffusion Models

Authors: Zhihua Tian, Sirun Nan, Ming Xu, Shengfang Zhai, Wenjie Qu, Jian Liu, Kui Ren, Ruoxi Jia, Jiaheng Zhang

Abstract: Text-to-image (T2I) diffusion models have achieved remarkable progress in generating high-quality images but also raise people's concerns about generating harmful or misleading content. While extensive approaches have been proposed to erase unwanted concepts without requiring retraining from scratch, they inadvertently degrade performance on normal generation tasks. In this work, we propose Interp… ▽ More Text-to-image (T2I) diffusion models have achieved remarkable progress in generating high-quality images but also raise people's concerns about generating harmful or misleading content. While extensive approaches have been proposed to erase unwanted concepts without requiring retraining from scratch, they inadvertently degrade performance on normal generation tasks. In this work, we propose Interpret then Deactivate (ItD), a novel framework to enable precise concept removal in T2I diffusion models while preserving overall performance. ItD first employs a sparse autoencoder (SAE) to interpret each concept as a combination of multiple features. By permanently deactivating the specific features associated with target concepts, we repurpose SAE as a zero-shot classifier that identifies whether the input prompt includes target concepts, allowing selective concept erasure in diffusion models. Moreover, we demonstrate that ItD can be easily extended to erase multiple concepts without requiring further training. Comprehensive experiments across celebrity identities, artistic styles, and explicit content demonstrate ItD's effectiveness in eliminating targeted concepts without interfering with normal concept generation. Additionally, ItD is also robust against adversarial prompts designed to circumvent content filters. Code is available at: https://github.com/NANSirun/Interpret-then-deactivate. △ Less

Submitted 18 March, 2025; v1 submitted 12 March, 2025; originally announced March 2025.

Comments: 25 pages

arXiv:2310.17145 [pdf]

doi 10.1063/5.0216883

Unveiling microstructural damage for leakage current degradation in SiC Schottky diode after heavy ions irradiation under 200 V

Authors: Xiaoyu Yan, Pengfei Zhai, Chen Yang, Shiwei Zhao, Shuai Nan, Peipei Hu, Teng Zhang, Qiyu Chen, Lijun Xu, Zongzhen Li, Jie Liu

Abstract: Single-event burnout and single-event leakage current (SELC) in SiC power devices induced by heavy ions severely limit their space application, and the underlying mechanism is still unclear. One fundamental problem is lack of high-resolution characterization of radiation damage in the irradiated SiC power devices, which is a crucial indicator of the related mechanism. In this letter, high-resoluti… ▽ More Single-event burnout and single-event leakage current (SELC) in SiC power devices induced by heavy ions severely limit their space application, and the underlying mechanism is still unclear. One fundamental problem is lack of high-resolution characterization of radiation damage in the irradiated SiC power devices, which is a crucial indicator of the related mechanism. In this letter, high-resolution transmission electron microscopy (TEM) was used to characterize the radiation damage in the 1437.6 MeV 181Ta-irradiated SiC junction barrier Schottky diode under 200 V. The amorphous radiation damage with about 52 nm in diameter and 121 nm in length at the Schottky metal (Ti)-semiconductor (SiC) interface was observed. More importantly, in the damage site the atomic mixing of Ti, Si, and C was identified by electron energy loss spectroscopy and high-angle annular dark-field scanning TEM. It indicates that the melting of the Ti-SiC interface induced by localized Joule heating is responsible for the amorphization and the formation of titanium silicide, titanium carbide, or ternary phases. These modifications at nanoscale in turn cause the localized degradation of the Schottky contact, resulting in the permanent increase in leakage current. This experimental study provides very valuable clues to thorough understanding of the SELC mechanism in SiC diode. △ Less

Submitted 7 March, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

Comments: 4 pages,4 figures

Journal ref: Applied Physics Letters, 125, 042103 (2024)

arXiv:1806.09756 [pdf]

doi 10.1016/j.nimb.2019.07.024

Fine structure of swift heavy ion track in rutile TiO2

Authors: Pengfei Zhai, Shuai Nan, Lijun Xu, Weixing Li, Zongzhen Li, Peipei Hu, Jian Zeng, Shengxia Zhang, Youmei Sun, Jie Liu

Abstract: We report on the first observation of fine structure of latent tracks in rutile TiO2, which changes from cylinder to dumbbell-shape and then to sandglass-shape as a function of the ion path length. Based on inelastic thermal spike model, we show that Hagen-Poiseuille flow of molten phase produces the hillocks on surface and the void-rich zone near surface after epitaxial recrystallization due to m… ▽ More We report on the first observation of fine structure of latent tracks in rutile TiO2, which changes from cylinder to dumbbell-shape and then to sandglass-shape as a function of the ion path length. Based on inelastic thermal spike model, we show that Hagen-Poiseuille flow of molten phase produces the hillocks on surface and the void-rich zone near surface after epitaxial recrystallization due to material deficit, while at a deep depth, the lack of efficient outflow and recrystallization result in the absence of tracks. We propose that core-shell duration of transient molten phase induced by swift heavy ion and parabolic distribution of fluid velocity are radial-dependent. Moreover, the various morphologies of tracks are a consequence of the molten phase outflow and recrystallization during rapid cooling down. Our perspective provides a new interpretation in the track formation. △ Less

Submitted 14 June, 2019; v1 submitted 25 June, 2018; originally announced June 2018.

arXiv:1509.01090 [pdf, ps, other]

Tiling sets and spectral sets over finite fields

Authors: C. Aten, B. Ayachi, E. Bau, D. FitzPatrick, A. Iosevich, H. Liu, A. Lott, I. MacKinnon, S. Maimon, S. Nan, J. Pakianathan, G. Petridis, C. Rojas Mena, A. Sheikh, T. Tribone, J. Weill, C. Yu

Abstract: We study tiling and spectral sets in vector spaces over prime fields. The classical Fuglede conjecture in locally compact abelian groups says that a set is spectral if and only if it tiles by translation. This conjecture was disproved by T. Tao in Euclidean spaces of dimensions 5 and higher, using constructions over prime fields (in vector spaces over finite fields of prime order) and lifting them… ▽ More We study tiling and spectral sets in vector spaces over prime fields. The classical Fuglede conjecture in locally compact abelian groups says that a set is spectral if and only if it tiles by translation. This conjecture was disproved by T. Tao in Euclidean spaces of dimensions 5 and higher, using constructions over prime fields (in vector spaces over finite fields of prime order) and lifting them to the Euclidean setting. Over prime fields, when the dimension of the vector space is less than or equal to $2$ it has recently been proven that the Fuglede conjecture holds (see \cite{IMP15}). In this paper we study this question in higher dimensions over prime fields and provide some results and counterexamples. In particular we prove the existence of spectral sets which do not tile in $\mathbb{Z}_p^5$ for all odd primes $p$ and $\mathbb{Z}_p^4$ for all odd primes $p$ such that $p \equiv 3 \text{ mod } 4$. Although counterexamples in low dimensional groups over cyclic rings $\mathbb{Z}_n$ were previously known they were usually for non prime $n$ or a small, sporadic set of primes $p$ rather than general constructions. This paper is a result of a Research Experience for Undergraduates program ran at the University of Rochester during the summer of 2015 by A. Iosevich, J. Pakianathan and G. Petridis. △ Less

Submitted 3 September, 2015; originally announced September 2015.

MSC Class: 46S10; 52C22; 05D99

arXiv:0906.3057 [pdf]

GRB 090423: Marking the Death of a Massive Star at z=8.2

Authors: Lin Lin, Liang En Wei, Zhang Shuang Nan

Abstract: GRB 090423 is the new high-z record holder of Gamma-ray bursts (GRBs) with z~ 8.2. We present a detailed analysis of both the spectral and temporal features of GRB 090423 observed with Swift/BAT and Fermi/GBM. We find that the T90 observed with BAT in the 15-150 keV band is 13.2 s, corresponding to ~ 1.4 s at z=8.2. It once again gives rise to an issue whether the progenitors of high-z GRBs are… ▽ More GRB 090423 is the new high-z record holder of Gamma-ray bursts (GRBs) with z~ 8.2. We present a detailed analysis of both the spectral and temporal features of GRB 090423 observed with Swift/BAT and Fermi/GBM. We find that the T90 observed with BAT in the 15-150 keV band is 13.2 s, corresponding to ~ 1.4 s at z=8.2. It once again gives rise to an issue whether the progenitors of high-z GRBs are massive stars or mergers since the discovery of GRB 080913 at z=6.7. In comparison with T90 distribution in the burst frame of current redshift-known GRB sample, we find that it is marginally grouped into the long group (Type II GRBs). The spectrum observed with both BAT and GBM is well fitted by a power-law with exponential cutoff, which yields an Ep=50.4+/-7.0 keV. The event well satisfies the Amati-relation for the Type II GRBs within their 3 siggma uncertainty range. Our results indicate that this event would be produced by the death of a massive star. Based on the Amati-relation, we derive its distance modulus, which follows the Hubble diagram of the concordance cosmology model at a redshift of ~8.2. △ Less

Submitted 16 June, 2009; originally announced June 2009.

Comments: accepted for publication in Science in China G

Showing 1–6 of 6 results for author: Nan, S