Skip to main content

Showing 1–18 of 18 results for author: Gasteiger, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.17420  [pdf, other

    cs.LG cs.AI cs.CL

    The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence

    Authors: Tom Wollschläger, Jannes Elstner, Simon Geisler, Vincent Cohen-Addad, Stephan Günnemann, Johannes Gasteiger

    Abstract: The safety alignment of large language models (LLMs) can be circumvented through adversarially crafted inputs, yet the mechanisms by which these attacks bypass safety barriers remain poorly understood. Prior work suggests that a single refusal direction in the model's activation space determines whether an LLM refuses a request. In this study, we propose a novel gradient-based approach to represen… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  2. arXiv:2502.17254  [pdf, other

    cs.LG

    REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective

    Authors: Simon Geisler, Tom Wollschläger, M. H. I. Abdalla, Vincent Cohen-Addad, Johannes Gasteiger, Stephan Günnemann

    Abstract: To circumvent the alignment of large language models (LLMs), current optimization-based adversarial attacks usually craft adversarial prompts by maximizing the likelihood of a so-called affirmative response. An affirmative response is a manually designed start of a harmful answer to an inappropriate request. While it is often easy to craft prompts that yield a substantial likelihood for the affirm… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

    Comments: 30 pages, 6 figures, 15 tables

  3. arXiv:2402.09154  [pdf, other

    cs.LG

    Attacking Large Language Models with Projected Gradient Descent

    Authors: Simon Geisler, Tom Wollschläger, M. H. I. Abdalla, Johannes Gasteiger, Stephan Günnemann

    Abstract: Current LLM alignment methods are readily broken through specifically crafted adversarial prompts. While crafting adversarial prompts using discrete optimization is highly effective, such attacks typically use more than 100,000 LLM calls. This high computational cost makes them unsuitable for, e.g., quantitative analyses and adversarial training. To remedy this, we revisit Projected Gradient Desce… ▽ More

    Submitted 3 March, 2025; v1 submitted 14 February, 2024; originally announced February 2024.

  4. arXiv:2312.10029  [pdf, other

    cs.LG cs.AI

    Challenges with unsupervised LLM knowledge discovery

    Authors: Sebastian Farquhar, Vikrant Varma, Zachary Kenton, Johannes Gasteiger, Vladimir Mikulik, Rohin Shah

    Abstract: We show that existing unsupervised methods on large language model (LLM) activations do not discover knowledge -- instead they seem to discover whatever feature of the activations is most prominent. The idea behind unsupervised knowledge elicitation is that knowledge satisfies a consistency structure, which can be used to discover knowledge. We first prove theoretically that arbitrary features (no… ▽ More

    Submitted 18 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: 12 pages (38 including references and appendices). First three authors equal contribution, randomised order

  5. arXiv:2306.14818  [pdf, other

    cs.LG physics.chem-ph

    Accelerating Molecular Graph Neural Networks via Knowledge Distillation

    Authors: Filip Ekström Kelvinius, Dimitar Georgiev, Artur Petrov Toshev, Johannes Gasteiger

    Abstract: Recent advances in graph neural networks (GNNs) have enabled more comprehensive modeling of molecules and molecular systems, thereby enhancing the precision of molecular property prediction and molecular simulations. Nonetheless, as the field has been progressing to bigger and more complex architectures, state-of-the-art GNNs have become largely prohibitive for many large-scale applications. In th… ▽ More

    Submitted 28 October, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

    Comments: Accepted as a conference paper at NeurIPS 2023

  6. arXiv:2303.04791  [pdf, other

    cs.LG cond-mat.mtrl-sci physics.chem-ph physics.comp-ph

    Ewald-based Long-Range Message Passing for Molecular Graphs

    Authors: Arthur Kosmala, Johannes Gasteiger, Nicholas Gao, Stephan Günnemann

    Abstract: Neural architectures that learn potential energy surfaces from molecular data have undergone fast improvement in recent years. A key driver of this success is the Message Passing Neural Network (MPNN) paradigm. Its favorable scaling with system size partly relies upon a spatial distance limit on messages. While this focus on locality is a useful inductive bias, it also impedes the learning of long… ▽ More

    Submitted 6 June, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: Published at the 40th International Conference on Machine Learning (ICML 2023)

  7. arXiv:2302.02829  [pdf, other

    cs.LG cs.CR

    Collective Robustness Certificates: Exploiting Interdependence in Graph Neural Networks

    Authors: Jan Schuchardt, Aleksandar Bojchevski, Johannes Gasteiger, Stephan Günnemann

    Abstract: In tasks like node classification, image segmentation, and named-entity recognition we have a classifier that simultaneously outputs multiple predictions (a vector of labels) based on a single input, i.e. a single graph, image, or document respectively. Existing adversarial robustness certificates consider each prediction independently and are thus overly pessimistic for such tasks. They implicitl… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

    Comments: Accepted at ICLR 2021 (https://openreview.net/forum?id=ULQdiUTHe3y). Uploaded to arxiv to fix Google Scholar indexing

  8. arXiv:2212.09083  [pdf, other

    cs.LG cs.AI cs.SI stat.ML

    Influence-Based Mini-Batching for Graph Neural Networks

    Authors: Johannes Gasteiger, Chendi Qian, Stephan Günnemann

    Abstract: Using graph neural networks for large graphs is challenging since there is no clear way of constructing mini-batches. To solve this, previous methods have relied on sampling or graph clustering. While these approaches often lead to good training convergence, they introduce significant overhead due to expensive random data accesses and perform poorly during inference. In this work we instead focus… ▽ More

    Submitted 18 December, 2022; originally announced December 2022.

    Comments: Published as a proceedings paper at LoG 2022

  9. arXiv:2204.02782  [pdf, other

    cs.LG cond-mat.mtrl-sci physics.chem-ph physics.comp-ph

    GemNet-OC: Developing Graph Neural Networks for Large and Diverse Molecular Simulation Datasets

    Authors: Johannes Gasteiger, Muhammed Shuaibi, Anuroop Sriram, Stephan Günnemann, Zachary Ulissi, C. Lawrence Zitnick, Abhishek Das

    Abstract: Recent years have seen the advent of molecular simulation datasets that are orders of magnitude larger and more diverse. These new datasets differ substantially in four aspects of complexity: 1. Chemical diversity (number of different elements), 2. system size (number of atoms per sample), 3. dataset size (number of data samples), and 4. domain shift (similarity of the training and test set). Desp… ▽ More

    Submitted 30 September, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

  10. arXiv:2111.04718  [pdf, other

    cs.LG physics.chem-ph physics.comp-ph q-bio.QM

    Directional Message Passing on Molecular Graphs via Synthetic Coordinates

    Authors: Johannes Gasteiger, Chandan Yeshwanth, Stephan Günnemann

    Abstract: Graph neural networks that leverage coordinates via directional message passing have recently set the state of the art on multiple molecular property prediction tasks. However, they rely on atom position information that is often unavailable, and obtaining it is usually prohibitively expensive or even impossible. In this paper we propose synthetic coordinates that enable the use of advanced GNNs w… ▽ More

    Submitted 5 April, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

    Comments: Published as a conference paper at NeurIPS 2021. Author name changed from Johannes Klicpera to Johannes Gasteiger

  11. arXiv:2107.06876  [pdf, other

    cs.LG cs.CL cs.DS cs.SI stat.ML

    Scalable Optimal Transport in High Dimensions for Graph Distances, Embedding Alignment, and More

    Authors: Johannes Gasteiger, Marten Lienen, Stephan Günnemann

    Abstract: The current best practice for computing optimal transport (OT) is via entropy regularization and Sinkhorn iterations. This algorithm runs in quadratic time as it requires the full pairwise cost matrix, which is prohibitively expensive for large sets of objects. In this work we propose two effective log-linear time approximations of the cost matrix: First, a sparse approximation based on locality-s… ▽ More

    Submitted 5 April, 2022; v1 submitted 14 July, 2021; originally announced July 2021.

    Comments: Published as a conference paper at ICML 2021. Author name changed from Johannes Klicpera to Johannes Gasteiger

  12. arXiv:2106.08903  [pdf, other

    physics.comp-ph cs.LG physics.chem-ph stat.ML

    GemNet: Universal Directional Graph Neural Networks for Molecules

    Authors: Johannes Gasteiger, Florian Becker, Stephan Günnemann

    Abstract: Effectively predicting molecular interactions has the potential to accelerate molecular dynamics by multiple orders of magnitude and thus revolutionize chemical simulations. Graph neural networks (GNNs) have recently shown great successes for this task, overtaking classical methods based on fixed molecular kernels. However, they still appear very limited from a theoretical perspective, since regul… ▽ More

    Submitted 22 June, 2024; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: Published as a conference paper at NeurIPS 2021. Author name changed from Johannes Klicpera to Johannes Gasteiger

  13. arXiv:2011.14115  [pdf, other

    cs.LG physics.chem-ph physics.comp-ph

    Fast and Uncertainty-Aware Directional Message Passing for Non-Equilibrium Molecules

    Authors: Johannes Gasteiger, Shankari Giri, Johannes T. Margraf, Stephan Günnemann

    Abstract: Many important tasks in chemistry revolve around molecules during reactions. This requires predictions far from the equilibrium, while most recent work in machine learning for molecules has been focused on equilibrium or near-equilibrium states. In this paper we aim to extend this scope in three ways. First, we propose the DimeNet++ model, which is 8x faster and 10% more accurate than the original… ▽ More

    Submitted 5 April, 2022; v1 submitted 28 November, 2020; originally announced November 2020.

    Comments: Published at the Machine Learning for Molecules Workshop at NeurIPS 2020. Author name changed from Johannes Klicpera to Johannes Gasteiger

  14. arXiv:2008.12952  [pdf, other

    cs.LG cs.CR cs.SI stat.ML

    Efficient Robustness Certificates for Discrete Data: Sparsity-Aware Randomized Smoothing for Graphs, Images and More

    Authors: Aleksandar Bojchevski, Johannes Gasteiger, Stephan Günnemann

    Abstract: Existing techniques for certifying the robustness of models for discrete data either work only for a small class of models or are general at the expense of efficiency or tightness. Moreover, they do not account for sparsity in the input which, as our findings show, is often essential for obtaining non-trivial guarantees. We propose a model-agnostic certificate based on the randomized smoothing fra… ▽ More

    Submitted 27 February, 2023; v1 submitted 29 August, 2020; originally announced August 2020.

    Comments: Proceedings of the 37th International Conference on Machine Learning (ICML 2020)

  15. arXiv:2007.01570  [pdf, other

    cs.LG cs.SI stat.ML

    Scaling Graph Neural Networks with Approximate PageRank

    Authors: Aleksandar Bojchevski, Johannes Gasteiger, Bryan Perozzi, Amol Kapoor, Martin Blais, Benedek Rózemberczki, Michal Lukasik, Stephan Günnemann

    Abstract: Graph neural networks (GNNs) have emerged as a powerful approach for solving many network mining tasks. However, learning on large graphs remains a challenge - many recently proposed scalable GNN approaches rely on an expensive message-passing procedure to propagate information through the graph. We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs… ▽ More

    Submitted 5 April, 2022; v1 submitted 3 July, 2020; originally announced July 2020.

    Comments: Published as a Conference Paper at ACM SIGKDD 2020. Author name changed from Johannes Klicpera to Johannes Gasteiger

  16. arXiv:2003.03123  [pdf, other

    cs.LG physics.comp-ph stat.ML

    Directional Message Passing for Molecular Graphs

    Authors: Johannes Gasteiger, Janek Groß, Stephan Günnemann

    Abstract: Graph neural networks have recently achieved great successes in predicting quantum mechanical properties of molecules. These models represent a molecule as a graph using only the distance between atoms (nodes). They do not, however, consider the spatial direction from one atom to another, despite directional information playing a central role in empirical potentials for molecules, e.g. in angular… ▽ More

    Submitted 5 April, 2022; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: Published as a conference paper at ICLR 2020. Author name changed from Johannes Klicpera to Johannes Gasteiger

  17. arXiv:1911.05485  [pdf, other

    cs.SI cs.AI cs.LG stat.ML

    Diffusion Improves Graph Learning

    Authors: Johannes Gasteiger, Stefan Weißenberger, Stephan Günnemann

    Abstract: Graph convolution is the core of most Graph Neural Networks (GNNs) and usually approximated by message passing between direct (one-hop) neighbors. In this work, we remove the restriction of using only the direct neighbors by introducing a powerful, yet spatially localized graph convolution: Graph diffusion convolution (GDC). GDC leverages generalized graph diffusion, examples of which are the heat… ▽ More

    Submitted 5 April, 2022; v1 submitted 28 October, 2019; originally announced November 2019.

    Comments: Published as a conference paper at NeurIPS 2019. Author name changed from Johannes Klicpera to Johannes Gasteiger

    Journal ref: Thirty-third Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, 2019

  18. arXiv:1810.05997  [pdf, other

    cs.LG stat.ML

    Predict then Propagate: Graph Neural Networks meet Personalized PageRank

    Authors: Johannes Gasteiger, Aleksandar Bojchevski, Stephan Günnemann

    Abstract: Neural message passing algorithms for semi-supervised classification on graphs have recently achieved great success. However, for classifying a node these methods only consider nodes that are a few propagation steps away and the size of this utilized neighborhood is hard to extend. In this paper, we use the relationship between graph convolutional networks (GCN) and PageRank to derive an improved… ▽ More

    Submitted 5 April, 2022; v1 submitted 14 October, 2018; originally announced October 2018.

    Comments: Published as a conference paper at ICLR 2019. Author name changed from Johannes Klicpera to Johannes Gasteiger

    Journal ref: International Conference on Learning Representations (ICLR), New Orleans, LA, USA, 2019