Search | arXiv e-print repository

Informed, but Not Always Improved: Challenging the Benefit of Background Knowledge in GNNs

Authors: Kutalmış Coşkun, Ivo Kavisanczki, Amin Mirzaei, Tom Siegl, Bjarne C. Hiller, Stefan Lüdtke, Martin Becker

Abstract: In complex and low-data domains such as biomedical research, incorporating background knowledge (BK) graphs, such as protein-protein interaction (PPI) networks, into graph-based machine learning pipelines is a promising research direction. However, while BK is often assumed to improve model performance, its actual contribution and the impact of imperfect knowledge remain poorly understood. In this… ▽ More In complex and low-data domains such as biomedical research, incorporating background knowledge (BK) graphs, such as protein-protein interaction (PPI) networks, into graph-based machine learning pipelines is a promising research direction. However, while BK is often assumed to improve model performance, its actual contribution and the impact of imperfect knowledge remain poorly understood. In this work, we investigate the role of BK in an important real-world task: cancer subtype classification. Surprisingly, we find that (i) state-of-the-art GNNs using BK perform no better than uninformed models like linear regression, and (ii) their performance remains largely unchanged even when the BK graph is heavily perturbed. To understand these unexpected results, we introduce an evaluation framework, which employs (i) a synthetic setting where the BK is clearly informative and (ii) a set of perturbations that simulate various imperfections in BK graphs. With this, we test the robustness of BK-aware models in both synthetic and real-world biomedical settings. Our findings reveal that careful alignment of GNN architectures and BK characteristics is necessary but holds the potential for significant performance improvements. △ Less

Submitted 16 May, 2025; originally announced May 2025.

Comments: 10 pages, 7 figures

arXiv:2503.09159 [pdf, other]

Unreflected Use of Tabular Data Repositories Can Undermine Research Quality

Authors: Andrej Tschalzev, Lennart Purucker, Stefan Lüdtke, Frank Hutter, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Data repositories have accumulated a large number of tabular datasets from various domains. Machine Learning researchers are actively using these datasets to evaluate novel approaches. Consequently, data repositories have an important standing in tabular data research. They not only host datasets but also provide information on how to use them in supervised learning tasks. In this paper, we argue… ▽ More Data repositories have accumulated a large number of tabular datasets from various domains. Machine Learning researchers are actively using these datasets to evaluate novel approaches. Consequently, data repositories have an important standing in tabular data research. They not only host datasets but also provide information on how to use them in supervised learning tasks. In this paper, we argue that, despite great achievements in usability, the unreflected usage of datasets from data repositories may have led to reduced research quality and scientific rigor. We present examples from prominent recent studies that illustrate the problematic use of datasets from OpenML, a large data repository for tabular data. Our illustrations help users of data repositories avoid falling into the traps of (1) using suboptimal model selection strategies, (2) overlooking strong baselines, and (3) inappropriate preprocessing. In response, we discuss possible solutions for how data repositories can prevent the inappropriate use of datasets and become the cornerstones for improved overall quality of empirical research studies. △ Less

Submitted 12 March, 2025; originally announced March 2025.

arXiv:2408.08761 [pdf, other]

Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization

Authors: Sascha Marton, Tim Grams, Florian Vogt, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Reinforcement learning (RL) has seen significant success across various domains, but its adoption is often limited by the black-box nature of neural network policies, making them difficult to interpret. In contrast, symbolic policies allow representing decision-making strategies in a compact and interpretable way. However, learning symbolic policies directly within on-policy methods remains challe… ▽ More Reinforcement learning (RL) has seen significant success across various domains, but its adoption is often limited by the black-box nature of neural network policies, making them difficult to interpret. In contrast, symbolic policies allow representing decision-making strategies in a compact and interpretable way. However, learning symbolic policies directly within on-policy methods remains challenging. In this paper, we introduce SYMPOL, a novel method for SYMbolic tree-based on-POLicy RL. SYMPOL employs a tree-based model integrated with a policy gradient method, enabling the agent to learn and adapt its actions while maintaining a high level of interpretability. We evaluate SYMPOL on a set of benchmark RL tasks, demonstrating its superiority over alternative tree-based RL approaches in terms of performance and interpretability. Unlike existing methods, it enables gradient-based, end-to-end learning of interpretable, axis-aligned decision trees within standard on-policy RL algorithms. Therefore, SYMPOL can become the foundation for a new class of interpretable RL based on decision trees. Our implementation is available under: https://github.com/s-marton/sympol △ Less

Submitted 11 March, 2025; v1 submitted 16 August, 2024; originally announced August 2024.

arXiv:2407.02112 [pdf, other]

A Data-Centric Perspective on Evaluating Machine Learning Models for Tabular Data

Authors: Andrej Tschalzev, Sascha Marton, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Tabular data is prevalent in real-world machine learning applications, and new models for supervised learning of tabular data are frequently proposed. Comparative studies assessing the performance of models typically consist of model-centric evaluation setups with overly standardized data preprocessing. This paper demonstrates that such model-centric evaluations are biased, as real-world modeling… ▽ More Tabular data is prevalent in real-world machine learning applications, and new models for supervised learning of tabular data are frequently proposed. Comparative studies assessing the performance of models typically consist of model-centric evaluation setups with overly standardized data preprocessing. This paper demonstrates that such model-centric evaluations are biased, as real-world modeling pipelines often require dataset-specific preprocessing and feature engineering. Therefore, we propose a data-centric evaluation framework. We select 10 relevant datasets from Kaggle competitions and implement expert-level preprocessing pipelines for each dataset. We conduct experiments with different preprocessing pipelines and hyperparameter optimization (HPO) regimes to quantify the impact of model selection, HPO, feature engineering, and test-time adaptation. Our main findings are: 1. After dataset-specific feature engineering, model rankings change considerably, performance differences decrease, and the importance of model selection reduces. 2. Recent models, despite their measurable progress, still significantly benefit from manual feature engineering. This holds true for both tree-based models and neural networks. 3. While tabular data is typically considered static, samples are often collected over time, and adapting to distribution shifts can be important even in supposedly static data. These insights suggest that research efforts should be directed toward a data-centric perspective, acknowledging that tabular data requires feature engineering and often exhibits temporal characteristics. Our framework is available under: https://github.com/atschalz/dc_tabeval. △ Less

Submitted 18 December, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01115 [pdf, other]

Enabling Mixed Effects Neural Networks for Diverse, Clustered Data Using Monte Carlo Methods

Authors: Andrej Tschalzev, Paul Nitschke, Lukas Kirchdorfer, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Neural networks often assume independence among input data samples, disregarding correlations arising from inherent clustering patterns in real-world datasets (e.g., due to different sites or repeated measurements). Recently, mixed effects neural networks (MENNs) which separate cluster-specific 'random effects' from cluster-invariant 'fixed effects' have been proposed to improve generalization and… ▽ More Neural networks often assume independence among input data samples, disregarding correlations arising from inherent clustering patterns in real-world datasets (e.g., due to different sites or repeated measurements). Recently, mixed effects neural networks (MENNs) which separate cluster-specific 'random effects' from cluster-invariant 'fixed effects' have been proposed to improve generalization and interpretability for clustered data. However, existing methods only allow for approximate quantification of cluster effects and are limited to regression and binary targets with only one clustering feature. We present MC-GMENN, a novel approach employing Monte Carlo methods to train Generalized Mixed Effects Neural Networks. We empirically demonstrate that MC-GMENN outperforms existing mixed effects deep learning models in terms of generalization performance, time complexity, and quantification of inter-cluster variance. Additionally, MC-GMENN is applicable to a wide range of datasets, including multi-class classification tasks with multiple high-cardinality categorical features. For these datasets, we show that MC-GMENN outperforms conventional encoding and embedding methods, simultaneously offering a principled methodology for interpreting the effects of clustering patterns. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2309.17130 [pdf, other]

GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data

Authors: Sascha Marton, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Despite the success of deep learning for text and image data, tree-based ensemble models are still state-of-the-art for machine learning with heterogeneous tabular data. However, there is a significant need for tabular-specific gradient-based methods due to their high flexibility. In this paper, we propose $\text{GRANDE}$, $\text{GRA}$die$\text{N}$t-Based $\text{D}$ecision Tree $\text{E}$nsembles,… ▽ More Despite the success of deep learning for text and image data, tree-based ensemble models are still state-of-the-art for machine learning with heterogeneous tabular data. However, there is a significant need for tabular-specific gradient-based methods due to their high flexibility. In this paper, we propose $\text{GRANDE}$, $\text{GRA}$die$\text{N}$t-Based $\text{D}$ecision Tree $\text{E}$nsembles, a novel approach for learning hard, axis-aligned decision tree ensembles using end-to-end gradient descent. GRANDE is based on a dense representation of tree ensembles, which affords to use backpropagation with a straight-through operator to jointly optimize all model parameters. Our method combines axis-aligned splits, which is a useful inductive bias for tabular data, with the flexibility of gradient-based optimization. Furthermore, we introduce an advanced instance-wise weighting that facilitates learning representations for both, simple and complex relations, within a single model. We conducted an extensive evaluation on a predefined benchmark with 19 classification datasets and demonstrate that our method outperforms existing gradient-boosting and deep learning frameworks on most datasets. The method is available under: https://github.com/s-marton/GRANDE △ Less

Submitted 12 March, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

arXiv:2309.00306 [pdf, ps, other]

On the Aggregation of Rules for Knowledge Graph Completion

Authors: Patrick Betz, Stefan Lüdtke, Christian Meilicke, Heiner Stuckenschmidt

Abstract: Rule learning approaches for knowledge graph completion are efficient, interpretable and competitive to purely neural models. The rule aggregation problem is concerned with finding one plausibility score for a candidate fact which was simultaneously predicted by multiple rules. Although the problem is ubiquitous, as data-driven rule learning can result in noisy and large rulesets, it is underrepre… ▽ More Rule learning approaches for knowledge graph completion are efficient, interpretable and competitive to purely neural models. The rule aggregation problem is concerned with finding one plausibility score for a candidate fact which was simultaneously predicted by multiple rules. Although the problem is ubiquitous, as data-driven rule learning can result in noisy and large rulesets, it is underrepresented in the literature and its theoretical foundations have not been studied before in this context. In this work, we demonstrate that existing aggregation approaches can be expressed as marginal inference operations over the predicting rules. In particular, we show that the common Max-aggregation strategy, which scores candidates based on the rule with the highest confidence, has a probabilistic interpretation. Finally, we propose an efficient and overlooked baseline which combines the previous strategies and is competitive to computationally more expensive approaches. △ Less

Submitted 1 September, 2023; originally announced September 2023.

Comments: KLR Workshop@ICML2023

arXiv:2308.03403 [pdf, other]

Towards Machine Learning-based Fish Stock Assessment

Authors: Stefan Lüdtke, Maria E. Pierce

Abstract: The accurate assessment of fish stocks is crucial for sustainable fisheries management. However, existing statistical stock assessment models can have low forecast performance of relevant stock parameters like recruitment or spawning stock biomass, especially in ecosystems that are changing due to global warming and other anthropogenic stressors. In this paper, we investigate the use of machine le… ▽ More The accurate assessment of fish stocks is crucial for sustainable fisheries management. However, existing statistical stock assessment models can have low forecast performance of relevant stock parameters like recruitment or spawning stock biomass, especially in ecosystems that are changing due to global warming and other anthropogenic stressors. In this paper, we investigate the use of machine learning models to improve the estimation and forecast of such stock parameters. We propose a hybrid model that combines classical statistical stock assessment models with supervised ML, specifically gradient boosted trees. Our hybrid model leverages the initial estimate provided by the classical model and uses the ML model to make a post-hoc correction to improve accuracy. We experiment with five different stocks and find that the forecast accuracy of recruitment and spawning stock biomass improves considerably in most cases. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: Accepted at Fragile Earth Workshop 2023

arXiv:2305.03515 [pdf, other]

GradTree: Learning Axis-Aligned Decision Trees with Gradient Descent

Authors: Sascha Marton, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Decision Trees (DTs) are commonly used for many machine learning tasks due to their high degree of interpretability. However, learning a DT from data is a difficult optimization problem, as it is non-convex and non-differentiable. Therefore, common approaches learn DTs using a greedy growth algorithm that minimizes the impurity locally at each internal node. Unfortunately, this greedy procedure ca… ▽ More Decision Trees (DTs) are commonly used for many machine learning tasks due to their high degree of interpretability. However, learning a DT from data is a difficult optimization problem, as it is non-convex and non-differentiable. Therefore, common approaches learn DTs using a greedy growth algorithm that minimizes the impurity locally at each internal node. Unfortunately, this greedy procedure can lead to inaccurate trees. In this paper, we present a novel approach for learning hard, axis-aligned DTs with gradient descent. The proposed method uses backpropagation with a straight-through operator on a dense DT representation, to jointly optimize all tree parameters. Our approach outperforms existing methods on binary classification benchmarks and achieves competitive results for multi-class tasks. The method is available under: https://github.com/s-marton/GradTree △ Less

Submitted 19 August, 2024; v1 submitted 5 May, 2023; originally announced May 2023.

arXiv:2301.10571 [pdf, other]

Leveraging Planning Landmarks for Hybrid Online Goal Recognition

Authors: Nils Wilken, Lea Cohausz, Johannes Schaum, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible and with minimal domain knowledge. Hence, in this paper, we propose a hybrid method for online goal recognition that co… ▽ More Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible and with minimal domain knowledge. Hence, in this paper, we propose a hybrid method for online goal recognition that combines a symbolic planning landmark based approach and a data-driven goal recognition approach and evaluate it in a real-world cooking scenario. The empirical results show that the proposed method is not only significantly more efficient in terms of computation time than the state-of-the-art but also improves goal recognition performance. Furthermore, we show that the utilized planning landmark based approach, which was so far only evaluated on artificial benchmark domains, achieves also good recognition performance when applied to a real-world cooking scenario. △ Less

Submitted 25 January, 2023; originally announced January 2023.

Comments: 9 pages. Presented at SPARK 2022 (https://icaps22.icaps-conference.org/workshops/SPARK/)

arXiv:2301.05608 [pdf, other]

Investigating the Combination of Planning-Based and Data-Driven Methods for Goal Recognition

Authors: Nils Wilken, Lea Cohausz, Johannes Schaum, Stefan Lüdtke, Heiner Stuckenschmidt

Abstract: An important feature of pervasive, intelligent assistance systems is the ability to dynamically adapt to the current needs of their users. Hence, it is critical for such systems to be able to recognize those goals and needs based on observations of the user's actions and state of the environment. In this work, we investigate the application of two state-of-the-art, planning-based plan recognition… ▽ More An important feature of pervasive, intelligent assistance systems is the ability to dynamically adapt to the current needs of their users. Hence, it is critical for such systems to be able to recognize those goals and needs based on observations of the user's actions and state of the environment. In this work, we investigate the application of two state-of-the-art, planning-based plan recognition approaches in a real-world setting. So far, these approaches were only evaluated in artificial settings in combination with agents that act perfectly rational. We show that such approaches have difficulties when used to recognize the goals of human subjects, because human behaviour is typically not perfectly rational. To overcome this issue, we propose an extension to the existing approaches through a classification-based method trained on observed behaviour data. We empirically show that the proposed extension not only outperforms the purely planning-based- and purely data-driven goal recognition methods but is also able to recognize the correct goal more reliably, especially when only a small number of observations were seen. This substantially improves the usefulness of hybrid goal recognition approaches for intelligent assistance systems, as recognizing a goal early opens much more possibilities for supportive reactions of the system. △ Less

Submitted 13 January, 2023; originally announced January 2023.

arXiv:2207.08816 [pdf, other]

doi 10.1145/3558884.3558892

Discovering Behavioral Predispositions in Data to Improve Human Activity Recognition

Authors: Maximilian Popko, Sebastian Bader, Stefan Lüdtke, Thomas Kirste

Abstract: The automatic, sensor-based assessment of challenging behavior of persons with dementia is an important task to support the selection of interventions. However, predicting behaviors like apathy and agitation is challenging due to the large inter- and intra-patient variability. Goal of this paper is to improve the recognition performance by making use of the observation that patients tend to show s… ▽ More The automatic, sensor-based assessment of challenging behavior of persons with dementia is an important task to support the selection of interventions. However, predicting behaviors like apathy and agitation is challenging due to the large inter- and intra-patient variability. Goal of this paper is to improve the recognition performance by making use of the observation that patients tend to show specific behaviors at certain times of the day or week. We propose to identify such segments of similar behavior via clustering the distributions of annotations of the time segments. All time segments within a cluster then consist of similar behaviors and thus indicate a behavioral predisposition (BPD). We utilize BPDs by training a classifier for each BPD. Empirically, we demonstrate that when the BPD per time segment is known, activity recognition performance can be substantially improved. △ Less

Submitted 18 July, 2022; originally announced July 2022.

Comments: Submitted to iWOAR 2022 - 7th international Workshop on Sensor-Based Activity Recognition and Artificial Intelligence

Journal ref: 2022. Proceedings of the 7th International Workshop on Sensor-based Activity Recognition and Artificial Intelligence. Association for Computing Machinery, New York, NY, USA

arXiv:2207.08414 [pdf, other]

Outlier Explanation via Sum-Product Networks

Authors: Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Outlier explanation is the task of identifying a set of features that distinguish a sample from normal data, which is important for downstream (human) decision-making. Existing methods are based on beam search in the space of feature subsets. They quickly becomes computationally expensive, as they require to run an outlier detection algorithm from scratch for each feature subset. To alleviate this… ▽ More Outlier explanation is the task of identifying a set of features that distinguish a sample from normal data, which is important for downstream (human) decision-making. Existing methods are based on beam search in the space of feature subsets. They quickly becomes computationally expensive, as they require to run an outlier detection algorithm from scratch for each feature subset. To alleviate this problem, we propose a novel outlier explanation algorithm based on Sum-Product Networks (SPNs), a class of probabilistic circuits. Our approach leverages the tractability of marginal inference in SPNs to compute outlier scores in feature subsets. By using SPNs, it becomes feasible to perform backwards elimination instead of the usual forward beam search, which is less susceptible to missing relevant features in an explanation, especially when the number of features is large. We empirically show that our approach achieves state-of-the-art results for outlier explanation, outperforming recent search-based as well as deep learning-based explanation methods △ Less

Submitted 18 July, 2022; originally announced July 2022.

arXiv:2206.04891 [pdf, other]

doi 10.1007/s10994-023-06428-4

Explaining Neural Networks without Access to Training Data

Authors: Sascha Marton, Stefan Lüdtke, Christian Bartelt, Andrej Tschalzev, Heiner Stuckenschmidt

Abstract: We consider generating explanations for neural networks in cases where the network's training data is not accessible, for instance due to privacy or safety issues. Recently, $\mathcal{I}$-Nets have been proposed as a sample-free approach to post-hoc, global model interpretability that does not require access to training data. They formulate interpretation as a machine learning task that maps netwo… ▽ More We consider generating explanations for neural networks in cases where the network's training data is not accessible, for instance due to privacy or safety issues. Recently, $\mathcal{I}$-Nets have been proposed as a sample-free approach to post-hoc, global model interpretability that does not require access to training data. They formulate interpretation as a machine learning task that maps network representations (parameters) to a representation of an interpretable function. In this paper, we extend the $\mathcal{I}$-Net framework to the cases of standard and soft decision trees as surrogate models. We propose a suitable decision tree representation and design of the corresponding $\mathcal{I}$-Net output layers. Furthermore, we make $\mathcal{I}$-Nets applicable to real-world tasks by considering more realistic distributions when generating the $\mathcal{I}$-Net's training data. We empirically evaluate our approach against traditional global, post-hoc interpretability approaches and show that it achieves superior results when the training data is not accessible. △ Less

Submitted 10 June, 2022; originally announced June 2022.

Journal ref: Machine Learning (2024)

arXiv:2202.00332 [pdf, other]

Activity Recognition in Assembly Tasks by Bayesian Filtering in Multi-Hypergraphs

Authors: Timon Felske, Stefan Lüdtke, Sebastian Bader, Thomas Kirste

Abstract: We study sensor-based human activity recognition in manual work processes like assembly tasks. In such processes, the system states often have a rich structure, involving object properties and relations. Thus, estimating the hidden system state from sensor observations by recursive Bayesian filtering can be very challenging, due to the combinatorial explosion in the number of system states. To all… ▽ More We study sensor-based human activity recognition in manual work processes like assembly tasks. In such processes, the system states often have a rich structure, involving object properties and relations. Thus, estimating the hidden system state from sensor observations by recursive Bayesian filtering can be very challenging, due to the combinatorial explosion in the number of system states. To alleviate this problem, we propose an efficient Bayesian filtering model for such processes. In our approach, system states are represented by multi-hypergraphs, and the system dynamics is modeled by graph rewriting rules. We show a preliminary concept that allows to represent distributions over multi-hypergraphs more compactly than by full enumeration, and present an inference algorithm that works directly on this compact representation. We demonstrate the applicability of the algorithm on a real dataset. △ Less

Submitted 1 February, 2022; originally announced February 2022.

Comments: Accepted for presentation at the 2nd GCLR workshop in conjunction with AAAI 2022

arXiv:2111.04564 [pdf, other]

Human Activity Recognition using Attribute-Based Neural Networks and Context Information

Authors: Stefan Lüdtke, Fernando Moya Rueda, Waqas Ahmed, Gernot A. Fink, Thomas Kirste

Abstract: We consider human activity recognition (HAR) from wearable sensor data in manual-work processes, like warehouse order-picking. Such structured domains can often be partitioned into distinct process steps, e.g., packaging or transporting. Each process step can have a different prior distribution over activity classes, e.g., standing or walking, and different system dynamics. Here, we show how such… ▽ More We consider human activity recognition (HAR) from wearable sensor data in manual-work processes, like warehouse order-picking. Such structured domains can often be partitioned into distinct process steps, e.g., packaging or transporting. Each process step can have a different prior distribution over activity classes, e.g., standing or walking, and different system dynamics. Here, we show how such context information can be integrated systematically into a deep neural network-based HAR system. Specifically, we propose a hybrid architecture that combines a deep neural network-that estimates high-level movement descriptors, attributes, from the raw-sensor data-and a shallow classifier, which predicts activity classes from the estimated attributes and (optional) context information, like the currently executed process step. We empirically show that our proposed architecture increases HAR performance, compared to state-of-the-art methods. Additionally, we show that HAR performance can be further increased when information about process steps is incorporated, even when that information is only partially correct. △ Less

Submitted 28 October, 2021; originally announced November 2021.

Comments: 3rd International Workshop on Deep Learning for Human Activity Recognition

arXiv:2110.05165 [pdf, other]

Exchangeability-Aware Sum-Product Networks

Authors: Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Sum-Product Networks (SPNs) are expressive probabilistic models that provide exact, tractable inference. They achieve this efficiency by making use of local independence. On the other hand, mixtures of exchangeable variable models (MEVMs) are a class of tractable probabilistic models that make use of exchangeability of discrete random variables to render inference tractable. Exchangeability, which… ▽ More Sum-Product Networks (SPNs) are expressive probabilistic models that provide exact, tractable inference. They achieve this efficiency by making use of local independence. On the other hand, mixtures of exchangeable variable models (MEVMs) are a class of tractable probabilistic models that make use of exchangeability of discrete random variables to render inference tractable. Exchangeability, which arises naturally in relational domains, has not been considered for efficient representation and inference in SPNs yet. The contribution of this paper is a novel probabilistic model which we call Exchangeability-Aware Sum-Product Networks (XSPNs). It contains both SPNs and MEVMs as special cases, and combines the ability of SPNs to efficiently learn deep probabilistic models with the ability of MEVMs to efficiently handle exchangeable random variables. We introduce a structure learning algorithm for XSPNs and empirically show that they can be more accurate than conventional SPNs when the data contains repeated, interchangeable parts. △ Less

Submitted 28 April, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

Comments: accepted at IJCAI 2022

arXiv:2102.13133 [pdf, other]

doi 10.1109/TPDS.2021.3084795

VPIC 2.0: Next Generation Particle-in-Cell Simulations

Authors: Robert Bird, Nigel Tan, Scott V. Luedtke, Stephen Lien Harrell, Michela Taufer, Brian Albright

Abstract: VPIC is a general purpose Particle-in-Cell simulation code for modeling plasma phenomena such as magnetic reconnection, fusion, solar weather, and laser-plasma interaction in three dimensions using large numbers of particles. VPIC's capacity in both fidelity and scale makes it particularly well-suited for plasma research on pre-exascale and exascale platforms. In this paper we demonstrate the uniq… ▽ More VPIC is a general purpose Particle-in-Cell simulation code for modeling plasma phenomena such as magnetic reconnection, fusion, solar weather, and laser-plasma interaction in three dimensions using large numbers of particles. VPIC's capacity in both fidelity and scale makes it particularly well-suited for plasma research on pre-exascale and exascale platforms. In this paper we demonstrate the unique challenges involved in preparing the VPIC code for operation at exascale, outlining important optimizations to make VPIC efficient on accelerators. Specifically, we show the work undertaken in adapting VPIC to exploit the portability-enabling framework Kokkos and highlight the enhancements to VPIC's modeling capabilities to achieve performance at exascale. We assess the achieved performance-portability trade-off through a suite of studies on nine different varieties of modern pre-exascale hardware. Our performance-portability study includes weak-scaling runs on three of the top ten TOP500 supercomputers, as well as a comparison of low-level system performance of hardware from four different vendors. △ Less

Submitted 25 February, 2021; originally announced February 2021.

arXiv:2101.10356 [pdf]

doi 10.1038/s41592-021-01220-5

Deep learning based mixed-dimensional GMM for characterizing variability in CryoEM

Authors: Muyuan Chen, Steven Ludtke

Abstract: Structural flexibility and/or dynamic interactions with other molecules is a critical aspect of protein function. CryoEM provides direct visualization of individual macromolecules sampling different conformational and compositional states. While numerous methods are available for computational classification of discrete states, characterization of continuous conformational changes or large numbers… ▽ More Structural flexibility and/or dynamic interactions with other molecules is a critical aspect of protein function. CryoEM provides direct visualization of individual macromolecules sampling different conformational and compositional states. While numerous methods are available for computational classification of discrete states, characterization of continuous conformational changes or large numbers of discrete state without human supervision remains challenging. Here we present e2gmm, a machine learning algorithm to determine a conformational landscape for proteins or complexes using a 3-D Gaussian mixture model mapped onto 2-D particle images in known orientations. Using a deep neural network architecture, e2gmm can automatically resolve the structural heterogeneity within the protein complex and map particles onto a small latent space describing conformational and compositional changes. This system presents a more intuitive and flexible representation than other manifold methods currently in use. We demonstrate this method on both simulated data as well as three biological systems, to explore compositional and conformational changes at a range of scales. The software is distributed as part of EMAN2. △ Less

Submitted 23 May, 2021; v1 submitted 25 January, 2021; originally announced January 2021.

Comments: 31 pages, 5 main figures and 8 supplementary figures

Journal ref: Nature Methods 18, 930-936 (2021)

arXiv:1804.06748 [pdf, other]

doi 10.1613/jair.1.11261

State-Space Abstractions for Probabilistic Inference: A Systematic Review

Authors: Stefan Lüdtke, Max Schröder, Frank Krüger, Sebastian Bader, Thomas Kirste

Abstract: Tasks such as social network analysis, human behavior recognition, or modeling biochemical reactions, can be solved elegantly by using the probabilistic inference framework. However, standard probabilistic inference algorithms work at a propositional level, and thus cannot capture the symmetries and redundancies that are present in these tasks. Algorithms that exploit those symmetries have been de… ▽ More Tasks such as social network analysis, human behavior recognition, or modeling biochemical reactions, can be solved elegantly by using the probabilistic inference framework. However, standard probabilistic inference algorithms work at a propositional level, and thus cannot capture the symmetries and redundancies that are present in these tasks. Algorithms that exploit those symmetries have been devised in different research fields, for example by the lifted inference-, multiple object tracking-, and modeling and simulation-communities. The common idea, that we call state space abstraction, is to perform inference over compact representations of sets of symmetric states. Although they are concerned with a similar topic, the relationship between these approaches has not been investigated systematically. This survey provides the following contributions. We perform a systematic literature review to outline the state of the art in probabilistic inference methods exploiting symmetries. From an initial set of more than 4,000 papers, we identify 116 relevant papers. Furthermore, we provide new high-level categories that classify the approaches, based on common properties of the approaches. The research areas underlying each of the categories are introduced concisely. Researchers from different fields that are confronted with a state space explosion problem in a probabilistic system can use this classification to identify possible solutions. Finally, based on this conceptualization, we identify potentials for future research, as some relevant application domains are not addressed by current approaches. △ Less

Submitted 4 December, 2018; v1 submitted 18 April, 2018; originally announced April 2018.

arXiv:1801.10495 [pdf, other]

Lifted Filtering via Exchangeable Decomposition

Authors: Stefan Lüdtke, Max Schröder, Sebastian Bader, Kristian Kersting, Thomas Kirste

Abstract: We present a model for exact recursive Bayesian filtering based on lifted multiset states. Combining multisets with lifting makes it possible to simultaneously exploit multiple strategies for reducing inference complexity when compared to list-based grounded state representations. The core idea is to borrow the concept of Maximally Parallel Multiset Rewriting Systems and to enhance it by concepts… ▽ More We present a model for exact recursive Bayesian filtering based on lifted multiset states. Combining multisets with lifting makes it possible to simultaneously exploit multiple strategies for reducing inference complexity when compared to list-based grounded state representations. The core idea is to borrow the concept of Maximally Parallel Multiset Rewriting Systems and to enhance it by concepts from Rao-Blackwellization and Lifted Inference, giving a representation of state distributions that enables efficient inference. In worlds where the random variables that define the system state are exchangeable -- where the identity of entities does not matter -- it automatically uses a representation that abstracts from ordering (achieving an exponential reduction in complexity) -- and it automatically adapts when observations or system dynamics destroy exchangeability by breaking symmetry. △ Less

Submitted 7 May, 2018; v1 submitted 31 January, 2018; originally announced January 2018.

arXiv:1707.06446 [pdf, other]

Sequential Lifted Bayesian Filtering in Multiset Rewriting Systems

Authors: Max Schröder, Stefan Lüdtke, Sebastian Bader, Frank Krüger, Thomas Kirste

Abstract: Bayesian Filtering for plan and activity recognition is challenging for scenarios that contain many observation equivalent entities (i.e. entities that produce the same observations). This is due to the combinatorial explosion in the number of hypotheses that need to be tracked. However, this class of problems exhibits a certain symmetry that can be exploited for state space representation and inf… ▽ More Bayesian Filtering for plan and activity recognition is challenging for scenarios that contain many observation equivalent entities (i.e. entities that produce the same observations). This is due to the combinatorial explosion in the number of hypotheses that need to be tracked. However, this class of problems exhibits a certain symmetry that can be exploited for state space representation and inference. We analyze current state of the art methods and find that none of them completely fits the requirements arising in this problem class. We sketch a novel inference algorithm that provides a solution by incorporating concepts from Lifted Inference algorithms, Probabilistic Multiset Rewriting Systems, and Computational State Space Models. Two experiments confirm that this novel algorithm has the potential to perform efficient probabilistic inference on this problem class. △ Less

Submitted 14 August, 2017; v1 submitted 20 July, 2017; originally announced July 2017.

Comments: 7 pages, 3 figures, accepted at UAI-17 Statistical Relational AI (StarAI) workshop

Showing 1–22 of 22 results for author: Lüdtke, S