-
The Unit-Zero Divisor Graph of a Commutative Ring
Authors:
Vika Yugi Kurniawan,
Yeni Susanti,
Budi Surodjo
Abstract:
This paper introduces a new approach to associating a graph with a commutative ring. Let $R$ be a commutative ring with identity. The unit-zero divisor graph of a commutative ring $R$, denoted by $G_{UZ}(R)$, offers a novel framework for exploring the interaction between ring and graph structures. The vertex set of $G_{UZ}(R)$ consists of all elements of the ring $R$. Two distinct vertices $x$ and…
▽ More
This paper introduces a new approach to associating a graph with a commutative ring. Let $R$ be a commutative ring with identity. The unit-zero divisor graph of a commutative ring $R$, denoted by $G_{UZ}(R)$, offers a novel framework for exploring the interaction between ring and graph structures. The vertex set of $G_{UZ}(R)$ consists of all elements of the ring $R$. Two distinct vertices $x$ and $y$ in $G_{UZ}(R)$ are adjacent if and only if $x + y$ is a unit and $xy$ is a zero divisor in $R$. This dual adjacency condition gives rise to a graph that reflects both the additive and multiplicative behavior of the ring. This study investigates key structural properties of $G_{UZ}(R)$, including regularity, bipartiteness, planarity, and Hamiltonicity. In addition, it examines how these graph features are influenced by the algebraic structure of the ring, particularly the group of units, the set of zero divisors, ideals, and the Jacobson radical.
△ Less
Submitted 13 June, 2025;
originally announced June 2025.
-
Paths to Causality: Finding Informative Subgraphs Within Knowledge Graphs for Knowledge-Based Causal Discovery
Authors:
Yuni Susanti,
Michael Färber
Abstract:
Inferring causal relationships between variable pairs is crucial for understanding multivariate interactions in complex systems. Knowledge-based causal discovery -- which involves inferring causal relationships by reasoning over the metadata of variables (e.g., names or textual context) -- offers a compelling alternative to traditional methods that rely on observational data. However, existing met…
▽ More
Inferring causal relationships between variable pairs is crucial for understanding multivariate interactions in complex systems. Knowledge-based causal discovery -- which involves inferring causal relationships by reasoning over the metadata of variables (e.g., names or textual context) -- offers a compelling alternative to traditional methods that rely on observational data. However, existing methods using Large Language Models (LLMs) often produce unstable and inconsistent results, compromising their reliability for causal inference. To address this, we introduce a novel approach that integrates Knowledge Graphs (KGs) with LLMs to enhance knowledge-based causal discovery. Our approach identifies informative metapath-based subgraphs within KGs and further refines the selection of these subgraphs using Learning-to-Rank-based models. The top-ranked subgraphs are then incorporated into zero-shot prompts, improving the effectiveness of LLMs in inferring the causal relationship. Extensive experiments on biomedical and open-domain datasets demonstrate that our method outperforms most baselines by up to 44.4 points in F1 scores, evaluated across diverse LLMs and KGs. Our code and datasets are available on GitHub: https://github.com/susantiyuni/path-to-causality
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Bridging RDF Knowledge Graphs with Graph Neural Networks for Semantically-Rich Recommender Systems
Authors:
Michael Färber,
David Lamprecht,
Yuni Susanti
Abstract:
Graph Neural Networks (GNNs) have substantially advanced the field of recommender systems. However, despite the creation of more than a thousand knowledge graphs (KGs) under the W3C standard RDF, their rich semantic information has not yet been fully leveraged in GNN-based recommender systems. To address this gap, we propose a comprehensive integration of RDF KGs with GNNs that utilizes both the t…
▽ More
Graph Neural Networks (GNNs) have substantially advanced the field of recommender systems. However, despite the creation of more than a thousand knowledge graphs (KGs) under the W3C standard RDF, their rich semantic information has not yet been fully leveraged in GNN-based recommender systems. To address this gap, we propose a comprehensive integration of RDF KGs with GNNs that utilizes both the topological information from RDF object properties and the content information from RDF datatype properties. Our main focus is an in-depth evaluation of various GNNs, analyzing how different semantic feature initializations and types of graph structure heterogeneity influence their performance in recommendation tasks. Through experiments across multiple recommendation scenarios involving multi-million-node RDF graphs, we demonstrate that harnessing the semantic richness of RDF KGs significantly improves recommender systems and lays the groundwork for GNN-based recommender systems for the Linked Open Data cloud. The code and data are available on our GitHub repository: https://github.com/davidlamprecht/rdf-gnn-recommendation
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Clean Graphs and Idempotent Graphs over Finite Rings: An Approach Based on Z_n
Authors:
Felicia Servina Djuang,
Indah Emilia Wijayanti,
Yeni Susanti
Abstract:
Let $R$ be a finite ring with identity. The idempotent graph $I(R)$ is the graph whose vertex set consists of the non-trivial idempotent elements of $R$, where two distinct vertices $x$ and $y$ are adjacent if and only if $xy = yx = 0$. The clean graph $Cl(R)$ is a graph whose vertices are of the form $(e, u)$, where $e$ is an idempotent element and $u$ is a unit of $R$. Two distinct vertices…
▽ More
Let $R$ be a finite ring with identity. The idempotent graph $I(R)$ is the graph whose vertex set consists of the non-trivial idempotent elements of $R$, where two distinct vertices $x$ and $y$ are adjacent if and only if $xy = yx = 0$. The clean graph $Cl(R)$ is a graph whose vertices are of the form $(e, u)$, where $e$ is an idempotent element and $u$ is a unit of $R$. Two distinct vertices $(e,u)$ and $(f, v)$ are adjacent if and only if $ef = fe = 0$ or $uv = vu = 1$. The graph $Cl_2(R)$ is the subgraph of $Cl(R)$ induced by the set $\{(e, u) : e \text{ is a nonzero idempotent element of } R\}$. In this study, we examine the structure of clean graphs over $\mathbb{Z}_{n}$ derived from their $Cl_2$ graphs and investigate their relationship with the structure of their idempotent graphs.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
Can LLMs Leverage Observational Data? Towards Data-Driven Causal Discovery with LLMs
Authors:
Yuni Susanti,
Michael Färber
Abstract:
Causal discovery traditionally relies on statistical methods applied to observational data, often requiring large datasets and assumptions about underlying causal structures. Recent advancements in Large Language Models (LLMs) have introduced new possibilities for causal discovery by providing domain expert knowledge. However, it remains unclear whether LLMs can effectively process observational d…
▽ More
Causal discovery traditionally relies on statistical methods applied to observational data, often requiring large datasets and assumptions about underlying causal structures. Recent advancements in Large Language Models (LLMs) have introduced new possibilities for causal discovery by providing domain expert knowledge. However, it remains unclear whether LLMs can effectively process observational data for causal discovery. In this work, we explore the potential of LLMs for data-driven causal discovery by integrating observational data for LLM-based reasoning. Specifically, we examine whether LLMs can effectively utilize observational data through two prompting strategies: pairwise prompting and breadth first search (BFS)-based prompting. In both approaches, we incorporate the observational data directly into the prompt to assess LLMs' ability to infer causal relationships from such data. Experiments on benchmark datasets show that incorporating observational data enhances causal discovery, boosting F1 scores by up to 0.11 point using both pairwise and BFS LLM-based prompting, while outperforming traditional statistical causal discovery baseline by up to 0.52 points. Our findings highlight the potential and limitations of LLMs for data-driven causal discovery, demonstrating their ability to move beyond textual metadata and effectively interpret and utilize observational data for more informed causal reasoning. Our studies lays the groundwork for future advancements toward fully LLM-driven causal discovery.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery
Authors:
Yuni Susanti,
Michael Färber
Abstract:
Causal discovery aims to estimate causal structures among variables based on observational data. Large Language Models (LLMs) offer a fresh perspective to tackle the causal discovery problem by reasoning on the metadata associated with variables rather than their actual data values, an approach referred to as knowledge-based causal discovery. In this paper, we investigate the capabilities of Small…
▽ More
Causal discovery aims to estimate causal structures among variables based on observational data. Large Language Models (LLMs) offer a fresh perspective to tackle the causal discovery problem by reasoning on the metadata associated with variables rather than their actual data values, an approach referred to as knowledge-based causal discovery. In this paper, we investigate the capabilities of Small Language Models (SLMs, defined as LLMs with fewer than 1 billion parameters) with prompt-based learning for knowledge-based causal discovery. Specifically, we present KG Structure as Prompt, a novel approach for integrating structural information from a knowledge graph, such as common neighbor nodes and metapaths, into prompt-based learning to enhance the capabilities of SLMs. Experimental results on three types of biomedical and open-domain datasets under few-shot settings demonstrate the effectiveness of our approach, surpassing most baselines and even conventional fine-tuning approaches trained on full datasets. Our findings further highlight the strong capabilities of SLMs: in combination with knowledge graphs and prompt-based learning, SLMs demonstrate the potential to surpass LLMs with larger number of parameters. Our code and datasets are available on GitHub.
△ Less
Submitted 30 July, 2024; v1 submitted 26 July, 2024;
originally announced July 2024.
-
AutoRDF2GML: Facilitating RDF Integration in Graph Machine Learning
Authors:
Michael Färber,
David Lamprecht,
Yuni Susanti
Abstract:
In this paper, we introduce AutoRDF2GML, a framework designed to convert RDF data into data representations tailored for graph machine learning tasks. AutoRDF2GML enables, for the first time, the creation of both content-based features -- i.e., features based on RDF datatype properties -- and topology-based features -- i.e., features based on RDF object properties. Characterized by automated featu…
▽ More
In this paper, we introduce AutoRDF2GML, a framework designed to convert RDF data into data representations tailored for graph machine learning tasks. AutoRDF2GML enables, for the first time, the creation of both content-based features -- i.e., features based on RDF datatype properties -- and topology-based features -- i.e., features based on RDF object properties. Characterized by automated feature extraction, AutoRDF2GML makes it possible even for users less familiar with RDF and SPARQL to generate data representations ready for graph machine learning tasks, such as link prediction, node classification, and graph classification. Furthermore, we present four new benchmark datasets for graph machine learning, created from large RDF knowledge graphs using our framework. These datasets serve as valuable resources for evaluating graph machine learning approaches, such as graph neural networks. Overall, our framework effectively bridges the gap between the Graph Machine Learning and Semantic Web communities, paving the way for RDF-based machine learning applications.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
Prompting or Fine-tuning? Exploring Large Language Models for Causal Graph Validation
Authors:
Yuni Susanti,
Nina Holsmoelle
Abstract:
This study explores the capability of Large Language Models (LLMs) to evaluate causality in causal graphs generated by conventional statistical causal discovery methods-a task traditionally reliant on manual assessment by human subject matter experts. To bridge this gap in causality assessment, LLMs are employed to evaluate the causal relationships by determining whether a causal connection betwee…
▽ More
This study explores the capability of Large Language Models (LLMs) to evaluate causality in causal graphs generated by conventional statistical causal discovery methods-a task traditionally reliant on manual assessment by human subject matter experts. To bridge this gap in causality assessment, LLMs are employed to evaluate the causal relationships by determining whether a causal connection between variable pairs can be inferred from textual context. Our study compares two approaches: (1) prompting-based method for zero-shot and few-shot causal inference and, (2) fine-tuning language models for the causal relation prediction task. While prompt-based LLMs have demonstrated versatility across various NLP tasks, our experiments on biomedical and general-domain datasets show that fine-tuned models consistently outperform them, achieving up to a 20.5-point improvement in F1 score-even when using smaller-parameter language models. These findings provide valuable insights into the strengths and limitations of both approaches for causal graph evaluation.
△ Less
Submitted 9 April, 2025; v1 submitted 29 May, 2024;
originally announced June 2024.
-
Data Augmentation Techniques for Process Extraction from Scientific Publications
Authors:
Yuni Susanti
Abstract:
We present data augmentation techniques for process extraction tasks in scientific publications. We cast the process extraction task as a sequence labeling task where we identify all the entities in a sentence and label them according to their process-specific roles. The proposed method attempts to create meaningful augmented sentences by utilizing (1) process-specific information from the origina…
▽ More
We present data augmentation techniques for process extraction tasks in scientific publications. We cast the process extraction task as a sequence labeling task where we identify all the entities in a sentence and label them according to their process-specific roles. The proposed method attempts to create meaningful augmented sentences by utilizing (1) process-specific information from the original sentence, (2) role label similarity, and (3) sentence similarity. We demonstrate that the proposed methods substantially improve the performance of the process extraction model trained on chemistry domain datasets, up to 12.3 points improvement in performance accuracy (F-score). The proposed methods could potentially reduce overfitting as well, especially when training on small datasets or in a low-resource setting such as in chemistry and other scientific domains.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.