Search | arXiv e-print repository

Predicting from a Different Perspective: A Re-ranking Model for Inductive Knowledge Graph Completion

Abstract: Rule-induction models have demonstrated great power in the inductive setting of knowledge graph completion. In this setting, the models are tested on a knowledge graph entirely composed of unseen entities. These models learn relation patterns as rules by utilizing subgraphs. Providing the same inputs with different rules leads to differences in the model's predictions. In this paper, we focus on t… ▽ More Rule-induction models have demonstrated great power in the inductive setting of knowledge graph completion. In this setting, the models are tested on a knowledge graph entirely composed of unseen entities. These models learn relation patterns as rules by utilizing subgraphs. Providing the same inputs with different rules leads to differences in the model's predictions. In this paper, we focus on the behavior of such models. We propose a re-ranking-based model called ReDistLP (Re-ranking with a Distinct Model for Link Prediction). This model enhances the effectiveness of re-ranking by leveraging the difference in the predictions between the initial retriever and the re-ranker. ReDistLP outperforms the state-of-the-art methods in 2 out of 3 benchmarks. △ Less

Submitted 19 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

Comments: 12 pages, 2 figures

arXiv:2402.16278 [pdf, other]

A Self-matching Training Method with Annotation Embedding Models for Ontology Subsumption Prediction

Authors: Yukihiro Shiraishi, Ken Kaneiwa

Abstract: Recently, ontology embeddings representing entities in a low-dimensional space have been proposed for ontology completion. However, the ontology embeddings for concept subsumption prediction do not address the difficulties of similar and isolated entities and fail to extract the global information of annotation axioms from an ontology. In this paper, we propose a self-matching training method for… ▽ More Recently, ontology embeddings representing entities in a low-dimensional space have been proposed for ontology completion. However, the ontology embeddings for concept subsumption prediction do not address the difficulties of similar and isolated entities and fail to extract the global information of annotation axioms from an ontology. In this paper, we propose a self-matching training method for the two ontology embedding models: Inverted-index Matrix Embedding (InME) and Co-occurrence Matrix Embedding (CoME). The two embeddings capture the global and local information in annotation axioms by means of the occurring locations of each word in a set of axioms and the co-occurrences of words in each axiom. The self-matching training method increases the robustness of the concept subsumption prediction when predicted superclasses are similar to subclasses and are isolated to other entities in an ontology. Our evaluation experiments show that the self-matching training method with InME outperforms the existing ontology embeddings for the GO and FoodOn ontologies and that the method with the concatenation of CoME and OWL2Vec* outperforms them for the HeLiS ontology. △ Less

Submitted 10 March, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

Comments: 21 pages, 6 figures

arXiv:2212.12691 [pdf, ps, other]

Multi-duplicated Characterization of Graph Structures using Information Gain Ratio for Graph Neural Networks

Authors: Yuga Oishi, Ken kaneiwa

Abstract: Various graph neural networks (GNNs) have been proposed to solve node classification tasks in machine learning for graph data. GNNs use the structural information of graph data by aggregating the features of neighboring nodes. However, they fail to directly characterize and leverage the structural information. In this paper, we propose multi-duplicated characterization of graph structures using in… ▽ More Various graph neural networks (GNNs) have been proposed to solve node classification tasks in machine learning for graph data. GNNs use the structural information of graph data by aggregating the features of neighboring nodes. However, they fail to directly characterize and leverage the structural information. In this paper, we propose multi-duplicated characterization of graph structures using information gain ratio (IGR) for GNNs (MSI-GNN), which enhances the performance of node classification by using an i-hop adjacency matrix as the structural information of the graph data. In MSI-GNN, the i-hop adjacency matrix is adaptively adjusted by two methods: (i) structural features in the matrix are selected based on the IGR, and (ii) the selected features in (i) for each node are duplicated and combined flexibly. In an experiment, we show that our MSI-GNN outperforms GCN, H2GCN, and GCNII in terms of average accuracies in benchmark graph datasets. △ Less

Submitted 26 December, 2022; v1 submitted 24 December, 2022; originally announced December 2022.

Comments: 20pages, 8 figures

arXiv:2212.00898 [pdf, ps, other]

Hierarchical Model Selection for Graph Neural Netoworks

Authors: Yuga Oishi, Ken Kaneiwa

Abstract: Node classification on graph data is a major problem, and various graph neural networks (GNNs) have been proposed. Variants of GNNs such as H2GCN and CPF outperform graph convolutional networks (GCNs) by improving on the weaknesses of the traditional GNN. However, there are some graph data which these GNN variants fail to perform well than other GNNs in the node classification task. This is becaus… ▽ More Node classification on graph data is a major problem, and various graph neural networks (GNNs) have been proposed. Variants of GNNs such as H2GCN and CPF outperform graph convolutional networks (GCNs) by improving on the weaknesses of the traditional GNN. However, there are some graph data which these GNN variants fail to perform well than other GNNs in the node classification task. This is because H2GCN has a feature thinning on graph data with high average degree, and CPF gives rise to a problem about label-propagation suitability. Accordingly, we propose a hierarchical model selection framework (HMSF) that selects an appropriate GNN model by analyzing the indicators of each graph data. In the experiment, we show that the model selected by our HMSF achieves high performance on node classification for various types of graph data. △ Less

Submitted 1 December, 2022; originally announced December 2022.

Comments: 14 pages, 5 figures

arXiv:2208.05279 [pdf, ps, other]

The Completeness of Reasoning Algorithms for Clause Sets in Description Logic ALC

Authors: Daiki Takahashi, Ken Kaneiwa

Abstract: On the Semantic Web, metadata and ontologies are used to enable computers to read data. The Web Ontology Language (OWL) has been proposed as a standard ontological language, and various inference systems for this language have been studied. Description logics are regarded as the theoretical foundations of OWL; they provide the syntax and semantics of a formal language for describing ontologies and… ▽ More On the Semantic Web, metadata and ontologies are used to enable computers to read data. The Web Ontology Language (OWL) has been proposed as a standard ontological language, and various inference systems for this language have been studied. Description logics are regarded as the theoretical foundations of OWL; they provide the syntax and semantics of a formal language for describing ontologies and knowledge bases. In addition, tableau algorithms for description logics have been developed as the standard reasoning algorithms for decidable problems. However, tableau algorithms generate inefficient reasoning steps owing to their nondeterministic branching for disjunction as well as the increase in the size of models occasioned by existential quantification. In this study, we propose conjunctive normal form (CNF) concepts, which utilize a flat concept form for description logic ALC in order to develop algorithms for reasoning about sets of clauses. We present an efficient reasoning algorithm for clause sets where any ALC concept is transformed into an equivalent CNF concept. Theoretically, we prove the soundness, completeness, and termination of the reasoning algorithms for the satisfiability of CNF concepts. △ Less

Submitted 10 August, 2022; originally announced August 2022.

arXiv:2204.05096 [pdf, other]

Block-Segmentation Vectors for Arousal Prediction using Semi-supervised Learning

Authors: Yuki Odaka, Ken Kaneiwa

Abstract: To handle emotional expressions in computer applications, Russell's circum- plex model has been useful for representing emotions according to valence and arousal. In SentiWordNet, the level of valence is automatically assigned to a large number of synsets (groups of synonyms in WordNet) using semi-supervised learning. However, when assigning the level of arousal, the existing method proposed for S… ▽ More To handle emotional expressions in computer applications, Russell's circum- plex model has been useful for representing emotions according to valence and arousal. In SentiWordNet, the level of valence is automatically assigned to a large number of synsets (groups of synonyms in WordNet) using semi-supervised learning. However, when assigning the level of arousal, the existing method proposed for SentiWordNet reduces the accuracy of sentiment prediction. In this paper, we propose a block-segmentation vector for predicting the arousal levels of many synsets from a small number of labeled words using semi-supervised learning. We analyze the distribution of arousal and non-arousal words in a corpus of sentences by comparing it with the distribution of valence words. We address the problem that arousal level prediction fails when arousal and non-arousal words are mixed together in some sentences. To capture the features of such arousal and non-arousal words, we generate word vectors based on inverted indexes by block IDs, where the corpus is divided into blocks in the flow of sentences. In the evaluation experiment, we show that the results of arousal prediction with the block-segmentation vectors outperform the results of the previous method in SentiWordNet. △ Less

Submitted 11 April, 2022; originally announced April 2022.

arXiv:2201.01996 [pdf, ps, other]

Skip Vectors for RDF Data: Extraction Based on the Complexity of Feature Patterns

Authors: Yota Minami, Ken Kaneiwa

Abstract: The Resource Description Framework (RDF) is a framework for describing metadata, such as attributes and relationships of resources on the Web. Machine learning tasks for RDF graphs adopt three methods: (i) support vector machines (SVMs) with RDF graph kernels, (ii) RDF graph embeddings, and (iii) relational graph convolutional networks. In this paper, we propose a novel feature vector (called a Sk… ▽ More The Resource Description Framework (RDF) is a framework for describing metadata, such as attributes and relationships of resources on the Web. Machine learning tasks for RDF graphs adopt three methods: (i) support vector machines (SVMs) with RDF graph kernels, (ii) RDF graph embeddings, and (iii) relational graph convolutional networks. In this paper, we propose a novel feature vector (called a Skip vector) that represents some features of each resource in an RDF graph by extracting various combinations of neighboring edges and nodes. In order to make the Skip vector low-dimensional, we select important features for classification tasks based on the information gain ratio of each feature. The classification tasks can be performed by applying the low-dimensional Skip vector of each resource to conventional machine learning algorithms, such as SVMs, the k-nearest neighbors method, neural networks, random forests, and AdaBoost. In our evaluation experiments with RDF data, such as Wikidata, DBpedia, and YAGO, we compare our method with RDF graph kernels in an SVM. We also compare our method with the two approaches: RDF graph embeddings such as RDF2vec and relational graph convolutional networks on the AIFB, MUTAG, BGS, and AM benchmarks. △ Less

Submitted 9 March, 2022; v1 submitted 6 January, 2022; originally announced January 2022.

Showing 1–7 of 7 results for author: Kaneiwa, K