Skip to main content

Showing 1–13 of 13 results for author: Van Ossenbruggen, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2508.10467  [pdf, ps, other

    cs.AI cs.DL

    FIRESPARQL: A LLM-based Framework for SPARQL Query Generation over Scholarly Knowledge Graphs

    Authors: Xueli Pan, Victor de Boer, Jacco van Ossenbruggen

    Abstract: Question answering over Scholarly Knowledge Graphs (SKGs) remains a challenging task due to the complexity of scholarly content and the intricate structure of these graphs. Large Language Model (LLM) approaches could be used to translate natural language questions (NLQs) into SPARQL queries; however, these LLM-based approaches struggle with SPARQL query generation due to limited exposure to SKG-sp… ▽ More

    Submitted 14 August, 2025; originally announced August 2025.

    Comments: Accepted at 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)

  2. arXiv:2502.20945  [pdf, other

    cs.DB

    Metadata-driven Table Union Search: Leveraging Semantics for Restricted Access Data Integration

    Authors: Margherita Martorana, Tobias Kuhn, Jacco van Ossenbruggen

    Abstract: Over the past decade, the Table Union Search (TUS) task has aimed to identify unionable tables within data lakes to improve data integration and discovery. While numerous solutions and approaches have been introduced, they primarily rely on open data, making them not applicable to restricted access data, such as medical records or government statistics, due to privacy concerns. Restricted data can… ▽ More

    Submitted 28 February, 2025; originally announced February 2025.

  3. arXiv:2409.13709  [pdf, other

    cs.CL cs.AI

    Column Vocabulary Association (CVA): semantic interpretation of dataless tables

    Authors: Margherita Martorana, Xueli Pan, Benno Kruit, Tobias Kuhn, Jacco van Ossenbruggen

    Abstract: Traditional Semantic Table Interpretation (STI) methods rely primarily on the underlying table data to create semantic annotations. This year's SemTab challenge introduced the ``Metadata to KG'' track, which focuses on performing STI by using only metadata information, without access to the underlying data. In response to this new challenge, we introduce a new term: Column Vocabulary Association (… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

  4. arXiv:2409.08820  [pdf, other

    cs.AI

    A RAG Approach for Generating Competency Questions in Ontology Engineering

    Authors: Xueli Pan, Jacco van Ossenbruggen, Victor de Boer, Zhisheng Huang

    Abstract: Competency question (CQ) formulation is central to several ontology development and evaluation methodologies. Traditionally, the task of crafting these competency questions heavily relies on the effort of domain experts and knowledge engineers which is often time-consuming and labor-intensive. With the emergence of Large Language Models (LLMs), there arises the possibility to automate and enhance… ▽ More

    Submitted 11 February, 2025; v1 submitted 13 September, 2024; originally announced September 2024.

    Journal ref: 18th International Conference on Metadata and Semantics Research (MTSR2024)

  5. arXiv:2409.08046  [pdf, other

    cs.IR

    On the challenges of studying bias in Recommender Systems: A UserKNN case study

    Authors: Savvina Daniil, Manel Slokom, Mirjam Cuper, Cynthia C. S. Liem, Jacco van Ossenbruggen, Laura Hollink

    Abstract: Statements on the propagation of bias by recommender systems are often hard to verify or falsify. Research on bias tends to draw from a small pool of publicly available datasets and is therefore bound by their specific properties. Additionally, implementation choices are often not explicitly described or motivated in research, while they may have an effect on bias propagation. In this paper, we ex… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

    Comments: Accepted at FAccTRec@RecSys 2024, 11 pages

  6. Analysing and Organising Human Communications for AI Fairness-Related Decisions: Use Cases from the Public Sector

    Authors: Mirthe Dankloff, Vanja Skoric, Giovanni Sileno, Sennay Ghebreab, Jacco Van Ossenbruggen, Emma Beauxis-Aussalet

    Abstract: AI algorithms used in the public sector, e.g., for allocating social benefits or predicting fraud, often involve multiple public and private stakeholders at various phases of the algorithm's life-cycle. Communication issues between these diverse stakeholders can lead to misinterpretation and misuse of algorithms. We investigate the communication processes for AI fairness-related decisions by condu… ▽ More

    Submitted 20 March, 2024; originally announced April 2024.

  7. arXiv:2403.00884  [pdf, other

    cs.DB cs.AI cs.IR

    Zero-Shot Topic Classification of Column Headers: Leveraging LLMs for Metadata Enrichment

    Authors: Margherita Martorana, Tobias Kuhn, Lise Stork, Jacco van Ossenbruggen

    Abstract: Traditional dataset retrieval systems rely on metadata for indexing, rather than on the underlying data values. However, high-quality metadata creation and enrichment often require manual annotations, which is a labour-intensive and challenging process to automate. In this study, we propose a method to support metadata enrichment using topic annotations generated by three Large Language Models (LL… ▽ More

    Submitted 6 September, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

  8. arXiv:2311.10757  [pdf, other

    cs.CL cs.AI

    How Contentious Terms About People and Cultures are Used in Linked Open Data

    Authors: Andrei Nesterov, Laura Hollink, Jacco van Ossenbruggen

    Abstract: Web resources in linked open data (LOD) are comprehensible to humans through literal textual values attached to them, such as labels, notes, or comments. Word choices in literals may not always be neutral. When outdated and culturally stereotyping terminology is used in literals, they may appear as offensive to users in interfaces and propagate stereotypes to algorithms trained on them. We study h… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    MSC Class: I.2.1

  9. arXiv:2209.00371  [pdf, other

    cs.IR cs.AI

    Hidden Author Bias in Book Recommendation

    Authors: Savvina Daniil, Mirjam Cuper, Cynthia C. S. Liem, Jacco van Ossenbruggen, Laura Hollink

    Abstract: Collaborative filtering algorithms have the advantage of not requiring sensitive user or item information to provide recommendations. However, they still suffer from fairness related issues, like popularity bias. In this work, we argue that popularity bias often leads to other biases that are not obvious when additional user or item information is not provided to the researcher. We examine our hyp… ▽ More

    Submitted 8 September, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: Accepted at FAccTRec@RecSys 2022

  10. arXiv:2203.01608  [pdf, other

    cs.DL cs.AI

    Nanopublication-Based Semantic Publishing and Reviewing: A Field Study with Formalization Papers

    Authors: Cristina-Iulia Bucur, Tobias Kuhn, Davide Ceolin, Jacco van Ossenbruggen

    Abstract: With the rapidly increasing amount of scientific literature,it is getting continuously more difficult for researchers in different disciplines to be updated with the recent findings in their field of study.Processing scientific articles in an automated fashion has been proposed as a solution to this problem,but the accuracy of such processing remains very poor for extraction tasks beyond the basic… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

  11. Expressing High-Level Scientific Claims with Formal Semantics

    Authors: Cristina-Iulia Bucur, Tobias Kuhn, Davide Ceolin, Jacco van Ossenbruggen

    Abstract: The use of semantic technologies is gaining significant traction in science communication with a wide array of applications in disciplines including the Life Sciences, Computer Science, and the Social Sciences. Languages like RDF, OWL, and other formalisms based on formal logic are applied to make scientific knowledge accessible not only to human readers but also to automated systems. These approa… ▽ More

    Submitted 29 October, 2021; v1 submitted 27 September, 2021; originally announced September 2021.

    Comments: 8 pages

    ACM Class: I.2.4

    Journal ref: Proceedings of the 11th Knowledge Capture Conference (K-CAP '21), December 2--3, 2021, Virtual Event, USA

  12. arXiv:1910.12619  [pdf, other

    cs.CL cs.AI

    Is it a Fruit, an Apple or a Granny Smith? Predicting the Basic Level in a Concept Hierarchy

    Authors: Laura Hollink, Aysenur Bilgin, Jacco van Ossenbruggen

    Abstract: The "basic level", according to experiments in cognitive psychology, is the level of abstraction in a hierarchy of concepts at which humans perform tasks quicker and with greater accuracy than at other levels. We argue that applications that use concept hierarchies - such as knowledge graphs, ontologies or taxonomies - could significantly improve their user interfaces if they `knew' which concepts… ▽ More

    Submitted 25 October, 2019; originally announced October 2019.

  13. arXiv:1810.00968  [pdf, other

    cs.CL cs.LG

    Utilizing a Transparency-driven Environment toward Trusted Automatic Genre Classification: A Case Study in Journalism History

    Authors: Aysenur Bilgin, Laura Hollink, Jacco van Ossenbruggen, Erik Tjong Kim Sang, Kim Smeenk, Frank Harbers, Marcel Broersma

    Abstract: With the growing abundance of unlabeled data in real-world tasks, researchers have to rely on the predictions given by black-boxed computational models. However, it is an often neglected fact that these models may be scoring high on accuracy for the wrong reasons. In this paper, we present a practical impact analysis of enabling model transparency by various presentation forms. For this purpose, w… ▽ More

    Submitted 1 October, 2018; originally announced October 2018.

    Comments: 11 pages, 8 figures, IEEE eScience Conference 2018