Skip to main content

Showing 1–50 of 61 results for author: De Bie, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.01208  [pdf, ps, other

    cs.LG

    Multiresolution Analysis and Statistical Thresholding on Dynamic Networks

    Authors: Raphaël Romero, Tijl De Bie, Nick Heard, Alexander Modell

    Abstract: Detecting structural change in dynamic network data has wide-ranging applications. Existing approaches typically divide the data into time bins, extract network features within each bin, and then compare these features over time. This introduces an inherent tradeoff between temporal resolution and the statistical stability of the extracted features. Despite this tradeoff, reminiscent of time-frequ… ▽ More

    Submitted 1 July, 2025; v1 submitted 1 June, 2025; originally announced June 2025.

  2. arXiv:2505.22114  [pdf, ps, other

    cs.LG

    BiMi Sheets: Infosheets for bias mitigation methods

    Authors: MaryBeth Defrance, Guillaume Bied, Maarten Buyl, Jefrey Lijffijt, Tijl De Bie

    Abstract: Over the past 15 years, hundreds of bias mitigation methods have been proposed in the pursuit of fairness in machine learning (ML). However, algorithmic biases are domain-, task-, and model-specific, leading to a `portability trap': bias mitigation solutions in one context may not be appropriate in another. Thus, a myriad of design choices have to be made when creating a bias mitigation method, su… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  3. arXiv:2505.07653  [pdf, ps, other

    cs.CL

    JobHop: A Large-Scale Dataset of Career Trajectories

    Authors: Iman Johary, Raphael Romero, Alexandru C. Mara, Tijl De Bie

    Abstract: Understanding labor market dynamics is essential for policymakers, employers, and job seekers. However, comprehensive datasets that capture real-world career trajectories are scarce. In this paper, we introduce JobHop, a large-scale public dataset derived from anonymized resumes provided by VDAB, the public employment service in Flanders, Belgium. Utilizing Large Language Models (LLMs), we process… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  4. arXiv:2504.03803  [pdf, other

    cs.CL cs.CY cs.LG

    What Large Language Models Do Not Talk About: An Empirical Study of Moderation and Censorship Practices

    Authors: Sander Noels, Guillaume Bied, Maarten Buyl, Alexander Rogiers, Yousra Fettach, Jefrey Lijffijt, Tijl De Bie

    Abstract: Large Language Models (LLMs) are increasingly deployed as gateways to information, yet their content moderation practices remain underexplored. This work investigates the extent to which LLMs refuse to answer or omit information when prompted on political topics. To do so, we distinguish between hard censorship (i.e., generated refusals, error messages, or canned denial responses) and soft censors… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

    Comments: 17 pages, 38 pages in total including appendix; 5 figures, 22 figures in appendix

  5. arXiv:2503.03446  [pdf, other

    cs.CV cs.CY

    Biased Heritage: How Datasets Shape Models in Facial Expression Recognition

    Authors: Iris Dominguez-Catena, Daniel Paternain, Mikel Galar, MaryBeth Defrance, Maarten Buyl, Tijl De Bie

    Abstract: In recent years, the rapid development of artificial intelligence (AI) systems has raised concerns about our ability to ensure their fairness, that is, how to avoid discrimination based on protected characteristics such as gender, race, or age. While algorithmic fairness is well-studied in simple binary classification tasks on tabular data, its application to complex, real-world scenarios-such as… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

    Comments: 17 pages, 7 figures

    ACM Class: I.2.10

  6. arXiv:2412.12744  [pdf, other

    cs.CL cs.AI cs.LG

    Your Next State-of-the-Art Could Come from Another Domain: A Cross-Domain Analysis of Hierarchical Text Classification

    Authors: Nan Li, Bo Kang, Tijl De Bie

    Abstract: Text classification with hierarchical labels is a prevalent and challenging task in natural language processing. Examples include assigning ICD codes to patient records, tagging patents into IPC classes, assigning EUROVOC descriptors to European legal texts, and more. Despite its widespread applications, a comprehensive understanding of state-of-the-art methods across different domains has been la… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

  7. arXiv:2411.06837  [pdf, other

    cs.CL

    Persuasion with Large Language Models: a Survey

    Authors: Alexander Rogiers, Sander Noels, Maarten Buyl, Tijl De Bie

    Abstract: The rapid rise of Large Language Models (LLMs) has created new disruptive possibilities for persuasive communication, by enabling fully-automated personalized and interactive content generation at an unprecedented scale. In this paper, we survey the research field of LLM-based persuasion that has emerged as a result. We begin by exploring the different modes in which LLM Systems are used to influe… ▽ More

    Submitted 11 November, 2024; originally announced November 2024.

  8. arXiv:2410.18417  [pdf, other

    cs.CL cs.LG

    Large Language Models Reflect the Ideology of their Creators

    Authors: Maarten Buyl, Alexander Rogiers, Sander Noels, Guillaume Bied, Iris Dominguez-Catena, Edith Heiter, Iman Johary, Alexandru-Cristian Mara, Raphaël Romero, Jefrey Lijffijt, Tijl De Bie

    Abstract: Large language models (LLMs) are trained on vast amounts of data to generate natural language, enabling them to perform tasks like text summarization and question answering. These models have become popular in artificial intelligence (AI) assistants like ChatGPT and already play an influential role in how humans access information. However, the behavior of LLMs varies depending on their design, tr… ▽ More

    Submitted 30 January, 2025; v1 submitted 24 October, 2024; originally announced October 2024.

  9. A Dutch Financial Large Language Model

    Authors: Sander Noels, Jorne De Blaere, Tijl De Bie

    Abstract: This paper presents FinGEITje, the first Dutch financial Large Language Model (LLM) specifically designed and optimized for various financial tasks. Together with the model, we release a specialized Dutch financial instruction tuning dataset with over 140,000 samples, constructed employing an automated translation and data processing method. The open-source data construction method is provided, fa… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Comments: 9 pages, 1 figure, accepted at ACM ICAIF'24

  10. arXiv:2409.16965  [pdf, other

    cs.LG cs.CY

    ABCFair: an Adaptable Benchmark approach for Comparing Fairness Methods

    Authors: MaryBeth Defrance, Maarten Buyl, Tijl De Bie

    Abstract: Numerous methods have been implemented that pursue fairness with respect to sensitive features by mitigating biases in machine learning. Yet, the problem settings that each method tackles vary significantly, including the stage of intervention, the composition of sensitive features, the fairness notion, and the distribution of the output. Even in binary classification, these subtle differences mak… ▽ More

    Submitted 21 October, 2024; v1 submitted 25 September, 2024; originally announced September 2024.

    Comments: Accepted at NeurIPS 2024 Datasets and Benchmarks Track

  11. arXiv:2407.05175  [pdf, other

    cs.CE cs.CL

    TopoLedgerBERT: Topological Learning of Ledger Description Embeddings using Siamese BERT-Networks

    Authors: Sander Noels, Sébastien Viaene, Tijl De Bie

    Abstract: This paper addresses a long-standing problem in the field of accounting: mapping company-specific ledger accounts to a standardized chart of accounts. We propose a novel solution, TopoLedgerBERT, a unique sentence embedding method devised specifically for ledger account mapping. This model integrates hierarchical information from the charts of accounts into the sentence embedding process, aiming t… ▽ More

    Submitted 19 April, 2024; originally announced July 2024.

    Comments: 8 pages, 4 figures

  12. arXiv:2406.12953  [pdf, other

    cs.GR cs.HC cs.LG

    Pattern or Artifact? Interactively Exploring Embedding Quality with TRACE

    Authors: Edith Heiter, Liesbet Martens, Ruth Seurinck, Martin Guilliams, Tijl De Bie, Yvan Saeys, Jefrey Lijffijt

    Abstract: This paper presents TRACE, a tool to analyze the quality of 2D embeddings generated through dimensionality reduction techniques. Dimensionality reduction methods often prioritize preserving either local neighborhoods or global distances, but insights from visual structures can be misleading if the objective has not been achieved uniformly. TRACE addresses this challenge by providing a scalable and… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 4 pages, 3 figures, Accepted at ECML-PKDD 2024. For a demo video, see https://youtu.be/mtyFzXt51Jw. Code is available at https://github.com/aida-ugent/TRACE

  13. arXiv:2405.18941  [pdf, other

    cs.IR cs.LG

    Content-Agnostic Moderation for Stance-Neutral Recommendation

    Authors: Nan Li, Bo Kang, Tijl De Bie

    Abstract: Personalized recommendation systems often drive users towards more extreme content, exacerbating opinion polarization. While (content-aware) moderation has been proposed to mitigate these effects, such approaches risk curtailing the freedom of speech and of information. To address this concern, we propose and explore the feasibility of \emph{content-agnostic} moderation as an alternative approach… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  14. Gaussian Embedding of Temporal Networks

    Authors: Raphaël Romero, Jefrey Lijffijt, Riccardo Rastelli, Marco Corneli, Tijl De Bie

    Abstract: Representing the nodes of continuous-time temporal graphs in a low-dimensional latent space has wide-ranging applications, from prediction to visualization. Yet, analyzing continuous-time relational data with timestamped interactions introduces unique challenges due to its sparsity. Merely embedding nodes as trajectories in the latent space overlooks this sparsity, emphasizing the need to quantify… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Journal ref: IEEE Access ( Volume: 11, 2023) Page(s): 117971 - 117983

  15. Exploring the Performance of Continuous-Time Dynamic Link Prediction Algorithms

    Authors: Raphaël Romero, Maarten Buyl, Tijl De Bie, Jefrey Lijffijt

    Abstract: Dynamic Link Prediction (DLP) addresses the prediction of future links in evolving networks. However, accurately portraying the performance of DLP algorithms poses challenges that might impede progress in the field. Importantly, common evaluation pipelines usually calculate ranking or binary classification metrics, where the scores of observed interactions (positives) are compared with those of ra… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Journal ref: Appl. Sci. 2024, 14(8), 3516

  16. arXiv:2404.17597  [pdf, other

    cs.IR

    KamerRaad: Enhancing Information Retrieval in Belgian National Politics through Hierarchical Summarization and Conversational Interfaces

    Authors: Alexander Rogiers, Maarten Buyl, Bo Kang, Tijl De Bie

    Abstract: KamerRaad is an AI tool that leverages large language models to help citizens interactively engage with Belgian political information. The tool extracts and concisely summarizes key excerpts from parliamentary proceedings, followed by the potential for interaction based on generative AI that allows users to steadily build up their understanding. KamerRaad's front-end, built with Streamlit, facilit… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 4 pages, 2 figures, submitted to 2024 ECML-PKDD demo track

    ACM Class: H.3.3

  17. arXiv:2311.18486  [pdf, other

    cs.SI cs.AI

    New Perspectives on the Evaluation of Link Prediction Algorithms for Dynamic Graphs

    Authors: Raphaël Romero, Tijl De Bie, Jefrey Lijffijt

    Abstract: There is a fast-growing body of research on predicting future links in dynamic networks, with many new algorithms. Some benchmark data exists, and performance evaluations commonly rely on comparing the scores of observed network events (positives) with those of randomly generated ones (negatives). These evaluation measures depend on both the predictive ability of the model and, crucially, the type… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  18. arXiv:2311.04542  [pdf, other

    cs.IR cs.LG

    FEIR: Quantifying and Reducing Envy and Inferiority for Fair Recommendation of Limited Resources

    Authors: Nan Li, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: In settings such as e-recruitment and online dating, recommendation involves distributing limited opportunities, calling for novel approaches to quantify and enforce fairness. We introduce \emph{inferiority}, a novel (un)fairness measure quantifying a user's competitive disadvantage for their recommended items. Inferiority complements \emph{envy}, a fairness notion measuring preference for others'… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  19. arXiv:2310.17256  [pdf, other

    cs.LG

    fairret: a Framework for Differentiable Fairness Regularization Terms

    Authors: Maarten Buyl, MaryBeth Defrance, Tijl De Bie

    Abstract: Current fairness toolkits in machine learning only admit a limited range of fairness definitions and have seen little integration with automatic differentiation libraries, despite the central role these libraries play in modern machine learning pipelines. We introduce a framework of fairness regularization terms (fairrets) which quantify bias as modular, flexible objectives that are easily integ… ▽ More

    Submitted 10 April, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: Presented at ICLR 2024

  20. arXiv:2309.09708  [pdf, other

    cs.CL cs.AI

    LLM4Jobs: Unsupervised occupation extraction and standardization leveraging Large Language Models

    Authors: Nan Li, Bo Kang, Tijl De Bie

    Abstract: Automated occupation extraction and standardization from free-text job postings and resumes are crucial for applications like job recommendation and labor market policy formation. This paper introduces LLM4Jobs, a novel unsupervised methodology that taps into the capabilities of large language models (LLMs) for occupation coding. LLM4Jobs uniquely harnesses both the natural language understanding… ▽ More

    Submitted 19 September, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

  21. arXiv:2308.09516  [pdf, other

    cs.IR

    ReCon: Reducing Congestion in Job Recommendation using Optimal Transport

    Authors: Yoosof Mashayekhi, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: Recommender systems may suffer from congestion, meaning that there is an unequal distribution of the items in how often they are recommended. Some items may be recommended much more than others. Recommenders are increasingly used in domains where items have limited availability, such as the job market, where congestion is especially problematic: Recommending a vacancy -- for which typically only o… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

  22. arXiv:2304.11060  [pdf, other

    cs.CL cs.AI

    SkillGPT: a RESTful API service for skill extraction and standardization using a Large Language Model

    Authors: Nan Li, Bo Kang, Tijl De Bie

    Abstract: We present SkillGPT, a tool for skill extraction and standardization (SES) from free-style job descriptions and user profiles with an open-source Large Language Model (LLM) as backbone. Most previous methods for similar tasks either need supervision or rely on heavy data-preprocessing and feature engineering. Directly prompting the latest conversational LLM for standard skills, however, is slow, c… ▽ More

    Submitted 18 October, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

  23. arXiv:2304.06057  [pdf, other

    cs.CY cs.LG

    Maximal Fairness

    Authors: MaryBeth Defrance, Tijl De Bie

    Abstract: Fairness in AI has garnered quite some attention in research, and increasingly also in society. The so-called "Impossibility Theorem" has been one of the more striking research results with both theoretical and practical consequences, as it states that satisfying a certain combination of fairness measures is impossible. To date, this negative result has not yet been complemented with a positive on… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

    Comments: Accepted at FAccT 2023

  24. arXiv:2301.03338  [pdf, other

    cs.LG

    Topologically Regularized Data Embeddings

    Authors: Edith Heiter, Robin Vandaele, Tijl De Bie, Yvan Saeys, Jefrey Lijffijt

    Abstract: Unsupervised representation learning methods are widely used for gaining insight into high-dimensional, unstructured, or structured data. In some cases, users may have prior topological knowledge about the data, such as a known cluster structure or the fact that the data is known to lie along a tree- or graph-structured topology. However, generic methods to ensure such structure is salient in the… ▽ More

    Submitted 7 November, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

    Comments: 52 pages, preprint, under review

  25. Inherent Limitations of AI Fairness

    Authors: Maarten Buyl, Tijl De Bie

    Abstract: As the real-world impact of Artificial Intelligence (AI) systems has been steadily growing, so too have these systems come under increasing scrutiny. In response, the study of AI fairness has rapidly developed into a rich field of research with links to computer science, social science, law, and philosophy. Many technical solutions for measuring and achieving AI fairness have been proposed, yet th… ▽ More

    Submitted 9 June, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

    Comments: Accepted for publication at the Communications of the ACM

  26. arXiv:2209.08064  [pdf, other

    cs.LG cs.SI

    A Systematic Evaluation of Node Embedding Robustness

    Authors: Alexandru Mara, Jefrey Lijffijt, Stephan Günnemann, Tijl De Bie

    Abstract: Node embedding methods map network nodes to low dimensional vectors that can be subsequently used in a variety of downstream prediction tasks. The popularity of these methods has grown significantly in recent years, yet, their robustness to perturbations of the input data is still poorly understood. In this paper, we assess the empirical robustness of node embedding models to random and adversaria… ▽ More

    Submitted 30 November, 2022; v1 submitted 16 September, 2022; originally announced September 2022.

  27. arXiv:2209.05112  [pdf, ps, other

    cs.IR

    A challenge-based survey of e-recruitment recommendation systems

    Authors: Yoosof Mashayekhi, Nan Li, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: E-recruitment recommendation systems recommend jobs to job seekers and job seekers to recruiters. The recommendations are generated based on the suitability of the job seekers for the positions as well as the job seekers' and the recruiters' preferences. Therefore, e-recruitment recommendation systems could greatly impact job seekers' careers. Moreover, by affecting the hiring processes of the com… ▽ More

    Submitted 20 October, 2023; v1 submitted 12 September, 2022; originally announced September 2022.

  28. SimHawNet: A Modified Hawkes Process for Temporal Network Simulation

    Authors: Mathilde Perez, Raphaël Romero, Bo Kang, Tijl De Bie, Jefrey Lijffijt, Charlotte Laclau

    Abstract: Temporal networks allow representing connections between objects while incorporating the temporal dimension. While static network models can capture unchanging topological regularities, they often fail to model the effects associated with the causal generative process of the network that occurs in time. Hence, exploiting the temporal aspect of networks has been the focus of many recent studies. In… ▽ More

    Submitted 16 January, 2025; v1 submitted 14 March, 2022; originally announced March 2022.

  29. arXiv:2202.12270  [pdf, other

    cs.CV cs.LG

    Evaluating Feature Attribution Methods in the Image Domain

    Authors: Arne Gevaert, Axel-Jan Rousseau, Thijs Becker, Dirk Valkenborg, Tijl De Bie, Yvan Saeys

    Abstract: Feature attribution maps are a popular approach to highlight the most important pixels in an image for a given prediction of a model. Despite a recent growth in popularity and available methods, little attention is given to the objective evaluation of such attribution maps. Building on previous work in this domain, we investigate existing metrics and propose new variants of metrics for the evaluat… ▽ More

    Submitted 9 August, 2024; v1 submitted 22 February, 2022; originally announced February 2022.

    Comments: Updated based on reviewer comments: added discussion on sanity checks, application to tabular datasets, and minor changes

  30. arXiv:2202.03814  [pdf, other

    cs.LG stat.ML

    Optimal Transport of Classifiers to Fairness

    Authors: Maarten Buyl, Tijl De Bie

    Abstract: In past work on fairness in machine learning, the focus has been on forcing the prediction of classifiers to have similar statistical properties for people of different demographics. To reduce the violation of these properties, fairness methods usually simply rescale the classifier scores, ignoring similarities and dissimilarities between members of different groups. Yet, we hypothesize that such… ▽ More

    Submitted 29 November, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

  31. An Earth Mover's Distance Based Graph Distance Metric For Financial Statements

    Authors: Sander Noels, Benjamin Vandermarliere, Ken Bastiaensen, Tijl De Bie

    Abstract: Quantifying the similarity between a group of companies has proven to be useful for several purposes, including company benchmarking, fraud detection, and searching for investment opportunities. This exercise can be done using a variety of data sources, such as company activity data and financial data. However, ledger account data is widely available and is standardized to a large extent. Such led… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: 8 pages, 5 figures

    Journal ref: 2022 IEEE Symposium on Computational Intelligence for Financial Engineering and Economics (CIFEr)

  32. arXiv:2110.09193  [pdf, other

    cs.LG stat.ML

    Topologically Regularized Data Embeddings

    Authors: Robin Vandaele, Bo Kang, Jefrey Lijffijt, Tijl De Bie, Yvan Saeys

    Abstract: Unsupervised feature learning often finds low-dimensional embeddings that capture the structure of complex data. For tasks for which prior expert topological knowledge is available, incorporating this into the learned representation may lead to higher quality embeddings. For example, this may help one to embed the data into a given number of clusters, or to accommodate for noise that prevents one… ▽ More

    Submitted 7 March, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

  33. arXiv:2109.10569  [pdf, other

    cs.LG stat.ML

    The Curse Revisited: When are Distances Informative for the Ground Truth in Noisy High-Dimensional Data?

    Authors: Robin Vandaele, Bo Kang, Tijl De Bie, Yvan Saeys

    Abstract: Distances between data points are widely used in machine learning applications. Yet, when corrupted by noise, these distances -- and thus the models based upon them -- may lose their usefulness in high dimensions. Indeed, the small marginal effects of the noise may then accumulate quickly, shifting empirical closest and furthest neighbors away from the ground truth. In this paper, we exactly chara… ▽ More

    Submitted 7 March, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

  34. arXiv:2107.01936  [pdf, other

    cs.SI cs.LG

    Adversarial Robustness of Probabilistic Network Embedding for Link Prediction

    Authors: Xi Chen, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: In today's networked society, many real-world problems can be formalized as predicting links in networks, such as Facebook friendship suggestions, e-commerce recommendations, and the prediction of scientific collaborations in citation networks. Increasingly often, link prediction problem is tackled by means of network embedding methods, owing to their state-of-the-art performance. However, these m… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

  35. arXiv:2105.05699  [pdf, other

    cs.DB cs.LG

    Automating Data Science: Prospects and Challenges

    Authors: Tijl De Bie, Luc De Raedt, José Hernández-Orallo, Holger H. Hoos, Padhraic Smyth, Christopher K. I. Williams

    Abstract: Given the complexity of typical data science projects and the associated demand for human expertise, automation has the potential to transform the data science process. Key insights: * Automation in data science aims to facilitate and transform the work of data scientists, not to replace them. * Important parts of data science are already being automated, especially in the modeling stages, w… ▽ More

    Submitted 28 February, 2022; v1 submitted 12 May, 2021; originally announced May 2021.

    Comments: 19 pages, 3 figures. v1 accepted for publication (April 2021) in Communications of the ACM

    Journal ref: Communications of the ACM 65(3) 76-87 (2022)

  36. arXiv:2103.01846  [pdf, other

    cs.LG

    The KL-Divergence between a Graph Model and its Fair I-Projection as a Fairness Regularizer

    Authors: Maarten Buyl, Tijl De Bie

    Abstract: Learning and reasoning over graphs is increasingly done by means of probabilistic models, e.g. exponential random graph models, graph embedding models, and graph neural networks. When graphs are modeling relations between people, however, they will inevitably reflect biases, prejudices, and other forms of inequity and inequality. An important challenge is thus to design accurate graph modeling app… ▽ More

    Submitted 27 June, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

  37. arXiv:2005.10701  [pdf, other

    cs.SI cs.LG stat.ML

    CSNE: Conditional Signed Network Embedding

    Authors: Alexandru Mara, Yoosof Mashayekhi, Jefrey Lijffijt, Tijl De Bie

    Abstract: Signed networks are mathematical structures that encode positive and negative relations between entities such as friend/foe or trust/distrust. Recently, several papers studied the construction of useful low-dimensional representations (embeddings) of these networks for the prediction of missing relations or signs. Existing embedding methods for sign prediction generally enforce different notions o… ▽ More

    Submitted 25 May, 2020; v1 submitted 19 May, 2020; originally announced May 2020.

  38. Benchmarking Network Embedding Models for Link Prediction: Are We Making Progress?

    Authors: Alexandru Mara, Jefrey Lijffijt, Tijl De Bie

    Abstract: Network embedding methods map a network's nodes to vectors in an embedding space, in such a way that these representations are useful for estimating some notion of similarity or proximity between pairs of nodes in the network. The quality of these node representations is then showcased through results of downstream prediction tasks. Commonly used benchmark tasks such as link prediction, however, p… ▽ More

    Submitted 3 September, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

  39. arXiv:2002.11442  [pdf, other

    cs.LG stat.ML

    DeBayes: a Bayesian Method for Debiasing Network Embeddings

    Authors: Maarten Buyl, Tijl De Bie

    Abstract: As machine learning algorithms are increasingly deployed for high-impact automated decision making, ethical and increasingly also legal standards demand that they treat all individuals fairly, without discrimination based on their age, gender, race or other sensitive traits. In recent years much progress has been made on ensuring fairness and reducing bias in standard machine learning settings. Ye… ▽ More

    Submitted 30 April, 2021; v1 submitted 26 February, 2020; originally announced February 2020.

  40. arXiv:2002.10127  [pdf, other

    cs.LG cs.CL stat.ML

    FONDUE: A Framework for Node Disambiguation Using Network Embeddings

    Authors: Ahmad Mel, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: Real-world data often presents itself in the form of a network. Examples include social networks, citation networks, biological networks, and knowledge graphs. In their simplest form, networks represent real-life entities (e.g. people, papers, proteins, concepts) as nodes, and describe them in terms of their relations with other entities by means of edges between these nodes. This can be valuable… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

    Comments: 11 pages, 3 figures

  41. Block-Approximated Exponential Random Graphs

    Authors: Florian Adriaens, Alexandru Mara, Jefrey Lijffijt, Tijl De Bie

    Abstract: An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs. By utilizing fast matrix block-approximation techniques, we propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions, while being able to meaningfully model both local information of the graph (e.g.,… ▽ More

    Submitted 26 August, 2020; v1 submitted 14 February, 2020; originally announced February 2020.

    Comments: Accepted for DSAA 2020 conference

  42. arXiv:2002.01227  [pdf, other

    cs.LG cs.IT stat.ML

    ALPINE: Active Link Prediction using Network Embedding

    Authors: Xi Chen, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: Many real-world problems can be formalized as predicting links in a partially observed network. Examples include Facebook friendship suggestions, consumer-product recommendations, and the identification of hidden interactions between actors in a crime network. Several link prediction algorithms, notably those recently introduced using network embedding, are capable of doing this by just relying on… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

  43. arXiv:2002.00793  [pdf, other

    cs.SI cs.LG stat.ML

    Explainable Subgraphs with Surprising Densities: A Subgroup Discovery Approach

    Authors: Junning Deng, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: The connectivity structure of graphs is typically related to the attributes of the nodes. In social networks for example, the probability of a friendship between two people depends on their attributes, such as their age, address, and hobbies. The connectivity of a graph can thus possibly be understood in terms of patterns of the form 'the subgroup of individuals with properties X are often (or rar… ▽ More

    Submitted 10 January, 2020; originally announced February 2020.

  44. FACE: Feasible and Actionable Counterfactual Explanations

    Authors: Rafael Poyiadzi, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, Peter Flach

    Abstract: Work in Counterfactual Explanations tends to focus on the principle of "the closest possible world" that identifies small changes leading to the desired outcome. In this paper we argue that while this approach might initially seem intuitively appealing it exhibits shortcomings not addressed in the current literature. First, a counterfactual example generated by the state-of-the-art systems is not… ▽ More

    Submitted 24 February, 2020; v1 submitted 20 September, 2019; originally announced September 2019.

    Comments: Presented at AAAI/ACM Conference on AI, Ethics, and Society 2020

  45. Discovering Interesting Cycles in Directed Graphs

    Authors: Florian Adriaens, Cigdem Aslay, Tijl De Bie, Aristides Gionis, Jefrey Lijffijt

    Abstract: Cycles in graphs often signify interesting processes. For example, cyclic trading patterns can indicate inefficiencies or economic dependencies in trade networks, cycles in food webs can identify fragile dependencies in ecosystems, and cycles in financial transaction networks can be an indication of money laundering. Identifying such interesting cycles, which can also be constrained to contain a g… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: Accepted for CIKM'19

  46. arXiv:1905.10086  [pdf, other

    cs.LG stat.ML

    Conditional t-SNE: Complementary t-SNE embeddings through factoring out prior information

    Authors: Bo Kang, Darío García García, Jefrey Lijffijt, Raúl Santos-Rodríguez, Tijl De Bie

    Abstract: Dimensionality reduction and manifold learning methods such as t-Distributed Stochastic Neighbor Embedding (t-SNE) are routinely used to map high-dimensional data into a 2-dimensional space to visualize and explore the data. However, two dimensions are typically insufficient to capture all structure in the data, the salient structure is often already known, and it is not obvious how to extract the… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

  47. arXiv:1905.03040  [pdf, other

    cs.SI

    Mining Subjectively Interesting Attributed Subgraphs

    Authors: Anes Bendimerad, Ahmad Mel, Jefrey Lijffijt, Marc Plantevit, Céline Robardet, Tijl De Bie

    Abstract: Community detection in graphs, data clustering, and local pattern mining are three mature fields of data mining and machine learning. In recent years, attributed subgraph mining is emerging as a new powerful data mining task in the intersection of these areas. Given a graph and a set of attributes for each vertex, attributed subgraph mining aims to find cohesive subgraphs for which (a subset of) t… ▽ More

    Submitted 19 April, 2019; originally announced May 2019.

    Comments: International Workshop On Mining And Learning With Graphs, held with SIGKDD 2018

  48. arXiv:1904.12694  [pdf, other

    cs.LG stat.ML

    ExplaiNE: An Approach for Explaining Network Embedding-based Link Predictions

    Authors: Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: Networks are powerful data structures, but are challenging to work with for conventional machine learning methods. Network Embedding (NE) methods attempt to resolve this by learning vector representations for the nodes, for subsequent use in downstream machine learning tasks. Link Prediction (LP) is one such downstream machine learning task that is an important use case and popular benchmark for… ▽ More

    Submitted 22 April, 2019; originally announced April 2019.

  49. arXiv:1903.11535  [pdf, other

    cs.SI

    Opinion Dynamics with Backfire Effect and Biased Assimilation

    Authors: Xi Chen, Panayiotis Tsaparas, Jefrey Lijffijt, Tijl De Bie

    Abstract: The democratization of AI tools for content generation, combined with unrestricted access to mass media for all (e.g. through microblogging and social media), makes it increasingly hard for people to distinguish fact from fiction. This raises the question of how individual opinions evolve in such a networked environment without grounding in a known reality. The dominant approach to studying this p… ▽ More

    Submitted 27 March, 2019; originally announced March 2019.

  50. EvalNE: A Framework for Evaluating Network Embeddings on Link Prediction

    Authors: Alexandru Mara, Jefrey Lijffijt, Tijl De Bie

    Abstract: In this paper we present EvalNE, a Python toolbox for evaluating network embedding methods on link prediction tasks. Link prediction is one of the most popular choices for evaluating the quality of network embeddings. However, the complexity of this task requires a carefully designed evaluation pipeline in order to provide consistent, reproducible and comparable results. EvalNE simplifies this pro… ▽ More

    Submitted 22 January, 2019; originally announced January 2019.