Skip to main content

Showing 1–26 of 26 results for author: Baesens, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.11812  [pdf, ps, other

    cs.AI cs.CL cs.LG

    On the Performance of LLMs for Real Estate Appraisal

    Authors: Margot Geerts, Manon Reusens, Bart Baesens, Seppe vanden Broucke, Jochen De Weerdt

    Abstract: The real estate market is vital to global economies but suffers from significant information asymmetry. This study examines how Large Language Models (LLMs) can democratize access to real estate insights by generating competitive and interpretable house price estimates through optimized In-Context Learning (ICL) strategies. We systematically evaluate leading LLMs on diverse international housing d… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: Accepted at ECML-PKDD 2025

  2. arXiv:2506.04292  [pdf, ps, other

    cs.SI cs.LG stat.AP

    GARG-AML against Smurfing: A Scalable and Interpretable Graph-Based Framework for Anti-Money Laundering

    Authors: Bruno Deprez, Bart Baesens, Tim Verdonck, Wouter Verbeke

    Abstract: Money laundering poses a significant challenge as it is estimated to account for 2%-5% of the global GDP. This has compelled regulators to impose stringent controls on financial institutions. One prominent laundering method for evading these controls, called smurfing, involves breaking up large transactions into smaller amounts. Given the complexity of smurfing schemes, which involve multiple tran… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  3. arXiv:2506.02659  [pdf, other

    cs.CL

    Are Economists Always More Introverted? Analyzing Consistency in Persona-Assigned LLMs

    Authors: Manon Reusens, Bart Baesens, David Jurgens

    Abstract: Personalized Large Language Models (LLMs) are increasingly used in diverse applications, where they are assigned a specific persona - such as a happy high school teacher - to guide their responses. While prior research has examined how well LLMs adhere to predefined personas in writing style, a comprehensive analysis of consistency across different personas and task types is lacking. In this paper… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  4. arXiv:2503.24259  [pdf, other

    cs.LG

    Advances in Continual Graph Learning for Anti-Money Laundering Systems: A Comprehensive Review

    Authors: Bruno Deprez, Wei Wei, Wouter Verbeke, Bart Baesens, Kevin Mets, Tim Verdonck

    Abstract: Financial institutions are required by regulation to report suspicious financial transactions related to money laundering. Therefore, they need to constantly monitor vast amounts of incoming and outgoing transactions. A particular challenge in detecting money laundering is that money launderers continuously adapt their tactics to evade detection. Hence, detection methods need constant fine-tuning.… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

  5. arXiv:2406.17385  [pdf, other

    cs.CL

    Native Design Bias: Studying the Impact of English Nativeness on Language Model Performance

    Authors: Manon Reusens, Philipp Borchert, Jochen De Weerdt, Bart Baesens

    Abstract: Large Language Models (LLMs) excel at providing information acquired during pretraining on large-scale corpora and following instructions through user prompts. This study investigates whether the quality of LLM responses varies depending on the demographic profile of users. Considering English as the global lingua franca, along with the diversity of its dialects among speakers of different native… ▽ More

    Submitted 7 October, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  6. arXiv:2405.19383  [pdf, other

    cs.SI cs.LG

    Network Analytics for Anti-Money Laundering -- A Systematic Literature Review and Experimental Evaluation

    Authors: Bruno Deprez, Toon Vanderschueren, Bart Baesens, Tim Verdonck, Wouter Verbeke

    Abstract: Money laundering presents a pervasive challenge, burdening society by financing illegal activities. The use of network information is increasingly being explored to more effectively combat money laundering, given it involves connected parties. This led to a surge in research on network analytics (NA) for anti-money laundering (AML). The literature on NA for AML is, however, fragmented and a compre… ▽ More

    Submitted 19 March, 2025; v1 submitted 29 May, 2024; originally announced May 2024.

  7. arXiv:2405.18913  [pdf, other

    cs.LG

    Time-Series Foundation Models for Forecasting Soil Moisture Levels in Smart Agriculture

    Authors: Boje Deforce, Bart Baesens, Estefanía Serral Asensio

    Abstract: The recent surge in foundation models for natural language processing and computer vision has fueled innovation across various domains. Inspired by this progress, we explore the potential of foundation models for time-series forecasting in smart agriculture, a field often plagued by limited data availability. Specifically, this work presents a novel application of $\texttt{TimeGPT}$, a state-of-th… ▽ More

    Submitted 9 August, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: 7 pages, accepted at KDD '24 - Fragile Earth Workshop https://openreview.net/forum?id=GZBGhi4JfE

  8. End-To-End Self-Tuning Self-Supervised Time Series Anomaly Detection

    Authors: Boje Deforce, Meng-Chieh Lee, Bart Baesens, Estefanía Serral Asensio, Jaemin Yoo, Leman Akoglu

    Abstract: Time series anomaly detection (TSAD) finds many applications such as monitoring environmental sensors, industry KPIs, patient biomarkers, etc. A two-fold challenge for TSAD is a versatile and unsupervised model that can detect various different types of time series anomalies (spikes, discontinuities, trend shifts, etc.) without any labeled data. Modern neural networks have outstanding ability in m… ▽ More

    Submitted 3 April, 2025; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted at SDM 2025

  9. arXiv:2310.10310  [pdf, other

    cs.CL

    Investigating Bias in Multilingual Language Models: Cross-Lingual Transfer of Debiasing Techniques

    Authors: Manon Reusens, Philipp Borchert, Margot Mieskes, Jochen De Weerdt, Bart Baesens

    Abstract: This paper investigates the transferability of debiasing techniques across different languages within multilingual models. We examine the applicability of these techniques in English, French, German, and Dutch. Using multilingual BERT (mBERT), we demonstrate that cross-lingual transfer of debiasing techniques is not only feasible but also yields promising results. Surprisingly, our findings reveal… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 main conference

  10. arXiv:2310.06675  [pdf, other

    cs.CL

    SEER : A Knapsack approach to Exemplar Selection for In-Context HybridQA

    Authors: Jonathan Tonglet, Manon Reusens, Philipp Borchert, Bart Baesens

    Abstract: Question answering over hybrid contexts is a complex task, which requires the combination of information extracted from unstructured texts and structured tables in various ways. Recently, In-Context Learning demonstrated significant performance advances for reasoning tasks. In this paradigm, a large language model performs predictions based on a small set of supporting exemplars. The performance o… ▽ More

    Submitted 20 October, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Camera ready revision for EMNLP 2023 main conference. Code available at https://github.com/jtonglet/SEER

  11. INFLECT-DGNN: Influencer Prediction with Dynamic Graph Neural Networks

    Authors: Elena Tiukhova, Emiliano Penaloza, María Óskarsdóttir, Bart Baesens, Monique Snoeck, Cristián Bravo

    Abstract: Leveraging network information for predictive modeling has become widespread in many domains. Within the realm of referral and targeted marketing, influencer detection stands out as an area that could greatly benefit from the incorporation of dynamic network representation due to the continuous evolution of customer-brand relationships. In this paper, we present INFLECT-DGNN, a new method for prof… ▽ More

    Submitted 10 September, 2024; v1 submitted 16 July, 2023; originally announced July 2023.

    Comments: 27 pages, 7 figures

    Journal ref: IEEE Access, 12, 115026-115041 (2024)

  12. arXiv:2305.05495  [pdf, other

    cs.LG

    Self-Supervised Anomaly Detection of Rogue Soil Moisture Sensors

    Authors: Boje Deforce, Bart Baesens, Jan Diels, Estefanía Serral Asensio

    Abstract: IoT data is a central element in the successful digital transformation of agriculture. However, IoT data comes with its own set of challenges. E.g., the risk of data contamination due to rogue sensors. A sensor is considered rogue when it provides incorrect measurements over time. To ensure correct analytical results, an essential preprocessing step when working with IoT data is the detection of s… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

  13. arXiv:2211.09664  [pdf, other

    cs.SI cs.AI cs.LG

    Influencer Detection with Dynamic Graph Neural Networks

    Authors: Elena Tiukhova, Emiliano Penaloza, María Óskarsdóttir, Hernan Garcia, Alejandro Correa Bahnsen, Bart Baesens, Monique Snoeck, Cristián Bravo

    Abstract: Leveraging network information for prediction tasks has become a common practice in many domains. Being an important part of targeted marketing, influencer detection can potentially benefit from incorporating dynamic network representation. In this work, we investigate different dynamic Graph Neural Networks (GNNs) configurations for influencer detection and evaluate their prediction performance u… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: Conference workshop camera-ready paper - accepted at NeurIPS TGL 2022. 8 pages, 4 figures

  14. arXiv:2206.01562  [pdf, other

    econ.GN cs.LG stat.ML

    Prescriptive maintenance with causal machine learning

    Authors: Toon Vanderschueren, Robert Boute, Tim Verdonck, Bart Baesens, Wouter Verbeke

    Abstract: Machine maintenance is a challenging operational problem, where the goal is to plan sufficient preventive maintenance to avoid machine failures and overhauls. Maintenance is often imperfect in reality and does not make the asset as good as new. Although a variety of imperfect maintenance policies have been proposed in the literature, these rely on strong assumptions regarding the effect of mainten… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

  15. arXiv:2202.04369  [pdf, other

    cs.LG stat.ML

    A new perspective on classification: optimally allocating limited resources to uncertain tasks

    Authors: Toon Vanderschueren, Bart Baesens, Tim Verdonck, Wouter Verbeke

    Abstract: A central problem in business concerns the optimal allocation of limited resources to a set of available tasks, where the payoff of these tasks is inherently uncertain. In credit card fraud detection, for instance, a bank can only assign a small subset of transactions to their fraud investigations team. Typically, such problems are solved using a classification framework, where the focus is on pre… ▽ More

    Submitted 9 February, 2022; originally announced February 2022.

  16. Expert-driven Trace Clustering with Instance-level Constraints

    Authors: Pieter De Koninck, Klaas Nelissen, Seppe vanden Broucke, Bart Baesens, Monique Snoeck, Jochen De Weerdt

    Abstract: Within the field of process mining, several different trace clustering approaches exist for partitioning traces or process instances into similar groups. Typically, this partitioning is based on certain patterns or similarity between the traces, or driven by the discovery of a process model for each cluster. The main drawback of these techniques, however, is that their solutions are usually hard t… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

    Journal ref: Knowl Inf Syst 63, 1197-1220 (2021)

  17. arXiv:2009.08313  [pdf, other

    cs.SI cs.CR stat.ML

    Social network analytics for supervised fraud detection in insurance

    Authors: María Óskarsdóttir, Waqas Ahmed, Katrien Antonio, Bart Baesens, Rémi Dendievel, Tom Donas, Tom Reynkens

    Abstract: Insurance fraud occurs when policyholders file claims that are exaggerated or based on intentional damages. This contribution develops a fraud detection strategy by extracting insightful information from the social network of a claim. First, we construct a network by linking claims with all their involved parties, including the policyholders, brokers, experts, and garages. Next, we establish fraud… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

    Comments: 37 pages, 8 figures

  18. arXiv:2005.01075  [pdf, other

    cs.LG cs.AI

    Autoencoders for strategic decision support

    Authors: Sam Verboven, Jeroen Berrevoets, Chris Wuytens, Bart Baesens, Wouter Verbeke

    Abstract: In the majority of executive domains, a notion of normality is involved in most strategic decisions. However, few data-driven tools that support strategic decision-making are available. We introduce and extend the use of autoencoders to provide strategically relevant granular feedback. A first experiment indicates that experts are inconsistent in their decision making, highlighting the need for st… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

  19. arXiv:2003.11915  [pdf, other

    cs.LG cs.CR stat.AP stat.ML

    robROSE: A robust approach for dealing with imbalanced data in fraud detection

    Authors: Bart Baesens, Sebastiaan Höppner, Irene Ortner, Tim Verdonck

    Abstract: A major challenge when trying to detect fraud is that the fraudulent activities form a minority class which make up a very small proportion of the data set. In most data sets, fraud occurs in typically less than 0.5% of the cases. Detecting fraud in such a highly imbalanced data set typically leads to predictions that favor the majority group, causing fraud to remain undetected. We discuss some po… ▽ More

    Submitted 22 March, 2020; originally announced March 2020.

  20. arXiv:2002.09931  [pdf, other

    cs.SI cs.CY cs.LG stat.ML

    The Value of Big Data for Credit Scoring: Enhancing Financial Inclusion using Mobile Phone Data and Social Network Analytics

    Authors: María Óskarsdóttir, Cristián Bravo, Carlos Sarraute, Jan Vanthienen, Bart Baesens

    Abstract: Credit scoring is without a doubt one of the oldest applications of analytics. In recent years, a multitude of sophisticated classification techniques have been developed to improve the statistical performance of credit scoring models. Instead of focusing on the techniques themselves, this paper leverages alternative data sources to enhance both statistical and economic model performance. The stud… ▽ More

    Submitted 23 February, 2020; originally announced February 2020.

    Journal ref: Applied Soft Computing, Volume 74, January 2019, Pages 26-39

  21. arXiv:2002.00949  [pdf, other

    econ.EM cs.LG stat.ML

    Profit-oriented sales forecasting: a comparison of forecasting techniques from a business perspective

    Authors: Tine Van Calster, Filip Van den Bossche, Bart Baesens, Wilfried Lemahieu

    Abstract: Choosing the technique that is the best at forecasting your data, is a problem that arises in any forecasting application. Decades of research have resulted into an enormous amount of forecasting methods that stem from statistics, econometrics and machine learning (ML), which leads to a very difficult and elaborate choice to make in any forecasting exercise. This paper aims to facilitate this proc… ▽ More

    Submitted 3 February, 2020; originally announced February 2020.

  22. arXiv:2001.10994  [pdf, other

    cs.SI cs.CY

    Credit Scoring for Good: Enhancing Financial Inclusion with Smartphone-Based Microlending

    Authors: María Óskarsdóttir, Cristián Bravo, Carlos Sarraute, Bart Baesens, Jan Vanthienen

    Abstract: Globally, two billion people and more than half of the poorest adults do not use formal financial services. Consequently, there is increased emphasis on developing financial technology that can facilitate access to financial products for the unbanked. In this regard, smartphone-based microlending has emerged as a potential solution to enhance financial inclusion. We propose a methodology to impr… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

    Comments: Thirty Ninth International Conference on Information Systems (ICIS), December 14, 2018, San Francisco, USA

  23. Social Network Analytics for Churn Prediction in Telco: Model Building, Evaluation and Network Architecture

    Authors: María Óskarsdóttir, Cristián Bravo, Wouter Verbeke, Carlos Sarraute, Bart Baesens, Jan Vanthienen

    Abstract: Social network analytics methods are being used in the telecommunication industry to predict customer churn with great success. In particular it has been shown that relational learners adapted to this specific problem enhance the performance of predictive models. In the current study we benchmark different strategies for constructing a relational learner by applying them to a total of eight dist… ▽ More

    Submitted 18 January, 2020; originally announced January 2020.

    Journal ref: Expert Systems with Applications, Volume 85, 1 November 2017, Pages 204-220

  24. A Comparative Study of Social Network Classifiers for Predicting Churn in the Telecommunication Industry

    Authors: Maria Óskarsdóttir, Cristián Bravo, Wouter Verbeke, Carlos Sarraute, Bart Baesens, Jan Vanthienen

    Abstract: Relational learning in networked data has been shown to be effective in a number of studies. Relational learners, composed of relational classifiers and collective inference methods, enable the inference of nodes in a network given the existence and strength of links to other nodes. These methods have been adapted to predict customer churn in telecommunication companies showing that incorporating… ▽ More

    Submitted 18 January, 2020; originally announced January 2020.

    Comments: 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)

  25. arXiv:1901.01726  [pdf

    cs.SE

    Evaluating software defect prediction performance: an updated benchmarking study

    Authors: Libo Li, Stefan Lessmann, Bart Baesens

    Abstract: Accurately predicting faulty software units helps practitioners target faulty units and prioritize their efforts to maintain software quality. Prior studies use machine-learning models to detect faulty software code. We revisit past studies and point out potential improvements. Our new study proposes a revised benchmarking configuration. The configuration considers many new dimensions, such as cla… ▽ More

    Submitted 7 January, 2019; originally announced January 2019.

  26. arXiv:1712.08101  [pdf, other

    stat.ML cs.LG stat.AP

    Profit Driven Decision Trees for Churn Prediction

    Authors: Sebastiaan Höppner, Eugen Stripling, Bart Baesens, Seppe vanden Broucke, Tim Verdonck

    Abstract: Customer retention campaigns increasingly rely on predictive models to detect potential churners in a vast customer base. From the perspective of machine learning, the task of predicting customer churn can be presented as a binary classification problem. Using data on historic behavior, classification algorithms are built with the purpose of accurately predicting the probability of a customer defe… ▽ More

    Submitted 21 December, 2017; originally announced December 2017.