-
Predicting and Explaining Customer Data Sharing in the Open Banking
Authors:
João B. G. de Brito,
Rodrigo Heldt,
Cleo S. Silveira,
Matthias Bogaert,
Guilherme B. Bucco,
Fernando B. Luce,
João L. Becker,
Filipe J. Zabala,
Michel J. Anzanello
Abstract:
The emergence of Open Banking represents a significant shift in financial data management, influencing financial institutions' market dynamics and marketing strategies. This increased competition creates opportunities and challenges, as institutions manage data inflow to improve products and services while mitigating data outflow that could aid competitors. This study introduces a framework to pre…
▽ More
The emergence of Open Banking represents a significant shift in financial data management, influencing financial institutions' market dynamics and marketing strategies. This increased competition creates opportunities and challenges, as institutions manage data inflow to improve products and services while mitigating data outflow that could aid competitors. This study introduces a framework to predict customers' propensity to share data via Open Banking and interprets this behavior through Explanatory Model Analysis (EMA). Using data from a large Brazilian financial institution with approximately 3.2 million customers, a hybrid data balancing strategy incorporating ADASYN and NEARMISS techniques was employed to address the infrequency of data sharing and enhance the training of XGBoost models. These models accurately predicted customer data sharing, achieving 91.39% accuracy for inflow and 91.53% for outflow. The EMA phase combined the Shapley Additive Explanations (SHAP) method with the Classification and Regression Tree (CART) technique, revealing the most influential features on customer decisions. Key features included the number of transactions and purchases in mobile channels, interactions within these channels, and credit-related features, particularly credit card usage across the national banking system. These results highlight the critical role of mobile engagement and credit in driving customer data-sharing behaviors, providing financial institutions with strategic insights to enhance competitiveness and innovation in the Open Banking environment.
△ Less
Submitted 27 June, 2025;
originally announced July 2025.
-
Geospatial Road Cycling Race Results Data Set
Authors:
Bram Janssens,
Luca Pappalardo,
Jelle De Bock,
Matthias Bogaert,
Steven Verstockt
Abstract:
The field of cycling analytics has only recently started to develop due to limited access to open data sources. Accordingly, research and data sources are very divergent, with large differences in information used across studies. To improve this, and facilitate further research in the field, we propose the publication of a data set which links thousands of professional race results from the period…
▽ More
The field of cycling analytics has only recently started to develop due to limited access to open data sources. Accordingly, research and data sources are very divergent, with large differences in information used across studies. To improve this, and facilitate further research in the field, we propose the publication of a data set which links thousands of professional race results from the period 2017-2023 to detailed geographic information about the courses, an essential aspect in road cycling analytics. Initial use cases are proposed, showcasing the usefulness in linking these two data sources.
△ Less
Submitted 26 September, 2024;
originally announced October 2024.
-
Bike2Vec: Vector Embedding Representations of Road Cycling Riders and Races
Authors:
Ethan Baron,
Bram Janssens,
Matthias Bogaert
Abstract:
Vector embeddings have been successfully applied in several domains to obtain effective representations of non-numeric data which can then be used in various downstream tasks. We present a novel application of vector embeddings in professional road cycling by demonstrating a method to learn representations for riders and races based on historical results. We use unsupervised learning techniques to…
▽ More
Vector embeddings have been successfully applied in several domains to obtain effective representations of non-numeric data which can then be used in various downstream tasks. We present a novel application of vector embeddings in professional road cycling by demonstrating a method to learn representations for riders and races based on historical results. We use unsupervised learning techniques to validate that the resultant embeddings capture interesting features of riders and races. These embeddings could be used for downstream prediction tasks such as early talent identification and race outcome prediction.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.