Search | arXiv e-print repository

Reproducibility in Machine Learning-based Research: Overview, Barriers and Drivers

Authors: Harald Semmelrock, Tony Ross-Hellauer, Simone Kopeinik, Dieter Theiler, Armin Haberl, Stefan Thalmann, Dominik Kowald

Abstract: Many research fields are currently reckoning with issues of poor levels of reproducibility. Some label it a "crisis", and research employing or building Machine Learning (ML) models is no exception. Issues including lack of transparency, data or code, poor adherence to standards, and the sensitivity of ML training conditions mean that many papers are not even reproducible in principle. Where they… ▽ More Many research fields are currently reckoning with issues of poor levels of reproducibility. Some label it a "crisis", and research employing or building Machine Learning (ML) models is no exception. Issues including lack of transparency, data or code, poor adherence to standards, and the sensitivity of ML training conditions mean that many papers are not even reproducible in principle. Where they are, though, reproducibility experiments have found worryingly low degrees of similarity with original results. Despite previous appeals from ML researchers on this topic and various initiatives from conference reproducibility tracks to the ACM's new Emerging Interest Group on Reproducibility and Replicability, we contend that the general community continues to take this issue too lightly. Poor reproducibility threatens trust in and integrity of research results. Therefore, in this article, we lay out a new perspective on the key barriers and drivers (both procedural and technical) to increased reproducibility at various levels (methods, code, data, and experiments). We then map the drivers to the barriers to give concrete advice for strategies for researchers to mitigate reproducibility issues in their own work, to lay out key areas where further research is needed in specific areas, and to further ignite discussion on the threat presented by these urgent issues. △ Less

Submitted 26 February, 2025; v1 submitted 20 June, 2024; originally announced June 2024.

Comments: Accepted for publication in the AI Magazine

arXiv:2307.10320 [pdf, other]

Reproducibility in Machine Learning-Driven Research

Authors: Harald Semmelrock, Simone Kopeinik, Dieter Theiler, Tony Ross-Hellauer, Dominik Kowald

Abstract: Research is facing a reproducibility crisis, in which the results and findings of many studies are difficult or even impossible to reproduce. This is also the case in machine learning (ML) and artificial intelligence (AI) research. Often, this is the case due to unpublished data and/or source-code, and due to sensitivity to ML training conditions. Although different solutions to address this issue… ▽ More Research is facing a reproducibility crisis, in which the results and findings of many studies are difficult or even impossible to reproduce. This is also the case in machine learning (ML) and artificial intelligence (AI) research. Often, this is the case due to unpublished data and/or source-code, and due to sensitivity to ML training conditions. Although different solutions to address this issue are discussed in the research community such as using ML platforms, the level of reproducibility in ML-driven research is not increasing substantially. Therefore, in this mini survey, we review the literature on reproducibility in ML-driven research with three main aims: (i) reflect on the current situation of ML reproducibility in various research fields, (ii) identify reproducibility issues and barriers that exist in these research fields applying ML, and (iii) identify potential drivers such as tools, practices, and interventions that support ML reproducibility. With this, we hope to contribute to decisions on the viability of different solutions for supporting ML reproducibility. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: This research is supported by the Horizon Europe project TIER2 under grant agreement No 101094817

arXiv:2301.01037 [pdf, other]

Uptrendz: API-Centric Real-time Recommendations in Multi-Domain Settings

Authors: Emanuel Lacic, Tomislav Duricic, Leon Fadljevic, Dieter Theiler, Dominik Kowald

Abstract: In this work, we tackle the problem of adapting a real-time recommender system to multiple application domains, and their underlying data models and customization requirements. To do that, we present Uptrendz, a multi-domain recommendation platform that can be customized to provide real-time recommendations in an API-centric way. We demonstrate (i) how to set up a real-time movie recommender using… ▽ More In this work, we tackle the problem of adapting a real-time recommender system to multiple application domains, and their underlying data models and customization requirements. To do that, we present Uptrendz, a multi-domain recommendation platform that can be customized to provide real-time recommendations in an API-centric way. We demonstrate (i) how to set up a real-time movie recommender using the popular MovieLens-100k dataset, and (ii) how to simultaneously support multiple application domains based on the use-case of recommendations in entrepreneurial start-up founding. For that, we differentiate between domains on the item- and system-level. We believe that our demonstration shows a convenient way to adapt, deploy and evaluate a recommender system in an API-centric way. The source-code and documentation that demonstrates how to utilize the configured Uptrendz API is available on GitHub. △ Less

Submitted 3 January, 2023; originally announced January 2023.

Comments: ECIR 2023 demo paper

arXiv:2210.11828 [pdf, other]

Towards Employing Recommender Systems for Supporting Data and Algorithm Sharing

Authors: Peter Müllner, Stefan Schmerda, Dieter Theiler, Stefanie Lindstaedt, Dominik Kowald

Abstract: Data and algorithm sharing is an imperative part of data and AI-driven economies. The efficient sharing of data and algorithms relies on the active interplay between users, data providers, and algorithm providers. Although recommender systems are known to effectively interconnect users and items in e-commerce settings, there is a lack of research on the applicability of recommender systems for dat… ▽ More Data and algorithm sharing is an imperative part of data and AI-driven economies. The efficient sharing of data and algorithms relies on the active interplay between users, data providers, and algorithm providers. Although recommender systems are known to effectively interconnect users and items in e-commerce settings, there is a lack of research on the applicability of recommender systems for data and algorithm sharing. To fill this gap, we identify six recommendation scenarios for supporting data and algorithm sharing, where four of these scenarios substantially differ from the traditional recommendation scenarios in e-commerce applications. We evaluate these recommendation scenarios using a novel dataset based on interaction data of the OpenML data and algorithm sharing platform, which we also provide for the scientific community. Specifically, we investigate three types of recommendation approaches, namely popularity-, collaboration-, and content-based recommendations. We find that collaboration-based recommendations provide the most accurate recommendations in all scenarios. Plus, the recommendation accuracy strongly depends on the specific scenario, e.g., algorithm recommendations for users are a more difficult problem than algorithm recommendations for datasets. Finally, the content-based approach generates the least popularity-biased recommendations that cover the most datasets and algorithms. △ Less

Submitted 26 October, 2022; v1 submitted 21 October, 2022; originally announced October 2022.

Comments: Accepted to the DataEconomy Workshop at CoNEXT'22

arXiv:1908.04042 [pdf, other]

Evaluating Tag Recommendations for E-Book Annotation Using a Semantic Similarity Metric

Authors: Emanuel Lacic, Dominik Kowald, Dieter Theiler, Matthias Traub, Lucky Kuffer, Stefanie Lindstaedt, Elisabeth Lex

Abstract: In this paper, we present our work to support publishers and editors in finding descriptive tags for e-books through tag recommendations. We propose a hybrid tag recommendation system for e-books, which leverages search query terms from Amazon users and e-book metadata, which is assigned by publishers and editors. Our idea is to mimic the vocabulary of users in Amazon, who search for and review e-… ▽ More In this paper, we present our work to support publishers and editors in finding descriptive tags for e-books through tag recommendations. We propose a hybrid tag recommendation system for e-books, which leverages search query terms from Amazon users and e-book metadata, which is assigned by publishers and editors. Our idea is to mimic the vocabulary of users in Amazon, who search for and review e-books, and to combine these search terms with editor tags in a hybrid tag recommendation approach. In total, we evaluate 19 tag recommendation algorithms on the review content of Amazon users, which reflects the readers' vocabulary. Our results show that we can improve the performance of tag recommender systems for e-books both concerning tag recommendation accuracy, diversity as well as a novel semantic similarity metric, which we also propose in this paper. △ Less

Submitted 12 August, 2019; originally announced August 2019.

Comments: REVEAL Workshop @ RecSys'2019, Kopenhagen, Denmark

arXiv:1908.04017 [pdf, other]

Using the Open Meta Kaggle Dataset to Evaluate Tripartite Recommendations in Data Markets

Authors: Dominik Kowald, Matthias Traub, Dieter Theiler, Heimo Gursch, Emanuel Lacic, Stefanie Lindstaedt, Roman Kern, Elisabeth Lex

Abstract: This work addresses the problem of providing and evaluating recommendations in data markets. Since most of the research in recommender systems is focused on the bipartite relationship between users and items (e.g., movies), we extend this view to the tripartite relationship between users, datasets and services, which is present in data markets. Between these entities, we identify four use cases fo… ▽ More This work addresses the problem of providing and evaluating recommendations in data markets. Since most of the research in recommender systems is focused on the bipartite relationship between users and items (e.g., movies), we extend this view to the tripartite relationship between users, datasets and services, which is present in data markets. Between these entities, we identify four use cases for recommendations: (i) recommendation of datasets for users, (ii) recommendation of services for users, (iii) recommendation of services for datasets, and (iv) recommendation of datasets for services. Using the open Meta Kaggle dataset, we evaluate the recommendation accuracy of a popularity-based as well as a collaborative filtering-based algorithm for these four use cases and find that the recommendation accuracy strongly depends on the given use case. The presented work contributes to the tripartite recommendation problem in general and to the under-researched portfolio of evaluating recommender systems for data markets in particular. △ Less

Submitted 27 August, 2019; v1 submitted 12 August, 2019; originally announced August 2019.

Comments: REVEAL workshop @ RecSys'2019, Kopenhagen, Denmark

arXiv:1808.04603 [pdf, other]

AFEL-REC: A Recommender System for Providing Learning Resource Recommendations in Social Learning Environments

Authors: Dominik Kowald, Emanuel Lacic, Dieter Theiler, Elisabeth Lex

Abstract: In this paper, we present preliminary results of AFEL-REC, a recommender system for social learning environments. AFEL-REC is build upon a scalable software architecture to provide recommendations of learning resources in near real-time. Furthermore, AFEL-REC can cope with any kind of data that is present in social learning environments such as resource metadata, user interactions or social tags.… ▽ More In this paper, we present preliminary results of AFEL-REC, a recommender system for social learning environments. AFEL-REC is build upon a scalable software architecture to provide recommendations of learning resources in near real-time. Furthermore, AFEL-REC can cope with any kind of data that is present in social learning environments such as resource metadata, user interactions or social tags. We provide a preliminary evaluation of three recommendation use cases implemented in AFEL-REC and we find that utilizing social data in form of tags is helpful for not only improving recommendation accuracy but also coverage. This paper should be valuable for both researchers and practitioners interested in providing resource recommendations in social learning environments. △ Less

Submitted 14 August, 2018; originally announced August 2018.

Journal ref: Social Recommender Systems Workshop @ ACM CIKM 2018 Conference

Showing 1–7 of 7 results for author: Theiler, D