-
Exploring Scientometrics with the OpenAIRE Graph: Introducing the OpenAIRE Beginner's Kit
Authors:
Andrea Mannocci,
Miriam Baglioni
Abstract:
The OpenAIRE Graph is an extensive resource housing diverse information on research products, including literature, datasets, and software, alongside research projects and other scholarly outputs and context. It stands as a cornerstone among contemporary research information databases, offering invaluable insights for scientometric investigations. Despite its wealth of data, its sheer size may ini…
▽ More
The OpenAIRE Graph is an extensive resource housing diverse information on research products, including literature, datasets, and software, alongside research projects and other scholarly outputs and context. It stands as a cornerstone among contemporary research information databases, offering invaluable insights for scientometric investigations. Despite its wealth of data, its sheer size may initially appear daunting, potentially hindering its widespread adoption. To address this challenge, this paper introduces the OpenAIRE Beginner's Kit, a user-friendly solution providing access to a subset of the OpenAIRE Graph within a sandboxed environment coupled with a Jupyter notebook for analysis. The OpenAIRE Beginner's Kit is meticulously designed to democratise research and data exploration, offering accessibility from standard desktop and laptop setups. Within this paper, we provide a brief overview of the included dataset and offer guidance on leveraging the kit through a selection of illustrative queries tailored to address common scientometric inquiries.
△ Less
Submitted 19 September, 2024;
originally announced September 2024.
-
A Survey on Knowledge Organization Systems of Research Fields: Resources and Challenges
Authors:
Angelo Salatino,
Tanay Aggarwal,
Andrea Mannocci,
Francesco Osborne,
Enrico Motta
Abstract:
Knowledge Organization Systems (KOSs), such as term lists, thesauri, taxonomies, and ontologies, play a fundamental role in categorising, managing, and retrieving information. In the academic domain, KOSs are often adopted for representing research areas and their relationships, primarily aiming to classify research articles, academic courses, patents, books, scientific venues, domain experts, gra…
▽ More
Knowledge Organization Systems (KOSs), such as term lists, thesauri, taxonomies, and ontologies, play a fundamental role in categorising, managing, and retrieving information. In the academic domain, KOSs are often adopted for representing research areas and their relationships, primarily aiming to classify research articles, academic courses, patents, books, scientific venues, domain experts, grants, software, experiment materials, and several other relevant products and agents. These structured representations of research areas, widely embraced by many academic fields, have proven effective in empowering AI-based systems to i) enhance retrievability of relevant documents, ii) enable advanced analytic solutions to quantify the impact of academic research, and iii) analyse and forecast research dynamics. This paper aims to present a comprehensive survey of the current KOS for academic disciplines. We analysed and compared 45 KOSs according to five main dimensions: scope, structure, curation, usage, and links to other KOSs. Our results reveal a very heterogeneous scenario in terms of scope, scale, quality, and usage, highlighting the need for more integrated solutions for representing research knowledge across academic fields. We conclude by discussing the main challenges and the most promising future directions.
△ Less
Submitted 11 June, 2025; v1 submitted 6 September, 2024;
originally announced September 2024.
-
(Semi)automated disambiguation of scholarly repositories
Authors:
Miriam Baglioni,
Andrea Mannocci,
Gina Pavone,
Michele De Bonis,
Paolo Manghi
Abstract:
The full exploitation of scholarly repositories is pivotal in modern Open Science, and scholarly repository registries are kingpins in enabling researchers and research infrastructures to list and search for suitable repositories. However, since multiple registries exist, repository managers are keen on registering multiple times the repositories they manage to maximise their traction and visibili…
▽ More
The full exploitation of scholarly repositories is pivotal in modern Open Science, and scholarly repository registries are kingpins in enabling researchers and research infrastructures to list and search for suitable repositories. However, since multiple registries exist, repository managers are keen on registering multiple times the repositories they manage to maximise their traction and visibility across different research communities, disciplines, and applications. These multiple registrations ultimately lead to information fragmentation and redundancy on the one hand and, on the other, force registries' users to juggle multiple registries, profiles and identifiers describing the same repository. Such problems are known to registries, which claim equivalence between repository profiles whenever possible by cross-referencing their identifiers across different registries. However, as we will see, this ``claim set'' is far from complete and, therefore, many replicas slip under the radar, possibly creating problems downstream. In this work, we combine such claims to create duplicate sets and extend them with the results of an automated clustering algorithm run over repository metadata descriptions. Then we manually validate our results to produce an ``as accurate as possible'' de-duplicated dataset of scholarly repositories.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
"Knock knock! Who's there?" A study on scholarly repositories' availability
Authors:
Andrea Mannocci,
Miriam Baglioni,
Paolo Manghi
Abstract:
Scholarly repositories are the cornerstone of modern open science, and their availability is vital for enacting its practices. To this end, scholarly registries such as FAIRsharing, re3data, OpenDOAR and ROAR give them presence and visibility across different research communities, disciplines, and applications by assigning an identifier and persisting their profiles with summary metadata. Alas, li…
▽ More
Scholarly repositories are the cornerstone of modern open science, and their availability is vital for enacting its practices. To this end, scholarly registries such as FAIRsharing, re3data, OpenDOAR and ROAR give them presence and visibility across different research communities, disciplines, and applications by assigning an identifier and persisting their profiles with summary metadata. Alas, like any other resource available on the Web, scholarly repositories, be they tailored for literature, software or data, are quite dynamic and can be frequently changed, moved, merged or discontinued. Therefore, their references are prone to link rot over time, and their availability often boils down to whether the homepage URLs indicated in authoritative repository profiles within scholarly registries respond or not. For this study, we harvested the content of four prominent scholarly registries and resolved over 13 thousand unique repository URLs. By performing a quantitative analysis on such an extensive collection of repositories, this paper aims to provide a global snapshot of their availability, which bewilderingly is far from granted.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
Will open science change authorship for good? Towards a quantitative analysis
Authors:
Andrea Mannocci,
Ornella Irrera,
Paolo Manghi
Abstract:
Authorship of scientific articles has profoundly changed from early science until now. If once upon a time a paper was authored by a handful of authors, scientific collaborations are much more prominent on average nowadays. As authorship (and citation) is essentially the primary reward mechanism according to the traditional research evaluation frameworks, it turned to be a rather hot-button topic…
▽ More
Authorship of scientific articles has profoundly changed from early science until now. If once upon a time a paper was authored by a handful of authors, scientific collaborations are much more prominent on average nowadays. As authorship (and citation) is essentially the primary reward mechanism according to the traditional research evaluation frameworks, it turned to be a rather hot-button topic from which a significant portion of academic disputes stems. However, the novel Open Science practices could be an opportunity to disrupt such dynamics and diversify the credit of the different scientific contributors involved in the diverse phases of the lifecycle of the same research effort. In fact, a paper and research data (or software) contextually published could exhibit different authorship to give credit to the various contributors right where it feels most appropriate. We argue that this can be computationally analysed by taking advantage of the wealth of information in model Open Science Graphs. Such a study can pave the way to understand better the dynamics and patterns of authorship in linked literature, research data and software, and how they evolved over the years.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Open Science and Authorship of Supplementary Material. Evidence from a Research Community
Authors:
Andrea Mannocci,
Ornella Irrera,
Paolo Manghi
Abstract:
Authorship of scientific articles has profoundly changed from early science until now. While once upon a time a paper was authored by a handful of authors, scientific collaborations are much more prominent on average nowadays. As authorship (and citation) is essentially the primary reward mechanism according to the traditional research evaluation frameworks, it turned out to be a rather hot-button…
▽ More
Authorship of scientific articles has profoundly changed from early science until now. While once upon a time a paper was authored by a handful of authors, scientific collaborations are much more prominent on average nowadays. As authorship (and citation) is essentially the primary reward mechanism according to the traditional research evaluation frameworks, it turned out to be a rather hot-button topic from which a significant portion of academic disputes stems. However, the novel Open Science practices could be an opportunity to disrupt such dynamics and diversify the credit of the different scientific contributors involved in the diverse phases of the lifecycle of the same research effort. In fact, a paper and research data (or software) contextually published could exhibit different authorship to give credit to the various contributors right where it feels most appropriate. As a preliminary study, in this paper, we leverage the wealth of information contained in Open Science Graphs, such as OpenAIRE, and conduct a focused analysis on a subset of publications with supplementary material drawn from the European Marine Science (MES) research community. The results are promising and suggest our hypothesis is worth exploring further as we registered in 22% of the cases substantial variations between the authors participating in the publication and the authors participating in the supplementary dataset (or software), thus posing the premises for a longitudinal, large-scale analysis of the phenomenon.
△ Less
Submitted 7 July, 2022; v1 submitted 6 July, 2022;
originally announced July 2022.
-
BIP! Scholar: A Service to Facilitate Fair Researcher Assessment
Authors:
Thanasis Vergoulis,
Serafeim Chatzopoulos,
Kleanthis Vichos,
Ilias Kanellos,
Andrea Mannocci,
Natalia Manola,
Paolo Manghi
Abstract:
In recent years, assessing the performance of researchers has become a burden due to the extensive volume of the existing research output. As a result, evaluators often end up relying heavily on a selection of performance indicators like the h-index. However, over-reliance on such indicators may result in reinforcing dubious research practices, while overlooking important aspects of a researcher's…
▽ More
In recent years, assessing the performance of researchers has become a burden due to the extensive volume of the existing research output. As a result, evaluators often end up relying heavily on a selection of performance indicators like the h-index. However, over-reliance on such indicators may result in reinforcing dubious research practices, while overlooking important aspects of a researcher's career, such as their exact role in the production of particular research works or their contribution to other important types of academic or research activities (e.g., production of datasets, peer reviewing). In response, a number of initiatives that attempt to provide guidelines towards fairer research assessment frameworks have been established. In this work, we present BIP! Scholar, a Web-based service that offers researchers the opportunity to set up profiles that summarise their research careers taking into consideration well-established guidelines for fair research assessment, facilitating the work of evaluators who want to be more compliant with the respective practices.
△ Less
Submitted 6 May, 2022;
originally announced May 2022.
-
Detection, Analysis, and Prediction of Research Topics with Scientific Knowledge Graphs
Authors:
Angelo Salatino,
Andrea Mannocci,
Francesco Osborne
Abstract:
Analysing research trends and predicting their impact on academia and industry is crucial to gain a deeper understanding of the advances in a research field and to inform critical decisions about research funding and technology adoption. In the last years, we saw the emergence of several publicly-available and large-scale Scientific Knowledge Graphs fostering the development of many data-driven ap…
▽ More
Analysing research trends and predicting their impact on academia and industry is crucial to gain a deeper understanding of the advances in a research field and to inform critical decisions about research funding and technology adoption. In the last years, we saw the emergence of several publicly-available and large-scale Scientific Knowledge Graphs fostering the development of many data-driven approaches for performing quantitative analyses of research trends. This chapter presents an innovative framework for detecting, analysing, and forecasting research topics based on a large-scale knowledge graph characterising research articles according to the research topics from the Computer Science Ontology. We discuss the advantages of a solution based on a formal representation of topics and describe how it was applied to produce bibliometric studies and innovative tools for analysing and predicting research dynamics.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
BIP! DB: A Dataset of Impact Measures for Scientific Publications
Authors:
Thanasis Vergoulis,
Ilias Kanellos,
Claudio Atzori,
Andrea Mannocci,
Serafeim Chatzopoulos,
Sandro La Bruzzo,
Natalia Manola,
Paolo Manghi
Abstract:
The growth rate of the number of scientific publications is constantly increasing, creating important challenges in the identification of valuable research and in various scholarly data management applications, in general. In this context, measures which can effectively quantify the scientific impact could be invaluable. In this work, we present BIP! DB, an open dataset that contains a variety of…
▽ More
The growth rate of the number of scientific publications is constantly increasing, creating important challenges in the identification of valuable research and in various scholarly data management applications, in general. In this context, measures which can effectively quantify the scientific impact could be invaluable. In this work, we present BIP! DB, an open dataset that contains a variety of impact measures calculated for a large collection of more than 100 million scientific publications from various disciplines.
△ Less
Submitted 6 May, 2022; v1 submitted 28 January, 2021;
originally announced January 2021.
-
The Evolution of IJHCS and CHI: A Quantitative Analysis
Authors:
Andrea Mannocci,
Francesco Osborne,
Enrico Motta
Abstract:
In this paper we focus on the International Journal of Human-Computer Studies (IJHCS) as a domain of analysis, to gain insights about its evolution in the past 50 years and what this evolution tells us about the research landscape associated with the journal. To this purpose we use techniques from the field of Science of Science and analyse the relevant scholarly data to identify a variety of phen…
▽ More
In this paper we focus on the International Journal of Human-Computer Studies (IJHCS) as a domain of analysis, to gain insights about its evolution in the past 50 years and what this evolution tells us about the research landscape associated with the journal. To this purpose we use techniques from the field of Science of Science and analyse the relevant scholarly data to identify a variety of phenomena, including significant geopolitical patterns, the key trends that emerge from a topic-centric analysis, and the insights that can be drawn from an analysis of citation data. Because the area of Human-Computer Interaction (HCI) has always been a central focus for IJHCS, we also include in the analysis the CHI conference, which is the premiere scientific venue in HCI. Analysing both venues provides more data points to our study and allows us to consider two alternative viewpoints on the evolution of HCI research.
△ Less
Submitted 12 August, 2019;
originally announced August 2019.
-
Control Theoretic Optimization of 802.11 WLANs: Implementation and Experimental Evaluation
Authors:
Pablo Serrano,
Paul Patras,
Andrea Mannocci,
Vincenzo Mancuso,
Albert Banchs
Abstract:
In 802.11 WLANs, adapting the contention parameters to network conditions results in substantial performance improvements. Even though the ability to change these parameters has been available in standard devices for years, so far no adaptive mechanism using this functionality has been validated in a realistic deployment. In this paper we report our experiences with implementing and evaluating two…
▽ More
In 802.11 WLANs, adapting the contention parameters to network conditions results in substantial performance improvements. Even though the ability to change these parameters has been available in standard devices for years, so far no adaptive mechanism using this functionality has been validated in a realistic deployment. In this paper we report our experiences with implementing and evaluating two adaptive algorithms based on control theory, one centralized and one distributed, in a large-scale testbed consisting of 18 commercial off-the-shelf devices. We conduct extensive measurements, considering different network conditions in terms of number of active nodes, link qualities and traffic generated. We show that both algorithms significantly outperform the standard configuration in terms of total throughput. We also identify the limitations inherent in distributed schemes, and demonstrate that the centralized approach substantially improves performance under a large variety of scenarios, which confirms its suitability for real deployments.
△ Less
Submitted 18 February, 2014; v1 submitted 13 March, 2012;
originally announced March 2012.