-
Spanish philosophers perceptions of pay to publish and open access: books versus journals, more than a financial dilemma
Authors:
Ramon A. Feenstra,
Emilio Delgado Lopez-Cozar
Abstract:
This study examines habits and perceptions related to pay to publish and open access practices in fields that have attracted little research to date: philosophy and ethics. The study is undertaken in the Spanish context, where the culture of publication and the book and journal publishing industry has some specific characteristics with regard to paying to publish, such as not offering open access…
▽ More
This study examines habits and perceptions related to pay to publish and open access practices in fields that have attracted little research to date: philosophy and ethics. The study is undertaken in the Spanish context, where the culture of publication and the book and journal publishing industry has some specific characteristics with regard to paying to publish, such as not offering open access distribution of books published for a fee. The study draws on data from a survey of 201 researchers, a public debate with 26 researchers, and 14 in-depth interviews. The results reveal some interesting insights on the criteria researchers apply when selecting publishers and journals for their work, the extent of paying to publish (widespread in the case of books and modest for journals) and the debates that arise over the effects it has on manuscript review and unequal access to resources to cover publication fees. Data on the extent of open access and the researchers views on dissemination of publicly funded research are also presented.
△ Less
Submitted 20 May, 2021; v1 submitted 17 May, 2021;
originally announced May 2021.
-
The footprint of a metrics-based research evaluation system on Spanish philosophical scholarship: an analysis of researchers perceptions
Authors:
Ramon A. Feenstra,
Emilio Delgado Lopez-Cozar
Abstract:
The use of bibliometric indicators in research evaluation has a series of complex impacts on academic inquiry. These systems have gradually spread into a wide range of locations and disciplines, including the humanities. The aim of the present study is to examine their effects as perceived by philosophy researchers in Spain, a country where bibliometric indicators have long been used to evaluate r…
▽ More
The use of bibliometric indicators in research evaluation has a series of complex impacts on academic inquiry. These systems have gradually spread into a wide range of locations and disciplines, including the humanities. The aim of the present study is to examine their effects as perceived by philosophy researchers in Spain, a country where bibliometric indicators have long been used to evaluate research. The study combines data from a self-administered questionnaire completed by 201 researchers and from 14 in-depth interviews with researchers selected according to their affiliation, professional category, gender and area of knowledge. Results show that the evaluation system is widely perceived to affect research behaviour in significant ways, particularly related to publication practices (document type and publication language), the transformation of research agendas and the neglect of teaching work, as well as increasing research misconduct and negatively affecting mental health. Although to a lesser extent, other consequences included increased research productivity and enhanced transparency and impartiality in academic selection processes.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
Large coverage fluctuations in Google Scholar: a case study
Authors:
Alberto Martín-Martín,
Emilio Delgado López-Cózar
Abstract:
Unlike other academic bibliographic databases, Google Scholar intentionally operates in a way that does not maintain coverage stability: documents that stop being available to Google Scholar's crawlers are removed from the system. This can also affect Google Scholar's citation graph (citation counts can decrease). Furthermore, because Google Scholar is not transparent about its coverage, the only…
▽ More
Unlike other academic bibliographic databases, Google Scholar intentionally operates in a way that does not maintain coverage stability: documents that stop being available to Google Scholar's crawlers are removed from the system. This can also affect Google Scholar's citation graph (citation counts can decrease). Furthermore, because Google Scholar is not transparent about its coverage, the only way to directly observe coverage loss is through regular monitorization of Google Scholar data. Because of this, few studies have empirically documented this phenomenon. This study analyses a large decrease in coverage of documents in the field of Astronomy and Astrophysics that took place in 2019 and its subsequent recovery, using longitudinal data from previous analyses and a new dataset extracted in 2020. Documents from most of the larger publishers in the field disappeared from Google Scholar despite continuing to be available on the Web, which suggests an error on Google Scholar's side. Disappeared documents did not reappear until the following index-wide update, many months after the problem was discovered. The slowness with which Google Scholar is currently able to resolve indexing errors is a clear limitation of the platform both for literature search and bibliometric use cases.
△ Less
Submitted 15 February, 2021;
originally announced February 2021.
-
Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations' COCI: a multidisciplinary comparison of coverage via citations
Authors:
Alberto Martín-Martín,
Mike Thelwall,
Enrique Orduna-Malea,
Emilio Delgado López-Cózar
Abstract:
New sources of citation data have recently become available, such as Microsoft Academic, Dimensions, and the OpenCitations Index of CrossRef open DOI-to-DOI citations (COCI). Although these have been compared to the Web of Science (WoS), Scopus, or Google Scholar, there is no systematic evidence of their differences across subject categories. In response, this paper investigates 3,073,351 citation…
▽ More
New sources of citation data have recently become available, such as Microsoft Academic, Dimensions, and the OpenCitations Index of CrossRef open DOI-to-DOI citations (COCI). Although these have been compared to the Web of Science (WoS), Scopus, or Google Scholar, there is no systematic evidence of their differences across subject categories. In response, this paper investigates 3,073,351 citations found by these six data sources to 2,515 English-language highly-cited documents published in 2006 from 252 subject categories, expanding and updating the largest previous study. Google Scholar found 88% of all citations, many of which were not found by the other sources, and nearly all citations found by the remaining sources (89%-94%). A similar pattern held within most subject categories. Microsoft Academic is the second largest overall (60% of all citations), including 82% of Scopus citations and 86% of Web of Science citations. In most categories, Microsoft Academic found more citations than Scopus and WoS (182 and 223 subject categories, respectively), but had coverage gaps in some areas, such as Physics and some Humanities categories. After Scopus, Dimensions is fourth largest (54% of all citations), including 84% of Scopus citations and 88% of WoS citations. It found more citations than Scopus in 36 categories, more than WoS in 185, and displays some coverage gaps, especially in the Humanities. Following WoS, COCI is the smallest, with 28% of all citations. Google Scholar is still the most comprehensive source. In many subject categories Microsoft Academic and Dimensions are good alternatives to Scopus and WoS in terms of coverage.
△ Less
Submitted 30 January, 2021; v1 submitted 29 April, 2020;
originally announced April 2020.
-
Google Scholar, Web of Science, and Scopus: a systematic comparison of citations in 252 subject categories
Authors:
Alberto Martín-Martín,
Enrique Orduna-Malea,
Mike Thelwall,
Emilio Delgado López-Cózar
Abstract:
Despite citation counts from Google Scholar (GS), Web of Science (WoS), and Scopus being widely consulted by researchers and sometimes used in research evaluations, there is no recent or systematic evidence about the differences between them. In response, this paper investigates 2,448,055 citations to 2,299 English-language highly-cited documents from 252 GS subject categories published in 2006, c…
▽ More
Despite citation counts from Google Scholar (GS), Web of Science (WoS), and Scopus being widely consulted by researchers and sometimes used in research evaluations, there is no recent or systematic evidence about the differences between them. In response, this paper investigates 2,448,055 citations to 2,299 English-language highly-cited documents from 252 GS subject categories published in 2006, comparing GS, the WoS Core Collection, and Scopus. GS consistently found the largest percentage of citations across all areas (93%-96%), far ahead of Scopus (35%-77%) and WoS (27%-73%). GS found nearly all the WoS (95%) and Scopus (92%) citations. Most citations found only by GS were from non-journal sources (48%-65%), including theses, books, conference papers, and unpublished materials. Many were non-English (19%-38%), and they tended to be much less cited than citing sources that were also in Scopus or WoS. Despite the many unique GS citing sources, Spearman correlations between citation counts in GS and WoS or Scopus are high (0.78-0.99). They are lower in the Humanities, and lower between GS and WoS than between GS and Scopus. The results suggest that in all areas GS citation data is essentially a superset of WoS and Scopus, with substantial extra coverage.
△ Less
Submitted 12 March, 2019; v1 submitted 15 August, 2018;
originally announced August 2018.
-
Google Scholar: the 'big data' bibliographic tool
Authors:
Emilio Delgado Lopez-Cozar,
Enrique Orduna-Malea,
Alberto Martin-Martin,
Juan M. Ayllon
Abstract:
The launch of Google Scholar back in 2004 meant a revolution not only in the scientific information search market but also in research evaluation processes. Its dynamism, unparalleled coverage, and uncontrolled indexing make of Google Scholar an unusual product, especially when compared to traditional bibliographic databases. Conceived primarily as a discovery tool for academic information, it pre…
▽ More
The launch of Google Scholar back in 2004 meant a revolution not only in the scientific information search market but also in research evaluation processes. Its dynamism, unparalleled coverage, and uncontrolled indexing make of Google Scholar an unusual product, especially when compared to traditional bibliographic databases. Conceived primarily as a discovery tool for academic information, it presents a number of limitations as a bibliometric tool. The main objective of this chapter is to show how Google Scholar operates and how its core database may be used for bibliometric purposes. To do this, the general features of the search engine (in terms of document typologies, disciplines, and coverage) are analysed. Lastly, several bibliometric tools based on Google Scholar data, both official (Google Scholar Metrics, Google Scholar Citations), and some developed by third parties (H Index Scholar, Publishers Scholar Metrics, Proceedings Scholar Metrics, Journal Scholar Metrics, Scholar Mirrors), as well as software to collect and process data from this source (Publish or Perish, Scholarometer) are introduced, aiming to illustrate the potential bibliometric uses of this source.
△ Less
Submitted 17 June, 2018;
originally announced June 2018.
-
Unbundling Open Access dimensions: a conceptual discussion to reduce terminology inconsistencies
Authors:
Alberto Martín-Martín,
Rodrigo Costas,
Thed N. van Leeuwen,
Emilio Delgado López-Cózar
Abstract:
The current ways in which documents are made freely accessible in the Web no longer adhere to the models established Budapest/Bethesda/Berlin (BBB) definitions of Open Access (OA). Since those definitions were established, OA-related terminology has expanded, trying to keep up with all the variants of OA publishing that are out there. However, the inconsistent and arbitrary terminology that is bei…
▽ More
The current ways in which documents are made freely accessible in the Web no longer adhere to the models established Budapest/Bethesda/Berlin (BBB) definitions of Open Access (OA). Since those definitions were established, OA-related terminology has expanded, trying to keep up with all the variants of OA publishing that are out there. However, the inconsistent and arbitrary terminology that is being used to refer to these variants are complicating communication about OA-related issues. This study intends to initiate a discussion on this issue, by proposing a conceptual model of OA. Our model features six different dimensions (prestige, user rights, stability, immediacy, peer-review, and cost). Each dimension allows for a range of different options. We believe that by combining the options in these six dimensions, we can arrive at all the current variants of OA, while avoiding ambiguous and/or arbitrary terminology. This model can be an useful tool for funders and policy makers who need to decide exactly which aspects of OA are necessary for each specific scenario.
△ Less
Submitted 21 August, 2018; v1 submitted 13 June, 2018;
originally announced June 2018.
-
Google Scholar as a data source for research assessment
Authors:
Emilio Delgado López-Cózar,
Enrique Orduna-Malea,
Alberto Martín-Martín
Abstract:
The launch of Google Scholar (GS) marked the beginning of a revolution in the scientific information market. This search engine, unlike traditional databases, automatically indexes information from the academic web. Its ease of use, together with its wide coverage and fast indexing speed, have made it the first tool most scientists currently turn to when they need to carry out a literature search.…
▽ More
The launch of Google Scholar (GS) marked the beginning of a revolution in the scientific information market. This search engine, unlike traditional databases, automatically indexes information from the academic web. Its ease of use, together with its wide coverage and fast indexing speed, have made it the first tool most scientists currently turn to when they need to carry out a literature search. Additionally, the fact that its search results were accompanied from the beginning by citation counts, as well as the later development of secondary products which leverage this citation data (such as Google Scholar Metrics and Google Scholar Citations), made many scientists wonder about its potential as a source of data for bibliometric analyses. The goal of this chapter is to lay the foundations for the use of GS as a supplementary source (and in some disciplines, arguably the best alternative) for scientific evaluation. First, we present a general overview of how GS works. Second, we present empirical evidences about its main characteristics (size, coverage, and growth rate). Third, we carry out a systematic analysis of the main limitations this search engine presents as a tool for the evaluation of scientific performance. Lastly, we discuss the main differences between GS and other more traditional bibliographic databases in light of the correlations found between their citation data. We conclude that Google Scholar presents a broader view of the academic world because it has brought to light a great amount of sources that were not previously visible.
△ Less
Submitted 18 June, 2018; v1 submitted 12 June, 2018;
originally announced June 2018.
-
A novel method for depicting academic disciplines through Google Scholar Citations: The case of Bibliometrics
Authors:
Alberto Martín-Martín,
Enrique Orduna-Malea,
Emilio Delgado López-Cózar
Abstract:
This article describes a procedure to generate a snapshot of the structure of a specific scientific community and their outputs based on the information available in Google Scholar Citations (GSC). We call this method MADAP (Multifaceted Analysis of Disciplines through Academic Profiles). The international community of researchers working in Bibliometrics, Scientometrics, Informetrics, Webometrics…
▽ More
This article describes a procedure to generate a snapshot of the structure of a specific scientific community and their outputs based on the information available in Google Scholar Citations (GSC). We call this method MADAP (Multifaceted Analysis of Disciplines through Academic Profiles). The international community of researchers working in Bibliometrics, Scientometrics, Informetrics, Webometrics, and Altmetrics was selected as a case study. The records of the top 1,000 most cited documents by these authors according to GSC were manually processed to fill any missing information and deduplicate fields like the journal titles and book publishers. The results suggest that it is feasible to use GSC and the MADAP method to produce an accurate depiction of the community of researchers working in Bibliometrics (both specialists and occasional researchers) and their publication habits (main publication venues such as journals and book publishers). Additionally, the wide document coverage of Google Scholar (specially books and book chapters) enables more comprehensive analyses of the documents published in a specific discipline than were previously possible with other citation indexes, finally shedding light on what until now had been a blind spot in most citation analyses.
△ Less
Submitted 27 April, 2018;
originally announced April 2018.
-
Can we use Google Scholar to identify highly-cited documents?
Authors:
Alberto Martín-Martín,
Enrique Orduna-Malea,
Anne-Wil Harzing,
Emilio Delgado López-Cózar
Abstract:
The main objective of this paper is to empirically test whether the identification of highly-cited documents through Google Scholar is feasible and reliable. To this end, we carried out a longitudinal analysis (1950 to 2013), running a generic query (filtered only by year of publication) to minimise the effects of academic search engine optimisation. This gave us a final sample of 64,000 documents…
▽ More
The main objective of this paper is to empirically test whether the identification of highly-cited documents through Google Scholar is feasible and reliable. To this end, we carried out a longitudinal analysis (1950 to 2013), running a generic query (filtered only by year of publication) to minimise the effects of academic search engine optimisation. This gave us a final sample of 64,000 documents (1,000 per year). The strong correlation between a document's citations and its position in the search results (r= -0.67) led us to conclude that Google Scholar is able to identify highly-cited papers effectively. This, combined with Google Scholar's unique coverage (no restrictions on document type and source), makes the academic search engine an invaluable tool for bibliometric research relating to the identification of the most influential scientific documents. We find evidence, however, that Google Scholar ranks those documents whose language (or geographical web domain) matches with the user's interface language higher than could be expected based on citations. Nonetheless, this language effect and other factors related to the Google Scholar's operation, i.e. the proper identification of versions and the date of publication, only have an incidental impact. They do not compromise the ability of Google Scholar to identify the highly-cited papers.
△ Less
Submitted 27 April, 2018;
originally announced April 2018.
-
Coverage of highly-cited documents in Google Scholar, Web of Science, and Scopus: a multidisciplinary comparison
Authors:
Alberto Martín-Martín,
Enrique Orduna-Malea,
Emilio Delgado López-Cózar
Abstract:
This study explores the extent to which bibliometric indicators based on counts of highly-cited documents could be affected by the choice of data source. The initial hypothesis is that databases that rely on journal selection criteria for their document coverage may not necessarily provide an accurate representation of highly-cited documents across all subject areas, while inclusive databases, whi…
▽ More
This study explores the extent to which bibliometric indicators based on counts of highly-cited documents could be affected by the choice of data source. The initial hypothesis is that databases that rely on journal selection criteria for their document coverage may not necessarily provide an accurate representation of highly-cited documents across all subject areas, while inclusive databases, which give each document the chance to stand on its own merits, might be better suited to identify highly-cited documents. To test this hypothesis, an analysis of 2,515 highly-cited documents published in 2006 that Google Scholar displays in its Classic Papers product is carried out at the level of broad subject categories, checking whether these documents are also covered in Web of Science and Scopus, and whether the citation counts offered by the different sources are similar. The results show that a large fraction of highly-cited documents in the Social Sciences and Humanities (8.6%-28.2%) are invisible to Web of Science and Scopus. In the Natural, Life, and Health Sciences the proportion of missing highly-cited documents in Web of Science and Scopus is much lower. Furthermore, in all areas, Spearman correlation coefficients of citation counts in Google Scholar, as compared to Web of Science and Scopus citation counts, are remarkably strong (.83-.99). The main conclusion is that the data about highly-cited documents available in the inclusive database Google Scholar does indeed reveal significant coverage deficiencies in Web of Science and Scopus in several areas of research. Therefore, using these selective databases to compute bibliometric indicators based on counts of highly-cited documents might produce biased assessments in poorly covered areas.
△ Less
Submitted 26 June, 2018; v1 submitted 25 April, 2018;
originally announced April 2018.
-
The lost academic home: institutional affiliation links in Google Scholar Citations
Authors:
Enrique Orduña-Malea,
Juan M. Ayllón,
Alberto Martín-Martín,
Emilio Delgado López-Cózar
Abstract:
This paper analyzes the new affiliation feature available in Google-Scholar Citations revealing that the affiliation-tool works well for most-institutions, it is unable to detect all existing institutions in database, and it is not always able to create unique-standardized entry for each-institution.
This paper analyzes the new affiliation feature available in Google-Scholar Citations revealing that the affiliation-tool works well for most-institutions, it is unable to detect all existing institutions in database, and it is not always able to create unique-standardized entry for each-institution.
△ Less
Submitted 19 April, 2018;
originally announced April 2018.
-
Author-level metrics in the new academic profile platforms: The online behaviour of the Bibliometrics community
Authors:
Alberto Martín-Martín,
Enrique Orduna-Malea,
Emilio Delgado López-Cózar
Abstract:
The new web-based academic communication platforms do not only enable researchers to better advertise their academic outputs, making them more visible than ever before, but they also provide a wide supply of metrics to help authors better understand the impact their work is making. This study has three objectives: a) to analyse the uptake of some of the most popular platforms (Google Scholar Citat…
▽ More
The new web-based academic communication platforms do not only enable researchers to better advertise their academic outputs, making them more visible than ever before, but they also provide a wide supply of metrics to help authors better understand the impact their work is making. This study has three objectives: a) to analyse the uptake of some of the most popular platforms (Google Scholar Citations, ResearcherID, ResearchGate, Mendeley and Twitter) by a specific scientific community (bibliometrics, scientometrics, informetrics, webometrics, and altmetrics); b) to compare the metrics available from each platform; and c) to determine the meaning of all these new metrics. To do this, the data available in these platforms about a sample of 811 authors (researchers in bibliometrics for whom a public profile Google Scholar Citations was found) were extracted. A total of 31 metrics were analysed. The results show that a high number of the analysed researchers only had a profile in Google Scholar Citations (159), or only in Google Scholar Citations and ResearchGate (142). Lastly, we find two kinds of metrics of online impact. First, metrics related to connectivity (followers), and second, all metrics associated to academic impact. This second group can further be divided into usage metrics (reads, views), and citation metrics. The results suggest that Google Scholar Citations is the source that provides more comprehensive citation-related data, whereas Twitter stands out in connectivity-related metrics.
△ Less
Submitted 16 April, 2018;
originally announced April 2018.
-
Dimensions: re-discovering the ecosystem of scientific information
Authors:
Enrique Orduna-Malea,
Emilio Delgado Lopez-Cozar
Abstract:
The overarching aim of this work is to provide a detailed description of the free version of Dimensions (new bibliographic database produced by Digital Science and launched in January 2018). To do this, the work is divided into two differentiated blocks. First, its characteristics, operation and features are described, focusing on its main strengths and weaknesses. Secondly, an analysis of its cov…
▽ More
The overarching aim of this work is to provide a detailed description of the free version of Dimensions (new bibliographic database produced by Digital Science and launched in January 2018). To do this, the work is divided into two differentiated blocks. First, its characteristics, operation and features are described, focusing on its main strengths and weaknesses. Secondly, an analysis of its coverage is carried out (comparing it Scopus and Google Scholar) in order to determine whether the bibliometric indicators offered by Dimensions have an order of magnitude significant enough to be used. To this end, an analysis is carried out at three levels: journals (sample of 20 publications in 'Library & Information Science'), documents (276 articles published by the Journal of informetrics between 2013 and 2015) and authors (28 people awarded with the Derek de Solla Price prize). Preliminary results indicate that Dimensions has coverage of the recent literature superior to Scopus although inferior to Google Scholar. With regard to the number of citations received, Dimensions offers slightly lower figures than Scopus. Despite this, the number of citations in Dimensions exhibits a strong correlation with Scopus and somewhat less (although still significant) with Google Scholar. For this reason, it is concluded that Dimensions is an alternative for carrying out citation studies, being able to rival Scopus (greater coverage and free of charge) and with Google Scholar (greater functionalities for the treatment and data export).
△ Less
Submitted 15 April, 2018;
originally announced April 2018.
-
Evidence of Open Access of scientific publications in Google Scholar: a large-scale analysis
Authors:
Alberto Martín-Martín,
Rodrigo Costas,
Thed van Leeuwen,
Emilio Delgado López-Cózar
Abstract:
This article uses Google Scholar (GS) as a source of data to analyse Open Access (OA) levels across all countries and fields of research. All articles and reviews with a DOI and published in 2009 or 2014 and covered by the three main citation indexes in the Web of Science (2,269,022 documents) were selected for study. The links to freely available versions of these documents displayed in GS were c…
▽ More
This article uses Google Scholar (GS) as a source of data to analyse Open Access (OA) levels across all countries and fields of research. All articles and reviews with a DOI and published in 2009 or 2014 and covered by the three main citation indexes in the Web of Science (2,269,022 documents) were selected for study. The links to freely available versions of these documents displayed in GS were collected. To differentiate between more reliable (sustainable and legal) forms of access and less reliable ones, the data extracted from GS was combined with information available in DOAJ, CrossRef, OpenDOAR, and ROAR. This allowed us to distinguish the percentage of documents in our sample that are made OA by the publisher (23.1%, including Gold, Hybrid, Delayed, and Bronze OA) from those available as Green OA (17.6%), and those available from other sources (40.6%, mainly due to ResearchGate). The data shows an overall free availability of 54.6%, with important differences at the country and subject category levels. The data extracted from GS yielded very similar results to those found by other studies that analysed similar samples of documents, but employed different methods to find evidence of OA, thus suggesting a relative consistency among methods.
△ Less
Submitted 24 July, 2018; v1 submitted 16 March, 2018;
originally announced March 2018.
-
Classic papers: déjà vu, a step further in the bibliometric exploitation of Google Scholar
Authors:
Emilio Delgado Lopez-Cozar,
Alberto Martin-Martin,
Enrique Oduna-Malea
Abstract:
After giving a brief overview of Eugene Garfield contributions to the issue of identifying and studying the most cited scientific articles, manifested in the creation of his Citation Classics, the main characteristics and features of Google Scholar new service Classic Papers, as well as its main strengths and weaknesses, are addressed. This product currently displays the most cited English-languag…
▽ More
After giving a brief overview of Eugene Garfield contributions to the issue of identifying and studying the most cited scientific articles, manifested in the creation of his Citation Classics, the main characteristics and features of Google Scholar new service Classic Papers, as well as its main strengths and weaknesses, are addressed. This product currently displays the most cited English-language original research articles by fields and published in 2006
△ Less
Submitted 28 June, 2017;
originally announced June 2017.
-
Do ResearchGate Scores create ghost academic reputations?
Authors:
Enrique Orduna-Malea,
Alberto Martin-Martin,
Mike Thelwall,
Emilio Delgado Lopez-Cozar
Abstract:
The academic social network site ResearchGate (RG) has its own indicator, RG Score, for its members. The high profile nature of the site means that the RG score may be used for recruitment, promotion and other tasks for which researchers are evaluated. In response, this study investigates whether it is reasonable to employ the RG Score as evidence of scholarly reputation. For this, three different…
▽ More
The academic social network site ResearchGate (RG) has its own indicator, RG Score, for its members. The high profile nature of the site means that the RG score may be used for recruitment, promotion and other tasks for which researchers are evaluated. In response, this study investigates whether it is reasonable to employ the RG Score as evidence of scholarly reputation. For this, three different author samples were investigated. An outlier sample includes 104 authors with high values. A Nobel sample comprises 73 Nobel winners from Medicine & Physiology, Chemistry, Physics and Economics (from 1975 to 2015). A longitudinal sample includes weekly data on 4 authors with different RG Scores. The results suggest that high RG Scores are built primarily from activity related to asking and answering questions in the site. In particular, it seems impossible to get a high RG Score solely through publications. Within RG it is possible to distinguish between (passive) academics that interact little in the site and active platform users, who can get high RG Scores through engaging with others inside the site (questions, answers, social networks with influential researchers). Thus, RG Scores should not be mistaken for academic reputation indicators.
△ Less
Submitted 9 May, 2017;
originally announced May 2017.
-
Google Scholar and the gray literature: A reply to Bonato's review
Authors:
Enrique Orduna-Malea,
Alberto Martin-Martin,
Emilio Delgado Lopez-Cozar
Abstract:
Recently, a review concluded that Google Scholar (GS) is not a suitable source of information "for identifying recent conference papers or other gray literature publications". The goal of this letter is to demonstrate that GS can be an effective tool to search and find gray literature, as long as appropriate search strategies are used. To do this, we took as examples the same two case studies used…
▽ More
Recently, a review concluded that Google Scholar (GS) is not a suitable source of information "for identifying recent conference papers or other gray literature publications". The goal of this letter is to demonstrate that GS can be an effective tool to search and find gray literature, as long as appropriate search strategies are used. To do this, we took as examples the same two case studies used by the original review, describing first how GS processes original's search strategies, then proposing alternative search strategies, and finally generalizing each case study to compose a general search procedure aimed at finding gray literature in Google Scholar for two wide selected case studies: a) all contributions belonging to a congress (the ASCO Annual Meeting); and b) indexed guidelines as well as gray literature within medical institutions (National Institutes of Health) and governmental agencies (U.S. Department of Health & Human Services). The results confirm that original search strategies were undertrained offering misleading results and erroneous conclusions. Google Scholar lacks many of the advanced search features available in other bibliographic databases (such as Pubmed), however, it is one thing to have a friendly search experience, and quite another to find gray literature. We finally conclude that Google Scholar is a powerful tool for searching gray literature, as long as the users are familiar with all the possibilities it offers as a search engine. Poorly formulated searches will undoubtedly return misleading results.
△ Less
Submitted 13 February, 2017;
originally announced February 2017.
-
2016 Google Scholar Metrics released: a matter of languages... and something else
Authors:
Alberto Martín-Martín,
Juan Manuel Ayllón,
Enrique Orduña-Malea,
Emilio Delgado López-Cózar
Abstract:
The 2016 edition of Google Scholar Metrics was released on July 15th 2016. There haven't been any structural changes respect to previous versions, which means that most of its limitations still persist. The biggest changes are the addition of five new language rankings (Russian, Korean, Polish, Ukrainian, and Indonesian) and elimination of two other language rankings (Italian and Dutch). In additi…
▽ More
The 2016 edition of Google Scholar Metrics was released on July 15th 2016. There haven't been any structural changes respect to previous versions, which means that most of its limitations still persist. The biggest changes are the addition of five new language rankings (Russian, Korean, Polish, Ukrainian, and Indonesian) and elimination of two other language rankings (Italian and Dutch). In addition, for reasons still unknown, this new edition doesn't include as many working paper and discussion paper series as previous editions.
△ Less
Submitted 21 July, 2016;
originally announced July 2016.
-
A two-sided academic landscape: portrait of highly-cited documents in Google Scholar (1950-2013)
Authors:
Alberto Martin-Martin,
Enrique Orduna-Malea,
Juan M. Ayllon,
Emilio Delgado Lopez-Cozar
Abstract:
The main objective of this paper is to identify the set of highly-cited documents in Google Scholar and to define their core characteristics (document types, language, free availability, source providers, and number of versions), under the hypothesis that the wide coverage of this search engine may provide a different portrait about this document set respect to that offered by the traditional bibl…
▽ More
The main objective of this paper is to identify the set of highly-cited documents in Google Scholar and to define their core characteristics (document types, language, free availability, source providers, and number of versions), under the hypothesis that the wide coverage of this search engine may provide a different portrait about this document set respect to that offered by the traditional bibliographic databases. To do this, a query per year was carried out from 1950 to 2013 identifying the top 1,000 documents retrieved from Google Scholar and obtaining a final sample of 64,000 documents, of which 40% provided a free full-text link. The results obtained show that the average highly-cited document is a journal article or a book (62% of the top 1% most cited documents of the sample), written in English (92.5% of all documents) and available online in PDF format (86.0% of all documents). Yet, the existence of errors especially when detecting duplicates and linking cites properly must be pointed out. The fact of managing with highly cited papers, however, minimizes the effects of these limitations. Given the high presence of books, and to a lesser extend of other document types (such as proceedings or reports), the research concludes that Google Scholar data offer an original and different vision of the most influential academic documents (measured from the perspective of their citation count), a set composed not only by strictly scientific material (journal articles) but academic in its broad sense
△ Less
Submitted 11 July, 2016;
originally announced July 2016.
-
Proceedings Scholar Metrics: H Index of proceedings on Computer Science, Electrical & Electronic Engineering, and Communications according to Google Scholar Metrics (2010-2014)
Authors:
Alberto Martín-Martín,
Juan Manuel Ayllón,
Enrique Orduña-Malea,
Emilio Delgado López-Cózar
Abstract:
The objective of this report is to present a list of proceedings (conferences, workshops, symposia, meetings) in the areas of Computer Science, Electrical & Electronic Engineering, and Communications covered by Google Scholar Metrics and ranked according to their h-index. Google Scholar Metrics only displays publications that have published at least 100 papers and have received at least one citati…
▽ More
The objective of this report is to present a list of proceedings (conferences, workshops, symposia, meetings) in the areas of Computer Science, Electrical & Electronic Engineering, and Communications covered by Google Scholar Metrics and ranked according to their h-index. Google Scholar Metrics only displays publications that have published at least 100 papers and have received at least one citation in the last five years (2010-2014). The searches were conducted between the 8th and 10th of December, 2015. A total of 1501 proceedings have been identified
△ Less
Submitted 17 June, 2016;
originally announced June 2016.
-
Back to the past: on the shoulders of an academic search engine giant
Authors:
Alberto Martin-Martin,
Enrique Orduna-Malea,
Juan M. Ayllon,
Emilio Delgado Lopez-Cozar
Abstract:
A study released by the Google Scholar team found an apparently increasing fraction of citations to old articles from studies published in the last 24 years (1990-2013). To demonstrate this finding we conducted a complementary study using a different data source (Journal Citation Reports), metric (aggregate cited half-life), time spam (2003-2013), and set of categories (53 Social Science subject c…
▽ More
A study released by the Google Scholar team found an apparently increasing fraction of citations to old articles from studies published in the last 24 years (1990-2013). To demonstrate this finding we conducted a complementary study using a different data source (Journal Citation Reports), metric (aggregate cited half-life), time spam (2003-2013), and set of categories (53 Social Science subject categories and 167 Science subject categories). Although the results obtained confirm and reinforce the previous findings, the possible causes of this phenomenon keep unclear. We finally hypothesize that first page results syndrome in conjunction with the fact that Google Scholar favours the most cited documents are suggesting the growing trend of citing old documents is partly caused by Google Scholar.
△ Less
Submitted 30 March, 2016;
originally announced March 2016.
-
The counting house: measuring those who count. Presence of Bibliometrics, Scientometrics, Informetrics, Webometrics and Altmetrics in the Google Scholar Citations, ResearcherID, ResearchGate, Mendeley & Twitter
Authors:
Alberto Martin-Martin,
Enrique Orduna-Malea,
Juan M. Ayllon,
Emilio Delgado Lopez-Cozar
Abstract:
Following in the footsteps of the model of scientific communication, which has recently gone through a metamorphosis (from the Gutenberg galaxy to the Web galaxy), a change in the model and methods of scientific evaluation is also taking place. A set of new scientific tools are now providing a variety of indicators which measure all actions and interactions among scientists in the digital space, m…
▽ More
Following in the footsteps of the model of scientific communication, which has recently gone through a metamorphosis (from the Gutenberg galaxy to the Web galaxy), a change in the model and methods of scientific evaluation is also taking place. A set of new scientific tools are now providing a variety of indicators which measure all actions and interactions among scientists in the digital space, making new aspects of scientific communication emerge. In this work we present a method for capturing the structure of an entire scientific community (the Bibliometrics, Scientometrics, Informetrics, Webometrics, and Altmetrics community) and the main agents that are part of it (scientists, documents, and sources) through the lens of Google Scholar Citations.
Additionally, we compare these author portraits to the ones offered by other profile or social platforms currently used by academics (ResearcherID, ResearchGate, Mendeley, and Twitter), in order to test their degree of use, completeness, reliability, and the validity of the information they provide. A sample of 814 authors (researchers in Bibliometrics with a public profile created in Google Scholar Citations was subsequently searched in the other platforms, collecting the main indicators computed by each of them. The data collection was carried out on September, 2015. The Spearman correlation was applied to these indicators (a total of 31) , and a Principal Component Analysis was carried out in order to reveal the relationships among metrics and platforms as well as the possible existence of metric clusters
△ Less
Submitted 7 February, 2016;
originally announced February 2016.
-
Improvements in Google Scholar Citations are for the summer: creating an institutional affiliation link feature
Authors:
Enrique Orduna-Malea,
Juan Manuel Ayllón,
Alberto Martín-Martín,
Emilio Delgado López-Cózar
Abstract:
This report describes the feature introduced by Google to provide standardized access to institutional affiliations within Google Scholar Citations. First, this new tool is described, pointing out its main characteristics and functioning. Next, the coverage and precision of the tool are evaluated. Two special cases (Google Inc. and Spanish Universities) are briefly treated with the purpose of illu…
▽ More
This report describes the feature introduced by Google to provide standardized access to institutional affiliations within Google Scholar Citations. First, this new tool is described, pointing out its main characteristics and functioning. Next, the coverage and precision of the tool are evaluated. Two special cases (Google Inc. and Spanish Universities) are briefly treated with the purpose of illustrating some aspects about the accuracy of the tool for the task of gathering authors within their appropriate institution. Finally, some inconsistencies, errors and malfunctioning are identified, categorized and described. The report finishes by providing some suggestions to improve the feature. The general conclusion is that the standardized institutional affiliation link provided by Google Scholar Citations, despite working pretty well for a large number of institutions (especially Anglo-Saxon universities) still has a number of shortcomings and pitfalls which need to be addressed in order to make this authority control tool fully useful worldwide, both for searching purposes and for metric tasks
△ Less
Submitted 15 September, 2015;
originally announced September 2015.
-
Disclosing the network structure of private companies on the web: the case of Spanish IBEX 35 share index
Authors:
Enrique Orduna-Malea,
Emilio Delgado Lopez-Cozar,
Jorge Serrano-Cobos,
Nuria Lloret-Romero
Abstract:
It is common for an international company to have different brands, products or services, information for investors, a corporate blog, affiliates, branches in different countries, etc. If all these contents appear as independent additional web domains (AWD), the company should be represented on the web by all these web domains, since many of these AWDs may acquire remarkable performance that could…
▽ More
It is common for an international company to have different brands, products or services, information for investors, a corporate blog, affiliates, branches in different countries, etc. If all these contents appear as independent additional web domains (AWD), the company should be represented on the web by all these web domains, since many of these AWDs may acquire remarkable performance that could mask or distort the real web performance of the company, affecting therefore on the understanding of web metrics. The main objective of this study is to determine the amount, type, web impact and topology of the additional web domains in commercial companies in order to get a better understanding on their complete web impact and structure. The set of companies belonging to the Spanish IBEX-35 stock index has been analyzed as testing bench. We proceeded to identify and categorize all AWDs belonging to these companies, and to apply both web impact (web presence and visibility) and network metrics. The results show that AWDs get a high web presence but relatively low web visibility, due to certain opacity or less dissemination of some AWDs, favoring its isolation. This is verified by the low network density values obtained, that occur because AWDs are strongly connected with the corporate domain (although asymmetrically), but very weakly linked each other. Although the processes of AWDs creation and categorization are complex (web policy seems not to be driven by a defined or conscious plan), their influence on the web performance of IBEX 35companies is meaningful. This research measures the AWDs influence on companies under webometric terms for the first time.
△ Less
Submitted 9 June, 2015;
originally announced June 2015.
-
Methods for estimating the size of Google Scholar
Authors:
Enrique Orduna-Malea,
Juan M. Ayllon,
Alberto Martin-Martin,
Emilio Delgado Lopez-Cozar
Abstract:
The emergence of academic search engines (mainly Google Scholar and Microsoft Academic Search) that aspire to index the entirety of current academic knowledge has revived and increased interest in the size of the academic web. The main objective of this paper is to propose various methods to estimate the current size (number of indexed documents) of Google Scholar (May 2014) and to determine its v…
▽ More
The emergence of academic search engines (mainly Google Scholar and Microsoft Academic Search) that aspire to index the entirety of current academic knowledge has revived and increased interest in the size of the academic web. The main objective of this paper is to propose various methods to estimate the current size (number of indexed documents) of Google Scholar (May 2014) and to determine its validity, precision and reliability. To do this, we present, apply and discuss three empirical methods: an external estimate based on empirical studies of Google Scholar coverage, and two internal estimate methods based on direct, empty and absurd queries, respectively. The results, despite providing disparate values, place the estimated size of Google Scholar at around 160 to 165 million documents. However, all the methods show considerable limitations and uncertainties due to inconsistencies in the Google Scholar search functionalities.
△ Less
Submitted 9 June, 2015;
originally announced June 2015.
-
Hyperlinks embedded in Twitter as a proxy for total external inlinks to international university websites
Authors:
Enrique Orduna-Malea,
Daniel Torres-Salinas,
Emilio Delgado Lopez-Cozar
Abstract:
This article analyzes Twitter as a potential alternative source of external links for use in webometric analysis because of its capacity to embed hyperlinks in different tweets. Given the limitations on searching Twitter's public API, we decided to use the Topsy search engine as a source for compiling tweets. To this end, we took a global sample of 200 universities and compiled all the tweets with…
▽ More
This article analyzes Twitter as a potential alternative source of external links for use in webometric analysis because of its capacity to embed hyperlinks in different tweets. Given the limitations on searching Twitter's public API, we decided to use the Topsy search engine as a source for compiling tweets. To this end, we took a global sample of 200 universities and compiled all the tweets with hyperlinks to any of these institutions. Further link data was obtained from alternative sources (MajesticSEO and OpenSiteExplorer) in order to compare the results. Thereafter, various statistical tests were performed to determine the correlation between the indicators and the ability to predict external links from the collected tweets. The results indicate a high volume of tweets, although they are skewed by the presence and performance of specific universities and countries. The data provided by Topsy correlated significantly with all link indicators, particularly with OpenSiteExplorer (r=0.769). Finally, prediction models do not provide optimum results because of high error rates, which fall slightly in nonlinear models applied to specific environments. We conclude that the use of Twitter (via Topsy) as a source of hyperlinks to universities produces promising results due to its high correlation with link indicators, though limited by policies and culture regarding use and presence in social networks.
△ Less
Submitted 14 February, 2015;
originally announced February 2015.
-
Reviving the past: the growth of citations to old documents
Authors:
Alberto Martín-Martín,
Enrique Orduña-Malea,
Juan Manuel Ayllón,
Emilio Delgado López-Cózar
Abstract:
In this Digest we review a recent study released by the Google Scholar team on the apparently increasing fraction of citations to old articles from studies published in the last 24 years (1990-2013). First, we describe the main findings of their article. Secondly, we conduct an analogue study, using a different data source as well as different measures which throw very similar results, thus confir…
▽ More
In this Digest we review a recent study released by the Google Scholar team on the apparently increasing fraction of citations to old articles from studies published in the last 24 years (1990-2013). First, we describe the main findings of their article. Secondly, we conduct an analogue study, using a different data source as well as different measures which throw very similar results, thus confirming the phenomenon. Lastly, we discuss the possible causes of this phenomenon.
△ Less
Submitted 9 January, 2015;
originally announced January 2015.
-
Proceedings Scholar Metrics: H Index of proceedings on Computer Science, Electrical & Electronic Engineering, and Communications according to Google Scholar Metrics (2009-2013)
Authors:
Alberto Martin-Martin,
Enrique Ordunna-Malea,
Juan Manuel Ayllon,
Emilio Delgado Lopez-Cozar
Abstract:
The objective of this report is to present a list of proceedings (conferences, workshops, symposia, meetings) in the areas of Computer Science, Electrical & Electronic Engineering, and Communications covered by Google Scholar Metrics and ranked according to their h-index. Google Scholar Metrics only displays publications that have published at least 100 papers and have received at least one citati…
▽ More
The objective of this report is to present a list of proceedings (conferences, workshops, symposia, meetings) in the areas of Computer Science, Electrical & Electronic Engineering, and Communications covered by Google Scholar Metrics and ranked according to their h-index. Google Scholar Metrics only displays publications that have published at least 100 papers and have received at least one citation in the last five years (2009-2013). The searches were conducted between the 15th and 22nd of December, 2014. A total of 1208 proceedings have been identified
△ Less
Submitted 7 January, 2015; v1 submitted 24 December, 2014;
originally announced December 2014.
-
Does Google Scholar contain all highly cited documents (1950-2013)?
Authors:
Alberto Martín-Martín,
Enrique Orduña-Malea,
Juan Manuel Ayllón,
Emilio Delgado López-Cózar
Abstract:
The study of highly cited documents on Google Scholar (GS) has never been addressed to date in a comprehensive manner. The objective of this work is to identify the set of highly cited documents in Google Scholar and define their core characteristics: their languages, their file format, or how many of them can be accessed free of charge. We will also try to answer some additional questions that…
▽ More
The study of highly cited documents on Google Scholar (GS) has never been addressed to date in a comprehensive manner. The objective of this work is to identify the set of highly cited documents in Google Scholar and define their core characteristics: their languages, their file format, or how many of them can be accessed free of charge. We will also try to answer some additional questions that hopefully shed some light about the use of GS as a tool for assessing scientific impact through citations. The decalogue of research questions is shown below:
1. Which are the most cited documents in GS?
2. Which are the most cited document types in GS?
3. What languages are the most cited documents written in GS?
4. How many highly cited documents are freely accessible?
4.1 What file types are the most commonly used to store these highly cited documents?
4.2 Which are the main providers of these documents?
5. How many of the highly cited documents indexed by GS are also indexed by WoS?
6. Is there a correlation between the number of citations that these highly cited documents have received in GS and the number of citations they have received in WoS?
7. How many versions of these highly cited documents has GS detected?
8. Is there a correlation between the number of versions GS has detected for these documents, and the number citations they have received?
9. Is there a correlation between the number of versions GS has detected for these documents, and their position in the search engine result pages?
10. Is there some relation between the positions these documents occupy in the search engine result pages, and the number of citations they have received?
△ Less
Submitted 25 March, 2015; v1 submitted 30 October, 2014;
originally announced October 2014.
-
About the size of Google Scholar: playing the numbers
Authors:
Enrique Orduña-Malea,
Juan Manuel Ayllón,
Alberto Martín-Martín,
Emilio Delgado López-Cózar
Abstract:
The emergence of academic search engines (Google Scholar and Microsoft Academic Search essentially) has revived and increased the interest in the size of the academic web, since their aspiration is to index the entirety of current academic knowledge. The search engine functionality and human search patterns lead us to believe, sometimes, that what you see in the search engine's results page is all…
▽ More
The emergence of academic search engines (Google Scholar and Microsoft Academic Search essentially) has revived and increased the interest in the size of the academic web, since their aspiration is to index the entirety of current academic knowledge. The search engine functionality and human search patterns lead us to believe, sometimes, that what you see in the search engine's results page is all that really exists. And, even when this is not true, we wonder which information is missing and why. The main objective of this working paper is to calculate the size of Google Scholar at present (May 2014). To do this, we present, apply and discuss up to 4 empirical methods: Khabsa & Giles's method, an estimate based on empirical data, and estimates based on direct queries and absurd queries. The results, despite providing disparate values, place the estimated size of Google Scholar in about 160 million documents. However, the fact that all methods show great inconsistencies, limitations and uncertainties, makes us wonder why Google does not simply provide this information to the scientific community if the company really knows this figure.
△ Less
Submitted 5 September, 2014; v1 submitted 23 July, 2014;
originally announced July 2014.
-
Google Scholar Metrics 2014: a low cost bibliometric tool
Authors:
Alberto Martín-Martín,
Juan Manuel Ayllón,
Enrique Orduña-Malea,
Emilio Delgado López-Cózar
Abstract:
We analyse the main features of the third edition of Google Scholar Metrics (GSM), released in June 2014, focusing on its more important changes, strengths, and weaknesses. Additionally, we present some figures that outline the dimensions of this new edition, and we compare them to those of previous editions. Principal among these figures are the number of visualized publications, publication type…
▽ More
We analyse the main features of the third edition of Google Scholar Metrics (GSM), released in June 2014, focusing on its more important changes, strengths, and weaknesses. Additionally, we present some figures that outline the dimensions of this new edition, and we compare them to those of previous editions. Principal among these figures are the number of visualized publications, publication types, languages, and the maximum and minimum h5-index and h5-median values by language, subject area, and subcategory. This new edition is marked by continuity. There is nothing new other than the updating of the time frame (2009-2013) and the removal of some redundant subcategories (from 268 to 261) for English written publications. Google has just updated the data, which means that some of the errors discussed in previous studies still persist. To sum up, GSM is a minimalist information product with few features, closed (it cannot be customized by the user), and simple (navigating it only takes a few clicks). For these reasons, we consider it a 'low cost' bibliometric tool, and propose a list of features it should incorporate in order to stop being labeled as such. Notwithstanding the above, this product presents a stability in its bibliometric indicators that supports its ability to measure and track the impact of scientific publications.
△ Less
Submitted 10 July, 2014;
originally announced July 2014.
-
The dark side of Open Access in Google and Google Scholar: the case of Latin-American repositories
Authors:
Enrique Orduña-Malea,
Emilio Delgado Lopez-Cozar
Abstract:
Since repositories are a key tool in making scholarly knowledge open access, determining their presence and impact on the Web is essential, particularly in Google (search engine par excellence) and Google Scholar (a tool increasingly used by researchers to search for academic information). The few studies conducted so far have been limited to very specific geographic areas (USA), which makes it ne…
▽ More
Since repositories are a key tool in making scholarly knowledge open access, determining their presence and impact on the Web is essential, particularly in Google (search engine par excellence) and Google Scholar (a tool increasingly used by researchers to search for academic information). The few studies conducted so far have been limited to very specific geographic areas (USA), which makes it necessary to find out what is happening in other regions that are not part of mainstream academia, and where repositories play a decisive role in the visibility of scholarly production. The main objective of this study is to ascertain the presence and visibility of Latin American repositories in Google and Google Scholar through the application of page count and visibility indicators. For a sample of 137 repositories, the results indicate that the indexing ratio is low in Google, and virtually nonexistent in Google Scholar; they also indicate a complete lack of correspondence between the repository records and the data produced by these two search tools. These results are mainly attributable to limitations arising from the use of description schemas that are incompatible with Google Scholar (repository design) and the reliability of web indicators (search engines). We conclude that neither Google nor Google Scholar accurately represent the actual size of open access content published by Latin American repositories; this may indicate a non-indexed, hidden side to open access, which could be limiting the dissemination and consumption of open access scholarly literature.
△ Less
Submitted 17 June, 2014;
originally announced June 2014.
-
Empirical Evidences in Citation-Based Search Engines: Is Microsoft Academic Search dead?
Authors:
Enrique Orduna-Malea,
Juan Manuel Ayllon,
Alberto Martin-Martin,
Emilio Delgado Lopez-Cozar
Abstract:
The goal of this working paper is to summarize the main empirical evidences provided by the scientific community as regards the comparison between the two main citation based academic search engines: Google Scholar and Microsoft Academic Search, paying special attention to the following issues: coverage, correlations between journal rankings, and usage of these academic search engines. Additionall…
▽ More
The goal of this working paper is to summarize the main empirical evidences provided by the scientific community as regards the comparison between the two main citation based academic search engines: Google Scholar and Microsoft Academic Search, paying special attention to the following issues: coverage, correlations between journal rankings, and usage of these academic search engines. Additionally, selfelaborated data is offered, which are intended to provide current evidence about the popularity of these tools on the Web, by measuring the number of rich files PDF, PPT and DOC in which these tools are mentioned, the amount of external links that both products receive, and the search queries frequency from Google Trends. The poor results obtained by MAS led us to an unexpected and unnoticed discovery: Microsoft Academic Search is outdated since 2013. Therefore, the second part of the working paper aims at advancing some data demonstrating this lack of update. For this purpose we gathered the number of total records indexed by Microsoft Academic Search since 2000. The data shows an abrupt drop in the number of documents indexed from 2,346,228 in 2010 to 8,147 in 2013 and 802 in 2014. This decrease is offered according to 15 thematic areas as well. In view of these problems it seems logical not only that Microsoft Academic Searchwas poorly used to search for articles by academics and students, who mostly use Google or Google Scholar, but virtually ignored by bibliometricians
△ Less
Submitted 23 May, 2014; v1 submitted 28 April, 2014;
originally announced April 2014.
-
Coverage, field specialization and impact of scientific publishers indexed in the 'Book Citation Index'
Authors:
Daniel Torres-Salinas,
Nicolás Robinson-García,
J. M. Campanario,
Emilio Delgado López-Cózar
Abstract:
Purpose: The aim of this study is to analyze the disciplinary coverage of the Thomson Reuters' Book Citation Index database focusing on publisher presence, impact and specialization. Design/Methodology/approach: We conduct a descriptive study in which we examine coverage by discipline, publisher distribution by field and country of publication, and publisher impact. For this the Thomson Reuters' S…
▽ More
Purpose: The aim of this study is to analyze the disciplinary coverage of the Thomson Reuters' Book Citation Index database focusing on publisher presence, impact and specialization. Design/Methodology/approach: We conduct a descriptive study in which we examine coverage by discipline, publisher distribution by field and country of publication, and publisher impact. For this the Thomson Reuters' Subject Categories were aggregated into 15 disciplines. Findings: 30% of the total share of this database belongs to the fields of Humanities and Social Sciences. Most of the disciplines are covered by very few publishers mainly from the UK and USA (75.05% of the books), in fact 33 publishers concentrate 90% of the whole share. Regarding publisher impact, 80.5% of the books and chapters remained uncited. Two serious errors were found in this database. Firstly, the Book Citation Index does not retrieve all citations for books and chapters. Secondly, book citations do not include citations to their chapters. Research limitations/implications: The Book Citation Index is still underdeveloped and has serious limitations which call into caution when using it for bibliometric purposes. Practical implications: The results obtained from this study warn against the use of this database for bibliometric purposes, but opens a new window of opportunities for covering long neglected areas such as Humanities and Social Sciences. The target audience of this study is librarians, bibliometricians, researchers, scientific publishers, prospective authors and evaluation agencies. Originality/Value: There are currently no studies analyzing in depth the coverage of this novel database which covers monographs.
△ Less
Submitted 10 December, 2013;
originally announced December 2013.
-
H Index Communication Journals according to Google Scholar Metrics (2008-2012)
Authors:
Rafael Repiso,
Emilio Delgado Lopez-Cozar
Abstract:
The aim of this report is to present a ranking of Communication journals covered in Google Scholar Metrics for the period 2008-2012. It corresponds to the H Index update made last year for the period 2007-2011 (Delgado López-Cózar and Repiso 2013). Google Scholar Metrics doesnt currently allow to group and sort all journals belonging to a scientific discipline. In the case of Communication, in the…
▽ More
The aim of this report is to present a ranking of Communication journals covered in Google Scholar Metrics for the period 2008-2012. It corresponds to the H Index update made last year for the period 2007-2011 (Delgado López-Cózar and Repiso 2013). Google Scholar Metrics doesnt currently allow to group and sort all journals belonging to a scientific discipline. In the case of Communication, in the ten listings displayed by GSM we can only locate 46 journals. Therefore, in an attempt to overcome this limitation, we have used the diversity of search procedures allowed by GSM to identify the greatest number of scientific journals of Communication with H Index calculated by this bibliometric tool. The result is a ranking of 354 communication journals sorted by the same H Index, and mean as discriminating value. Journals are also grouped by quartiles.
△ Less
Submitted 28 October, 2013;
originally announced October 2013.
-
Google Scholar Metrics evolution: an analysis according to languages
Authors:
Enrique Orduna-Malea,
Emilio Delgado Lopez-Cozar
Abstract:
In November 2012 the Google Scholar Metrics (GSM) journal rankings were updated, making it possible to compare bibliometric indicators in the 10 languages indexed and their stability with the April 2012 version. The h-index and h 5 median of 1000 journals were analysed, comparing their averages, maximum and minimum values and the correlation coefficient within rankings. The bibliometric figures gr…
▽ More
In November 2012 the Google Scholar Metrics (GSM) journal rankings were updated, making it possible to compare bibliometric indicators in the 10 languages indexed and their stability with the April 2012 version. The h-index and h 5 median of 1000 journals were analysed, comparing their averages, maximum and minimum values and the correlation coefficient within rankings. The bibliometric figures grew significantly. In just seven and a half months the h index of the journals increased by 15% and the median h-index by 17%. This growth was observed for all the bibliometric indicators analysed and for practically every journal. However, we found significant differences in growth rates depending on the language in which the journal is published. Moreover, the journal rankings seem to be stable between April and November, reinforcing the credibility of the data held by Google Scholar and the reliability of the GSM journal rankings, despite the uncontrolled growth of Google Scholar. Based on the findings of this study we suggest, firstly, that Google should upgrade its rankings at least semiannually and, secondly, that the results should be displayed in each ranking proportionally to the number of journals indexed by language
△ Less
Submitted 23 October, 2013;
originally announced October 2013.
-
The Google Scholar Experiment: how to index false papers and manipulate bibliometric indicators
Authors:
Emilio Delgado López-Cózar,
Nicolás Robinson-Garcia,
Daniel Torres-Salinas
Abstract:
Google Scholar has been well received by the research community. Its promises of free, universal and easy access to scientific literature as well as the perception that it covers better than other traditional multidisciplinary databases the areas of the Social Sciences and the Humanities have contributed to the quick expansion of Google Scholar Citations and Google Scholar Metrics: two new bibliom…
▽ More
Google Scholar has been well received by the research community. Its promises of free, universal and easy access to scientific literature as well as the perception that it covers better than other traditional multidisciplinary databases the areas of the Social Sciences and the Humanities have contributed to the quick expansion of Google Scholar Citations and Google Scholar Metrics: two new bibliometric products that offer citation data at the individual level and at journal level. In this paper we show the results of a experiment undertaken to analyze Google Scholar's capacity to detect citation counting manipulation. For this, six documents were uploaded to an institutional web domain authored by a false researcher and referencing all the publications of the members of the EC3 research group at the University of Granada. The detection of Google Scholar of these papers outburst the citations included in the Google Scholar Citations profiles of the authors. We discuss the effects of such outburst and how it could affect the future development of such products not only at individual level but also at journal level, especially if Google Scholar persists with its lack of transparency.
△ Less
Submitted 10 September, 2013;
originally announced September 2013.
-
Letter to the editor: Against the Resilience of Rejected Manuscripts
Authors:
Nicolas Robinson-Garcia,
Daniel Torres-Salinas,
J. M. Campanario,
Emilio Delgado López-Cózar
Abstract:
In this letter we propose the development of guidelines by the main editors associations as well as protocols within online journal management systems for keeping track of rejected manuscripts that are resubmitted as well as for the interchange of referees reports between journals.
In this letter we propose the development of guidelines by the main editors associations as well as protocols within online journal management systems for keeping track of rejected manuscripts that are resubmitted as well as for the interchange of referees reports between journals.
△ Less
Submitted 10 September, 2013;
originally announced September 2013.
-
Google Scholar Metrics 2013: nothing new under the sun
Authors:
Alvaro Cabezas-Clavijo,
Emilio Delgado Lopez-Cozar
Abstract:
Main characteristics of Google Scholar Metrics new version (july 2013) are presented. We outline the novelties and the weaknesses detected after a first analysis. As main conclusion, we remark the lack of new functionalities with respect to last editions, as the only modification is the update of the timeframe (2008-2012). Hence, problems pointed out in our last reviews still remain active. Finall…
▽ More
Main characteristics of Google Scholar Metrics new version (july 2013) are presented. We outline the novelties and the weaknesses detected after a first analysis. As main conclusion, we remark the lack of new functionalities with respect to last editions, as the only modification is the update of the timeframe (2008-2012). Hence, problems pointed out in our last reviews still remain active. Finally, it seems Google Scholar Metrics will be updated in a yearly basis
△ Less
Submitted 26 July, 2013;
originally announced July 2013.
-
H Index of scientific Nursing journals according to Google Scholar Metrics (2007-2011)
Authors:
Liliana Marcela Reina Leal,
Rafael Repiso,
Emilio Delgado Lopez-Cozar
Abstract:
The aim of this report is to present a ranking of Nursing journals covered in Google Scholar Metrics (GSM), a Google product launched in 2012 to assess the impact of scientific journals from citation counts this receive on Google Scholar. Google has chosen to include only those journals that have published at least 100 papers and have at least one citation in a period of five years (2007-2011). Jo…
▽ More
The aim of this report is to present a ranking of Nursing journals covered in Google Scholar Metrics (GSM), a Google product launched in 2012 to assess the impact of scientific journals from citation counts this receive on Google Scholar. Google has chosen to include only those journals that have published at least 100 papers and have at least one citation in a period of five years (2007-2011). Journal rankings are sorted by languages (showing the 100 papers with the greatest impact). This tool allows to sort by subject areas and disciplines, but only in the case of journals in English. In this case, it only shows the 20 journals with the highest h index. This option is not available for journals in the other nine languages present in Google (Chinese, Portuguese, German, Spanish, French, Korean, Japanese, Dutch and Italian).
Google Scholar Metrics doesnt currently allow to group and sort all journals belonging to a scientific discipline. In the case of Nursing, in the ten listings displayed by GSM we can only locate 34 journals. Therefore, in an attempt to overcome this limitation, we have used the diversity of search procedures allowed by GSM to identify the greatest number of scientific journals of Nursing with h index calculated by this bibliometric tool. Bibliographic searches were conducted between 10th and 30th May 2013.
The result is a ranking of 337 nursing journals sorted by the same h index, and mean as discriminating value. Journals are also grouped by quartiles.
△ Less
Submitted 16 July, 2013;
originally announced July 2013.
-
An insight into the importance of national university rankings in an international context: The case of the I-UGR Rankings of Spanish universities
Authors:
Nicolás Robinson-García,
Daniel Torres-Salinas,
Emilio Delgado López-Cózar,
Francisco Herrera
Abstract:
The great importance international rankings have achieved in the research policy arena warns against many threats consequence of the flaws and shortcomings these tools present. One of them has to do with the inability to accurately represent national university systems as their original purpose is only to rank world-class universities. Another one has to do with the lack of representativeness of u…
▽ More
The great importance international rankings have achieved in the research policy arena warns against many threats consequence of the flaws and shortcomings these tools present. One of them has to do with the inability to accurately represent national university systems as their original purpose is only to rank world-class universities. Another one has to do with the lack of representativeness of universities' disciplinary profiles as they usually provide a unique table. Although some rankings offer a great coverage and others offer league tables by fields, no international ranking does both. In order to surpass such limitation from a research policy viewpoint, this paper analyzes the possibility of using national rankings in order to complement international rankings. For this, we analyze the Spanish university system as a study case presenting the I-UGR Rankings for Spanish universities by fields and subfields. Then, we compare their results with those obtained by the Shanghai Ranking, the QS Ranking, the Leiden Ranking and the NTU Ranking, as they all have basic common grounds which allow such comparison. We conclude that it is advisable to use national rankings in order to complement international rankings, however we observe that this must be done with certain caution as they differ on the methodology employed as well as on the construction of the fields.
△ Less
Submitted 3 March, 2014; v1 submitted 6 May, 2013;
originally announced May 2013.
-
Google Scholar and the h-index in biomedicine: the popularization of bibliometric asessment
Authors:
Alvaro Cabezas-Clavijo,
Emilio Delgado Lopez-Cozar
Abstract:
The aim of this paper is to review the features, benefits and limitations of the new scientific evaluation products derived from Google Scholar; Google Scholar Metrics and Google Scholar Citations, as well as the h-index which is the standard bibliometric indicator adopted by these services. It also outlines the potential of this new database as a source for studies in Biomedicine and compares the…
▽ More
The aim of this paper is to review the features, benefits and limitations of the new scientific evaluation products derived from Google Scholar; Google Scholar Metrics and Google Scholar Citations, as well as the h-index which is the standard bibliometric indicator adopted by these services. It also outlines the potential of this new database as a source for studies in Biomedicine and compares the h-index obtained by the most relevant journals and researchers in the field of Intensive Care Medicine, by means of data extracted from Web of Science, Scopus and Google Scholar. Results show that, although average h-index values in Google Scholar are almost 30% higher than those obtained in Web of Science and about 15% higher than those collected by Scopus, there are no substantive changes in the rankings generated from either data source. Despite some technical problems, it is concluded that Google Scholar is a valid tool for researchers in Health Sciences, both for purposes of information retrieval and computation of bibliometric indicators
△ Less
Submitted 7 April, 2013;
originally announced April 2013.
-
Ranking journals: Could Google Scholar Metrics be an alternative to Journal Citation Reports and Scimago Journal Rank?
Authors:
Emilio Delgado Lopez-Cozar,
Alvaro Cabezas-Clavijo
Abstract:
The launch of Google Scholar Metrics as a tool for assessing scientific journals may be serious competition for Thomson Reuters Journal Citation Reports, and for Scopus powered Scimago Journal Rank. A review of these bibliometric journal evaluation products is performed. We compare their main characteristics from different approaches: coverage, indexing policies, search and visualization, bibliome…
▽ More
The launch of Google Scholar Metrics as a tool for assessing scientific journals may be serious competition for Thomson Reuters Journal Citation Reports, and for Scopus powered Scimago Journal Rank. A review of these bibliometric journal evaluation products is performed. We compare their main characteristics from different approaches: coverage, indexing policies, search and visualization, bibliometric indicators, results analysis options, economic cost and differences in their ranking of journals. Despite its shortcomings, Google Scholar Metrics is a helpful tool for authors and editors in identifying core journals. As an increasingly useful tool for ranking scientific journals, it may also challenge established journals products
△ Less
Submitted 23 March, 2013;
originally announced March 2013.
-
H Index of History journals published in Spain according to Google Scholar Metrics (2007-2011)
Authors:
Emilio Delgado Lopez-Cozar,
Manuel Ramirez Sanchez
Abstract:
Google Scholar Metrics (GSM), which was recently launched in April 2012, features new bibliometric systems for gauging scientific journals by counting the number of citations obtained in Google Scholar. This way, it opens new possibilities for measuring journal impacts in the field of Humanities. The present article intends to evaluate the scope of this tool through analysing GSM searches, from th…
▽ More
Google Scholar Metrics (GSM), which was recently launched in April 2012, features new bibliometric systems for gauging scientific journals by counting the number of citations obtained in Google Scholar. This way, it opens new possibilities for measuring journal impacts in the field of Humanities. The present article intends to evaluate the scope of this tool through analysing GSM searches, from the 5th through 6th of December 2012, of History journals published in Spain. In sum, 69 journals were identified, accounting for only 24% of the History journals published in Spain. The ranges of H index values for this field are so small that the ranking can no longer be said to show a discriminating potential. In the light of this, we would like to propose a change in the way Google Scholar Metrics is designed so that it could also accommodate production and citation patterns in the particular field of History, and, in a broader scope, in the area of Humanities as well.
△ Less
Submitted 20 February, 2013; v1 submitted 7 February, 2013;
originally announced February 2013.
-
On the use of Biplot analysis for multivariate bibliometric and scientific indicators
Authors:
Daniel Torres-Salinas,
Nicolas Robinson-Garcia,
Evaristo Jiménez-Contreras,
Francisco Herrera,
Emilio Delgado López-Cózar
Abstract:
Bibliometric mapping and visualization techniques represent one of the main pillars in the field of scientometrics. Traditionally, the main methodologies employed for representing data are Multi-Dimensional Scaling, Principal Component Analysis or Correspondence Analysis. In this paper we aim at presenting a visualization methodology known as Biplot analysis for representing bibliometric and scien…
▽ More
Bibliometric mapping and visualization techniques represent one of the main pillars in the field of scientometrics. Traditionally, the main methodologies employed for representing data are Multi-Dimensional Scaling, Principal Component Analysis or Correspondence Analysis. In this paper we aim at presenting a visualization methodology known as Biplot analysis for representing bibliometric and science and technology indicators. A Biplot is a graphical representation of multivariate data, where the elements of a data matrix are represented according to dots and vectors associated with the rows and columns of the matrix. In this paper we explore the possibilities of applying the Biplot analysis in the research policy area. More specifically we will first describe and introduce the reader to this methodology and secondly, we will analyze its strengths and weaknesses through three different study cases: countries, universities and scientific fields. For this, we use a Biplot analysis known as JK-Biplot. Finally we compare the Biplot representation with other multivariate analysis techniques. We conclude that Biplot analysis could be a useful technique in scientometrics when studying multivariate data and an easy-to-read tool for research decision makers.
△ Less
Submitted 4 February, 2013;
originally announced February 2013.
-
Manipulating Google Scholar Citations and Google Scholar Metrics: simple, easy and tempting
Authors:
Emilio Delgado Lopez-Cozar,
Nicolas Robinson-Garcia,
Daniel Torres-Salinas
Abstract:
The launch of Google Scholar Citations and Google Scholar Metrics may provoke a revolution in the research evaluation field as it places within every researchers reach tools that allow bibliometric measuring. In order to alert the research community over how easily one can manipulate the data and bibliometric indicators offered by Google s products we present an experiment in which we manipulate t…
▽ More
The launch of Google Scholar Citations and Google Scholar Metrics may provoke a revolution in the research evaluation field as it places within every researchers reach tools that allow bibliometric measuring. In order to alert the research community over how easily one can manipulate the data and bibliometric indicators offered by Google s products we present an experiment in which we manipulate the Google Citations profiles of a research group through the creation of false documents that cite their documents, and consequently, the journals in which they have published modifying their H index. For this purpose we created six documents authored by a faked author and we uploaded them to a researcher s personal website under the University of Granadas domain. The result of the experiment meant an increase of 774 citations in 129 papers (six citations per paper) increasing the authors and journals H index. We analyse the malicious effect this type of practices can cause to Google Scholar Citations and Google Scholar Metrics. Finally, we conclude with several deliberations over the effects these malpractices may have and the lack of control tools these tools offer
△ Less
Submitted 21 February, 2013; v1 submitted 4 December, 2012;
originally announced December 2012.
-
Towards a Book Publishers Citation Reports. First approach using the Book Citation Index
Authors:
Daniel Torres-Salinas,
Nicolas Robinson-Garcia,
Emilio Delgado Lopez-Cozar
Abstract:
The absence of books and book chapters in the Web of Science Citation Indexes (SCI, SSCI and A&HCI) has always been considered an important flaw but the Thomson Reuters 'Book Citation Index' database was finally available in October of 2010 indexing 29,618 books and 379,082 book chapters. The Book Citation Index opens a new window of opportunities for analyzing these fields from a bibliometric poi…
▽ More
The absence of books and book chapters in the Web of Science Citation Indexes (SCI, SSCI and A&HCI) has always been considered an important flaw but the Thomson Reuters 'Book Citation Index' database was finally available in October of 2010 indexing 29,618 books and 379,082 book chapters. The Book Citation Index opens a new window of opportunities for analyzing these fields from a bibliometric point of view. The main objective of this article is to analyze different impact indicators referred to the scientific publishers included in the Book Citation Index for the Social Sciences and Humanities fields during 2006-2011. This way we construct what we have called the 'Book Publishers Citation Reports'. For this, we present a total of 19 rankings according to the different disciplines in Humanities & Arts and Social Sciences & Law with six indicators for scientific publishers
△ Less
Submitted 29 July, 2012;
originally announced July 2012.