Search | arXiv e-print repository

Tractable Ridge Regression for Paired Comparisons

Abstract: Paired comparison models, such as Bradley-Terry and Thurstone-Mosteller, are commonly used to estimate relative strengths of pairwise compared items in tournament-style data. We discuss estimation of paired comparison models with a ridge penalty. A new approach is derived which combines empirical Bayes and composite likelihoods without any need to re-fit the model, as a convenient alternative to c… ▽ More Paired comparison models, such as Bradley-Terry and Thurstone-Mosteller, are commonly used to estimate relative strengths of pairwise compared items in tournament-style data. We discuss estimation of paired comparison models with a ridge penalty. A new approach is derived which combines empirical Bayes and composite likelihoods without any need to re-fit the model, as a convenient alternative to cross-validation of the ridge tuning parameter. Simulation studies demonstrate much better predictive accuracy of the new approach relative to ordinary maximum likelihood. A widely used alternative, the application of a standard bias-reducing penalty, is also found to improve appreciably the performance of maximum likelihood; but the ridge penalty, with tuning as developed here, yields greater accuracy still. The methodology is illustrated through application to 28 seasons of English Premier League football. △ Less

Submitted 2 October, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

arXiv:2312.13619 [pdf, ps, other]

The many routes to the ubiquitous Bradley-Terry model

Authors: Ian Hamilton, Nick Tawn, David Firth

Abstract: The rating of items based on pairwise comparisons has been a topic of statistical investigation for many decades. Numerous approaches have been proposed. One of the best known is the Bradley-Terry model. This paper seeks to assemble and explain a variety of motivations for its use. Some are based on principles or on maximising an objective function; others are derived from well-known statistical m… ▽ More The rating of items based on pairwise comparisons has been a topic of statistical investigation for many decades. Numerous approaches have been proposed. One of the best known is the Bradley-Terry model. This paper seeks to assemble and explain a variety of motivations for its use. Some are based on principles or on maximising an objective function; others are derived from well-known statistical models, or stylised game scenarios. They include both examples well-known in the literature as well as what are believed to be novel presentations. △ Less

Submitted 21 December, 2023; originally announced December 2023.

arXiv:2312.10548 [pdf, other]

Analysis of composition on the original scale of measurement

Authors: David Firth, Fiona Sammut

Abstract: In current applied research the most-used route to an analysis of composition is through log-ratios -- that is, contrasts among log-transformed measurements. Here we argue instead for a more direct approach, using a statistical model for the arithmetic mean on the original scale of measurement. Central to the approach is a general variance-covariance function, derived by assuming multiplicative me… ▽ More In current applied research the most-used route to an analysis of composition is through log-ratios -- that is, contrasts among log-transformed measurements. Here we argue instead for a more direct approach, using a statistical model for the arithmetic mean on the original scale of measurement. Central to the approach is a general variance-covariance function, derived by assuming multiplicative measurement error. Quasi-likelihood analysis of logit models for composition is then a general alternative to the use of multivariate linear models for log-ratio transformed measurements, and it has important advantages. These include robustness to secondary aspects of model specification, stability when there are zero-valued or near-zero measurements in the data, and more direct interpretation. The usual efficiency property of quasi-likelihood estimation applies even when the error covariance matrix is unspecified. We also indicate how the derived variance-covariance function can be used, instead of the variance-covariance matrix of log-ratios, with more general multivariate methods for the analysis of composition. A specific feature is that the notion of `null correlation' -- for compositional measurements on their original scale -- emerges naturally. △ Less

Submitted 16 December, 2023; originally announced December 2023.

Comments: This is a preliminary version, made available prior to journal submission. Comments that could improve the paper would be very much welcomed

arXiv:2306.13367 [pdf, other]

Judging a book by its cover: how much of REF `research quality' is really `journal prestige'?

Authors: David Antony Selby, David Firth

Abstract: The Research Excellence Framework (REF) is a periodic UK-wide assessment of the quality of published research in universities. The most recent REF was in 2014, and the next will be in 2021. The published results of REF2014 include a categorical `quality profile' for each unit of assessment (typically a university department), reporting what percentage of the unit's REF-submitted research outputs w… ▽ More The Research Excellence Framework (REF) is a periodic UK-wide assessment of the quality of published research in universities. The most recent REF was in 2014, and the next will be in 2021. The published results of REF2014 include a categorical `quality profile' for each unit of assessment (typically a university department), reporting what percentage of the unit's REF-submitted research outputs were assessed as being at each of four quality levels (labelled 4*, 3*, 2* and 1*). Also in the public domain are the original submissions made to REF2014, which include -- for each unit of assessment -- publication details of the REF-submitted research outputs. In this work, we address the question: to what extent can a REF quality profile for research outputs be attributed to the journals in which (most of) those outputs were published? The data are the published submissions and results from REF2014. The main statistical challenge comes from the fact that REF quality profiles are available only at the aggregated level of whole units of assessment: the REF panel's assessment of each individual research output is not made public. Our research question is thus an `ecological inference' problem, which demands special care in model formulation and methodology. The analysis is based on logit models in which journal-specific parameters are regularized via prior `pseudo-data'. We develop a lack-of-fit measure for the extent to which REF scores appear to depend on publication venues rather than research quality or institution-level differences. Results are presented for several research fields. △ Less

Submitted 23 June, 2023; originally announced June 2023.

Comments: 50 pages, 19 figures

arXiv:2112.11262 [pdf, other]

Retrodictive Modelling of Modern Rugby Union: Extension of Bradley-Terry to Multiple Outcomes

Authors: Ian Hamilton, David Firth

Abstract: Frequently in sporting competitions it is desirable to compare teams based on records of varying schedule strength. Methods have been developed for sports where the result outcomes are win, draw, or loss. In this paper those ideas are extended to account for any finite multiple outcome result set. A principle-based motivation is supplied and an implementation presented for modern rugby union, wher… ▽ More Frequently in sporting competitions it is desirable to compare teams based on records of varying schedule strength. Methods have been developed for sports where the result outcomes are win, draw, or loss. In this paper those ideas are extended to account for any finite multiple outcome result set. A principle-based motivation is supplied and an implementation presented for modern rugby union, where bonus points are awarded for losing within a certain score margin and for scoring a certain number of tries. A number of variants are discussed including the constraining assumptions that are implied by each. The model is applied to assess the current rules of the Daily Mail Trophy, a national schools tournament in England and Wales. △ Less

Submitted 21 December, 2021; originally announced December 2021.

arXiv:2005.01873 [pdf]

Protocol for a Study of the Effect of Surface Mining in Central Appalachia on Adverse Birth Outcomes

Authors: Dylan S. Small, Dan Firth, Luke Keele, Matthew Huber, Molly Passarella, Scott Lorch, Heather Burris

Abstract: Surface mining has become a major method of coal mining in Central Appalachia alongside the traditional underground mining. Concerns have been raised about the health effects of this surface mining, particularly mountaintop removal mining where coal is mined upon steep mountaintops by removing the mountaintop through clearcutting forests and explosives. We have designed a matched observational stu… ▽ More Surface mining has become a major method of coal mining in Central Appalachia alongside the traditional underground mining. Concerns have been raised about the health effects of this surface mining, particularly mountaintop removal mining where coal is mined upon steep mountaintops by removing the mountaintop through clearcutting forests and explosives. We have designed a matched observational study to assess the effects of surface mining in Central Appalachia on adverse birth outcomes. This protocol describes for the study the background and motivation, the sample selection and the analysis plan. △ Less

Submitted 4 May, 2020; originally announced May 2020.

arXiv:1909.07123 [pdf, other]

Davidson-Luce model for multi-item choice with ties

Authors: David Firth, Ioannis Kosmidis, Heather Turner

Abstract: This paper introduces a natural extension of the pair-comparison-with-ties model of Davidson (1970, J. Amer. Statist. Assoc.), to allow for ties when more than two items are compared. Properties of the new model are discussed. It is found that this "Davidson-Luce" model retains the many appealing features of Davidson's solution, while extending the scope of application substantially beyond the dom… ▽ More This paper introduces a natural extension of the pair-comparison-with-ties model of Davidson (1970, J. Amer. Statist. Assoc.), to allow for ties when more than two items are compared. Properties of the new model are discussed. It is found that this "Davidson-Luce" model retains the many appealing features of Davidson's solution, while extending the scope of application substantially beyond the domain of pair-comparison data. The model introduced here already underpins the handling of tied rankings in the "PlackettLuce" R package. △ Less

Submitted 16 September, 2019; originally announced September 2019.

Comments: 11 pages, including Appendix with example R code

MSC Class: 62J15 (Primary) 62J12 (Secondary)

arXiv:1812.01938 [pdf, other]

Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models

Authors: Ioannis Kosmidis, David Firth

Abstract: Penalization of the likelihood by Jeffreys' invariant prior, or by a positive power thereof, is shown to produce finite-valued maximum penalized likelihood estimates in a broad class of binomial generalized linear models. The class of models includes logistic regression, where the Jeffreys-prior penalty is known additionally to reduce the asymptotic bias of the maximum likelihood estimator; and al… ▽ More Penalization of the likelihood by Jeffreys' invariant prior, or by a positive power thereof, is shown to produce finite-valued maximum penalized likelihood estimates in a broad class of binomial generalized linear models. The class of models includes logistic regression, where the Jeffreys-prior penalty is known additionally to reduce the asymptotic bias of the maximum likelihood estimator; and also models with other commonly used link functions such as probit and log-log. Shrinkage towards equiprobability across observations, relative to the maximum likelihood estimator, is established theoretically and is studied through illustrative examples. Some implications of finiteness and shrinkage for inference are discussed, particularly when inference is based on Wald-type procedures. A widely applicable procedure is developed for computation of maximum penalized likelihood estimates, by using repeated maximum likelihood fits with iteratively adjusted binomial responses and totals. These theoretical results and methods underpin the increasingly widespread use of reduced-bias and similarly penalized binomial regression models in many applied fields. △ Less

Submitted 23 March, 2020; v1 submitted 5 December, 2018; originally announced December 2018.

MSC Class: 62J12; 62F10; 62F12; 62F03

arXiv:1810.12068 [pdf, other]

Modelling rankings in R: the PlackettLuce package

Authors: Heather L. Turner, Jacob van Etten, David Firth, Ioannis Kosmidis

Abstract: This paper presents the R package PlackettLuce, which implements a generalization of the Plackett-Luce model for rankings data. The generalization accommodates both ties (of arbitrary order) and partial rankings (complete rankings of subsets of items). By default, the implementation adds a set of pseudo-comparisons with a hypothetical item, ensuring that the underlying network of wins and losses b… ▽ More This paper presents the R package PlackettLuce, which implements a generalization of the Plackett-Luce model for rankings data. The generalization accommodates both ties (of arbitrary order) and partial rankings (complete rankings of subsets of items). By default, the implementation adds a set of pseudo-comparisons with a hypothetical item, ensuring that the underlying network of wins and losses between items is always strongly connected. In this way, the worth of each item always has a finite maximum likelihood estimate, with finite standard error. The use of pseudo-comparisons also has a regularization effect, shrinking the estimated parameters towards equal item worth. In addition to standard methods for model summary, PlackettLuce provides a method to compute quasi standard errors for the item parameters. This provides the basis for comparison intervals that do not change with the choice of identifiability constraint placed on the item parameters. Finally, the package provides a method for model-based partitioning using covariates whose values vary between rankings, enabling the identification of subgroups of judges or settings that have different item worths. The features of the package are demonstrated through application to classic and novel data sets. △ Less

Submitted 14 December, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

Comments: In v2: review of software implementing alternative models to Plackett-Luce; comparison of algorithms provided by the PlackettLuce package; further examples of rankings where the underlying win-loss network is not strongly connected. In addition, general editing to improve organisation and clarity. In v3: corrected headings Table 4, minor edits

arXiv:1312.1794 [pdf, other]

doi 10.1111/rssa.12124

Statistical Modelling of Citation Exchange Between Statistics Journals

Authors: Cristiano Varin, Manuela Cattelan, David Firth

Abstract: Rankings of scholarly journals based on citation data are often met with skepticism by the scientific community. Part of the skepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of authors. This paper focuses on analysis… ▽ More Rankings of scholarly journals based on citation data are often met with skepticism by the scientific community. Part of the skepticism is due to disparity between the common perception of journals' prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of authors. This paper focuses on analysis of the table of cross-citations among a selection of Statistics journals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by considerable heterogeneity, which needs to be properly summarized. Inferential conclusions require care in order to avoid potential over-interpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK's Research Assessment Exercise shows strong correlation at aggregate level between assessed research quality and journal citation `export scores' within the discipline of Statistics. △ Less

Submitted 3 April, 2015; v1 submitted 6 December, 2013; originally announced December 2013.

Comments: To be published with discussion on Journal of the Royal Statistical Society Series A

Journal ref: Journal of the Royal Statistical Society Series A Volume 179, Issue 1 Pages 1 - 318, January 2016

Showing 1–10 of 10 results for author: Firth, D