Search | arXiv e-print repository

Bye-Bye, Bye Advantage: Estimating the competitive impact of rest differential in the National Football League

Authors: Michael J Lopez, Thompson J Bliss

Abstract: The National Football League (NFL) sets its regular season schedule to optimize viewership and minimize competitive inequities. One inequity assumed to impact team performance is rest differential, defined as the relative number of days between games. Using Bayesian state space models on both game outcomes and betting market data, we estimate the competitive effect of rest differential in American… ▽ More The National Football League (NFL) sets its regular season schedule to optimize viewership and minimize competitive inequities. One inequity assumed to impact team performance is rest differential, defined as the relative number of days between games. Using Bayesian state space models on both game outcomes and betting market data, we estimate the competitive effect of rest differential in American football. We find that the most commonly referred to inequities -- both the bye week rest advantage and the mini-bye week rest advantage -- currently show no significant evidence of providing the rested team a competitive edge. Further, we trace a decline in the advantage of a bye week to a 2011 change to the NFL's Collective Bargaining Agreement, which represents a natural experiment to test the relevance of rest and preparation in football. Prior to the agreement, NFL teams off a bye week received a significant advantage (+2.2 points per game), but since 2011, that benefit has been mitigated. △ Less

Submitted 19 August, 2024; originally announced August 2024.

Comments: 10 figures, 4 tables

arXiv:2401.16392 [pdf, other]

A comprehensive survey of the home advantage in American football

Authors: Luke S. Benz, Thompson J. Bliss, Michael J. Lopez

Abstract: The existence and justification to the home advantage -- the benefit a sports team receives when playing at home -- has been studied across sport. The majority of research on this topic is limited to individual leagues in short time frames, which hinders extrapolation and a deeper understanding of possible causes. Using nearly two decades of data from the National Football League (NFL), the Nation… ▽ More The existence and justification to the home advantage -- the benefit a sports team receives when playing at home -- has been studied across sport. The majority of research on this topic is limited to individual leagues in short time frames, which hinders extrapolation and a deeper understanding of possible causes. Using nearly two decades of data from the National Football League (NFL), the National Collegiate Athletic Association (NCAA), and high schools from across the United States, we provide a uniform approach to understanding the home advantage in American football. Our findings suggest home advantage is declining in the NFL and the highest levels of collegiate football, but not in amateur football. This increases the possibility that characteristics of the NCAA and NFL, such as travel improvements and instant replay, have helped level the playing field. △ Less

Submitted 27 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

arXiv:2012.14949 [pdf, other]

Estimating the change in soccer's home advantage during the Covid-19 pandemic using bivariate Poisson regression

Authors: Luke S. Benz, Michael J. Lopez

Abstract: In wake of the Covid-19 pandemic, 2019-2020 soccer seasons across the world were postponed and eventually made up during the summer months of 2020. Researchers from a variety of disciplines jumped at the opportunity to compare the rescheduled games, played in front of empty stadia, to previous games, played in front of fans. To date, most of this post-Covid soccer research has used linear regressi… ▽ More In wake of the Covid-19 pandemic, 2019-2020 soccer seasons across the world were postponed and eventually made up during the summer months of 2020. Researchers from a variety of disciplines jumped at the opportunity to compare the rescheduled games, played in front of empty stadia, to previous games, played in front of fans. To date, most of this post-Covid soccer research has used linear regression models, or versions thereof, to estimate potential changes to the home advantage. But because soccer outcomes are non-linear, we argue that leveraging the Poisson distribution would be more appropriate. We begin by using simulations to show that bivariate Poisson regression reduces absolute bias when estimating the home advantage benefit in a single season of soccer games, relative to linear regression, by almost 85 percent. Next, with data from 17 professional soccer leagues, we extend bivariate Poisson models estimate the change in home advantage due to games being played without fans. In contrast to current research that overwhelmingly suggests a drop in the home advantage, our findings are mixed; in some leagues, evidence points to a decrease, while in others, the home advantage may have risen. Altogether, this suggests a more complex causal mechanism for the impact of fans on sporting events. △ Less

Submitted 28 May, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

arXiv:1909.10631 [pdf, other]

Bigger data, better questions, and a return to fourth down behavior: an introduction to a special issue on tracking data in the National football League

Authors: Michael J. Lopez

Abstract: Most historical National Football League (NFL) analysis, both mainstream and academic, has relied on public, play-level data to generate team and player comparisons. Given the number of oft omitted variables that impact on-field results, such as play call, game situation, and opponent strength, findings tend to be more anecdotal than actionable. With the release of player tracking data, however, a… ▽ More Most historical National Football League (NFL) analysis, both mainstream and academic, has relied on public, play-level data to generate team and player comparisons. Given the number of oft omitted variables that impact on-field results, such as play call, game situation, and opponent strength, findings tend to be more anecdotal than actionable. With the release of player tracking data, however, analysts can better ask and answer questions to isolate skill and strategy. In this article, we highlight the limitations of traditional analyses, and use a decades-old punching bag for analysts, fourth-down strategy, as a microcosm for why tracking data is needed. Specifically, we assert that, in absence of using the precise yardage needed for a first down, past findings supporting an aggressive fourth down strategy may have been overstated. Next, we synthesize recent work that comprises this special Journal of Quantitative Analysis in Sports issue into player tracking data in football. Finally, we conclude with some best practices and limitations regarding usage of this data. The release of player tracking data marks a transition for the league and its' analysts, and we hope this issue helps guide innovation in football analytics for years to come. △ Less

Submitted 12 May, 2020; v1 submitted 23 September, 2019; originally announced September 2019.

arXiv:1901.04312

The Estimation of Causal Effects of Multiple Treatments in Observational Studies Using Bayesian Additive Regression Trees

Authors: Chenyang Gu, Michael J. Lopez, Liangyuan Hu

Abstract: There is currently a dearth of appropriate methods to estimate the causal effects of multiple treatments when the outcome is binary. For such settings, we propose the use of nonparametric Bayesian modeling, Bayesian Additive Regression Trees (BART). We conduct an extensive simulation study to compare BART to several existing, propensity score-based methods and to identify its operating characteris… ▽ More There is currently a dearth of appropriate methods to estimate the causal effects of multiple treatments when the outcome is binary. For such settings, we propose the use of nonparametric Bayesian modeling, Bayesian Additive Regression Trees (BART). We conduct an extensive simulation study to compare BART to several existing, propensity score-based methods and to identify its operating characteristics when estimating average treatment effects on the treated. BART consistently demonstrates low bias and mean-squared errors. We illustrate the use of BART through a comparative effectiveness analysis of a large dataset, drawn from the latest SEER-Medicare linkage, on patients who were operated via robotic-assisted surgery, video-assisted thoratic surgery or open thoracotomy. △ Less

Submitted 27 February, 2020; v1 submitted 11 January, 2019; originally announced January 2019.

Comments: This article has been replaced by "Estimation of Causal Effects of Multiple Treatments in Observational Studies with a Binary Outcome" (arXiv:2001.06483 [stat.ME])

arXiv:1701.05976 [pdf, other]

How often does the best team win? A unified approach to understanding randomness in North American sport

Authors: Michael J. Lopez, Gregory J. Matthews, Benjamin S. Baumer

Abstract: Statistical applications in sports have long centered on how to best separate signal (e.g. team talent) from random noise. However, most of this work has concentrated on a single sport, and the development of meaningful cross-sport comparisons has been impeded by the difficulty of translating luck from one sport to another. In this manuscript, we develop Bayesian state-space models using betting m… ▽ More Statistical applications in sports have long centered on how to best separate signal (e.g. team talent) from random noise. However, most of this work has concentrated on a single sport, and the development of meaningful cross-sport comparisons has been impeded by the difficulty of translating luck from one sport to another. In this manuscript, we develop Bayesian state-space models using betting market data that can be uniformly applied across sporting organizations to better understand the role of randomness in game outcomes. These models can be used to extract estimates of team strength, the between-season, within-season, and game-to-game variability of team strengths, as well each team's home advantage. We implement our approach across a decade of play in each of the National Football League (NFL), National Hockey League (NHL), National Basketball Association (NBA), and Major League Baseball (MLB), finding that the NBA demonstrates both the largest dispersion in talent and the largest home advantage, while the NHL and MLB stand out for their relative randomness in game outcomes. We conclude by proposing new metrics for judging competitiveness across sports leagues, both within the regular season and using traditional postseason tournament formats. Although we focus on sports, we discuss a number of other situations in which our generalizable models might be usefully applied. △ Less

Submitted 22 November, 2017; v1 submitted 20 January, 2017; originally announced January 2017.

Comments: 40 pages, 20 figures, 5 tables, code available at https://github.com/bigfour/competitiveness

arXiv:1701.05132 [pdf, other]

doi 10.1214/17-STS612

Estimation of causal effects with multiple treatments: a review and new ideas

Authors: Michael J Lopez, Roee Gutman

Abstract: The propensity score is a common tool for estimating the causal effect of a binary treatment in observational data. In this setting, matching, subclassification, imputation, or inverse probability weighting on the propensity score can reduce the initial covariate bias between the treatment and control groups. With more than two treatment options, however, estimation of causal effects requires addi… ▽ More The propensity score is a common tool for estimating the causal effect of a binary treatment in observational data. In this setting, matching, subclassification, imputation, or inverse probability weighting on the propensity score can reduce the initial covariate bias between the treatment and control groups. With more than two treatment options, however, estimation of causal effects requires additional assumptions and techniques, the implementations of which have varied across disciplines. This paper reviews current methods, and it identifies and contrasts the treatment effects that each one estimates. Additionally, we propose possible matching techniques for use with multiple, nominal categorical treatments, and use simulations to show how such algorithms can yield improved covariate similarity between those in the matched sets, relative the pre-matched cohort. To sum, this manuscript provides a synopsis of how to notate and use causal methods for categorical treatments. △ Less

Submitted 19 January, 2017; v1 submitted 18 January, 2017; originally announced January 2017.

Report number: Volume 32, Number 3

Journal ref: Statistical Science, 2017

arXiv:1506.07939 [pdf, other]

Labor Disputes and Worker Productivity

Authors: Qi Ge, Michael J. Lopez

Abstract: We implement a propensity score matching technique to present the first evidence on the impact of labor supply decisions during labor disputes on worker productivity in the context of professional sports. In particular, we utilize a unique natural experiment from the 2012-13 National Hockey League (NHL) lockout, during which approximately 200 players decided to play overseas while the rest stayed… ▽ More We implement a propensity score matching technique to present the first evidence on the impact of labor supply decisions during labor disputes on worker productivity in the context of professional sports. In particular, we utilize a unique natural experiment from the 2012-13 National Hockey League (NHL) lockout, during which approximately 200 players decided to play overseas while the rest stayed in North America. We separate the players based on their nationality and investigate the effect of playing abroad on post-lockout player performance. We find limited evidence of enhanced productivity among European players, and no evidence of a benefit or drawback for North American players. The lack of consistent productivity impact is in line with literature in industries with large labor rents, and we propose several additional explanations within the context of professional hockey. Our study contributes to the general understanding of the impact of employer-initiated work stoppage on labor productivity. △ Less

Submitted 25 June, 2015; originally announced June 2015.

Comments: 4 figures

arXiv:1412.0248 [pdf, ps, other]

Building an NCAA mens basketball predictive model and quantifying its success

Authors: Michael J. Lopez, Gregory Matthews

Abstract: The old adage says that it is better to be lucky than to be good, but when it comes to winning NCAA tournament pools, do you need to be both? This paper attempts to answer this question using data from the 2014 men's basketball tournament and more than 400 predictions of game outcomes submitted to a contest hosted by the website Kaggle. We begin by describing how we built a prediction model for me… ▽ More The old adage says that it is better to be lucky than to be good, but when it comes to winning NCAA tournament pools, do you need to be both? This paper attempts to answer this question using data from the 2014 men's basketball tournament and more than 400 predictions of game outcomes submitted to a contest hosted by the website Kaggle. We begin by describing how we built a prediction model for men's basketball tournament outcomes under the binomial log-likelihood loss function. Next, under different sets of true underlying game probabilities, we simulate tournament outcomes and imputed pool standings, in an effort to determine how much of an entry's success can be attributed to luck. While one of our two submissions finished first in the Kaggle contest, we estimate that this winning entry had no more than about a 12% chance of doing so, even under the most optimistic of game probability scenarios. △ Less

Submitted 30 November, 2014; originally announced December 2014.

Showing 1–9 of 9 results for author: Lopez, M J