-
Bye-Bye, Bye Advantage: Estimating the competitive impact of rest differential in the National Football League
Authors:
Michael J Lopez,
Thompson J Bliss
Abstract:
The National Football League (NFL) sets its regular season schedule to optimize viewership and minimize competitive inequities. One inequity assumed to impact team performance is rest differential, defined as the relative number of days between games. Using Bayesian state space models on both game outcomes and betting market data, we estimate the competitive effect of rest differential in American…
▽ More
The National Football League (NFL) sets its regular season schedule to optimize viewership and minimize competitive inequities. One inequity assumed to impact team performance is rest differential, defined as the relative number of days between games. Using Bayesian state space models on both game outcomes and betting market data, we estimate the competitive effect of rest differential in American football. We find that the most commonly referred to inequities -- both the bye week rest advantage and the mini-bye week rest advantage -- currently show no significant evidence of providing the rested team a competitive edge. Further, we trace a decline in the advantage of a bye week to a 2011 change to the NFL's Collective Bargaining Agreement, which represents a natural experiment to test the relevance of rest and preparation in football. Prior to the agreement, NFL teams off a bye week received a significant advantage (+2.2 points per game), but since 2011, that benefit has been mitigated.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
A comprehensive survey of the home advantage in American football
Authors:
Luke S. Benz,
Thompson J. Bliss,
Michael J. Lopez
Abstract:
The existence and justification to the home advantage -- the benefit a sports team receives when playing at home -- has been studied across sport. The majority of research on this topic is limited to individual leagues in short time frames, which hinders extrapolation and a deeper understanding of possible causes. Using nearly two decades of data from the National Football League (NFL), the Nation…
▽ More
The existence and justification to the home advantage -- the benefit a sports team receives when playing at home -- has been studied across sport. The majority of research on this topic is limited to individual leagues in short time frames, which hinders extrapolation and a deeper understanding of possible causes. Using nearly two decades of data from the National Football League (NFL), the National Collegiate Athletic Association (NCAA), and high schools from across the United States, we provide a uniform approach to understanding the home advantage in American football. Our findings suggest home advantage is declining in the NFL and the highest levels of collegiate football, but not in amateur football. This increases the possibility that characteristics of the NCAA and NFL, such as travel improvements and instant replay, have helped level the playing field.
△ Less
Submitted 27 June, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Estimating the change in soccer's home advantage during the Covid-19 pandemic using bivariate Poisson regression
Authors:
Luke S. Benz,
Michael J. Lopez
Abstract:
In wake of the Covid-19 pandemic, 2019-2020 soccer seasons across the world were postponed and eventually made up during the summer months of 2020. Researchers from a variety of disciplines jumped at the opportunity to compare the rescheduled games, played in front of empty stadia, to previous games, played in front of fans. To date, most of this post-Covid soccer research has used linear regressi…
▽ More
In wake of the Covid-19 pandemic, 2019-2020 soccer seasons across the world were postponed and eventually made up during the summer months of 2020. Researchers from a variety of disciplines jumped at the opportunity to compare the rescheduled games, played in front of empty stadia, to previous games, played in front of fans. To date, most of this post-Covid soccer research has used linear regression models, or versions thereof, to estimate potential changes to the home advantage. But because soccer outcomes are non-linear, we argue that leveraging the Poisson distribution would be more appropriate. We begin by using simulations to show that bivariate Poisson regression reduces absolute bias when estimating the home advantage benefit in a single season of soccer games, relative to linear regression, by almost 85 percent. Next, with data from 17 professional soccer leagues, we extend bivariate Poisson models estimate the change in home advantage due to games being played without fans. In contrast to current research that overwhelmingly suggests a drop in the home advantage, our findings are mixed; in some leagues, evidence points to a decrease, while in others, the home advantage may have risen. Altogether, this suggests a more complex causal mechanism for the impact of fans on sporting events.
△ Less
Submitted 28 May, 2021; v1 submitted 29 December, 2020;
originally announced December 2020.
-
Bigger data, better questions, and a return to fourth down behavior: an introduction to a special issue on tracking data in the National football League
Authors:
Michael J. Lopez
Abstract:
Most historical National Football League (NFL) analysis, both mainstream and academic, has relied on public, play-level data to generate team and player comparisons. Given the number of oft omitted variables that impact on-field results, such as play call, game situation, and opponent strength, findings tend to be more anecdotal than actionable. With the release of player tracking data, however, a…
▽ More
Most historical National Football League (NFL) analysis, both mainstream and academic, has relied on public, play-level data to generate team and player comparisons. Given the number of oft omitted variables that impact on-field results, such as play call, game situation, and opponent strength, findings tend to be more anecdotal than actionable. With the release of player tracking data, however, analysts can better ask and answer questions to isolate skill and strategy. In this article, we highlight the limitations of traditional analyses, and use a decades-old punching bag for analysts, fourth-down strategy, as a microcosm for why tracking data is needed. Specifically, we assert that, in absence of using the precise yardage needed for a first down, past findings supporting an aggressive fourth down strategy may have been overstated. Next, we synthesize recent work that comprises this special Journal of Quantitative Analysis in Sports issue into player tracking data in football. Finally, we conclude with some best practices and limitations regarding usage of this data. The release of player tracking data marks a transition for the league and its' analysts, and we hope this issue helps guide innovation in football analytics for years to come.
△ Less
Submitted 12 May, 2020; v1 submitted 23 September, 2019;
originally announced September 2019.
-
The Estimation of Causal Effects of Multiple Treatments in Observational Studies Using Bayesian Additive Regression Trees
Authors:
Chenyang Gu,
Michael J. Lopez,
Liangyuan Hu
Abstract:
There is currently a dearth of appropriate methods to estimate the causal effects of multiple treatments when the outcome is binary. For such settings, we propose the use of nonparametric Bayesian modeling, Bayesian Additive Regression Trees (BART). We conduct an extensive simulation study to compare BART to several existing, propensity score-based methods and to identify its operating characteris…
▽ More
There is currently a dearth of appropriate methods to estimate the causal effects of multiple treatments when the outcome is binary. For such settings, we propose the use of nonparametric Bayesian modeling, Bayesian Additive Regression Trees (BART). We conduct an extensive simulation study to compare BART to several existing, propensity score-based methods and to identify its operating characteristics when estimating average treatment effects on the treated. BART consistently demonstrates low bias and mean-squared errors. We illustrate the use of BART through a comparative effectiveness analysis of a large dataset, drawn from the latest SEER-Medicare linkage, on patients who were operated via robotic-assisted surgery, video-assisted thoratic surgery or open thoracotomy.
△ Less
Submitted 27 February, 2020; v1 submitted 11 January, 2019;
originally announced January 2019.
-
How often does the best team win? A unified approach to understanding randomness in North American sport
Authors:
Michael J. Lopez,
Gregory J. Matthews,
Benjamin S. Baumer
Abstract:
Statistical applications in sports have long centered on how to best separate signal (e.g. team talent) from random noise. However, most of this work has concentrated on a single sport, and the development of meaningful cross-sport comparisons has been impeded by the difficulty of translating luck from one sport to another. In this manuscript, we develop Bayesian state-space models using betting m…
▽ More
Statistical applications in sports have long centered on how to best separate signal (e.g. team talent) from random noise. However, most of this work has concentrated on a single sport, and the development of meaningful cross-sport comparisons has been impeded by the difficulty of translating luck from one sport to another. In this manuscript, we develop Bayesian state-space models using betting market data that can be uniformly applied across sporting organizations to better understand the role of randomness in game outcomes. These models can be used to extract estimates of team strength, the between-season, within-season, and game-to-game variability of team strengths, as well each team's home advantage. We implement our approach across a decade of play in each of the National Football League (NFL), National Hockey League (NHL), National Basketball Association (NBA), and Major League Baseball (MLB), finding that the NBA demonstrates both the largest dispersion in talent and the largest home advantage, while the NHL and MLB stand out for their relative randomness in game outcomes. We conclude by proposing new metrics for judging competitiveness across sports leagues, both within the regular season and using traditional postseason tournament formats. Although we focus on sports, we discuss a number of other situations in which our generalizable models might be usefully applied.
△ Less
Submitted 22 November, 2017; v1 submitted 20 January, 2017;
originally announced January 2017.
-
Estimation of causal effects with multiple treatments: a review and new ideas
Authors:
Michael J Lopez,
Roee Gutman
Abstract:
The propensity score is a common tool for estimating the causal effect of a binary treatment in observational data. In this setting, matching, subclassification, imputation, or inverse probability weighting on the propensity score can reduce the initial covariate bias between the treatment and control groups. With more than two treatment options, however, estimation of causal effects requires addi…
▽ More
The propensity score is a common tool for estimating the causal effect of a binary treatment in observational data. In this setting, matching, subclassification, imputation, or inverse probability weighting on the propensity score can reduce the initial covariate bias between the treatment and control groups. With more than two treatment options, however, estimation of causal effects requires additional assumptions and techniques, the implementations of which have varied across disciplines. This paper reviews current methods, and it identifies and contrasts the treatment effects that each one estimates. Additionally, we propose possible matching techniques for use with multiple, nominal categorical treatments, and use simulations to show how such algorithms can yield improved covariate similarity between those in the matched sets, relative the pre-matched cohort. To sum, this manuscript provides a synopsis of how to notate and use causal methods for categorical treatments.
△ Less
Submitted 19 January, 2017; v1 submitted 18 January, 2017;
originally announced January 2017.
-
Labor Disputes and Worker Productivity
Authors:
Qi Ge,
Michael J. Lopez
Abstract:
We implement a propensity score matching technique to present the first evidence on the impact of labor supply decisions during labor disputes on worker productivity in the context of professional sports. In particular, we utilize a unique natural experiment from the 2012-13 National Hockey League (NHL) lockout, during which approximately 200 players decided to play overseas while the rest stayed…
▽ More
We implement a propensity score matching technique to present the first evidence on the impact of labor supply decisions during labor disputes on worker productivity in the context of professional sports. In particular, we utilize a unique natural experiment from the 2012-13 National Hockey League (NHL) lockout, during which approximately 200 players decided to play overseas while the rest stayed in North America. We separate the players based on their nationality and investigate the effect of playing abroad on post-lockout player performance. We find limited evidence of enhanced productivity among European players, and no evidence of a benefit or drawback for North American players. The lack of consistent productivity impact is in line with literature in industries with large labor rents, and we propose several additional explanations within the context of professional hockey. Our study contributes to the general understanding of the impact of employer-initiated work stoppage on labor productivity.
△ Less
Submitted 25 June, 2015;
originally announced June 2015.
-
Building an NCAA mens basketball predictive model and quantifying its success
Authors:
Michael J. Lopez,
Gregory Matthews
Abstract:
The old adage says that it is better to be lucky than to be good, but when it comes to winning NCAA tournament pools, do you need to be both? This paper attempts to answer this question using data from the 2014 men's basketball tournament and more than 400 predictions of game outcomes submitted to a contest hosted by the website Kaggle. We begin by describing how we built a prediction model for me…
▽ More
The old adage says that it is better to be lucky than to be good, but when it comes to winning NCAA tournament pools, do you need to be both? This paper attempts to answer this question using data from the 2014 men's basketball tournament and more than 400 predictions of game outcomes submitted to a contest hosted by the website Kaggle. We begin by describing how we built a prediction model for men's basketball tournament outcomes under the binomial log-likelihood loss function. Next, under different sets of true underlying game probabilities, we simulate tournament outcomes and imputed pool standings, in an effort to determine how much of an entry's success can be attributed to luck. While one of our two submissions finished first in the Kaggle contest, we estimate that this winning entry had no more than about a 12% chance of doing so, even under the most optimistic of game probability scenarios.
△ Less
Submitted 30 November, 2014;
originally announced December 2014.