Skip to main content

Showing 1–14 of 14 results for author: Wyner, A J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.21822  [pdf, ps, other

    stat.AP

    Putting Skill as Nearly Indistinguishable from Noise: An Empirical Bayes Analysis of PGA Tour Performance

    Authors: Ryan S. Brill, Abraham J. Wyner

    Abstract: We revisit a foundational question in golf analytics: how important are the core components of performance--driving, approach play, and putting--in explaining success on the PGA Tour? Building on Mark Broadie's strokes gained analyses, we use an empirical Bayes approach to estimate latent golfer skill and assess statistical significance using a multiple testing procedure that controls the false di… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

  2. arXiv:2411.10400  [pdf, other

    stat.AP

    The Loser's Curse and the Critical Role of the Utility Function

    Authors: Ryan S. Brill, Abraham J. Wyner

    Abstract: A longstanding question in the judgment and decision making literature is whether experts, even in high-stakes environments, exhibit the same cognitive biases observed in controlled experiments with inexperienced participants. Massey and Thaler (2013) claim to have found an example of bias and irrationality in expert decision making: general managers' behavior in the National Football League draft… ▽ More

    Submitted 23 April, 2025; v1 submitted 15 November, 2024; originally announced November 2024.

  3. arXiv:2409.04889  [pdf, other

    stat.AP

    Moving from Machine Learning to Statistics: the case of Expected Points in American football

    Authors: Ryan S. Brill, Ryan Yee, Sameer K. Deshpande, Abraham J. Wyner

    Abstract: Expected points is a value function fundamental to player evaluation and strategic in-game decision-making across sports analytics, particularly in American football. To estimate expected points, football analysts use machine learning tools, which are not equipped to handle certain challenges. They suffer from selection bias, display counter-intuitive artifacts of overfitting, do not quantify unce… ▽ More

    Submitted 7 September, 2024; originally announced September 2024.

    Comments: version 0; still have editing to do in the body

  4. arXiv:2406.16171  [pdf, other

    stat.ME stat.AP

    Exploring the difficulty of estimating win probability: a simulation study

    Authors: Ryan S. Brill, Ronald Yurko, Abraham J. Wyner

    Abstract: Estimating win probability is one of the classic modeling tasks of sports analytics. Many widely used win probability estimators use machine learning to fit the relationship between a binary win/loss outcome variable and certain game-state variables. To illustrate just how difficult it is to accurately fit such a model from noisy and highly correlated observational data, in this paper we conduct a… ▽ More

    Submitted 2 March, 2025; v1 submitted 23 June, 2024; originally announced June 2024.

  5. arXiv:2311.03490  [pdf, other

    stat.AP

    Analytics, have some humility: a statistical view of fourth-down decision making

    Authors: Ryan S. Brill, Ronald Yurko, Abraham J. Wyner

    Abstract: The standard mathematical approach to fourth-down decision making in American football is to make the decision that maximizes estimated win probability. Win probability estimates arise from machine learning models fit from historical data. These models attempt to capture a nuanced relationship between a noisy binary outcome variable and game-state variables replete with interactions and non-linear… ▽ More

    Submitted 31 January, 2025; v1 submitted 6 November, 2023; originally announced November 2023.

  6. A Bayesian analysis of the time through the order penalty in baseball

    Authors: Ryan S. Brill, Sameer K. Deshpande, Abraham J. Wyner

    Abstract: As a baseball game progresses, batters appear to perform better the more times they face a particular pitcher. The apparent drop-off in pitcher performance from one time through the order to the next, known as the Time Through the Order Penalty (TTOP), is often attributed to within-game batter learning. Although the TTOP has largely been accepted within baseball and influences many managers' in-ga… ▽ More

    Submitted 31 May, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Accepted to JQAS

  7. arXiv:1812.05792  [pdf, other

    stat.ML cs.LG

    Making Sense of Random Forest Probabilities: a Kernel Perspective

    Authors: Matthew A. Olson, Abraham J. Wyner

    Abstract: A random forest is a popular tool for estimating probabilities in machine learning classification tasks. However, the means by which this is accomplished is unprincipled: one simply counts the fraction of trees in a forest that vote for a certain class. In this paper, we forge a connection between random forests and kernel regression. This places random forest probability estimation on more sound… ▽ More

    Submitted 14 December, 2018; originally announced December 2018.

  8. A Hierarchical Bayesian Model of Pitch Framing

    Authors: Sameer K. Deshpande, Abraham J. Wyner

    Abstract: Since the advent of high-resolution pitch tracking data (PITCHf/x), many in the sabermetrics community have attempted to quantify a Major League Baseball catcher's ability to "frame" a pitch (i.e. increase the chance that a pitch is called as a strike). Especially in the last three years, there has been an explosion of interest in the "art of pitch framing" in the popular press as well as signs th… ▽ More

    Submitted 9 September, 2017; v1 submitted 3 April, 2017; originally announced April 2017.

    Journal ref: Journal of Quantitative Analysis in Sports. 13(3): 95--112. (2017)

  9. arXiv:1504.07676  [pdf, other

    stat.ML cs.LG stat.ME

    Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers

    Authors: Abraham J. Wyner, Matthew Olson, Justin Bleich, David Mease

    Abstract: There is a large literature explaining why AdaBoost is a successful classifier. The literature on AdaBoost focuses on classifier margins and boosting's interpretation as the optimization of an exponential likelihood function. These existing explanations, however, have been pointed out to be incomplete. A random forest is another popular ensemble method for which there is substantially less explana… ▽ More

    Submitted 29 April, 2017; v1 submitted 28 April, 2015; originally announced April 2015.

    Comments: 40 pages, 11 figures, 2 algorithms

  10. Rejoinder: A statistical analysis of multiple temperature proxies: Are reconstructions of surface temperatures over the last 1000 years reliable?

    Authors: Blakeley B. McShane, Abraham J. Wyner

    Abstract: Rejoinder to "A statistical analysis of multiple temperature proxies: Are reconstructions of surface temperatures over the last 1000 years reliable?" by B.B. McShane and A.J. Wyner [arXiv:1104.4002]

    Submitted 12 May, 2011; originally announced May 2011.

    Comments: Published in at http://dx.doi.org/10.1214/10-AOAS398REJ the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS398REJ

    Journal ref: Annals of Applied Statistics 2011, Vol. 5, No. 1, 99-123

  11. arXiv:1104.4002  [pdf, ps, other

    stat.AP physics.ao-ph

    A statistical analysis of multiple temperature proxies: Are reconstructions of surface temperatures over the last 1000 years reliable?

    Authors: Blakeley B. McShane, Abraham J. Wyner

    Abstract: Predicting historic temperatures based on tree rings, ice cores, and other natural proxies is a difficult endeavor. The relationship between proxies and temperature is weak and the number of proxies is far larger than the number of target data points. Furthermore, the data contain complex spatial and temporal dependence structures which are not easily captured with simple models. In this paper, we… ▽ More

    Submitted 20 April, 2011; originally announced April 2011.

    Comments: Published in at http://dx.doi.org/10.1214/10-AOAS398 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS398

    Journal ref: Annals of Applied Statistics 2011, Vol. 5, No. 1, 5-44

  12. arXiv:0902.1360  [pdf, ps, other

    stat.AP stat.ME

    Hierarchical Bayesian Modeling of Hitting Performance in Baseball

    Authors: Shane T. Jensen, Blake McShane, Abraham J. Wyner

    Abstract: We have developed a sophisticated statistical model for predicting the hitting performance of Major League baseball players. The Bayesian paradigm provides a principled method for balancing past performance with crucial covariates, such as player age and position. We share information across time and across players by using mixture distributions to control shrinkage for improved accuracy. We com… ▽ More

    Submitted 8 February, 2009; originally announced February 2009.

    Journal ref: Bayesian Analysis 2009, Vol. 4, No. 4, 631-652

  13. Comment: Boosting Algorithms: Regularization, Prediction and Model Fitting

    Authors: Andreas Buja, David Mease, Abraham J. Wyner

    Abstract: The authors are doing the readers of Statistical Science a true service with a well-written and up-to-date overview of boosting that originated with the seminal algorithms of Freund and Schapire. Equally, we are grateful for high-level software that will permit a larger readership to experiment with, or simply apply, boosting-inspired model fitting. The authors show us a world of methodology tha… ▽ More

    Submitted 17 April, 2008; originally announced April 2008.

    Comments: Published in at http://dx.doi.org/10.1214/07-STS242B the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-STS-STS242B

    Journal ref: Statistical Science 2007, Vol. 22, No. 4, 506-512

  14. Bayesball: A Bayesian hierarchical model for evaluating fielding in major league baseball

    Authors: Shane T. Jensen, Kenneth E. Shirley, Abraham J. Wyner

    Abstract: The use of statistical modeling in baseball has received substantial attention recently in both the media and academic community. We focus on a relatively under-explored topic: the use of statistical models for the analysis of fielding based on high-resolution data consisting of on-field location of batted balls. We combine spatial modeling with a hierarchical Bayesian structure in order to eval… ▽ More

    Submitted 14 August, 2009; v1 submitted 28 February, 2008; originally announced February 2008.

    Comments: Published in at http://dx.doi.org/10.1214/08-AOAS228 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS228

    Journal ref: Annals of Applied Statistics 2009, Vol. 3, No. 2, 491-520