Skip to main content

Showing 1–49 of 49 results for author: Cattaneo, M D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2509.11381  [pdf, ps, other

    math.ST econ.EM stat.ME stat.ML

    The Honest Truth About Causal Trees: Accuracy Limits for Heterogeneous Treatment Effect Estimation

    Authors: Matias D. Cattaneo, Jason M. Klusowski, Ruiqi Rae Yu

    Abstract: Recursive decision trees have emerged as a leading methodology for heterogeneous causal treatment effect estimation and inference in experimental and observational settings. These procedures are fitted using the celebrated CART (Classification And Regression Tree) algorithm [Breiman et al., 1984], or custom variants thereof, and hence are believed to be "adaptive" to high-dimensional data, sparsit… ▽ More

    Submitted 14 September, 2025; originally announced September 2025.

  2. arXiv:2509.08483  [pdf, ps, other

    cs.LG math.NA math.OC stat.CO stat.ML

    Modified Loss of Momentum Gradient Descent: Fine-Grained Analysis

    Authors: Matias D. Cattaneo, Boris Shigida

    Abstract: We analyze gradient descent with Polyak heavy-ball momentum (HB) whose fixed momentum parameter $β\in (0, 1)$ provides exponential decay of memory. Building on Kovachki and Stuart (2021), we prove that on an exponentially attractive invariant manifold the algorithm is exactly plain gradient descent with a modified loss, provided that the step size $h$ is small enough. Although the modified loss do… ▽ More

    Submitted 10 September, 2025; originally announced September 2025.

  3. arXiv:2508.03878  [pdf, ps, other

    stat.ME econ.EM stat.AP

    The Regression Discontinuity Design in Medical Science

    Authors: Matias D. Cattaneo, Rocio Titiunik

    Abstract: This article provides an introduction to the Regression Discontinuity (RD) design, and its application to empirical research in the medical sciences. While the main focus of this article is on causal interpretation, key concepts of estimation and inference are also briefly mentioned. A running medical empirical example is provided.

    Submitted 5 August, 2025; originally announced August 2025.

  4. arXiv:2507.14311  [pdf, ps, other

    econ.EM stat.AP

    Leveraging Covariates in Regression Discontinuity Designs

    Authors: Matias D. Cattaneo, Filippo Palomba

    Abstract: It is common practice to incorporate additional covariates in empirical economics. In the context of Regression Discontinuity (RD) designs, covariate adjustment plays multiple roles, making it essential to understand its impact on analysis and conclusions. Typically implemented via local least squares regressions, covariate adjustment can serve three main distinct purposes: (i) improving the effic… ▽ More

    Submitted 18 July, 2025; originally announced July 2025.

  5. arXiv:2505.07989  [pdf, ps, other

    stat.ME econ.EM stat.CO

    rd2d: Causal Inference in Boundary Discontinuity Designs

    Authors: Matias D. Cattaneo, Rocio Titiunik, Ruiqi Rae Yu

    Abstract: Boundary discontinuity designs -- also known as Multi-Score Regression Discontinuity (RD) designs, with Geographic RD designs as a prominent example -- are often used in empirical research to learn about causal treatment effects along a continuous assignment boundary defined by a bivariate score. This article introduces the R package rd2d, which implements and extends the methodological results de… ▽ More

    Submitted 10 June, 2025; v1 submitted 12 May, 2025; originally announced May 2025.

  6. arXiv:2505.05670  [pdf, ps, other

    econ.EM math.ST stat.AP stat.ME

    Estimation and Inference in Boundary Discontinuity Designs

    Authors: Matias D. Cattaneo, Rocio Titiunik, Ruiqi Rae Yu

    Abstract: Boundary Discontinuity Designs are used to learn about treatment effects along a continuous boundary that splits units into control and treatment groups according to a bivariate score variable. These research designs are also called Multi-Score Regression Discontinuity Designs, a leading special case being Geographic Regression Discontinuity Designs. We study the statistical properties of commonly… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  7. arXiv:2503.13696  [pdf, ps, other

    econ.EM stat.ME

    Treatment Effect Heterogeneity in Regression Discontinuity Designs

    Authors: Sebastian Calonico, Matias D. Cattaneo, Max H. Farrell, Filippo Palomba, Rocio Titiunik

    Abstract: Empirical studies using Regression Discontinuity (RD) designs often explore heterogeneous treatment effects based on pretreatment covariates, even though no formal statistical methods exist for such analyses. This has led to the widespread use of ad hoc approaches in applications. Motivated by common empirical practice, we develop a unified, theoretically grounded framework for RD heterogeneity an… ▽ More

    Submitted 3 July, 2025; v1 submitted 17 March, 2025; originally announced March 2025.

  8. arXiv:2502.13238  [pdf, ps, other

    stat.ME econ.EM

    Robust Inference for the Direct Average Treatment Effect with Treatment Assignment Interference

    Authors: Matias D. Cattaneo, Yihan He, Ruiqi, Yu

    Abstract: This paper develops methods for uncertainty quantification in causal inference settings with random network interference. We study the large-sample distributional properties of the classical difference-in-means Hajek treatment effect estimator, and propose a robust inference procedure for the (conditional) direct average treatment effect. Our framework allows for cross-unit interference in both th… ▽ More

    Submitted 26 June, 2025; v1 submitted 18 February, 2025; originally announced February 2025.

  9. arXiv:2502.02132  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    How Memory in Optimization Algorithms Implicitly Modifies the Loss

    Authors: Matias D. Cattaneo, Boris Shigida

    Abstract: In modern optimization methods used in deep learning, each update depends on the history of previous iterations, often referred to as memory, and this dependence decays fast as the iterates go further into the past. For example, gradient descent with momentum has exponentially decaying memory through exponentially averaged past gradients. We introduce a general technique for identifying a memoryle… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

  10. arXiv:2410.15477  [pdf, ps, other

    stat.ME stat.AP

    Randomization Inference for Before-and-After Studies with Multiple Units: An Application to a Criminal Procedure Reform in Uruguay

    Authors: Matias D. Cattaneo, Carlos Diaz, Rocio Titiunik

    Abstract: Learning about the immediate causal effects of large-scale policy interventions poses a significant challenge for quasi-experimental methods that rely on long-term trends or parametric modeling assumptions. As an alternative, we develop a randomization inference framework for before-and-after studies with multiple units, designed specifically for short-term causal inference and allowing for genera… ▽ More

    Submitted 8 August, 2025; v1 submitted 20 October, 2024; originally announced October 2024.

  11. arXiv:2407.15276  [pdf, other

    stat.ME econ.EM math.ST

    Nonlinear Binscatter Methods

    Authors: Matias D. Cattaneo, Richard K. Crump, Max H. Farrell, Yingjie Feng

    Abstract: Binned scatter plots are a powerful statistical tool for empirical work in the social, behavioral, and biomedical sciences. Available methods rely on a quantile-based partitioning estimator of the conditional mean regression function to primarily construct flexible yet interpretable visualization methods, but they can also be used to estimate treatment effects, assess uncertainty, and test substan… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  12. arXiv:2406.04191  [pdf, other

    math.ST econ.EM math.PR stat.ME

    Strong Approximations for Empirical Processes Indexed by Lipschitz Functions

    Authors: Matias D. Cattaneo, Ruiqi Rae Yu

    Abstract: This paper presents new uniform Gaussian strong approximations for empirical processes indexed by classes of functions based on $d$-variate random vectors ($d\geq1$). First, a uniform Gaussian strong approximation is established for general empirical processes indexed by possibly Lipschitz functions, improving on previous results in the literature. In the setting considered by Rio (1994), and if t… ▽ More

    Submitted 12 November, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  13. arXiv:2402.11640  [pdf, ps, other

    stat.ME

    Protocols for Observational Studies: An Application to Regression Discontinuity Designs

    Authors: Matias D. Cattaneo, Rocio Titiunik

    Abstract: In his 2022 IMS Medallion Lecture delivered at the Joint Statistical Meetings, Prof. Dylan S. Small eloquently advocated for the use of protocols in observational studies. We discuss his proposal and, inspired by his ideas, we develop a protocol for the regression discontinuity design.

    Submitted 18 February, 2024; originally announced February 2024.

  14. arXiv:2310.09702  [pdf, other

    math.ST stat.ME stat.ML

    Inference with Mondrian Random Forests

    Authors: Matias D. Cattaneo, Jason M. Klusowski, William G. Underwood

    Abstract: Random forests are popular methods for regression and classification analysis, and many different variants have been proposed in recent years. One interesting example is the Mondrian random forest, in which the underlying constituent trees are constructed via a Mondrian process. We give precise bias and variance characterizations, along with a Berry-Esseen-type central limit theorem, for the Mondr… ▽ More

    Submitted 8 April, 2025; v1 submitted 14 October, 2023; originally announced October 2023.

    Comments: 64 pages, 1 figure, 6 tables

    MSC Class: 62G08 (Primary); 62G05; 62G20 (Secondary)

  15. arXiv:2309.00079  [pdf, other

    cs.LG cs.AI math.OC stat.CO stat.ML

    On the Implicit Bias of Adam

    Authors: Matias D. Cattaneo, Jason M. Klusowski, Boris Shigida

    Abstract: In previous literature, backward error analysis was used to find ordinary differential equations (ODEs) approximating the gradient descent trajectory. It was found that finite step sizes implicitly regularize solutions because terms appearing in the ODEs penalize the two-norm of the loss gradients. We prove that the existence of similar implicit regularization in RMSProp and Adam depends on their… ▽ More

    Submitted 16 June, 2024; v1 submitted 31 August, 2023; originally announced September 2023.

  16. arXiv:2305.10934  [pdf, ps, other

    econ.TH econ.EM stat.ME

    Context-Dependent Heterogeneous Preferences: A Comment on Barseghyan and Molinari (2023)

    Authors: Matias D. Cattaneo, Xinwei Ma, Yusufcan Masatlioglu

    Abstract: Barseghyan and Molinari (2023) give sufficient conditions for semi-nonparametric point identification of parameters of interest in a mixture model of decision-making under risk, allowing for unobserved heterogeneity in utility functions and limited consideration. A key assumption in the model is that the heterogeneity of risk preferences is unobservable but context-independent. In this comment, we… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  17. arXiv:2303.13598  [pdf, ps, other

    math.ST econ.EM stat.ME

    Bootstrap-Assisted Inference for Generalized Grenander-type Estimators

    Authors: Matias D. Cattaneo, Michael Jansson, Kenichi Nagasawa

    Abstract: Westling and Carone (2020) proposed a framework for studying the large sample distributional properties of generalized Grenander-type estimators, a versatile class of nonparametric estimators of monotone functions. The limiting distribution of those estimators is representable as the left derivative of the greatest convex minorant of a Gaussian process whose monomial mean can be of unknown order (… ▽ More

    Submitted 4 July, 2024; v1 submitted 23 March, 2023; originally announced March 2023.

  18. arXiv:2302.07413  [pdf, other

    stat.ME econ.EM stat.AP

    A Guide to Regression Discontinuity Designs in Medical Applications

    Authors: Matias D. Cattaneo, Luke Keele, Rocio Titiunik

    Abstract: We present a practical guide for the analysis of regression discontinuity (RD) designs in biomedical contexts. We begin by introducing key concepts, assumptions, and estimands within both the continuity-based framework and the local randomization framework. We then discuss modern estimation and inference methods within both frameworks, including approaches for bandwidth or local neighborhood selec… ▽ More

    Submitted 16 May, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

  19. arXiv:2301.08958  [pdf, other

    stat.ME econ.EM stat.AP stat.CO

    A Practical Introduction to Regression Discontinuity Designs: Extensions

    Authors: Matias D. Cattaneo, Nicolas Idrobo, Rocio Titiunik

    Abstract: This monograph, together with its accompanying first part Cattaneo, Idrobo and Titiunik (2020), collects and expands the instructional materials we prepared for more than $50$ short courses and workshops on Regression Discontinuity (RD) methodology that we taught between 2014 and 2023. In this second monograph, we discuss several topics in RD methodology that build on and extend the analysis of RD… ▽ More

    Submitted 25 March, 2024; v1 submitted 21 January, 2023; originally announced January 2023.

  20. arXiv:2301.00277  [pdf, ps, other

    econ.EM math.ST stat.ME

    Higher-order Refinements of Small Bandwidth Asymptotics for Density-Weighted Average Derivative Estimators

    Authors: Matias D. Cattaneo, Max H. Farrell, Michael Jansson, Ricardo Masini

    Abstract: The density weighted average derivative (DWAD) of a regression function is a canonical parameter of interest in economics. Classical first-order large sample distribution theory for kernel-based DWAD estimators relies on tuning parameter restrictions and model assumptions that imply an asymptotic linear representation of the point estimator. These conditions can be restrictive, and the resulting d… ▽ More

    Submitted 15 February, 2024; v1 submitted 31 December, 2022; originally announced January 2023.

  21. arXiv:2211.10805  [pdf, other

    stat.ML cs.LG math.ST

    On the Pointwise Behavior of Recursive Partitioning and Its Implications for Heterogeneous Causal Effect Estimation

    Authors: Matias D. Cattaneo, Jason M. Klusowski, Peter M. Tian

    Abstract: Decision tree learning is increasingly being used for pointwise inference. Important applications include causal heterogenous treatment effects and dynamic policy decisions, as well as conditional quantile regression and design of experiments, where tree estimation and inference is conducted at specific values of the covariates. In this paper, we call into question the use of decision trees (train… ▽ More

    Submitted 6 February, 2024; v1 submitted 19 November, 2022; originally announced November 2022.

  22. arXiv:2210.14429  [pdf, other

    math.ST stat.ME

    Convergence Rates of Oblique Regression Trees for Flexible Function Libraries

    Authors: Matias D. Cattaneo, Rajita Chandak, Jason M. Klusowski

    Abstract: We develop a theoretical framework for the analysis of oblique decision trees, where the splits at each decision node occur at linear combinations of the covariates (as opposed to conventional tree constructions that force axis-aligned splits involving only a single covariate). While this methodology has garnered significant attention from the computer science and optimization communities since th… ▽ More

    Submitted 30 August, 2023; v1 submitted 25 October, 2022; originally announced October 2022.

  23. arXiv:2210.05026  [pdf, ps, other

    econ.EM stat.AP stat.CO stat.ME

    Uncertainty Quantification in Synthetic Controls with Staggered Treatment Adoption

    Authors: Matias D. Cattaneo, Yingjie Feng, Filippo Palomba, Rocio Titiunik

    Abstract: We propose principled prediction intervals to quantify the uncertainty of a large class of synthetic control predictions (or estimators) in settings with staggered treatment adoption, offering precise non-asymptotic coverage probability guarantees. From a methodological perspective, we provide a detailed discussion of different causal quantities to be predicted, which we call causal predictands, a… ▽ More

    Submitted 1 February, 2025; v1 submitted 10 October, 2022; originally announced October 2022.

  24. arXiv:2210.00362  [pdf, ps, other

    math.ST econ.EM stat.ME

    Yurinskii's Coupling for Martingales

    Authors: Matias D. Cattaneo, Ricardo P. Masini, William G. Underwood

    Abstract: Yurinskii's coupling is a popular theoretical tool for non-asymptotic distributional analysis in mathematical statistics and applied probability, offering a Gaussian strong approximation with an explicit error bound under easily verifiable conditions. Originally stated in $\ell_2$-norm for sums of independent random vectors, it has recently been extended both to the $\ell_p$-norm, for… ▽ More

    Submitted 4 August, 2025; v1 submitted 1 October, 2022; originally announced October 2022.

    Comments: 56 pages, 1 figure

    MSC Class: 62E20; 62G20; 60G42

  25. arXiv:2204.10375  [pdf, other

    stat.CO stat.AP stat.ME

    lpcde: Estimation and Inference for Local Polynomial Conditional Density Estimators

    Authors: Matias D. Cattaneo, Rajita Chandak, Michael Jansson, Xinwei Ma

    Abstract: This paper discusses the R package lpcde, which stands for local polynomial conditional density estimation. It implements the kernel-based local polynomial smoothing methods introduced in Cattaneo, Chandak, Jansson, Ma (2024) for statistical estimation and inference of conditional distributions, densities, and derivatives thereof. The package offers mean square error optimal bandwidth selection an… ▽ More

    Submitted 7 March, 2025; v1 submitted 21 April, 2022; originally announced April 2022.

  26. arXiv:2204.10359  [pdf, other

    math.ST econ.EM stat.ME

    Boundary Adaptive Local Polynomial Conditional Density Estimators

    Authors: Matias D. Cattaneo, Rajita Chandak, Michael Jansson, Xinwei Ma

    Abstract: We begin by introducing a class of conditional density estimators based on local polynomial techniques. The estimators are boundary adaptive and easy to implement. We then study the (pointwise and) uniform statistical properties of the estimators, offering characterizations of both probability concentration and distributional approximation. In particular, we establish uniform convergence rates in… ▽ More

    Submitted 17 December, 2023; v1 submitted 21 April, 2022; originally announced April 2022.

  27. arXiv:2202.05984  [pdf, other

    stat.ME econ.EM stat.AP stat.CO

    scpi: Uncertainty Quantification for Synthetic Control Methods

    Authors: Matias D. Cattaneo, Yingjie Feng, Filippo Palomba, Rocio Titiunik

    Abstract: The synthetic control method offers a way to quantify the effect of an intervention using weighted averages of untreated units to approximate the counterfactual outcome that the treated unit(s) would have experienced in the absence of the intervention. This method is useful for program evaluation and causal inference in observational studies. We introduce the software package scpi for prediction a… ▽ More

    Submitted 11 October, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

  28. arXiv:2201.05967  [pdf, other

    math.ST stat.ME

    Uniform Inference for Kernel Density Estimators with Dyadic Data

    Authors: Matias D. Cattaneo, Yingjie Feng, William G. Underwood

    Abstract: Dyadic data is often encountered when quantities of interest are associated with the edges of a network. As such it plays an important role in statistics, econometrics and many other data science disciplines. We consider the problem of uniformly estimating a dyadic Lebesgue density function, focusing on nonparametric kernel-based estimators taking the form of dyadic empirical processes. Our main c… ▽ More

    Submitted 13 October, 2023; v1 submitted 15 January, 2022; originally announced January 2022.

    Comments: Article: 23 pages, 3 figures. Supplemental appendix: 72 pages, 3 figures

    MSC Class: 62G05; 62G07; 62M99 (Primary) 91D30; 90B15 (Secondary)

  29. arXiv:2110.08410  [pdf, ps, other

    stat.ME econ.EM

    Covariate Adjustment in Regression Discontinuity Designs

    Authors: Matias D. Cattaneo, Luke Keele, Rocio Titiunik

    Abstract: The Regression Discontinuity (RD) design is a widely used non-experimental method for causal inference and program evaluation. While its canonical formulation only requires a score and an outcome variable, it is common in empirical work to encounter RD analyses where additional variables are used for adjustment. This practice has led to misconceptions about the role of covariate adjustment in RD a… ▽ More

    Submitted 24 August, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

  30. arXiv:2108.09400  [pdf, other

    econ.EM stat.AP stat.ME

    Regression Discontinuity Designs

    Authors: Matias D. Cattaneo, Rocio Titiunik

    Abstract: The Regression Discontinuity (RD) design is one of the most widely used non-experimental methods for causal inference and program evaluation. Over the last two decades, statistical and econometric methods for RD analysis have expanded and matured, and there is now a large number of methodological results for RD identification, estimation, inference, and validation. We offer a curated review of thi… ▽ More

    Submitted 24 February, 2022; v1 submitted 20 August, 2021; originally announced August 2021.

  31. arXiv:2009.14367  [pdf, other

    econ.EM math.ST stat.ME

    Local Regression Distribution Estimators

    Authors: Matias D. Cattaneo, Michael Jansson, Xinwei Ma

    Abstract: This paper investigates the large sample properties of local regression distribution estimators, which include a class of boundary adaptive density estimators as a prime example. First, we establish a pointwise Gaussian large sample distributional approximation in a unified way, allowing for both boundary and interior evaluation points simultaneously. Using this result, we study the asymptotic eff… ▽ More

    Submitted 28 January, 2021; v1 submitted 29 September, 2020; originally announced September 2020.

  32. arXiv:1912.07346  [pdf, other

    stat.CO econ.EM

    Analysis of Regression Discontinuity Designs with Multiple Cutoffs or Multiple Scores

    Authors: Matias D. Cattaneo, Rocio Titiunik, Gonzalo Vazquez-Bare

    Abstract: We introduce the \texttt{Stata} (and \texttt{R}) package \texttt{rdmulti}, which includes three commands (\texttt{rdmc}, \texttt{rdmcplot}, \texttt{rdms}) for analyzing Regression Discontinuity (RD) designs with multiple cutoffs or multiple scores. The command \texttt{rdmc} applies to non-cumulative and cumulative multi-cutoff RD settings. It calculates pooled and cutoff-specific RD treatment effe… ▽ More

    Submitted 25 April, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

  33. arXiv:1912.07120  [pdf, other

    stat.ME econ.EM

    Prediction Intervals for Synthetic Control Methods

    Authors: Matias D. Cattaneo, Yingjie Feng, Rocio Titiunik

    Abstract: Uncertainty quantification is a fundamental problem in the analysis and interpretation of synthetic control (SC) methods. We develop conditional prediction intervals in the SC framework, and provide conditions under which these intervals offer finite-sample probability guarantees. Our method allows for covariate adjustment and non-stationary data. The construction begins by noting that the statist… ▽ More

    Submitted 7 September, 2021; v1 submitted 15 December, 2019; originally announced December 2019.

  34. arXiv:1911.09511  [pdf, other

    stat.ME econ.EM stat.AP stat.CO

    A Practical Introduction to Regression Discontinuity Designs: Foundations

    Authors: Matias D. Cattaneo, Nicolas Idrobo, Rocio Titiunik

    Abstract: In this Element and its accompanying Element, Matias D. Cattaneo, Nicolas Idrobo, and Rocio Titiunik provide an accessible and practical guide for the analysis and interpretation of Regression Discontinuity (RD) designs that encourages the use of a common set of practices and facilitates the accumulation of RD-based empirical evidence. In this Element, the authors discuss the foundations of the ca… ▽ More

    Submitted 21 November, 2019; originally announced November 2019.

  35. arXiv:1906.06529  [pdf, other

    stat.CO econ.EM stat.AP

    lpdensity: Local Polynomial Density Estimation and Inference

    Authors: Matias D. Cattaneo, Michael Jansson, Xinwei Ma

    Abstract: Density estimation and inference methods are widely used in empirical work. When the underlying distribution has compact support, conventional kernel-based density estimators are no longer consistent near or at the boundary because of their well-known boundary bias. Alternative smoothing methods are available to handle boundary points in density estimation, but they all require additional tuning p… ▽ More

    Submitted 22 February, 2021; v1 submitted 15 June, 2019; originally announced June 2019.

  36. arXiv:1906.04242  [pdf, other

    econ.EM stat.AP stat.ME

    The Regression Discontinuity Design

    Authors: Matias D. Cattaneo, Rocio Titiunik, Gonzalo Vazquez-Bare

    Abstract: This handbook chapter gives an introduction to the sharp regression discontinuity design, covering identification, estimation, inference, and falsification methods.

    Submitted 1 June, 2020; v1 submitted 10 June, 2019; originally announced June 2019.

  37. arXiv:1906.00202  [pdf, other

    stat.CO econ.EM stat.ME

    lspartition: Partitioning-Based Least Squares Regression

    Authors: Matias D. Cattaneo, Max H. Farrell, Yingjie Feng

    Abstract: Nonparametric partitioning-based least squares regression is an important tool in empirical work. Common examples include regressions based on splines, wavelets, and piecewise polynomials. This article discusses the main methodological and numerical features of the R software package lspartition, which implements modern estimation and inference results for partitioning-based least squares (series)… ▽ More

    Submitted 8 August, 2019; v1 submitted 1 June, 2019; originally announced June 2019.

    Journal ref: R Journal 12(1): 172-187, 2020

  38. arXiv:1906.00198  [pdf, other

    stat.CO econ.EM stat.ME

    nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference

    Authors: Sebastian Calonico, Matias D. Cattaneo, Max H. Farrell

    Abstract: Nonparametric kernel density and local polynomial regression estimators are very popular in Statistics, Economics, and many other disciplines. They are routinely employed in applied work, either as part of the main empirical analysis or as a preliminary ingredient entering some other estimation or inference procedure. This article describes the main methodological and numerical features of the sof… ▽ More

    Submitted 1 June, 2019; originally announced June 2019.

    Journal ref: Journal of Statistical Software, 91(8): 1-33, 2019

  39. arXiv:1902.09615  [pdf, other

    econ.EM stat.CO

    Binscatter Regressions

    Authors: Matias D. Cattaneo, Richard K. Crump, Max H. Farrell, Yingjie Feng

    Abstract: We introduce the package Binsreg, which implements the binscatter methods developed by Cattaneo, Crump, Farrell, and Feng (2024b,a). The package includes seven commands: binsreg, binslogit, binsprobit, binsqreg, binstest, binspwc, and binsregselect. The first four commands implement binscatter plotting, point estimation, and uncertainty quantification (confidence intervals and confidence bands) fo… ▽ More

    Submitted 24 July, 2024; v1 submitted 25 February, 2019; originally announced February 2019.

  40. arXiv:1902.09608  [pdf, other

    econ.EM stat.ME stat.ML

    On Binscatter

    Authors: Matias D. Cattaneo, Richard K. Crump, Max H. Farrell, Yingjie Feng

    Abstract: Binscatter is a popular method for visualizing bivariate relationships and conducting informal specification testing. We study the properties of this method formally and develop enhanced visualization and econometric binscatter tools. These include estimating conditional means with optimal binning and quantifying uncertainty. We also highlight a methodological problem related to covariate adjustme… ▽ More

    Submitted 30 April, 2024; v1 submitted 25 February, 2019; originally announced February 2019.

    Journal ref: American Economic Review, 114(5) 1488-1514, 2024

  41. arXiv:1811.11512  [pdf, other

    econ.EM stat.ME

    Simple Local Polynomial Density Estimators

    Authors: Matias D. Cattaneo, Michael Jansson, Xinwei Ma

    Abstract: This paper introduces an intuitive and easy-to-implement nonparametric density estimator based on local polynomial techniques. The estimator is fully boundary adaptive and automatic, but does not require pre-binning or any other transformation of the data. We study the main asymptotic properties of the estimator, and use these results to provide principled estimation, inference, and bandwidth sele… ▽ More

    Submitted 7 June, 2019; v1 submitted 28 November, 2018; originally announced November 2018.

  42. arXiv:1809.03904  [pdf, ps, other

    econ.EM stat.ME

    Regression Discontinuity Designs Using Covariates

    Authors: Sebastian Calonico, Matias D. Cattaneo, Max H. Farrell, Rocio Titiunik

    Abstract: We study regression discontinuity designs when covariates are included in the estimation. We examine local polynomial estimators that include discrete or continuous covariates in an additive separable way, but without imposing any parametric restrictions on the underlying population regression functions. We recommend a covariate-adjustment approach that retains consistency under intuitive conditio… ▽ More

    Submitted 11 September, 2018; originally announced September 2018.

    Journal ref: Review of Economics and Statistics, 101(3), 442--451, 2019

  43. arXiv:1809.03584  [pdf, other

    econ.EM econ.GN stat.ME

    Characteristic-Sorted Portfolios: Estimation and Inference

    Authors: Matias D. Cattaneo, Richard K. Crump, Max H. Farrell, Ernst Schaumburg

    Abstract: Portfolio sorting is ubiquitous in the empirical finance literature, where it has been widely used to identify pricing anomalies. Despite its popularity, little attention has been paid to the statistical properties of the procedure. We develop a general framework for portfolio sorting by casting it as a nonparametric estimator. We present valid asymptotic inference methods and a valid mean square… ▽ More

    Submitted 5 October, 2019; v1 submitted 10 September, 2018; originally announced September 2018.

    Journal ref: Review of Economics and Statistics, 102(3), 531--551, 2020

  44. arXiv:1809.00236  [pdf, other

    econ.EM stat.ME

    Optimal Bandwidth Choice for Robust Bias Corrected Inference in Regression Discontinuity Designs

    Authors: Sebastian Calonico, Matias D. Cattaneo, Max H. Farrell

    Abstract: Modern empirical work in Regression Discontinuity (RD) designs often employs local polynomial estimation and inference with a mean square error (MSE) optimal bandwidth choice. This bandwidth yields an MSE-optimal RD treatment effect estimator, but is by construction invalid for inference. Robust bias corrected (RBC) inference methods are valid when using the MSE-optimal bandwidth, but we show they… ▽ More

    Submitted 2 January, 2020; v1 submitted 1 September, 2018; originally announced September 2018.

    Journal ref: Econometrics Journal, 23(2), 192--210, 2020

  45. arXiv:1808.04416  [pdf, other

    econ.EM stat.AP stat.ME

    Extrapolating Treatment Effects in Multi-Cutoff Regression Discontinuity Designs

    Authors: Matias D. Cattaneo, Luke Keele, Rocio Titiunik, Gonzalo Vazquez-Bare

    Abstract: In non-experimental settings, the Regression Discontinuity (RD) design is one of the most credible identification strategies for program evaluation and causal inference. However, RD treatment effect estimands are necessarily local, making statistical methods for the extrapolation of these effects a key area for development. We introduce a new method for extrapolation of RD effects that relies on t… ▽ More

    Submitted 1 April, 2020; v1 submitted 13 August, 2018; originally announced August 2018.

  46. arXiv:1807.10100  [pdf, other

    econ.EM math.ST stat.ME

    Two-Step Estimation and Inference with Possibly Many Included Covariates

    Authors: Matias D. Cattaneo, Michael Jansson, Xinwei Ma

    Abstract: We study the implications of including many covariates in a first-step estimate entering a two-step estimation procedure. We find that a first order bias emerges when the number of \textit{included} covariates is "large" relative to the square-root of sample size, rendering standard inference procedures invalid. We show that the jackknife is able to estimate this "many covariates" bias consistentl… ▽ More

    Submitted 26 July, 2018; originally announced July 2018.

  47. arXiv:1712.03448  [pdf, other

    econ.EM econ.TH stat.ME

    A Random Attention Model

    Authors: Matias D. Cattaneo, Xinwei Ma, Yusufcan Masatlioglu, Elchin Suleymanov

    Abstract: This paper illustrates how one can deduce preference from observed choices when attention is not only limited but also random. In contrast to earlier approaches, we introduce a Random Attention Model (RAM) where we abstain from any particular attention formation, and instead consider a large class of nonparametric random attention rules. Our model imposes one intuitive condition, termed Monotonic… ▽ More

    Submitted 29 August, 2019; v1 submitted 9 December, 2017; originally announced December 2017.

  48. arXiv:1704.08066  [pdf, ps, other

    math.ST econ.EM stat.ME

    Bootstrap-Based Inference for Cube Root Asymptotics

    Authors: Matias D. Cattaneo, Michael Jansson, Kenichi Nagasawa

    Abstract: This paper proposes a valid bootstrap-based distributional approximation for M-estimators exhibiting a Chernoff (1964)-type limiting distribution. For estimators of this kind, the standard nonparametric bootstrap is inconsistent. The method proposed herein is based on the nonparametric bootstrap, but restores consistency by altering the shape of the criterion function defining the estimator whose… ▽ More

    Submitted 29 May, 2020; v1 submitted 26 April, 2017; originally announced April 2017.

  49. arXiv:1507.02493  [pdf, ps, other

    math.ST econ.EM stat.ME

    Inference in Linear Regression Models with Many Covariates and Heteroskedasticity

    Authors: Matias D. Cattaneo, Michael Jansson, Whitney K. Newey

    Abstract: The linear regression model is widely used in empirical work in Economics, Statistics, and many other disciplines. Researchers often include many covariates in their linear model specification in an attempt to control for confounders. We give inference methods that allow for many covariates and heteroskedasticity. Our results are obtained using high-dimensional approximations, where the number of… ▽ More

    Submitted 16 January, 2017; v1 submitted 9 July, 2015; originally announced July 2015.