Skip to main content

Showing 1–7 of 7 results for author: Blei, D

Searching in archive econ. Search in all archives.
.
  1. arXiv:2409.09894  [pdf, other

    cs.LG econ.EM stat.ME stat.ML

    Estimating Wage Disparities Using Foundation Models

    Authors: Keyon Vafa, Susan Athey, David M. Blei

    Abstract: The rise of foundation models marks a paradigm shift in machine learning: instead of training specialized models from scratch, foundation models are first trained on massive datasets before being adapted or fine-tuned to make predictions on smaller datasets. Initially developed for text, foundation models have also excelled at making predictions about social science data. However, while many estim… ▽ More

    Submitted 29 April, 2025; v1 submitted 15 September, 2024; originally announced September 2024.

  2. arXiv:2302.12777  [pdf, other

    stat.ME econ.EM

    On the Misspecification of Linear Assumptions in Synthetic Control

    Authors: Achille Nazaret, Claudia Shi, David M. Blei

    Abstract: The synthetic control (SC) method is a popular approach for estimating treatment effects from observational panel data. It rests on a crucial assumption that we can write the treated unit as a linear combination of the untreated units. This linearity assumption, however, can be unlikely to hold in practice and, when violated, the resulting SC estimates are incorrect. In this paper we examine two q… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

  3. arXiv:2202.08370  [pdf, other

    cs.LG econ.EM

    CAREER: A Foundation Model for Labor Sequence Data

    Authors: Keyon Vafa, Emil Palikot, Tianyu Du, Ayush Kanodia, Susan Athey, David M. Blei

    Abstract: Labor economists regularly analyze employment data by fitting predictive models to small, carefully constructed longitudinal survey datasets. Although machine learning methods offer promise for such problems, these survey datasets are too small to take advantage of them. In recent years large datasets of online resumes have also become available, providing data about the career trajectories of mil… ▽ More

    Submitted 29 February, 2024; v1 submitted 16 February, 2022; originally announced February 2022.

  4. arXiv:2112.05671  [pdf, other

    stat.ME econ.EM

    On the Assumptions of Synthetic Control Methods

    Authors: Claudia Shi, Dhanya Sridhar, Vishal Misra, David M. Blei

    Abstract: Synthetic control (SC) methods have been widely applied to estimate the causal effect of large-scale interventions, e.g., the state-wide effect of a change in policy. The idea of synthetic controls is to approximate one unit's counterfactual outcomes using a weighted combination of some other units' observed outcomes. The motivating question of this paper is: how does the SC strategy lead to valid… ▽ More

    Submitted 14 December, 2021; v1 submitted 10 December, 2021; originally announced December 2021.

  5. arXiv:1906.02635  [pdf, other

    cs.LG econ.EM stat.ML

    Counterfactual Inference for Consumer Choice Across Many Product Categories

    Authors: Rob Donnelly, Francisco R. Ruiz, David Blei, Susan Athey

    Abstract: This paper proposes a method for estimating consumer preferences among discrete choices, where the consumer chooses at most one product in a category, but selects from multiple categories in parallel. The consumer's utility is additive in the different categories. Her preferences about product attributes as well as her price sensitivity vary across products and are in general correlated across pro… ▽ More

    Submitted 6 August, 2023; v1 submitted 6 June, 2019; originally announced June 2019.

    Journal ref: Quantitative Marketing and Economics, volume 19, pages 369-407 (2021)

  6. arXiv:1801.07826  [pdf, other

    econ.EM cs.AI stat.AP stat.ML

    Estimating Heterogeneous Consumer Preferences for Restaurants and Travel Time Using Mobile Location Data

    Authors: Susan Athey, David Blei, Robert Donnelly, Francisco Ruiz, Tobias Schmidt

    Abstract: This paper analyzes consumer choices over lunchtime restaurants using data from a sample of several thousand anonymous mobile phone users in the San Francisco Bay Area. The data is used to identify users' approximate typical morning location, as well as their choices of lunchtime restaurants. We build a model where restaurants have latent characteristics (whose distribution may depend on restauran… ▽ More

    Submitted 22 January, 2018; originally announced January 2018.

  7. arXiv:1711.03560  [pdf, other

    stat.ML cs.LG econ.EM

    SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements

    Authors: Francisco J. R. Ruiz, Susan Athey, David M. Blei

    Abstract: We develop SHOPPER, a sequential probabilistic model of shopping data. SHOPPER uses interpretable components to model the forces that drive how a customer chooses products; in particular, we designed SHOPPER to capture how items interact with other items. We develop an efficient posterior inference algorithm to estimate these forces from large-scale data, and we analyze a large dataset from a majo… ▽ More

    Submitted 9 June, 2019; v1 submitted 9 November, 2017; originally announced November 2017.

    Comments: Published at Annals of Applied Statistics. 27 pages, 4 figures