Skip to main content

Showing 1–11 of 11 results for author: Austern, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.17809  [pdf, other

    stat.ML cs.LG math.ST

    Poisson-Process Topic Model for Integrating Knowledge from Pre-trained Language Models

    Authors: Morgane Austern, Yuanchuan Guo, Zheng Tracy Ke, Tianle Liu

    Abstract: Topic modeling is traditionally applied to word counts without accounting for the context in which words appear. Recent advancements in large language models (LLMs) offer contextualized word embeddings, which capture deeper meaning and relationships between words. We aim to leverage such embeddings to improve topic modeling. We use a pre-trained LLM to convert each document into a sequence of wo… ▽ More

    Submitted 22 March, 2025; originally announced March 2025.

    Comments: 35 pages, 9 figures, 3 tables

    MSC Class: 62G07

  2. arXiv:2402.07340  [pdf, other

    cs.LG cs.IT cs.SI math.PR math.ST stat.ML

    Perfect Recovery for Random Geometric Graph Matching with Shallow Graph Neural Networks

    Authors: Suqi Liu, Morgane Austern

    Abstract: We study the graph matching problem in the presence of vertex feature information using shallow graph neural networks. Specifically, given two graphs that are independent perturbations of a single random geometric graph with sparse binary features, the task is to recover an unknown one-to-one mapping between the vertices of the two graphs. We show under certain conditions on the sparsity and noise… ▽ More

    Submitted 11 March, 2025; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: 27 pages, 5 figures, 3 tables; to appear in the Proceedings of the 28th International Conference on Artificial Intelligence and Statistics (AISTATS) 2025

  3. arXiv:2402.02692  [pdf, other

    cs.LG cs.SI math.ST stat.ML

    Statistical Guarantees for Link Prediction using Graph Neural Networks

    Authors: Alan Chung, Amin Saberi, Morgane Austern

    Abstract: This paper derives statistical guarantees for the performance of Graph Neural Networks (GNNs) in link prediction tasks on graphs generated by a graphon. We propose a linear GNN architecture (LG-GNN) that produces consistent estimators for the underlying edge probabilities. We establish a bound on the mean squared error and give guarantees on the ability of LG-GNN to detect high-probability edges.… ▽ More

    Submitted 7 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  4. arXiv:2303.04416  [pdf, other

    econ.EM cs.LG math.ST stat.ME

    Inference on Optimal Dynamic Policies via Softmax Approximation

    Authors: Qizhao Chen, Morgane Austern, Vasilis Syrgkanis

    Abstract: Estimating optimal dynamic policies from offline data is a fundamental problem in dynamic decision making. In the context of causal inference, the problem is known as estimating the optimal dynamic treatment regime. Even though there exists a plethora of methods for estimation, constructing confidence intervals for the value of the optimal regime and structural parameters associated with it is inh… ▽ More

    Submitted 13 December, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

  5. arXiv:2206.01825  [pdf, other

    econ.EM cs.LG math.ST stat.ME

    Debiased Machine Learning without Sample-Splitting for Stable Estimators

    Authors: Qizhao Chen, Vasilis Syrgkanis, Morgane Austern

    Abstract: Estimation and inference on causal parameters is typically reduced to a generalized method of moments problem, which involves auxiliary functions that correspond to solutions to a regression or classification problem. Recent line of work on debiased machine learning shows how one can use generic machine learning estimators for these auxiliary problems, while maintaining asymptotic normality and ro… ▽ More

    Submitted 14 November, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

  6. arXiv:2202.09134  [pdf, other

    cs.LG math.ST stat.ML

    Gaussian and Non-Gaussian Universality of Data Augmentation

    Authors: Kevin Han Huang, Peter Orbanz, Morgane Austern

    Abstract: We provide universality results that quantify how data augmentation affects the variance and limiting distribution of estimates through simple surrogates, and analyze several specific models in detail. The results confirm some observations made in machine learning practice, but also lead to unexpected findings: Data augmentation may increase rather than decrease the uncertainty of estimates, such… ▽ More

    Submitted 15 March, 2025; v1 submitted 18 February, 2022; originally announced February 2022.

  7. arXiv:2107.02363  [pdf, other

    stat.ML cs.LG math.ST

    Asymptotics of Network Embeddings Learned via Subsampling

    Authors: Andrew Davison, Morgane Austern

    Abstract: Network data are ubiquitous in modern machine learning, with tasks of interest including node classification, node clustering and link prediction. A frequent approach begins by learning an Euclidean embedding of the network, to which algorithms developed for vector-valued data are applied. For large networks, embeddings are learned using stochastic gradient methods where the sub-sampling scheme ca… ▽ More

    Submitted 17 May, 2023; v1 submitted 5 July, 2021; originally announced July 2021.

    Comments: Accepted at Journal of Machine Learning Research (JMLR). 120 pages, 3 figures, 1 table

    Journal ref: Journal of Machine Learning Research 24 (2023) 1-120. Published 5/23

  8. arXiv:2011.11248  [pdf, ps, other

    math.ST cs.LG stat.ML

    Asymptotics of the Empirical Bootstrap Method Beyond Asymptotic Normality

    Authors: Morgane Austern, Vasilis Syrgkanis

    Abstract: One of the most commonly used methods for forming confidence intervals for statistical inference is the empirical bootstrap, which is especially expedient when the limiting distribution of the estimator is unknown. However, despite its ubiquitous role, its theoretical properties are still not well understood for non-asymptotically normal estimators. In this paper, under stability conditions, we es… ▽ More

    Submitted 23 November, 2020; originally announced November 2020.

  9. arXiv:1806.10701  [pdf, other

    stat.ML cs.LG cs.SI

    Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data

    Authors: Victor Veitch, Morgane Austern, Wenda Zhou, David M. Blei, Peter Orbanz

    Abstract: Empirical risk minimization is the main tool for prediction problems, but its extension to relational data remains unsolved. We solve this problem using recent ideas from graph sampling theory to (i) define an empirical risk for relational data and (ii) obtain stochastic gradients for this empirical risk that are automatically unbiased. This is achieved by considering the method by which data is s… ▽ More

    Submitted 22 February, 2019; v1 submitted 27 June, 2018; originally announced June 2018.

    Comments: Accepted as AISTATS 2019 Oral

  10. arXiv:1804.05862  [pdf, other

    stat.ML cs.LG

    Non-Vacuous Generalization Bounds at the ImageNet Scale: A PAC-Bayesian Compression Approach

    Authors: Wenda Zhou, Victor Veitch, Morgane Austern, Ryan P. Adams, Peter Orbanz

    Abstract: Modern neural networks are highly overparameterized, with capacity to substantially overfit to training data. Nevertheless, these networks often generalize well in practice. It has also been observed that trained networks can often be "compressed" to much smaller representations. The purpose of this paper is to connect these two empirical observations. Our main technical result is a generalization… ▽ More

    Submitted 24 February, 2019; v1 submitted 16 April, 2018; originally announced April 2018.

    Comments: 16 pages, 1 figure. Accepted at ICLR 2019

  11. arXiv:1702.01317  [pdf, ps, other

    cs.IT

    On the Gaussianity of Kolmogorov Complexity of Mixing Sequences

    Authors: Morgane Austern, Arian Maleki

    Abstract: Let $ K(X_1, \ldots, X_n)$ and $H(X_n | X_{n-1}, \ldots, X_1)$ denote the Kolmogorov complexity and Shannon's entropy rate of a stationary and ergodic process $\{X_i\}_{i=-\infty}^\infty$. It has been proved that \[ \frac{K(X_1, \ldots, X_n)}{n} - H(X_n | X_{n-1}, \ldots, X_1) \rightarrow 0, \] almost surely. This paper studies the convergence rate of this asymptotic result. In particular, we show… ▽ More

    Submitted 4 February, 2017; originally announced February 2017.