Skip to main content

Showing 51–100 of 160 results for author: Lipton, Z

.
  1. arXiv:2302.02551  [pdf, other

    cs.CV cs.LG

    CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets

    Authors: Zachary Novack, Julian McAuley, Zachary C. Lipton, Saurabh Garg

    Abstract: Open vocabulary models (e.g. CLIP) have shown strong performance on zero-shot classification through their ability generate embeddings for each class based on their (natural language) names. Prior work has focused on improving the accuracy of these models through prompt engineering or by incorporating a small amount of labeled downstream data (via finetuning). However, there has been little focus… ▽ More

    Submitted 31 May, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: Accepted at ICML 2023

  2. arXiv:2301.11724  [pdf, other

    cs.LG

    Meta-Learning Mini-Batch Risk Functionals

    Authors: Jacob Tyo, Zachary C. Lipton

    Abstract: Supervised learning typically optimizes the expected value risk functional of the loss, but in many cases, we want to optimize for other risk functionals. In full-batch gradient descent, this is done by taking gradients of a risk functional of interest, such as the Conditional Value at Risk (CVaR) which ignores some quantile of extreme losses. However, deep learning must almost always use mini-bat… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  3. arXiv:2211.15853  [pdf, other

    cs.LG

    Disentangling the Mechanisms Behind Implicit Regularization in SGD

    Authors: Zachary Novack, Simran Kaur, Tanya Marwah, Saurabh Garg, Zachary C. Lipton

    Abstract: A number of competing hypotheses have been proposed to explain why small-batch Stochastic Gradient Descent (SGD)leads to improved generalization over the full-batch regime, with recent work crediting the implicit regularization of various quantities throughout training. However, to date, empirical evidence assessing the explanatory power of these hypotheses is lacking. In this paper, we conduct an… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted as Spotlight at the NeurIPS 2022 Workshop for Higher Order Optimization in Machine Learning

  4. arXiv:2211.07165  [pdf, other

    cs.LG stat.AP

    Model Evaluation in Medical Datasets Over Time

    Authors: Helen Zhou, Yuwen Chen, Zachary C. Lipton

    Abstract: Machine learning models deployed in healthcare systems face data drawn from continually evolving environments. However, researchers proposing such models typically evaluate them in a time-agnostic manner, with train and test splits sampling patients throughout the entire study period. We introduce the Evaluation on Medical Datasets Over Time (EMDOT) framework and Python package, which evaluates th… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 6 pages

  5. arXiv:2211.02093  [pdf, other

    cs.LG stat.ML

    Domain Adaptation under Missingness Shift

    Authors: Helen Zhou, Sivaraman Balakrishnan, Zachary C. Lipton

    Abstract: Rates of missing data often depend on record-keeping policies and thus may change across times and locations, even when the underlying features are comparatively stable. In this paper, we introduce the problem of Domain Adaptation under Missingness Shift (DAMS). Here, (labeled) source data and (unlabeled) target data would be exchangeable but for different missing data mechanisms. We show that if… ▽ More

    Submitted 3 May, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

  6. arXiv:2210.15031  [pdf, other

    cs.LG

    Characterizing Datapoints via Second-Split Forgetting

    Authors: Pratyush Maini, Saurabh Garg, Zachary C. Lipton, J. Zico Kolter

    Abstract: Researchers investigating example hardness have increasingly focused on the dynamics by which neural networks learn and forget examples throughout training. Popular metrics derived from these dynamics include (i) the epoch at which examples are first correctly classified; (ii) the number of times their predictions flip during training; and (iii) whether their prediction flips if they are held out.… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted at NeurIPS 2022

  7. arXiv:2210.12101  [pdf, ps, other

    cs.LG math.NA

    Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective

    Authors: Tanya Marwah, Zachary C. Lipton, Jianfeng Lu, Andrej Risteski

    Abstract: A burgeoning line of research leverages deep neural networks to approximate the solutions to high dimensional PDEs, opening lines of theoretical inquiry focused on explaining how it is that these models appear to evade the curse of dimensionality. However, most prior theoretical analyses have been limited to linear PDEs. In this work, we take a step towards studying the representational power of n… ▽ More

    Submitted 27 March, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

  8. arXiv:2210.01422  [pdf, other

    cs.LG

    Time-Varying Propensity Score to Bridge the Gap between the Past and Present

    Authors: Rasool Fakoor, Jonas Mueller, Zachary C. Lipton, Pratik Chaudhari, Alexander J. Smola

    Abstract: Real-world deployment of machine learning models is challenging because data evolves over time. While no model can work when data evolves in an arbitrary fashion, if there is some pattern to these changes, we might be able to design methods to address it. This paper addresses situations when data evolves gradually. We introduce a time-varying propensity score that can detect gradual shifts in the… ▽ More

    Submitted 2 May, 2024; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: Published at ICLR 2024

  9. arXiv:2209.14389  [pdf, other

    cs.CL cs.LG

    Downstream Datasets Make Surprisingly Good Pretraining Corpora

    Authors: Kundan Krishna, Saurabh Garg, Jeffrey P. Bigham, Zachary C. Lipton

    Abstract: For most natural language processing tasks, the dominant practice is to finetune large pretrained transformer models (e.g., BERT) using smaller downstream datasets. Despite the success of this approach, it remains unclear to what extent these gains are attributable to the massive background corpora employed for pretraining versus to the pretraining objectives themselves. This paper introduces a la… ▽ More

    Submitted 26 May, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: ACL2023 Camera Ready

  10. arXiv:2209.10444  [pdf, other

    cs.LG cs.AI stat.ML

    Off-Policy Risk Assessment in Markov Decision Processes

    Authors: Audrey Huang, Liu Leqi, Zachary Chase Lipton, Kamyar Azizzadenesheli

    Abstract: Addressing such diverse ends as safety alignment with human preferences, and the efficiency of learning, a growing line of reinforcement learning research focuses on risk functionals that depend on the entire distribution of returns. Recent work on \emph{off-policy risk assessment} (OPRA) for contextual bandits introduced consistent estimators for the target policy's CDF of returns along with fini… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

  11. arXiv:2209.06869  [pdf, other

    cs.CL cs.AI cs.LG

    On the State of the Art in Authorship Attribution and Authorship Verification

    Authors: Jacob Tyo, Bhuwan Dhingra, Zachary C. Lipton

    Abstract: Despite decades of research on authorship attribution (AA) and authorship verification (AV), inconsistent dataset splits/filtering and mismatched evaluation methods make it difficult to assess the state of the art. In this paper, we present a survey of the fields, resolve points of confusion, introduce Valla that standardizes and benchmarks AA/AV datasets and metrics, provide a large-scale empiric… ▽ More

    Submitted 5 October, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

  12. arXiv:2208.13126  [pdf, other

    cs.LG stat.ML

    Learning Clinical Concepts for Predicting Risk of Progression to Severe COVID-19

    Authors: Helen Zhou, Cheng Cheng, Kelly J. Shields, Gursimran Kochhar, Tariq Cheema, Zachary C. Lipton, Jeremy C. Weiss

    Abstract: With COVID-19 now pervasive, identification of high-risk individuals is crucial. Using data from a major healthcare provider in Southwestern Pennsylvania, we develop survival models predicting severe COVID-19 progression. In this endeavor, we face a tradeoff between more accurate models relying on many features and less accurate models relying on a few features aligned with clinician intuition. Co… ▽ More

    Submitted 27 August, 2022; originally announced August 2022.

  13. arXiv:2207.13179  [pdf, other

    cs.LG stat.ML

    Unsupervised Learning under Latent Label Shift

    Authors: Manley Roberts, Pranav Mani, Saurabh Garg, Zachary C. Lipton

    Abstract: What sorts of structure might enable a learner to discover classes from unlabeled data? Traditional approaches rely on feature-space similarity and heroic assumptions on the data. In this paper, we introduce unsupervised learning under Latent Label Shift (LLS), where we have access to unlabeled data from multiple domains such that the label marginals $p_d(y)$ can shift across domains but the class… ▽ More

    Submitted 1 December, 2022; v1 submitted 26 July, 2022; originally announced July 2022.

    Comments: NeurIPS 2022. Manley Roberts and Pranav Mani contributed equally to this work

  14. arXiv:2207.13048  [pdf, other

    cs.LG

    Domain Adaptation under Open Set Label Shift

    Authors: Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton

    Abstract: We introduce the problem of domain adaptation under Open Set Label Shift (OSLS) where the label distribution can change arbitrarily and a new class may arrive during deployment, but the class-conditional distributions p(x|y) are domain-invariant. OSLS subsumes domain adaptation under label shift and Positive-Unlabeled (PU) learning. The learner's goals here are two-fold: (a) estimate the target la… ▽ More

    Submitted 16 October, 2022; v1 submitted 26 July, 2022; originally announced July 2022.

    Comments: Accepted at NeurIPS 2022

  15. arXiv:2206.13648  [pdf, other

    stat.ML cs.LG

    Supervised Learning with General Risk Functionals

    Authors: Liu Leqi, Audrey Huang, Zachary C. Lipton, Kamyar Azizzadenesheli

    Abstract: Standard uniform convergence results bound the generalization gap of the expected loss over a hypothesis class. The emergence of risk-sensitive learning requires generalization guarantees for functionals of the loss distribution beyond the expectation. While prior works specialize in uniform convergence of particular functionals, our work provides uniform convergence for a general class of Hölder… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

  16. arXiv:2206.10654  [pdf, other

    cs.LG stat.ML

    On the Maximum Hessian Eigenvalue and Generalization

    Authors: Simran Kaur, Jeremy Cohen, Zachary C. Lipton

    Abstract: The mechanisms by which certain training interventions, such as increasing learning rates and applying batch normalization, improve the generalization of deep networks remains a mystery. Prior works have speculated that "flatter" solutions generalize better than "sharper" solutions to unseen data, motivating several metrics for measuring flatness (particularly $λ_{max}$, the largest eigenvalue of… ▽ More

    Submitted 23 May, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: Proceedings on "I Can't Believe It's Not Better! - Understanding Deep Learning Through Empirical Falsification" at NeurIPS 2022 Workshops, PMLR 187:51-65, 2023

  17. arXiv:2206.04039  [pdf, ps, other

    cs.CY cs.AI cs.CL cs.LG stat.ML

    Resolving the Human Subjects Status of Machine Learning's Crowdworkers

    Authors: Divyansh Kaushik, Zachary C. Lipton, Alex John London

    Abstract: In recent years, machine learning (ML) has relied heavily on crowdworkers both for building datasets and for addressing research questions requiring human interaction or judgment. The diverse tasks performed and uses of the data produced render it difficult to determine when crowdworkers are best thought of as workers (versus human subjects). These difficulties are compounded by conflicting polici… ▽ More

    Submitted 15 June, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

  18. arXiv:2205.09701  [pdf, other

    cs.HC cs.CY

    Homophily and Incentive Effects in Use of Algorithms

    Authors: Riccardo Fogliato, Sina Fazelpour, Shantanu Gupta, Zachary Lipton, David Danks

    Abstract: As algorithmic tools increasingly aid experts in making consequential decisions, the need to understand the precise factors that mediate their influence has grown commensurately. In this paper, we present a crowdsourcing vignette study designed to assess the impacts of two plausible factors on AI-informed decision-making. First, we examine homophily -- do people defer more to models that tend to a… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: Accepted at CogSci, 2022

  19. arXiv:2203.13423  [pdf, ps, other

    cs.LG cs.IR stat.ML

    Modeling Attrition in Recommender Systems with Departing Bandits

    Authors: Omer Ben-Porat, Lee Cohen, Liu Leqi, Zachary C. Lipton, Yishay Mansour

    Abstract: Traditionally, when recommender systems are formalized as multi-armed bandits, the policy of the recommender system influences the rewards accrued, but not the length of interaction. However, in real-world systems, dissatisfied users may depart (and never come back). In this work, we propose a novel multi-armed bandit setup that captures such policy-dependent horizons. Our setup consists of a fini… ▽ More

    Submitted 15 February, 2024; v1 submitted 24 March, 2022; originally announced March 2022.

    Comments: Accepted at AAAI 2022

  20. arXiv:2202.01336  [pdf, other

    cs.LG

    Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation

    Authors: Yi-Fan Zhang, Hanlin Zhang, Zachary C. Lipton, Li Erran Li, Eric P. Xing

    Abstract: Previous works on Treatment Effect Estimation (TEE) are not in widespread use because they are predominantly theoretical, where strong parametric assumptions are made but untractable for practical application. Recent work uses multilayer perceptron (MLP) for modeling casual relationships, however, MLPs lag far behind recent advances in ML methodology, which limits their applicability and generaliz… ▽ More

    Submitted 17 October, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

  21. arXiv:2201.04234  [pdf, other

    cs.LG stat.ML

    Leveraging Unlabeled Data to Predict Out-of-Distribution Performance

    Authors: Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton, Behnam Neyshabur, Hanie Sedghi

    Abstract: Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions that may cause performance drops. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on… ▽ More

    Submitted 14 October, 2022; v1 submitted 11 January, 2022; originally announced January 2022.

    Comments: Accepted at ICLR 2022

  22. arXiv:2112.09669  [pdf, other

    cs.CL

    Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations

    Authors: Siddhant Arora, Danish Pruthi, Norman Sadeh, William W. Cohen, Zachary C. Lipton, Graham Neubig

    Abstract: In attempts to "explain" predictions of machine learning models, researchers have proposed hundreds of techniques for attributing predictions to features that are deemed important. While these attributions are often claimed to hold the potential to improve human "understanding" of the models, surprisingly little work explicitly evaluates progress towards this aspiration. In this paper, we conduct… ▽ More

    Submitted 21 August, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

    Comments: AAAI 2022

  23. arXiv:2111.00980  [pdf, other

    cs.LG stat.ML

    Mixture Proportion Estimation and PU Learning: A Modern Approach

    Authors: Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton

    Abstract: Given only positive examples and unlabeled examples (from both positive and negative classes), we might hope nevertheless to estimate an accurate positive-versus-negative classifier. Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE) -- determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning -- given such an estimate, lea… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

    Comments: Spotlight at NeurIPS 2021

  24. arXiv:2110.07566  [pdf, other

    cs.CL cs.AI cs.LG

    Practical Benefits of Feature Feedback Under Distribution Shift

    Authors: Anurag Katakkar, Clay H. Yoo, Weiqin Wang, Zachary C. Lipton, Divyansh Kaushik

    Abstract: In attempts to develop sample-efficient and interpretable algorithms, researcher have explored myriad mechanisms for collecting and exploiting feature feedback (or rationales) auxiliary annotations provided for training (but not test) instances that highlight salient evidence. Examples include bounding boxes around objects and salient spans in text. Despite its intuitive appeal, feature feedback h… ▽ More

    Submitted 17 October, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

  25. arXiv:2109.04953  [pdf, other

    cs.CL cs.LG

    Does Pretraining for Summarization Require Knowledge Transfer?

    Authors: Kundan Krishna, Jeffrey Bigham, Zachary C. Lipton

    Abstract: Pretraining techniques leveraging enormous datasets have driven recent advances in text summarization. While folk explanations suggest that knowledge transfer accounts for pretraining's benefits, little is known about why it works or what makes a pretraining task or dataset suitable. In this paper, we challenge the knowledge transfer story, showing that pretraining on documents consisting of chara… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

    Comments: Camera-ready for Findings of EMNLP 2021

  26. arXiv:2109.01443  [pdf, other

    cs.HC cs.AI

    The Impact of Algorithmic Risk Assessments on Human Predictions and its Analysis via Crowdsourcing Studies

    Authors: Riccardo Fogliato, Alexandra Chouldechova, Zachary Lipton

    Abstract: As algorithmic risk assessment instruments (RAIs) are increasingly adopted to assist decision makers, their predictive performance and potential to promote inequity have come under scrutiny. However, while most studies examine these tools in isolation, researchers have come to recognize that assessing their impact requires understanding the behavior of their human interactants. In this paper, buil… ▽ More

    Submitted 3 September, 2021; originally announced September 2021.

    Comments: Proceedings of the ACM on Human-Computer Interaction 5, CSCW2, Article 428 (October 2021)

  27. arXiv:2108.09265  [pdf, other

    cs.LG econ.EM stat.ML

    Efficient Online Estimation of Causal Effects by Deciding What to Observe

    Authors: Shantanu Gupta, Zachary C. Lipton, David Childers

    Abstract: Researchers often face data fusion problems, where multiple data sources are available, each capturing a distinct subset of variables. While problem formulations typically take the data as given, in practice, data acquisition can be an ongoing process. In this paper, we aim to estimate any functional of a probabilistic model (e.g., a causal effect) as efficiently as possible, by deciding, at each… ▽ More

    Submitted 30 October, 2021; v1 submitted 20 August, 2021; originally announced August 2021.

    Comments: Accepted at NeurIPS 2021

  28. arXiv:2107.00441  [pdf, ps, other

    cs.CY

    When Curation Becomes Creation: Algorithms, Microcontent, and the Vanishing Distinction between Platforms and Creators

    Authors: Liu Leqi, Dylan Hadfield-Menell, Zachary C. Lipton

    Abstract: Ever since social activity on the Internet began migrating from the wilds of the open web to the walled gardens erected by so-called platforms, debates have raged about the responsibilities that these platforms ought to bear. And yet, despite intense scrutiny from the news media and grassroots movements of outraged users, platforms continue to operate, from a legal standpoint, on the friendliest t… ▽ More

    Submitted 1 July, 2021; originally announced July 2021.

  29. arXiv:2106.11342  [pdf

    cs.LG cs.AI cs.CL cs.CV

    Dive into Deep Learning

    Authors: Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola

    Abstract: This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code. The entire book is drafted in Jupyter notebooks, seamlessly integrating exposition figures, math, and interactive examples with self-contained code. Our goal is to offer a resource that could (i) be freely available for everyone; (ii) offer sufficient technical… ▽ More

    Submitted 22 August, 2023; v1 submitted 21 June, 2021; originally announced June 2021.

    Comments: (HTML) https://D2L.ai (GitHub) https://github.com/d2l-ai/d2l-en/

  30. arXiv:2106.07041  [pdf, other

    cs.LG

    Correcting Exposure Bias for Link Recommendation

    Authors: Shantanu Gupta, Hao Wang, Zachary C. Lipton, Yuyang Wang

    Abstract: Link prediction methods are frequently applied in recommender systems, e.g., to suggest citations for academic papers or friends in social networks. However, exposure bias can arise when users are systematically underexposed to certain relevant items. For example, in citation networks, authors might be more likely to encounter papers from their own field and thus cite them preferentially. This bia… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

  31. arXiv:2106.00872  [pdf, other

    cs.CL cs.AI cs.LG

    On the Efficacy of Adversarial Data Collection for Question Answering: Results from a Large-Scale Randomized Study

    Authors: Divyansh Kaushik, Douwe Kiela, Zachary C. Lipton, Wen-tau Yih

    Abstract: In adversarial data collection (ADC), a human workforce interacts with a model in real time, attempting to produce examples that elicit incorrect predictions. Researchers hope that models trained on these more challenging datasets will rely less on superficial patterns, and thus be less brittle. However, despite ADC's intuitive appeal, it remains unclear when training on adversarial datasets produ… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: Accepted at ACL-IJCNLP 2021

  32. arXiv:2105.04953  [pdf, other

    stat.AP

    On the Validity of Arrest as a Proxy for Offense: Race and the Likelihood of Arrest for Violent Crimes

    Authors: Riccardo Fogliato, Alice Xiang, Zachary Lipton, Daniel Nagin, Alexandra Chouldechova

    Abstract: The risk of re-offense is considered in decision-making at many stages of the criminal justice system, from pre-trial, to sentencing, to parole. To aid decision makers in their assessments, institutions increasingly rely on algorithmic risk assessment instruments (RAIs). These tools assess the likelihood that an individual will be arrested for a new criminal offense within some time window followi… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

    Comments: Accepted at AAAI/ACM Conference on AI, Ethics, and Society (AIES), 2021

  33. arXiv:2105.00303  [pdf, other

    cs.LG stat.ML

    RATT: Leveraging Unlabeled Data to Guarantee Generalization

    Authors: Saurabh Garg, Sivaraman Balakrishnan, J. Zico Kolter, Zachary C. Lipton

    Abstract: To assess generalization, machine learning scientists typically either (i) bound the generalization gap and then (after training) plug in the empirical risk to obtain a bound on the true risk; or (ii) validate empirically on holdout data. However, (i) typically yields vacuous guarantees for overparameterized models. Furthermore, (ii) shrinks the training set and its guarantee erodes with each re-u… ▽ More

    Submitted 6 November, 2021; v1 submitted 1 May, 2021; originally announced May 2021.

    Comments: ICML 2021 (Long Talk)

  34. arXiv:2104.08977  [pdf, other

    cs.LG stat.ML

    Off-Policy Risk Assessment in Contextual Bandits

    Authors: Audrey Huang, Liu Leqi, Zachary C. Lipton, Kamyar Azizzadenesheli

    Abstract: Even when unable to run experiments, practitioners can evaluate prospective policies, using previously logged data. However, while the bandits literature has adopted a diverse set of objectives, most research on off-policy evaluation to date focuses on the expected reward. In this paper, we introduce Lipschitz risk functionals, a broad class of objectives that subsumes conditional value-at-risk (C… ▽ More

    Submitted 29 June, 2021; v1 submitted 18 April, 2021; originally announced April 2021.

  35. arXiv:2103.02827  [pdf, other

    cs.LG cs.AI stat.ML

    On the Convergence and Optimality of Policy Gradient for Markov Coherent Risk

    Authors: Audrey Huang, Liu Leqi, Zachary C. Lipton, Kamyar Azizzadenesheli

    Abstract: In order to model risk aversion in reinforcement learning, an emerging line of research adapts familiar algorithms to optimize coherent risk functionals, a class that includes conditional value-at-risk (CVaR). Because optimizing the coherent risk is difficult in Markov decision processes, recent work tends to focus on the Markov coherent risk (MCR), a time-consistent surrogate. While, policy gradi… ▽ More

    Submitted 5 March, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

  36. arXiv:2103.02138  [pdf, ps, other

    cs.LG math.NA stat.ML

    Parametric Complexity Bounds for Approximating PDEs with Neural Networks

    Authors: Tanya Marwah, Zachary C. Lipton, Andrej Risteski

    Abstract: Recent experiments have shown that deep networks can approximate solutions to high-dimensional PDEs, seemingly escaping the curse of dimensionality. However, questions regarding the theoretical basis for such approximations, including the required network size, remain open. In this paper, we investigate the representational power of neural networks for approximating solutions to linear elliptic PD… ▽ More

    Submitted 6 July, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

  37. arXiv:2102.10264  [pdf, other

    cs.LG cs.RO stat.ML

    On Proximal Policy Optimization's Heavy-tailed Gradients

    Authors: Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, J. Zico Kolter, Zachary C. Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar

    Abstract: Modern policy gradient algorithms such as Proximal Policy Optimization (PPO) rely on an arsenal of heuristics, including loss clipping and gradient clipping, to ensure successful learning. These heuristics are reminiscent of techniques from robust statistics, commonly used for estimation in outlier-rich (``heavy-tailed'') regimes. In this paper, we present a detailed empirical study to characteriz… ▽ More

    Submitted 12 July, 2021; v1 submitted 20 February, 2021; originally announced February 2021.

    Comments: ICML 2021

  38. arXiv:2012.04825  [pdf, other

    stat.AP

    Unpacking the Drop in COVID-19 Case Fatality Rates: A Study of National and Florida Line-Level Data

    Authors: Cheng Cheng, Helen Zhou, Jeremy C. Weiss, Zachary C. Lipton

    Abstract: Since the COVID-19 pandemic first reached the United States, the case fatality rate has fallen precipitously. Several possible explanations have been floated, including greater detection of mild cases due to expanded testing, shifts in age distribution among the infected, lags between confirmed cases and reported deaths, improvements in treatment, mutations in the virus, and decreased viral load a… ▽ More

    Submitted 11 December, 2020; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: 24 pages, 13 figures

  39. arXiv:2012.00893  [pdf, other

    cs.CL cs.LG

    Evaluating Explanations: How much do explanations from the teacher aid students?

    Authors: Danish Pruthi, Rachit Bansal, Bhuwan Dhingra, Livio Baldini Soares, Michael Collins, Zachary C. Lipton, Graham Neubig, William W. Cohen

    Abstract: While many methods purport to explain predictions by highlighting salient features, what aims these explanations serve and how they ought to be evaluated often go unstated. In this work, we introduce a framework to quantify the value of explanations via the accuracy gains that they confer on a student model trained to simulate a teacher model. Crucially, the explanations are available to the stude… ▽ More

    Submitted 16 December, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: TACL 2021 (pre-MIT Press publication version)

  40. arXiv:2011.13477  [pdf, other

    cs.CL cs.LG

    Decoding and Diversity in Machine Translation

    Authors: Nicholas Roberts, Davis Liang, Graham Neubig, Zachary C. Lipton

    Abstract: Neural Machine Translation (NMT) systems are typically evaluated using automated metrics that assess the agreement between generated translations and ground truth candidates. To improve systems with respect to these metrics, NLP researchers employ a variety of heuristic techniques, including searching for the conditional mode (vs. sampling) and incorporating various training heuristics (e.g., labe… ▽ More

    Submitted 26 November, 2020; originally announced November 2020.

    Comments: Presented at the Resistance AI Workshop, 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada

  41. arXiv:2011.06741  [pdf, other

    cs.LG stat.ML

    Rebounding Bandits for Modeling Satiation Effects

    Authors: Liu Leqi, Fatma Kilinc-Karzan, Zachary C. Lipton, Alan L. Montgomery

    Abstract: Psychological research shows that enjoyment of many goods is subject to satiation, with short-term satisfaction declining after repeated exposures to the same item. Nevertheless, proposed algorithms for powering recommender systems seldom model these dynamics, instead proceeding as though user preferences were fixed in time. In this work, we introduce rebounding bandits, a multi-armed bandit setup… ▽ More

    Submitted 27 October, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

  42. arXiv:2011.03654  [pdf, other

    cs.CY cs.LG stat.ML

    Fair Machine Learning Under Partial Compliance

    Authors: Jessica Dai, Sina Fazelpour, Zachary C. Lipton

    Abstract: Typically, fair machine learning research focuses on a single decisionmaker and assumes that the underlying population is stationary. However, many of the critical domains motivating this work are characterized by competitive marketplaces with many decisionmakers. Realistically, we might expect only a subset of them to adopt any non-compulsory fairness-conscious policy, a situation that political… ▽ More

    Submitted 26 September, 2022; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: Presented at AIES 2021; previously at the NeurIPS 2020 Workshop on Consequential Decision Making in Dynamic Environments and the NeurIPS 2020 Workshop on ML for Economic Policy. Minor correction uploaded Sept. 2022

  43. arXiv:2011.01459  [pdf, other

    cs.CL cs.LG

    Weakly- and Semi-supervised Evidence Extraction

    Authors: Danish Pruthi, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton

    Abstract: For many prediction tasks, stakeholders desire not only predictions but also supporting evidence that a human can use to verify its correctness. However, in practice, additional annotations marking supporting evidence may only be available for a minority of training examples (if available at all). In this paper, we propose new methods to combine few evidence annotations (strong semi-supervision) w… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: Accepted to the Findings of EMNLP 2020, to be presented at BlackBoxNLP

  44. arXiv:2010.11966  [pdf, other

    cs.CL cs.LG

    Unsupervised Data Augmentation with Naive Augmentation and without Unlabeled Data

    Authors: David Lowell, Brian E. Howard, Zachary C. Lipton, Byron C. Wallace

    Abstract: Unsupervised Data Augmentation (UDA) is a semi-supervised technique that applies a consistency loss to penalize differences between a model's predictions on (a) observed (unlabeled) examples; and (b) corresponding 'noised' examples produced via data augmentation. While UDA has gained popularity for text classification, open questions linger over which design decisions are necessary and over how to… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

  45. arXiv:2010.03017  [pdf, other

    cs.CL cs.LG

    On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment

    Authors: Zirui Wang, Zachary C. Lipton, Yulia Tsvetkov

    Abstract: Modern multilingual models are trained on concatenated text from multiple languages in hopes of conferring benefits to each (positive transfer), with the most pronounced benefits accruing to low-resource languages. However, recent work has shown that this approach can degrade performance on high-resource languages, a phenomenon known as negative interference. In this paper, we present the first sy… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: Published as a main conference paper at EMNLP 2020

  46. arXiv:2010.02114  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Explaining The Efficacy of Counterfactually Augmented Data

    Authors: Divyansh Kaushik, Amrith Setlur, Eduard Hovy, Zachary C. Lipton

    Abstract: In attempts to produce ML models less reliant on spurious patterns in NLP datasets, researchers have recently proposed curating counterfactually augmented data (CAD) via a human-in-the-loop process in which given some documents and their (initial) labels, humans must revise the text to make a counterfactual label applicable. Importantly, edits that are not necessary to flip the applicable label ar… ▽ More

    Submitted 23 March, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Published at ICLR 2021

  47. arXiv:2007.07151  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Extracting Structured Data from Physician-Patient Conversations By Predicting Noteworthy Utterances

    Authors: Kundan Krishna, Amy Pavel, Benjamin Schloss, Jeffrey P. Bigham, Zachary C. Lipton

    Abstract: Despite diverse efforts to mine various modalities of medical data, the conversations between physicians and patients at the time of care remain an untapped source of insights. In this paper, we leverage this data to extract structured information that might assist physicians with post-visit documentation in electronic health records, potentially lightening the clerical burden. In this exploratory… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

  48. arXiv:2007.04082  [pdf, other

    q-fin.ST cs.LG cs.NE

    Uncertainty-Aware Lookahead Factor Models for Quantitative Investing

    Authors: Lakshay Chauhan, John Alberg, Zachary C. Lipton

    Abstract: On a periodic basis, publicly traded companies report fundamentals, financial data including revenue, earnings, debt, among others. Quantitative finance research has identified several factors, functions of the reported data that historically correlate with stock market performance. In this paper, we first show through simulation that if we could select stocks via factors calculated on future fund… ▽ More

    Submitted 15 July, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

  49. arXiv:2006.01898  [pdf, other

    stat.AP cs.LG stat.ML

    Predicting Mortality Risk in Viral and Unspecified Pneumonia to Assist Clinicians with COVID-19 ECMO Planning

    Authors: Helen Zhou, Cheng Cheng, Zachary C. Lipton, George H. Chen, Jeremy C. Weiss

    Abstract: Respiratory complications due to coronavirus disease COVID-19 have claimed tens of thousands of lives in 2020. Many cases of COVID-19 escalate from Severe Acute Respiratory Syndrome (SARS-CoV-2) to viral pneumonia to acute respiratory distress syndrome (ARDS) to death. Extracorporeal membranous oxygenation (ECMO) is a life-sustaining oxygenation and ventilation therapy that may be used for patient… ▽ More

    Submitted 2 June, 2020; originally announced June 2020.

  50. arXiv:2005.01795  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Generating SOAP Notes from Doctor-Patient Conversations Using Modular Summarization Techniques

    Authors: Kundan Krishna, Sopan Khosla, Jeffrey P. Bigham, Zachary C. Lipton

    Abstract: Following each patient visit, physicians draft long semi-structured clinical summaries called SOAP notes. While invaluable to clinicians and researchers, creating digital SOAP notes is burdensome, contributing to physician burnout. In this paper, we introduce the first complete pipelines to leverage deep summarization models to generate these notes based on transcripts of conversations between phy… ▽ More

    Submitted 2 June, 2021; v1 submitted 4 May, 2020; originally announced May 2020.

    Comments: Published at ACL 2021 Main Conference