-
Revisiting the Predictability of Performative, Social Events
Authors:
Juan C. Perdomo
Abstract:
Social predictions do not passively describe the future; they actively shape it. They inform actions and change individual expectations in ways that influence the likelihood of the predicted outcome. Given these dynamics, to what extent can social events be predicted? This question was discussed throughout the 20th century by authors like Merton, Morgenstern, Simon, and others who considered it a…
▽ More
Social predictions do not passively describe the future; they actively shape it. They inform actions and change individual expectations in ways that influence the likelihood of the predicted outcome. Given these dynamics, to what extent can social events be predicted? This question was discussed throughout the 20th century by authors like Merton, Morgenstern, Simon, and others who considered it a central issue in social science methodology. In this work, we provide a modern answer to this old problem. Using recent ideas from performative prediction and outcome indistinguishability, we establish that one can always efficiently predict social events accurately, regardless of how predictions influence data. While achievable, we also show that these predictions are often undesirable, highlighting the limitations of previous desiderata. We end with a discussion of various avenues forward.
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
Inducing Efficient and Equitable Professional Networks through Link Recommendations
Authors:
Cynthia Dwork,
Chris Hays,
Lunjia Hu,
Nicole Immorlica,
Juan Perdomo
Abstract:
Professional networks are a key determinant of individuals' labor market outcomes. They may also play a role in either exacerbating or ameliorating inequality of opportunity across demographic groups. In a theoretical model of professional network formation, we show that inequality can increase even without exogenous in-group preferences, confirming and complementing existing theoretical literatur…
▽ More
Professional networks are a key determinant of individuals' labor market outcomes. They may also play a role in either exacerbating or ameliorating inequality of opportunity across demographic groups. In a theoretical model of professional network formation, we show that inequality can increase even without exogenous in-group preferences, confirming and complementing existing theoretical literature. Increased inequality emerges from the differential leverage privileged and unprivileged individuals have in forming connections due to their asymmetric ex ante prospects. This is a formalization of a source of inequality in the labor market which has not been previously explored.
We next show how inequality-aware platforms may reduce inequality by subsidizing connections, through link recommendations that reduce costs, between privileged and unprivileged individuals. Indeed, mixed-privilege connections turn out to be welfare improving, over all possible equilibria, compared to not recommending links or recommending some smaller fraction of cross-group links. Taken together, these two findings reveal a stark reality: professional networking platforms that fail to foster integration in the link formation process risk reducing the platform's utility to its users and exacerbating existing labor market inequality.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
The Value of Prediction in Identifying the Worst-Off
Authors:
Unai Fischer-Abaigar,
Christoph Kern,
Juan Carlos Perdomo
Abstract:
Machine learning is increasingly used in government programs to identify and support the most vulnerable individuals, prioritizing assistance for those at greatest risk over optimizing aggregate outcomes. This paper examines the welfare impacts of prediction in equity-driven contexts, and how they compare to other policy levers, such as expanding bureaucratic capacity. Through mathematical models…
▽ More
Machine learning is increasingly used in government programs to identify and support the most vulnerable individuals, prioritizing assistance for those at greatest risk over optimizing aggregate outcomes. This paper examines the welfare impacts of prediction in equity-driven contexts, and how they compare to other policy levers, such as expanding bureaucratic capacity. Through mathematical models and a real-world case study on long-term unemployment amongst German residents, we develop a comprehensive understanding of the relative effectiveness of prediction in surfacing the worst-off. Our findings provide clear analytical frameworks and practical, data-driven tools that empower policymakers to make principled decisions when designing these systems.
△ Less
Submitted 13 February, 2025; v1 submitted 31 January, 2025;
originally announced January 2025.
-
From Fairness to Infinity: Outcome-Indistinguishable (Omni)Prediction in Evolving Graphs
Authors:
Cynthia Dwork,
Chris Hays,
Nicole Immorlica,
Juan C. Perdomo,
Pranay Tankala
Abstract:
Professional networks provide invaluable entree to opportunity through referrals and introductions. A rich literature shows they also serve to entrench and even exacerbate a status quo of privilege and disadvantage. Hiring platforms, equipped with the ability to nudge link formation, provide a tantalizing opening for beneficial structural change. We anticipate that key to this prospect will be the…
▽ More
Professional networks provide invaluable entree to opportunity through referrals and introductions. A rich literature shows they also serve to entrench and even exacerbate a status quo of privilege and disadvantage. Hiring platforms, equipped with the ability to nudge link formation, provide a tantalizing opening for beneficial structural change. We anticipate that key to this prospect will be the ability to estimate the likelihood of edge formation in an evolving graph. Outcome-indistinguishable prediction algorithms ensure that the modeled world is indistinguishable from the real world by a family of statistical tests. Omnipredictors ensure that predictions can be post-processed to yield loss minimization competitive with respect to a benchmark class of predictors for many losses simultaneously, with appropriate post-processing. We begin by observing that, by combining a slightly modified form of the online K29 star algorithm of Vovk (2007) with basic facts from the theory of reproducing kernel Hilbert spaces, one can derive simple and efficient online algorithms satisfying outcome indistinguishability and omniprediction, with guarantees that improve upon, or are complementary to, those currently known. This is of independent interest. We apply these techniques to evolving graphs, obtaining online outcome-indistinguishable omnipredictors for rich -- possibly infinite -- sets of distinguishers that capture properties of pairs of nodes, and their neighborhoods. This yields, inter alia, multicalibrated predictions of edge formation with respect to pairs of demographic groups, and the ability to simultaneously optimize loss as measured by a variety of social welfare functions.
△ Less
Submitted 26 November, 2024;
originally announced November 2024.
-
Large Scale Transfer Learning for Tabular Data via Language Modeling
Authors:
Josh Gardner,
Juan C. Perdomo,
Ludwig Schmidt
Abstract:
Tabular data -- structured, heterogeneous, spreadsheet-style data with rows and columns -- is widely used in practice across many domains. However, while recent foundation models have reduced the need for developing task-specific datasets and predictors in domains such as language modeling and computer vision, this transfer learning paradigm has not had similar impact in the tabular domain. In thi…
▽ More
Tabular data -- structured, heterogeneous, spreadsheet-style data with rows and columns -- is widely used in practice across many domains. However, while recent foundation models have reduced the need for developing task-specific datasets and predictors in domains such as language modeling and computer vision, this transfer learning paradigm has not had similar impact in the tabular domain. In this work, we seek to narrow this gap and present TabuLa-8B, a language model for tabular prediction. We define a process for extracting a large, high-quality training dataset from the TabLib corpus, proposing methods for tabular data filtering and quality control. Using the resulting dataset, which comprises over 2.1B rows from over 4M unique tables, we fine-tune a Llama 3-8B large language model (LLM) for tabular data prediction (classification and binned regression) using a novel packing and attention scheme for tabular prediction. Through evaluation across a test suite of 329 datasets, we find that TabuLa-8B has zero-shot accuracy on unseen tables that is over 15 percentage points (pp) higher than random guessing, a feat that is not possible with existing state-of-the-art tabular prediction models (e.g. XGBoost, TabPFN). In the few-shot setting (1-32 shots), without any fine-tuning on the target datasets, TabuLa-8B is 5-15 pp more accurate than XGBoost and TabPFN models that are explicitly trained on equal, or even up to 16x more data. We release our model, code, and data along with the publication of this paper.
△ Less
Submitted 20 November, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Insufficient Statistics Perturbation: Stable Estimators for Private Least Squares
Authors:
Gavin Brown,
Jonathan Hayase,
Samuel Hopkins,
Weihao Kong,
Xiyang Liu,
Sewoong Oh,
Juan C. Perdomo,
Adam Smith
Abstract:
We present a sample- and time-efficient differentially private algorithm for ordinary least squares, with error that depends linearly on the dimension and is independent of the condition number of $X^\top X$, where $X$ is the design matrix. All prior private algorithms for this task require either $d^{3/2}$ examples, error growing polynomially with the condition number, or exponential time. Our ne…
▽ More
We present a sample- and time-efficient differentially private algorithm for ordinary least squares, with error that depends linearly on the dimension and is independent of the condition number of $X^\top X$, where $X$ is the design matrix. All prior private algorithms for this task require either $d^{3/2}$ examples, error growing polynomially with the condition number, or exponential time. Our near-optimal accuracy guarantee holds for any dataset with bounded statistical leverage and bounded residuals. Technically, we build on the approach of Brown et al. (2023) for private mean estimation, adding scaled noise to a carefully designed stable nonprivate estimator of the empirical regression vector.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
The Relative Value of Prediction in Algorithmic Decision Making
Authors:
Juan Carlos Perdomo
Abstract:
Algorithmic predictions are increasingly used to inform the allocations of goods and interventions in the public sphere. In these domains, predictions serve as a means to an end. They provide stakeholders with insights into likelihood of future events as a means to improve decision making quality, and enhance social welfare. However, if maximizing welfare is the ultimate goal, prediction is only a…
▽ More
Algorithmic predictions are increasingly used to inform the allocations of goods and interventions in the public sphere. In these domains, predictions serve as a means to an end. They provide stakeholders with insights into likelihood of future events as a means to improve decision making quality, and enhance social welfare. However, if maximizing welfare is the ultimate goal, prediction is only a small piece of the puzzle. There are various other policy levers a social planner might pursue in order to improve bottom-line outcomes, such as expanding access to available goods, or increasing the effect sizes of interventions.
Given this broad range of design decisions, a basic question to ask is: What is the relative value of prediction in algorithmic decision making? How do the improvements in welfare arising from better predictions compare to those of other policy levers? The goal of our work is to initiate the formal study of these questions. Our main results are theoretical in nature. We identify simple, sharp conditions determining the relative value of prediction vis-à-vis expanding access, within several statistical models that are popular amongst quantitative social scientists. Furthermore, we illustrate how these theoretical insights may be used to guide the design of algorithmic decision making systems in practice.
△ Less
Submitted 29 May, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
Difficult Lessons on Social Prediction from Wisconsin Public Schools
Authors:
Juan C. Perdomo,
Tolani Britton,
Moritz Hardt,
Rediet Abebe
Abstract:
Early warning systems (EWS) are predictive tools at the center of recent efforts to improve graduation rates in public schools across the United States. These systems assist in targeting interventions to individual students by predicting which students are at risk of dropping out. Despite significant investments in their widespread adoption, there remain large gaps in our understanding of the effi…
▽ More
Early warning systems (EWS) are predictive tools at the center of recent efforts to improve graduation rates in public schools across the United States. These systems assist in targeting interventions to individual students by predicting which students are at risk of dropping out. Despite significant investments in their widespread adoption, there remain large gaps in our understanding of the efficacy of EWS, and the role of statistical risk scores in education.
In this work, we draw on nearly a decade's worth of data from a system used throughout Wisconsin to provide the first large-scale evaluation of the long-term impact of EWS on graduation outcomes. We present empirical evidence that the prediction system accurately sorts students by their dropout risk. We also find that it may have caused a single-digit percentage increase in graduation rates, though our empirical analyses cannot reliably rule out that there has been no positive treatment effect.
Going beyond a retrospective evaluation of DEWS, we draw attention to a central question at the heart of the use of EWS: Are individual risk scores necessary for effectively targeting interventions? We propose a simple mechanism that only uses information about students' environments -- such as their schools, and districts -- and argue that this mechanism can target interventions just as efficiently as the individual risk score-based mechanism. Our argument holds even if individual predictions are highly accurate and effective interventions exist. In addition to motivating this simple targeting mechanism, our work provides a novel empirical backbone for the robust qualitative understanding among education researchers that dropout is structurally determined. Combined, our insights call into question the marginal value of individual predictions in settings where outcomes are driven by high levels of inequality.
△ Less
Submitted 18 September, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
Making Decisions under Outcome Performativity
Authors:
Michael P. Kim,
Juan C. Perdomo
Abstract:
Decision-makers often act in response to data-driven predictions, with the goal of achieving favorable outcomes. In such settings, predictions don't passively forecast the future; instead, predictions actively shape the distribution of outcomes they are meant to predict. This performative prediction setting raises new challenges for learning "optimal" decision rules. In particular, existing soluti…
▽ More
Decision-makers often act in response to data-driven predictions, with the goal of achieving favorable outcomes. In such settings, predictions don't passively forecast the future; instead, predictions actively shape the distribution of outcomes they are meant to predict. This performative prediction setting raises new challenges for learning "optimal" decision rules. In particular, existing solution concepts do not address the apparent tension between the goals of forecasting outcomes accurately and steering individuals to achieve desirable outcomes.
To contend with this concern, we introduce a new optimality concept -- performative omniprediction -- adapted from the supervised (non-performative) learning setting. A performative omnipredictor is a single predictor that simultaneously encodes the optimal decision rule with respect to many possibly-competing objectives. Our main result demonstrates that efficient performative omnipredictors exist, under a natural restriction of performative prediction, which we call outcome performativity. On a technical level, our results follow by carefully generalizing the notion of outcome indistinguishability to the outcome performative setting. From an appropriate notion of Performative OI, we recover many consequences known to hold in the supervised setting, such as omniprediction and universal adaptability.
△ Less
Submitted 6 January, 2023; v1 submitted 4 October, 2022;
originally announced October 2022.
-
Deep Semi-Supervised and Self-Supervised Learning for Diabetic Retinopathy Detection
Authors:
Jose Miguel Arrieta Ramos,
Oscar Perdómo,
Fabio A. González
Abstract:
Diabetic retinopathy (DR) is one of the leading causes of blindness in the working-age population of developed countries, caused by a side effect of diabetes that reduces the blood supply to the retina. Deep neural networks have been widely used in automated systems for DR classification on eye fundus images. However, these models need a large number of annotated images. In the medical domain, ann…
▽ More
Diabetic retinopathy (DR) is one of the leading causes of blindness in the working-age population of developed countries, caused by a side effect of diabetes that reduces the blood supply to the retina. Deep neural networks have been widely used in automated systems for DR classification on eye fundus images. However, these models need a large number of annotated images. In the medical domain, annotations from experts are costly, tedious, and time-consuming; as a result, a limited number of annotated images are available. This paper presents a semi-supervised method that leverages unlabeled images and labeled ones to train a model that detects diabetic retinopathy. The proposed method uses unsupervised pretraining via self-supervised learning followed by supervised fine-tuning with a small set of labeled images and knowledge distillation to increase the performance in classification task. This method was evaluated on the EyePACS test and Messidor-2 dataset achieving 0.94 and 0.89 AUC respectively using only 2% of EyePACS train labeled images.
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
A Complete Characterization of Linear Estimators for Offline Policy Evaluation
Authors:
Juan C. Perdomo,
Akshay Krishnamurthy,
Peter Bartlett,
Sham Kakade
Abstract:
Offline policy evaluation is a fundamental statistical problem in reinforcement learning that involves estimating the value function of some decision-making policy given data collected by a potentially different policy. In order to tackle problems with complex, high-dimensional observations, there has been significant interest from theoreticians and practitioners alike in understanding the possibi…
▽ More
Offline policy evaluation is a fundamental statistical problem in reinforcement learning that involves estimating the value function of some decision-making policy given data collected by a potentially different policy. In order to tackle problems with complex, high-dimensional observations, there has been significant interest from theoreticians and practitioners alike in understanding the possibility of function approximation in reinforcement learning. Despite significant study, a sharp characterization of when we might expect offline policy evaluation to be tractable, even in the simplest setting of linear function approximation, has so far remained elusive, with a surprising number of strong negative results recently appearing in the literature.
In this work, we identify simple control-theoretic and linear-algebraic conditions that are necessary and sufficient for classical methods, in particular Fitted Q-iteration (FQI) and least squares temporal difference learning (LSTD), to succeed at offline policy evaluation. Using this characterization, we establish a precise hierarchy of regimes under which these estimators succeed. We prove that LSTD works under strictly weaker conditions than FQI. Furthermore, we establish that if a problem is not solvable via LSTD, then it cannot be solved by a broad class of linear estimators, even in the limit of infinite data. Taken together, our results provide a complete picture of the behavior of linear estimators for offline policy evaluation, unify previously disparate analyses of canonical algorithms, and provide significantly sharper notions of the underlying statistical complexity of offline policy evaluation.
△ Less
Submitted 19 December, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
Globally Convergent Policy Search over Dynamic Filters for Output Estimation
Authors:
Jack Umenberger,
Max Simchowitz,
Juan C. Perdomo,
Kaiqing Zhang,
Russ Tedrake
Abstract:
We introduce the first direct policy search algorithm which provably converges to the globally optimal $\textit{dynamic}$ filter for the classical problem of predicting the outputs of a linear dynamical system, given noisy, partial observations. Despite the ubiquity of partial observability in practice, theoretical guarantees for direct policy search algorithms, one of the backbones of modern rein…
▽ More
We introduce the first direct policy search algorithm which provably converges to the globally optimal $\textit{dynamic}$ filter for the classical problem of predicting the outputs of a linear dynamical system, given noisy, partial observations. Despite the ubiquity of partial observability in practice, theoretical guarantees for direct policy search algorithms, one of the backbones of modern reinforcement learning, have proven difficult to achieve. This is primarily due to the degeneracies which arise when optimizing over filters that maintain internal state.
In this paper, we provide a new perspective on this challenging problem based on the notion of $\textit{informativity}$, which intuitively requires that all components of a filter's internal state are representative of the true state of the underlying dynamical system. We show that informativity overcomes the aforementioned degeneracy. Specifically, we propose a $\textit{regularizer}$ which explicitly enforces informativity, and establish that gradient descent on this regularized objective - combined with a ``reconditioning step'' - converges to the globally optimal cost a $\mathcal{O}(1/T)$ rate. Our analysis relies on several new results which may be of independent interest, including a new framework for analyzing non-convex gradient descent via convex reformulation, and novel bounds on the solution to linear Lyapunov equations in terms of (our quantitative measure of) informativity.
△ Less
Submitted 25 February, 2022; v1 submitted 23 February, 2022;
originally announced February 2022.
-
A deep learning model for classification of diabetic retinopathy in eye fundus images based on retinal lesion detection
Authors:
Melissa delaPava,
Hernán Ríos,
Francisco J. Rodríguez,
Oscar J. Perdomo,
Fabio A. González
Abstract:
Diabetic retinopathy (DR) is the result of a complication of diabetes affecting the retina. It can cause blindness, if left undiagnosed and untreated. An ophthalmologist performs the diagnosis by screening each patient and analyzing the retinal lesions via ocular imaging. In practice, such analysis is time-consuming and cumbersome to perform. This paper presents a model for automatic DR classifica…
▽ More
Diabetic retinopathy (DR) is the result of a complication of diabetes affecting the retina. It can cause blindness, if left undiagnosed and untreated. An ophthalmologist performs the diagnosis by screening each patient and analyzing the retinal lesions via ocular imaging. In practice, such analysis is time-consuming and cumbersome to perform. This paper presents a model for automatic DR classification on eye fundus images. The approach identifies the main ocular lesions related to DR and subsequently diagnoses the illness. The proposed method follows the same workflow as the clinicians, providing information that can be interpreted clinically to support the prediction. A subset of the kaggle EyePACS and the Messidor-2 datasets, labeled with ocular lesions, is made publicly available. The kaggle EyePACS subset is used as a training set and the Messidor-2 as a test set for lesions and DR classification models. For DR diagnosis, our model has an area-under-the-curve, sensitivity, and specificity of 0.948, 0.886, and 0.875, respectively, which competes with state-of-the-art approaches.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Stabilizing Dynamical Systems via Policy Gradient Methods
Authors:
Juan C. Perdomo,
Jack Umenberger,
Max Simchowitz
Abstract:
Stabilizing an unknown control system is one of the most fundamental problems in control systems engineering. In this paper, we provide a simple, model-free algorithm for stabilizing fully observed dynamical systems. While model-free methods have become increasingly popular in practice due to their simplicity and flexibility, stabilization via direct policy search has received surprisingly little…
▽ More
Stabilizing an unknown control system is one of the most fundamental problems in control systems engineering. In this paper, we provide a simple, model-free algorithm for stabilizing fully observed dynamical systems. While model-free methods have become increasingly popular in practice due to their simplicity and flexibility, stabilization via direct policy search has received surprisingly little attention. Our algorithm proceeds by solving a series of discounted LQR problems, where the discount factor is gradually increased. We prove that this method efficiently recovers a stabilizing controller for linear systems, and for smooth, nonlinear systems within a neighborhood of their equilibria. Our approach overcomes a significant limitation of prior work, namely the need for a pre-given stabilizing control policy. We empirically evaluate the effectiveness of our approach on common control benchmarks.
△ Less
Submitted 12 October, 2021;
originally announced October 2021.
-
QoS Prediction for 5G Connected and Automated Driving
Authors:
Apostolos Kousaridas,
Ramya Panthangi Manjunath,
Jose Mauricio Perdomo,
Chan Zhou,
Ernst Zielinski,
Steffen Schmitz,
Andreas Pfadler
Abstract:
5G communication system can support the demanding quality-of-service (QoS) requirements of many advanced vehicle-to-everything (V2X) use cases. However, the safe and efficient driving, especially of automated vehicles, may be affected by sudden changes of the provided QoS. For that reason, the prediction of the QoS changes and the early notification of these predicted changes to the vehicles have…
▽ More
5G communication system can support the demanding quality-of-service (QoS) requirements of many advanced vehicle-to-everything (V2X) use cases. However, the safe and efficient driving, especially of automated vehicles, may be affected by sudden changes of the provided QoS. For that reason, the prediction of the QoS changes and the early notification of these predicted changes to the vehicles have been recently enabled by 5G communication systems. This solution enables the vehicles to avoid or mitigate the effect of sudden QoS changes at the application level. This article describes how QoS prediction could be generated by a 5G communication system and delivered to a V2X application. The tele-operated driving use case is used as an example to analyze the feasibility of a QoS prediction scheme. Useful recommendations for the development of a QoS prediction solution are provided, while open research topics are identified.
△ Less
Submitted 11 July, 2021;
originally announced July 2021.
-
Towards a Dimension-Free Understanding of Adaptive Linear Control
Authors:
Juan C. Perdomo,
Max Simchowitz,
Alekh Agarwal,
Peter Bartlett
Abstract:
We study the problem of adaptive control of the linear quadratic regulator for systems in very high, or even infinite dimension. We demonstrate that while sublinear regret requires finite dimensional inputs, the ambient state dimension of the system need not be bounded in order to perform online control. We provide the first regret bounds for LQR which hold for infinite dimensional systems, replac…
▽ More
We study the problem of adaptive control of the linear quadratic regulator for systems in very high, or even infinite dimension. We demonstrate that while sublinear regret requires finite dimensional inputs, the ambient state dimension of the system need not be bounded in order to perform online control. We provide the first regret bounds for LQR which hold for infinite dimensional systems, replacing dependence on ambient dimension with more natural notions of problem complexity. Our guarantees arise from a novel perturbation bound for certainty equivalence which scales with the prediction error in estimating the system parameters, without requiring consistent parameter recovery in more stringent measures like the operator norm. When specialized to finite dimensional settings, our bounds recover near optimal dimension and time horizon dependence.
△ Less
Submitted 15 July, 2021; v1 submitted 18 March, 2021;
originally announced March 2021.
-
Outside the Echo Chamber: Optimizing the Performative Risk
Authors:
John Miller,
Juan C. Perdomo,
Tijana Zrnic
Abstract:
In performative prediction, predictions guide decision-making and hence can influence the distribution of future data. To date, work on performative prediction has focused on finding performatively stable models, which are the fixed points of repeated retraining. However, stable solutions can be far from optimal when evaluated in terms of the performative risk, the loss experienced by the decision…
▽ More
In performative prediction, predictions guide decision-making and hence can influence the distribution of future data. To date, work on performative prediction has focused on finding performatively stable models, which are the fixed points of repeated retraining. However, stable solutions can be far from optimal when evaluated in terms of the performative risk, the loss experienced by the decision maker when deploying a model. In this paper, we shift attention beyond performative stability and focus on optimizing the performative risk directly. We identify a natural set of properties of the loss function and model-induced distribution shift under which the performative risk is convex, a property which does not follow from convexity of the loss alone. Furthermore, we develop algorithms that leverage our structural assumptions to optimize the performative risk with better sample efficiency than generic methods for derivative-free convex optimization.
△ Less
Submitted 15 June, 2021; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Hybrid Deep Learning Gaussian Process for Diabetic Retinopathy Diagnosis and Uncertainty Quantification
Authors:
Santiago Toledo-Cortés,
Melissa De La Pava,
Oscar Perdómo,
Fabio A. González
Abstract:
Diabetic Retinopathy (DR) is one of the microvascular complications of Diabetes Mellitus, which remains as one of the leading causes of blindness worldwide. Computational models based on Convolutional Neural Networks represent the state of the art for the automatic detection of DR using eye fundus images. Most of the current work address this problem as a binary classification task. However, inclu…
▽ More
Diabetic Retinopathy (DR) is one of the microvascular complications of Diabetes Mellitus, which remains as one of the leading causes of blindness worldwide. Computational models based on Convolutional Neural Networks represent the state of the art for the automatic detection of DR using eye fundus images. Most of the current work address this problem as a binary classification task. However, including the grade estimation and quantification of predictions uncertainty can potentially increase the robustness of the model. In this paper, a hybrid Deep Learning-Gaussian process method for DR diagnosis and uncertainty quantification is presented. This method combines the representational power of deep learning, with the ability to generalize from small datasets of Gaussian process models. The results show that uncertainty quantification in the predictions improves the interpretability of the method as a diagnostic support tool. The source code to replicate the experiments is publicly available at https://github.com/stoledoc/DLGP-DR-Diagnosis.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
Stochastic Optimization for Performative Prediction
Authors:
Celestine Mendler-Dünner,
Juan C. Perdomo,
Tijana Zrnic,
Moritz Hardt
Abstract:
In performative prediction, the choice of a model influences the distribution of future data, typically through actions taken based on the model's predictions.
We initiate the study of stochastic optimization for performative prediction. What sets this setting apart from traditional stochastic optimization is the difference between merely updating model parameters and deploying the new model. Th…
▽ More
In performative prediction, the choice of a model influences the distribution of future data, typically through actions taken based on the model's predictions.
We initiate the study of stochastic optimization for performative prediction. What sets this setting apart from traditional stochastic optimization is the difference between merely updating model parameters and deploying the new model. The latter triggers a shift in the distribution that affects future data, while the former keeps the distribution as is.
Assuming smoothness and strong convexity, we prove rates of convergence for both greedily deploying models after each stochastic update (greedy deploy) as well as for taking several updates before redeploying (lazy deploy). In both cases, our bounds smoothly recover the optimal $O(1/k)$ rate as the strength of performativity decreases. Furthermore, they illustrate how depending on the strength of performative effects, there exists a regime where either approach outperforms the other. We experimentally explore the trade-off on both synthetic data and a strategic classification simulator.
△ Less
Submitted 19 February, 2021; v1 submitted 11 June, 2020;
originally announced June 2020.
-
Performative Prediction
Authors:
Juan C. Perdomo,
Tijana Zrnic,
Celestine Mendler-Dünner,
Moritz Hardt
Abstract:
When predictions support decisions they may influence the outcome they aim to predict. We call such predictions performative; the prediction influences the target. Performativity is a well-studied phenomenon in policy-making that has so far been neglected in supervised learning. When ignored, performativity surfaces as undesirable distribution shift, routinely addressed with retraining.
We devel…
▽ More
When predictions support decisions they may influence the outcome they aim to predict. We call such predictions performative; the prediction influences the target. Performativity is a well-studied phenomenon in policy-making that has so far been neglected in supervised learning. When ignored, performativity surfaces as undesirable distribution shift, routinely addressed with retraining.
We develop a risk minimization framework for performative prediction bringing together concepts from statistics, game theory, and causality. A conceptual novelty is an equilibrium notion we call performative stability. Performative stability implies that the predictions are calibrated not against past outcomes, but against the future outcomes that manifest from acting on the prediction. Our main results are necessary and sufficient conditions for the convergence of retraining to a performatively stable point of nearly minimal loss.
In full generality, performative prediction strictly subsumes the setting known as strategic classification. We thus also give the first sufficient conditions for retraining to overcome strategic feedback effects.
△ Less
Submitted 26 February, 2021; v1 submitted 16 February, 2020;
originally announced February 2020.
-
Robust Attacks against Multiple Classifiers
Authors:
Juan C. Perdomo,
Yaron Singer
Abstract:
We address the challenge of designing optimal adversarial noise algorithms for settings where a learner has access to multiple classifiers. We demonstrate how this problem can be framed as finding strategies at equilibrium in a two-player, zero-sum game between a learner and an adversary. In doing so, we illustrate the need for randomization in adversarial attacks. In order to compute Nash equilib…
▽ More
We address the challenge of designing optimal adversarial noise algorithms for settings where a learner has access to multiple classifiers. We demonstrate how this problem can be framed as finding strategies at equilibrium in a two-player, zero-sum game between a learner and an adversary. In doing so, we illustrate the need for randomization in adversarial attacks. In order to compute Nash equilibrium, our main technical focus is on the design of best response oracles that can then be implemented within a Multiplicative Weights Update framework to boost deterministic perturbations against a set of models into optimal mixed strategies. We demonstrate the practical effectiveness of our approach on a series of image classification tasks using both linear classifiers and deep neural networks.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
Self-adaptation of Genetic Operators Through Genetic Programming Techniques
Authors:
Andres Felipe Cruz Salinas,
Jonatan Gomez Perdomo
Abstract:
Here we propose an evolutionary algorithm that self modifies its operators at the same time that candidate solutions are evolved. This tackles convergence and lack of diversity issues, leading to better solutions. Operators are represented as trees and are evolved using genetic programming (GP) techniques. The proposed approach is tested with real benchmark functions and an analysis of operator ev…
▽ More
Here we propose an evolutionary algorithm that self modifies its operators at the same time that candidate solutions are evolved. This tackles convergence and lack of diversity issues, leading to better solutions. Operators are represented as trees and are evolved using genetic programming (GP) techniques. The proposed approach is tested with real benchmark functions and an analysis of operator evolution is provided.
△ Less
Submitted 17 December, 2017;
originally announced December 2017.