-
Identifying Exoplanets with Deep Learning. V. Improved Light Curve Classification for TESS Full Frame Image Observations
Authors:
Evan Tey,
Dan Moldovan,
Michelle Kunimoto,
Chelsea X. Huang,
Avi Shporer,
Tansu Daylan,
Daniel Muthukrishna,
Andrew Vanderburg,
Anne Dattilo,
George R. Ricker,
S. Seager
Abstract:
The TESS mission produces a large amount of time series data, only a small fraction of which contain detectable exoplanetary transit signals. Deep learning techniques such as neural networks have proved effective at differentiating promising astrophysical eclipsing candidates from other phenomena such as stellar variability and systematic instrumental effects in an efficient, unbiased and sustaina…
▽ More
The TESS mission produces a large amount of time series data, only a small fraction of which contain detectable exoplanetary transit signals. Deep learning techniques such as neural networks have proved effective at differentiating promising astrophysical eclipsing candidates from other phenomena such as stellar variability and systematic instrumental effects in an efficient, unbiased and sustainable manner. This paper presents a high quality dataset containing light curves from the Primary Mission and 1st Extended Mission full frame images and periodic signals detected via Box Least Squares (Kovács et al. 2002; Hartman 2012). The dataset was curated using a thorough manual review process then used to train a neural network called Astronet-Triage-v2. On our test set, for transiting/eclipsing events we achieve a 99.6% recall (true positives over all data with positive labels) at a precision of 75.7% (true positives over all predicted positives). Since 90% of our training data is from the Primary Mission, we also test our ability to generalize on held-out 1st Extended Mission data. Here, we find an area under the precision-recall curve of 0.965, a 4% improvement over Astronet-Triage (Yu et al. 2019). On the TESS Object of Interest (TOI) Catalog through April 2022, a shortlist of planets and planet candidates, Astronet-Triage-v2 is able to recover 3577 out of 4140 TOIs, while Astronet-Triage only recovers 3349 targets at an equal level of precision. In other words, upgrading to Astronet-Triage-v2 helps save at least 200 planet candidates from being lost. The new model is currently used for planet candidate triage in the Quick-Look Pipeline (Huang et al. 2020a,b; Kunimoto et al. 2021).
△ Less
Submitted 3 January, 2023;
originally announced January 2023.
-
Algorithmic decision making methods for fair credit scoring
Authors:
Darie Moldovan
Abstract:
The effectiveness of machine learning in evaluating the creditworthiness of loan applicants has been demonstrated for a long time. However, there is concern that the use of automated decision-making processes may result in unequal treatment of groups or individuals, potentially leading to discriminatory outcomes. This paper seeks to address this issue by evaluating the effectiveness of 12 leading…
▽ More
The effectiveness of machine learning in evaluating the creditworthiness of loan applicants has been demonstrated for a long time. However, there is concern that the use of automated decision-making processes may result in unequal treatment of groups or individuals, potentially leading to discriminatory outcomes. This paper seeks to address this issue by evaluating the effectiveness of 12 leading bias mitigation methods across 5 different fairness metrics, as well as assessing their accuracy and potential profitability for financial institutions. Through our analysis, we have identified the challenges associated with achieving fairness while maintaining accuracy and profitabiliy, and have highlighted both the most successful and least successful mitigation methods. Ultimately, our research serves to bridge the gap between experimental machine learning and its practical applications in the finance industry.
△ Less
Submitted 22 June, 2023; v1 submitted 16 September, 2022;
originally announced September 2022.
-
CEREC: A Corpus for Entity Resolution in Email Conversations
Authors:
Parag Pravin Dakle,
Dan I. Moldovan
Abstract:
We present the first large scale corpus for entity resolution in email conversations (CEREC). The corpus consists of 6001 email threads from the Enron Email Corpus containing 36,448 email messages and 60,383 entity coreference chains. The annotation is carried out as a two-step process with minimal manual effort. Experiments are carried out for evaluating different features and performance of four…
▽ More
We present the first large scale corpus for entity resolution in email conversations (CEREC). The corpus consists of 6001 email threads from the Enron Email Corpus containing 36,448 email messages and 60,383 entity coreference chains. The annotation is carried out as a two-step process with minimal manual effort. Experiments are carried out for evaluating different features and performance of four baselines on the created corpus. For the task of mention identification and coreference resolution, a best performance of 59.2 F1 is reported, highlighting the room for improvement. An in-depth qualitative and quantitative error analysis is presented to understand the limitations of the baselines considered.
△ Less
Submitted 1 June, 2021; v1 submitted 21 May, 2021;
originally announced May 2021.
-
Causally motivated Shortcut Removal Using Auxiliary Labels
Authors:
Maggie Makar,
Ben Packer,
Dan Moldovan,
Davis Blalock,
Yoni Halpern,
Alexander D'Amour
Abstract:
Shortcut learning, in which models make use of easy-to-represent but unstable associations, is a major failure mode for robust machine learning. We study a flexible, causally-motivated approach to training robust predictors by discouraging the use of specific shortcuts, focusing on a common setting where a robust predictor could achieve optimal \emph{iid} generalization in principle, but is oversh…
▽ More
Shortcut learning, in which models make use of easy-to-represent but unstable associations, is a major failure mode for robust machine learning. We study a flexible, causally-motivated approach to training robust predictors by discouraging the use of specific shortcuts, focusing on a common setting where a robust predictor could achieve optimal \emph{iid} generalization in principle, but is overshadowed by a shortcut predictor in practice. Our approach uses auxiliary labels, typically available at training time, to enforce conditional independences implied by the causal graph. We show both theoretically and empirically that causally-motivated regularization schemes (a) lead to more robust estimators that generalize well under distribution shift, and (b) have better finite sample efficiency compared to usual regularization schemes, even when no shortcut is present. Our analysis highlights important theoretical properties of training techniques commonly used in the causal inference, fairness, and disentanglement literatures. Our code is available at https://github.com/mymakar/causally_motivated_shortcut_removal
△ Less
Submitted 23 February, 2022; v1 submitted 13 May, 2021;
originally announced May 2021.
-
Underspecification Presents Challenges for Credibility in Modern Machine Learning
Authors:
Alexander D'Amour,
Katherine Heller,
Dan Moldovan,
Ben Adlam,
Babak Alipanahi,
Alex Beutel,
Christina Chen,
Jonathan Deaton,
Jacob Eisenstein,
Matthew D. Hoffman,
Farhad Hormozdiari,
Neil Houlsby,
Shaobo Hou,
Ghassen Jerfel,
Alan Karthikesalingam,
Mario Lucic,
Yian Ma,
Cory McLean,
Diana Mincu,
Akinori Mitani,
Andrea Montanari,
Zachary Nado,
Vivek Natarajan,
Christopher Nielson,
Thomas F. Osborne
, et al. (15 additional authors not shown)
Abstract:
ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predict…
▽ More
ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predictors returned by underspecified pipelines are often treated as equivalent based on their training domain performance, but we show here that such predictors can behave very differently in deployment domains. This ambiguity can lead to instability and poor model behavior in practice, and is a distinct failure mode from previously identified issues arising from structural mismatch between training and deployment domains. We show that this problem appears in a wide variety of practical ML pipelines, using examples from computer vision, medical imaging, natural language processing, clinical risk prediction based on electronic health records, and medical genomics. Our results show the need to explicitly account for underspecification in modeling pipelines that are intended for real-world deployment in any domain.
△ Less
Submitted 24 November, 2020; v1 submitted 6 November, 2020;
originally announced November 2020.
-
On Robustness and Transferability of Convolutional Neural Networks
Authors:
Josip Djolonga,
Jessica Yung,
Michael Tschannen,
Rob Romijnders,
Lucas Beyer,
Alexander Kolesnikov,
Joan Puigcerver,
Matthias Minderer,
Alexander D'Amour,
Dan Moldovan,
Sylvain Gelly,
Neil Houlsby,
Xiaohua Zhai,
Mario Lucic
Abstract:
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts. However, several recent breakthroughs in transfer learning suggest that these networks can cope with severe distribution shifts and successfully adapt to new tasks from a few training examples. In this work we study the interplay between out-of-distribution and transfer performance of m…
▽ More
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts. However, several recent breakthroughs in transfer learning suggest that these networks can cope with severe distribution shifts and successfully adapt to new tasks from a few training examples. In this work we study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time and investigate the impact of the pre-training data size, the model scale, and the data preprocessing pipeline. We find that increasing both the training set and model sizes significantly improve the distributional shift robustness. Furthermore, we show that, perhaps surprisingly, simple changes in the preprocessing such as modifying the image resolution can significantly mitigate robustness issues in some cases. Finally, we outline the shortcomings of existing robustness evaluation datasets and introduce a synthetic dataset SI-Score we use for a systematic analysis across factors of variation common in visual data such as object size and position.
△ Less
Submitted 23 March, 2021; v1 submitted 16 July, 2020;
originally announced July 2020.
-
AutoGraph: Imperative-style Coding with Graph-based Performance
Authors:
Dan Moldovan,
James M Decker,
Fei Wang,
Andrew A Johnson,
Brian K Lee,
Zachary Nado,
D Sculley,
Tiark Rompf,
Alexander B Wiltschko
Abstract:
There is a perceived trade-off between machine learning code that is easy to write, and machine learning code that is scalable or fast to execute. In machine learning, imperative style libraries like Autograd and PyTorch are easy to write, but suffer from high interpretive overhead and are not easily deployable in production or mobile settings. Graph-based libraries like TensorFlow and Theano bene…
▽ More
There is a perceived trade-off between machine learning code that is easy to write, and machine learning code that is scalable or fast to execute. In machine learning, imperative style libraries like Autograd and PyTorch are easy to write, but suffer from high interpretive overhead and are not easily deployable in production or mobile settings. Graph-based libraries like TensorFlow and Theano benefit from whole-program optimization and can be deployed broadly, but make expressing complex models more cumbersome. We describe how the use of staged programming in Python, via source code transformation, offers a midpoint between these two library design patterns, capturing the benefits of both. A key insight is to delay all type-dependent decisions until runtime, via dynamic dispatch. We instantiate these principles in AutoGraph, a software system that improves the programming experience of the TensorFlow library, and demonstrate usability improvements with no loss in performance compared to native TensorFlow graphs. We also show that our system is backend agnostic, and demonstrate targeting an alternate IR with characteristics not found in TensorFlow graphs.
△ Less
Submitted 26 March, 2019; v1 submitted 16 October, 2018;
originally announced October 2018.
-
Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming
Authors:
Bart van Merriënboer,
Dan Moldovan,
Alexander B Wiltschko
Abstract:
The need to efficiently calculate first- and higher-order derivatives of increasingly complex models expressed in Python has stressed or exceeded the capabilities of available tools. In this work, we explore techniques from the field of automatic differentiation (AD) that can give researchers expressive power, performance and strong usability. These include source-code transformation (SCT), flexib…
▽ More
The need to efficiently calculate first- and higher-order derivatives of increasingly complex models expressed in Python has stressed or exceeded the capabilities of available tools. In this work, we explore techniques from the field of automatic differentiation (AD) that can give researchers expressive power, performance and strong usability. These include source-code transformation (SCT), flexible gradient surgery, efficient in-place array operations, higher-order derivatives as well as mixing of forward and reverse mode AD. We implement and demonstrate these ideas in the Tangent software library for Python, the first AD framework for a dynamic language that uses SCT.
△ Less
Submitted 26 September, 2018; v1 submitted 25 September, 2018;
originally announced September 2018.
-
Tangent: Automatic Differentiation Using Source Code Transformation in Python
Authors:
Bart van Merriënboer,
Alexander B. Wiltschko,
Dan Moldovan
Abstract:
Automatic differentiation (AD) is an essential primitive for machine learning programming systems. Tangent is a new library that performs AD using source code transformation (SCT) in Python. It takes numeric functions written in a syntactic subset of Python and NumPy as input, and generates new Python functions which calculate a derivative. This approach to automatic differentiation is different f…
▽ More
Automatic differentiation (AD) is an essential primitive for machine learning programming systems. Tangent is a new library that performs AD using source code transformation (SCT) in Python. It takes numeric functions written in a syntactic subset of Python and NumPy as input, and generates new Python functions which calculate a derivative. This approach to automatic differentiation is different from existing packages popular in machine learning, such as TensorFlow and Autograd. Advantages are that Tangent generates gradient code in Python which is readable by the user, easy to understand and debug, and has no runtime overhead. Tangent also introduces abstractions for easily injecting logic into the generated gradient code, further improving usability.
△ Less
Submitted 7 November, 2017;
originally announced November 2017.
-
Sparse Positional Strategies for Safety Games
Authors:
Rüdiger Ehlers,
Daniela Moldovan
Abstract:
We consider the problem of obtaining sparse positional strategies for safety games. Such games are a commonly used model in many formal methods, as they make the interaction of a system with its environment explicit. Often, a winning strategy for one of the players is used as a certificate or as an artefact for further processing in the application. Small such certificates, i.e., strategies that c…
▽ More
We consider the problem of obtaining sparse positional strategies for safety games. Such games are a commonly used model in many formal methods, as they make the interaction of a system with its environment explicit. Often, a winning strategy for one of the players is used as a certificate or as an artefact for further processing in the application. Small such certificates, i.e., strategies that can be written down very compactly, are typically preferred. For safety games, we only need to consider positional strategies. These map game positions of a player onto a move that is to be taken by the player whenever the play enters that position. For representing positional strategies compactly, a common goal is to minimize the number of positions for which a winning player's move needs to be defined such that the game is still won by the same player, without visiting a position with an undefined next move. We call winning strategies in which the next move is defined for few of the player's positions sparse.
Unfortunately, even roughly approximating the density of the sparsest strategy for a safety game has been shown to be NP-hard. Thus, to obtain sparse strategies in practice, one either has to apply some heuristics, or use some exhaustive search technique, like ILP (integer linear programming) solving. In this paper, we perform a comparative study of currently available methods to obtain sparse winning strategies for the safety player in safety games. We consider techniques from common knowledge, such as using ILP or SAT (satisfiability) solving, and a novel technique based on iterative linear programming. The results of this paper tell us if current techniques are already scalable enough for practical use.
△ Less
Submitted 3 July, 2012;
originally announced July 2012.