Skip to main content

Showing 1–10 of 10 results for author: Moldovan, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2301.01371  [pdf, other

    astro-ph.EP astro-ph.IM cs.LG

    Identifying Exoplanets with Deep Learning. V. Improved Light Curve Classification for TESS Full Frame Image Observations

    Authors: Evan Tey, Dan Moldovan, Michelle Kunimoto, Chelsea X. Huang, Avi Shporer, Tansu Daylan, Daniel Muthukrishna, Andrew Vanderburg, Anne Dattilo, George R. Ricker, S. Seager

    Abstract: The TESS mission produces a large amount of time series data, only a small fraction of which contain detectable exoplanetary transit signals. Deep learning techniques such as neural networks have proved effective at differentiating promising astrophysical eclipsing candidates from other phenomena such as stellar variability and systematic instrumental effects in an efficient, unbiased and sustaina… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

    Comments: accepted for publication in AJ. code can be found at: https://github.com/mdanatg/Astronet-Triage and data can be found at: https://zenodo.org/record/7411579

  2. Algorithmic decision making methods for fair credit scoring

    Authors: Darie Moldovan

    Abstract: The effectiveness of machine learning in evaluating the creditworthiness of loan applicants has been demonstrated for a long time. However, there is concern that the use of automated decision-making processes may result in unequal treatment of groups or individuals, potentially leading to discriminatory outcomes. This paper seeks to address this issue by evaluating the effectiveness of 12 leading… ▽ More

    Submitted 22 June, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

    Journal ref: in IEEE Access, vol. 11, pp. 59729-59743, 2023

  3. CEREC: A Corpus for Entity Resolution in Email Conversations

    Authors: Parag Pravin Dakle, Dan I. Moldovan

    Abstract: We present the first large scale corpus for entity resolution in email conversations (CEREC). The corpus consists of 6001 email threads from the Enron Email Corpus containing 36,448 email messages and 60,383 entity coreference chains. The annotation is carried out as a two-step process with minimal manual effort. Experiments are carried out for evaluating different features and performance of four… ▽ More

    Submitted 1 June, 2021; v1 submitted 21 May, 2021; originally announced May 2021.

    Journal ref: Proceedings of the 28th International Conference on Computational Linguistics, pp. 339-349. 2020

  4. arXiv:2105.06422  [pdf, other

    cs.LG

    Causally motivated Shortcut Removal Using Auxiliary Labels

    Authors: Maggie Makar, Ben Packer, Dan Moldovan, Davis Blalock, Yoni Halpern, Alexander D'Amour

    Abstract: Shortcut learning, in which models make use of easy-to-represent but unstable associations, is a major failure mode for robust machine learning. We study a flexible, causally-motivated approach to training robust predictors by discouraging the use of specific shortcuts, focusing on a common setting where a robust predictor could achieve optimal \emph{iid} generalization in principle, but is oversh… ▽ More

    Submitted 23 February, 2022; v1 submitted 13 May, 2021; originally announced May 2021.

    Journal ref: AISTATS, 2022

  5. arXiv:2011.03395  [pdf, other

    cs.LG stat.ML

    Underspecification Presents Challenges for Credibility in Modern Machine Learning

    Authors: Alexander D'Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D. Hoffman, Farhad Hormozdiari, Neil Houlsby, Shaobo Hou, Ghassen Jerfel, Alan Karthikesalingam, Mario Lucic, Yian Ma, Cory McLean, Diana Mincu, Akinori Mitani, Andrea Montanari, Zachary Nado, Vivek Natarajan, Christopher Nielson, Thomas F. Osborne , et al. (15 additional authors not shown)

    Abstract: ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predict… ▽ More

    Submitted 24 November, 2020; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: Updates: Updated statistical analysis in Section 6; Additional citations

  6. arXiv:2007.08558  [pdf, other

    cs.CV cs.LG

    On Robustness and Transferability of Convolutional Neural Networks

    Authors: Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander D'Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, Mario Lucic

    Abstract: Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts. However, several recent breakthroughs in transfer learning suggest that these networks can cope with severe distribution shifts and successfully adapt to new tasks from a few training examples. In this work we study the interplay between out-of-distribution and transfer performance of m… ▽ More

    Submitted 23 March, 2021; v1 submitted 16 July, 2020; originally announced July 2020.

    Comments: Accepted at CVPR 2021

  7. arXiv:1810.08061  [pdf, ps, other

    cs.PL cs.LG stat.ML

    AutoGraph: Imperative-style Coding with Graph-based Performance

    Authors: Dan Moldovan, James M Decker, Fei Wang, Andrew A Johnson, Brian K Lee, Zachary Nado, D Sculley, Tiark Rompf, Alexander B Wiltschko

    Abstract: There is a perceived trade-off between machine learning code that is easy to write, and machine learning code that is scalable or fast to execute. In machine learning, imperative style libraries like Autograd and PyTorch are easy to write, but suffer from high interpretive overhead and are not easily deployable in production or mobile settings. Graph-based libraries like TensorFlow and Theano bene… ▽ More

    Submitted 26 March, 2019; v1 submitted 16 October, 2018; originally announced October 2018.

  8. arXiv:1809.09569  [pdf, other

    cs.LG cs.SE stat.ML

    Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming

    Authors: Bart van Merriënboer, Dan Moldovan, Alexander B Wiltschko

    Abstract: The need to efficiently calculate first- and higher-order derivatives of increasingly complex models expressed in Python has stressed or exceeded the capabilities of available tools. In this work, we explore techniques from the field of automatic differentiation (AD) that can give researchers expressive power, performance and strong usability. These include source-code transformation (SCT), flexib… ▽ More

    Submitted 26 September, 2018; v1 submitted 25 September, 2018; originally announced September 2018.

  9. arXiv:1711.02712  [pdf, other

    cs.MS stat.ML

    Tangent: Automatic Differentiation Using Source Code Transformation in Python

    Authors: Bart van Merriënboer, Alexander B. Wiltschko, Dan Moldovan

    Abstract: Automatic differentiation (AD) is an essential primitive for machine learning programming systems. Tangent is a new library that performs AD using source code transformation (SCT) in Python. It takes numeric functions written in a syntactic subset of Python and NumPy as input, and generates new Python functions which calculate a derivative. This approach to automatic differentiation is different f… ▽ More

    Submitted 7 November, 2017; originally announced November 2017.

  10. Sparse Positional Strategies for Safety Games

    Authors: Rüdiger Ehlers, Daniela Moldovan

    Abstract: We consider the problem of obtaining sparse positional strategies for safety games. Such games are a commonly used model in many formal methods, as they make the interaction of a system with its environment explicit. Often, a winning strategy for one of the players is used as a certificate or as an artefact for further processing in the application. Small such certificates, i.e., strategies that c… ▽ More

    Submitted 3 July, 2012; originally announced July 2012.

    Comments: In Proceedings SYNT 2012, arXiv:1207.0554

    Journal ref: EPTCS 84, 2012, pp. 1-16