Skip to main content

Showing 1–32 of 32 results for author: Brenner, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.07919  [pdf, ps, other

    cs.LG cs.AI cs.CL nlin.CD physics.comp-ph

    Uncovering the Functional Roles of Nonlinearity in Memory

    Authors: Manuel Brenner, Georgia Koppe

    Abstract: Memory and long-range temporal processing are core requirements for sequence modeling tasks across natural language processing, time-series forecasting, speech recognition, and control. While nonlinear recurrence has long been viewed as essential for enabling such mechanisms, recent work suggests that linear dynamics may often suffice. In this study, we go beyond performance comparisons to systema… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: Preprint under review

  2. arXiv:2505.11774  [pdf, ps, other

    cs.LG cs.AI

    HARDMath2: A Benchmark for Applied Mathematics Built by Students as Part of a Graduate Class

    Authors: James V. Roggeveen, Erik Y. Wang, Will Flintoft, Peter Donets, Lucy S. Nathwani, Nickholas Gutierrez, David Ettel, Anton Marius Graf, Siddharth Dandavate, Arjun Nageswaran, Raglan Ward, Ava Williamson, Anne Mykland, Kacper K. Migacz, Yijun Wang, Egemen Bostan, Duy Thuc Nguyen, Zhe He, Marc L. Descoteaux, Felix Yeung, Shida Liu, Jorge García Ponce, Luke Zhu, Yuyang Chen, Ekaterina S. Ivshina , et al. (20 additional authors not shown)

    Abstract: Large language models (LLMs) have shown remarkable progress in mathematical problem-solving, but evaluation has largely focused on problems that have exact analytical solutions or involve formal proofs, often overlooking approximation-based problems ubiquitous in applied science and engineering. To fill this gap, we build on prior work and present HARDMath2, a dataset of 211 original problems cove… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

  3. arXiv:2504.06260  [pdf, other

    cs.AI cs.CL math.NA

    FEABench: Evaluating Language Models on Multiphysics Reasoning Ability

    Authors: Nayantara Mudur, Hao Cui, Subhashini Venugopalan, Paul Raccuglia, Michael P. Brenner, Peter Norgaard

    Abstract: Building precise simulations of the real world and invoking numerical solvers to answer quantitative problems is an essential requirement in engineering and science. We present FEABench, a benchmark to evaluate the ability of large language models (LLMs) and LLM agents to simulate and solve physics, mathematics and engineering problems using finite element analysis (FEA). We introduce a comprehens… ▽ More

    Submitted 8 April, 2025; originally announced April 2025.

    Comments: 39 pages. Accepted at the NeurIPS 2024 Workshops on Mathematical Reasoning and AI and Open-World Agents

  4. arXiv:2503.13771  [pdf, other

    cs.AI

    Towards AI-assisted Academic Writing

    Authors: Daniel J. Liebling, Malcolm Kane, Madeleine Grunde-Mclaughlin, Ian J. Lang, Subhashini Venugopalan, Michael P. Brenner

    Abstract: We present components of an AI-assisted academic writing system including citation recommendation and introduction writing. The system recommends citations by considering the user's current document context to provide relevant suggestions. It generates introductions in a structured fashion, situating the contributions of the research relative to prior work. We demonstrate the effectiveness of the… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

    Comments: accepted to NAACL 2025 Workshop on AI for Scientific Discovery

  5. arXiv:2503.13517  [pdf, other

    cs.CL cs.AI

    CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning

    Authors: Hao Cui, Zahra Shamsi, Gowoon Cheon, Xuejian Ma, Shutong Li, Maria Tikhanovskaya, Peter Norgaard, Nayantara Mudur, Martyna Plomecka, Paul Raccuglia, Yasaman Bahri, Victor V. Albert, Pranesh Srinivasan, Haining Pan, Philippe Faist, Brian Rohr, Ekin Dogus Cubuk, Muratahan Aykol, Amil Merchant, Michael J. Statt, Dan Morris, Drew Purves, Elise Kleeman, Ruth Alcantara, Matthew Abraham , et al. (9 additional authors not shown)

    Abstract: Scientific problem-solving involves synthesizing information while applying expert knowledge. We introduce CURIE, a scientific long-Context Understanding,Reasoning and Information Extraction benchmark to measure the potential of Large Language Models (LLMs) in scientific problem-solving and assisting scientists in realistic workflows. This benchmark introduces ten challenging tasks with a total of… ▽ More

    Submitted 13 May, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

    Comments: Accepted at ICLR 2025 main conference

  6. arXiv:2501.14249  [pdf, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  7. arXiv:2410.14240  [pdf, other

    cs.LG cs.AI math.DS nlin.CD physics.data-an

    Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction

    Authors: Manuel Brenner, Christoph Jürgen Hemmer, Zahra Monfared, Daniel Durstewitz

    Abstract: Dynamical systems (DS) theory is fundamental for many areas of science and engineering. It can provide deep insights into the behavior of systems evolving in time, as typically described by differential or recursive equations. A common approach to facilitate mathematical tractability and interpretability of DS models involves decomposing nonlinear DS into multiple linear DS separated by switching… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

    Comments: 38th Conference on Neural Information Processing Systems (NeurIPS 2024)

  8. arXiv:2410.09988  [pdf, other

    cs.LG cs.AI

    HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics

    Authors: Jingxuan Fan, Sarah Martinson, Erik Y. Wang, Kaylie Hausknecht, Jonah Brenner, Danxian Liu, Nianli Peng, Corey Wang, Michael P. Brenner

    Abstract: Advanced applied mathematics problems are underrepresented in existing Large Language Model (LLM) benchmark datasets. To address this, we introduce HARDMath, a dataset inspired by a graduate course on asymptotic methods, featuring challenging applied mathematics problems that require analytical approximation techniques. These problems demand a combination of mathematical reasoning, computational t… ▽ More

    Submitted 13 December, 2024; v1 submitted 13 October, 2024; originally announced October 2024.

    Comments: Code and the HARDMath dataset is available at https://github.com/sarahmart/HARDMath

  9. arXiv:2410.04814  [pdf, other

    cs.LG cs.AI math.DS nlin.CD physics.data-an

    Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

    Authors: Manuel Brenner, Elias Weber, Georgia Koppe, Daniel Durstewitz

    Abstract: In science, we are often interested in obtaining a generative model of the underlying system dynamics from observed time series. While powerful methods for dynamical systems reconstruction (DSR) exist when data come from a single domain, how to best integrate data from multiple dynamical regimes and leverage it for generalization is still an open question. This becomes particularly important when… ▽ More

    Submitted 17 February, 2025; v1 submitted 7 October, 2024; originally announced October 2024.

    Comments: Published at the Thirteenth International Conference on Learning Representations (ICLR 2025)

  10. arXiv:2407.06295  [pdf, other

    q-bio.CB cs.LG

    Engineering morphogenesis of cell clusters with differentiable programming

    Authors: Ramya Deshpande, Francesco Mottes, Ariana-Dalia Vlad, Michael P. Brenner, Alma dal Co

    Abstract: Understanding the rules underlying organismal development is a major unsolved problem in biology. Each cell in a developing organism responds to signals in its local environment by dividing, excreting, consuming, or reorganizing, yet how these individual actions coordinate over a macroscopic number of cells to grow complex structures with exquisite functionality is unknown. Here we use recent adva… ▽ More

    Submitted 27 February, 2025; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 12 pages, 4 figures

  11. arXiv:2406.04934  [pdf, other

    cs.LG cs.AI math.DS nlin.CD

    Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction

    Authors: Christoph Jürgen Hemmer, Manuel Brenner, Florian Hess, Daniel Durstewitz

    Abstract: In dynamical systems reconstruction (DSR) we seek to infer from time series measurements a generative model of the underlying dynamical process. This is a prime objective in any scientific discipline, where we are particularly interested in parsimonious models with a low parameter load. A common strategy here is parameter pruning, removing all parameters with small weights. However, here we find t… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  12. arXiv:2403.03154  [pdf, other

    physics.comp-ph cond-mat.other cs.AI

    Quantum Many-Body Physics Calculations with Large Language Models

    Authors: Haining Pan, Nayantara Mudur, Will Taranto, Maria Tikhanovskaya, Subhashini Venugopalan, Yasaman Bahri, Michael P. Brenner, Eun-Ah Kim

    Abstract: Large language models (LLMs) have demonstrated an unprecedented ability to perform complex tasks in multiple domains, including mathematical and scientific reasoning. We demonstrate that with carefully designed prompts, LLMs can accurately carry out key calculations in research papers in theoretical physics. We focus on a broadly used approximation method in quantum physics: the Hartree-Fock metho… ▽ More

    Submitted 22 August, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: 9 pages, 4 figures. Supplemental material in the source file

    Journal ref: Commun Phys 8, 49 (2025)

  13. arXiv:2402.18377  [pdf, other

    cs.LG cs.AI math.DS nlin.CD

    Out-of-Domain Generalization in Dynamical Systems Reconstruction

    Authors: Niclas Göring, Florian Hess, Manuel Brenner, Zahra Monfared, Daniel Durstewitz

    Abstract: In science we are interested in finding the governing equations, the dynamical rules, underlying empirical phenomena. While traditionally scientific models are derived through cycles of human insight and experimentation, recently deep learning (DL) techniques have been advanced to reconstruct dynamical systems (DS) directly from time series data. State-of-the-art dynamical systems reconstruction (… ▽ More

    Submitted 7 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

  14. arXiv:2312.01532  [pdf, other

    cs.HC cs.CL

    Using Large Language Models to Accelerate Communication for Users with Severe Motor Impairments

    Authors: Shanqing Cai, Subhashini Venugopalan, Katie Seaver, Xiang Xiao, Katrin Tomanek, Sri Jalasutram, Meredith Ringel Morris, Shaun Kane, Ajit Narayanan, Robert L. MacDonald, Emily Kornman, Daniel Vance, Blair Casey, Steve M. Gleason, Philip Q. Nelson, Michael P. Brenner

    Abstract: Finding ways to accelerate text input for individuals with profound motor impairments has been a long-standing area of research. Closing the speed gap for augmentative and alternative communication (AAC) devices such as eye-tracking keyboards is important for improving the quality of life for such individuals. Recent advances in neural networks of natural language pose new opportunities for re-thi… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

  15. arXiv:2311.07222  [pdf, other

    physics.ao-ph cs.LG physics.comp-ph

    Neural General Circulation Models for Weather and Climate

    Authors: Dmitrii Kochkov, Janni Yuval, Ian Langmore, Peter Norgaard, Jamie Smith, Griffin Mooers, Milan Klöwer, James Lottes, Stephan Rasp, Peter Düben, Sam Hatfield, Peter Battaglia, Alvaro Sanchez-Gonzalez, Matthew Willson, Michael P. Brenner, Stephan Hoyer

    Abstract: General circulation models (GCMs) are the foundation of weather and climate prediction. GCMs are physics-based simulators which combine a numerical solver for large-scale dynamics with tuned representations for small-scale processes such as cloud formation. Recently, machine learning (ML) models trained on reanalysis data achieved comparable or better skill than GCMs for deterministic weather fore… ▽ More

    Submitted 7 March, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: 92 pages, 54 figures. Nature (2024)

  16. arXiv:2310.07106  [pdf, other

    cs.CL cs.AI cs.LG q-bio.NC

    The Temporal Structure of Language Processing in the Human Brain Corresponds to The Layered Hierarchy of Deep Language Models

    Authors: Ariel Goldstein, Eric Ham, Mariano Schain, Samuel Nastase, Zaid Zada, Avigail Dabush, Bobbi Aubrey, Harshvardhan Gazula, Amir Feder, Werner K Doyle, Sasha Devore, Patricia Dugan, Daniel Friedman, Roi Reichart, Michael Brenner, Avinatan Hassidim, Orrin Devinsky, Adeen Flinker, Omer Levy, Uri Hasson

    Abstract: Deep Language Models (DLMs) provide a novel computational paradigm for understanding the mechanisms of natural language processing in the human brain. Unlike traditional psycholinguistic models, DLMs use layered sequences of continuous numerical vectors to represent words and context, allowing a plethora of emerging applications such as human-like text generation. In this paper we show evidence th… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  17. arXiv:2306.04406  [pdf, other

    cs.LG cs.AI math.DS nlin.CD

    Generalized Teacher Forcing for Learning Chaotic Dynamics

    Authors: Florian Hess, Zahra Monfared, Manuel Brenner, Daniel Durstewitz

    Abstract: Chaotic dynamical systems (DS) are ubiquitous in nature and society. Often we are interested in reconstructing such systems from observed time series for prediction or mechanistic insight, where by reconstruction we mean learning geometrical and invariant temporal properties of the system in question (like attractors). However, training reconstruction algorithms like recurrent neural networks (RNN… ▽ More

    Submitted 27 October, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: Published in the Proceedings of the 40th International Conference on Machine Learning (ICML 2023)

    Journal ref: PMLR 202:13017-13049, 2023

  18. RGB-D And Thermal Sensor Fusion: A Systematic Literature Review

    Authors: Martin Brenner, Napoleon H. Reyes, Teo Susnjak, Andre L. C. Barczak

    Abstract: In the last decade, the computer vision field has seen significant progress in multimodal data fusion and learning, where multiple sensors, including depth, infrared, and visual, are used to capture the environment across diverse spectral ranges. Despite these advancements, there has been no systematic and comprehensive evaluation of fusing RGB-D and thermal modalities to date. While autonomous dr… ▽ More

    Submitted 11 July, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: 34 pages, 21 figures

    Report number: Access-2023-19991

  19. arXiv:2303.07533  [pdf, other

    eess.AS cs.SD

    Speech Intelligibility Classifiers from 550k Disordered Speech Samples

    Authors: Subhashini Venugopalan, Jimmy Tobin, Samuel J. Yang, Katie Seaver, Richard J. N. Cave, Pan-Pan Jiang, Neil Zeghidour, Rus Heywood, Jordan Green, Michael P. Brenner

    Abstract: We developed dysarthric speech intelligibility classifiers on 551,176 disordered speech samples contributed by a diverse set of 468 speakers, with a range of self-reported speaking disorders and rated for their overall intelligibility on a five-point scale. We trained three models following different deep learning approaches and evaluated them on ~94K utterances from 100 speakers. We further found… ▽ More

    Submitted 15 March, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: ICASSP 2023 camera-ready

  20. arXiv:2302.01259  [pdf, other

    cs.LG

    Geometric Deep Learning for Autonomous Driving: Unlocking the Power of Graph Neural Networks With CommonRoad-Geometric

    Authors: Eivind Meyer, Maurice Brenner, Bowen Zhang, Max Schickert, Bilal Musani, Matthias Althoff

    Abstract: Heterogeneous graphs offer powerful data representations for traffic, given their ability to model the complex interaction effects among a varying number of traffic participants and the underlying road infrastructure. With the recent advent of graph neural networks (GNNs) as the accompanying deep learning framework, the graph structure can be efficiently leveraged for various machine learning appl… ▽ More

    Submitted 24 April, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

    Comments: Presented at IV 2023

  21. arXiv:2212.07892  [pdf, other

    cs.LG math.DS nlin.CD

    Integrating Multimodal Data for Joint Generative Modeling of Complex Dynamics

    Authors: Manuel Brenner, Florian Hess, Georgia Koppe, Daniel Durstewitz

    Abstract: Many, if not most, systems of interest in science are naturally described as nonlinear dynamical systems. Empirically, we commonly access these systems through time series measurements. Often such time series may consist of discrete random variables rather than continuous measurements, or may be composed of measurements from multiple data modalities observed simultaneously. For instance, in neuros… ▽ More

    Submitted 7 June, 2024; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: ICML 2024. Previously published as a workshop paper for the AAAI 2023 Workshop MLmDS as "Multimodal Teacher Forcing for Reconstructing Nonlinear Dynamical Systems"

  22. arXiv:2207.02542  [pdf, other

    cs.LG math.DS nlin.CD physics.comp-ph

    Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems

    Authors: Manuel Brenner, Florian Hess, Jonas M. Mikhaeil, Leonard Bereska, Zahra Monfared, Po-Chen Kuo, Daniel Durstewitz

    Abstract: In many scientific disciplines, we are interested in inferring the nonlinear dynamical system underlying a set of observed time series, a challenging task in the face of chaotic behavior and noise. Previous deep learning approaches toward this goal often suffered from a lack of interpretability and tractability. In particular, the high-dimensional latent spaces often required for a faithful embedd… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

    Comments: To be published in the Proceedings of the 39th International Conference on Machine Learning (ICML 2022)

  23. arXiv:2207.00556  [pdf, other

    cs.LG physics.flu-dyn

    Learning to correct spectral methods for simulating turbulent flows

    Authors: Gideon Dresdner, Dmitrii Kochkov, Peter Norgaard, Leonardo Zepeda-Núñez, Jamie A. Smith, Michael P. Brenner, Stephan Hoyer

    Abstract: Despite their ubiquity throughout science and engineering, only a handful of partial differential equations (PDEs) have analytical, or closed-form solutions. This motivates a vast amount of classical work on numerical simulation of PDEs and more recently, a whirlwind of research into data-driven techniques leveraging machine learning (ML). A recent line of work indicates that a hybrid of classical… ▽ More

    Submitted 25 June, 2023; v1 submitted 1 July, 2022; originally announced July 2022.

  24. arXiv:2205.03767  [pdf, other

    cs.CL

    Context-Aware Abbreviation Expansion Using Large Language Models

    Authors: Shanqing Cai, Subhashini Venugopalan, Katrin Tomanek, Ajit Narayanan, Meredith Ringel Morris, Michael P. Brenner

    Abstract: Motivated by the need for accelerating text entry in augmentative and alternative communication (AAC) for people with severe motor impairments, we propose a paradigm in which phrases are abbreviated aggressively as primarily word-initial letters. Our approach is to expand the abbreviations into full-phrase options by leveraging conversation context with the power of pretrained large language model… ▽ More

    Submitted 10 May, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: 15 pages, 7 figures, 8 tables. Accepted as a long paper at NAACL 2022

  25. arXiv:2107.11468  [pdf, other

    cs.LG cs.CV eess.IV

    Using a Cross-Task Grid of Linear Probes to Interpret CNN Model Predictions On Retinal Images

    Authors: Katy Blumer, Subhashini Venugopalan, Michael P. Brenner, Jon Kleinberg

    Abstract: We analyze a dataset of retinal images using linear probes: linear regression models trained on some "target" task, using embeddings from a deep convolutional (CNN) model trained on some "source" task as input. We use this method across all possible pairings of 93 tasks in the UK Biobank dataset of retinal images, leading to ~164k different models. We analyze the performance of these linear probes… ▽ More

    Submitted 23 July, 2021; originally announced July 2021.

    Comments: Extended abstract at Interpretable Machine Learning in Healthcare (IMLH) workshop at ICML 2021

  26. arXiv:2107.03985  [pdf, other

    eess.AS cs.LG cs.SD

    Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases

    Authors: Subhashini Venugopalan, Joel Shor, Manoj Plakal, Jimmy Tobin, Katrin Tomanek, Jordan R. Green, Michael P. Brenner

    Abstract: Automatic classification of disordered speech can provide an objective tool for identifying the presence and severity of speech impairment. Classification approaches can also help identify hard-to-recognize speech samples to teach ASR systems about the variable manifestations of impaired speech. Here, we develop and compare different deep learning techniques to classify the intelligibility of diso… ▽ More

    Submitted 8 July, 2021; originally announced July 2021.

    Comments: Accepted at INTERSPEECH 2021

  27. arXiv:2102.11192  [pdf, other

    cs.LG physics.ao-ph

    Variational Data Assimilation with a Learned Inverse Observation Operator

    Authors: Thomas Frerix, Dmitrii Kochkov, Jamie A. Smith, Daniel Cremers, Michael P. Brenner, Stephan Hoyer

    Abstract: Variational data assimilation optimizes for an initial state of a dynamical system such that its evolution fits observational data. The physical model can subsequently be evolved into the future to make predictions. This principle is a cornerstone of large scale forecasting applications such as numerical weather prediction. As such, it is implemented in current operational systems of weather forec… ▽ More

    Submitted 20 May, 2021; v1 submitted 22 February, 2021; originally announced February 2021.

    Comments: Published at the International Conference on Machine Learning (ICML) 2021

  28. arXiv:2102.01010  [pdf, other

    physics.flu-dyn cs.LG

    Machine learning accelerated computational fluid dynamics

    Authors: Dmitrii Kochkov, Jamie A. Smith, Ayya Alieva, Qing Wang, Michael P. Brenner, Stephan Hoyer

    Abstract: Numerical simulation of fluids plays an essential role in modeling many physical phenomena, such as weather, climate, aerodynamics and plasma physics. Fluids are well described by the Navier-Stokes equations, but solving these equations at scale remains daunting, limited by the computational cost of resolving the smallest spatiotemporal features. This leads to unfavorable trade-offs between accura… ▽ More

    Submitted 28 January, 2021; originally announced February 2021.

    Comments: 13 pages, 9 figures

  29. arXiv:2007.05500  [pdf, other

    cs.CV cs.LG eess.IV

    Scientific Discovery by Generating Counterfactuals using Image Translation

    Authors: Arunachalam Narayanaswamy, Subhashini Venugopalan, Dale R. Webster, Lily Peng, Greg Corrado, Paisan Ruamviboonsuk, Pinal Bavishi, Rory Sayres, Abigail Huang, Siva Balasubramanian, Michael Brenner, Philip Nelson, Avinash V. Varadarajan

    Abstract: Model explanation techniques play a critical role in understanding the source of a model's performance and making its decisions transparent. Here we investigate if explanation techniques can also be used as a mechanism for scientific discovery. We make three contributions: first, we propose a framework to convert predictions from explanation techniques to a mechanism of discovery. Second, we show… ▽ More

    Submitted 19 July, 2020; v1 submitted 10 July, 2020; originally announced July 2020.

    Comments: Accepted at MICCAI 2020. This version combines camera-ready and supplement

    Journal ref: MICCAI 2020

  30. arXiv:1912.07661  [pdf, other

    cs.LG eess.IV q-bio.QM stat.ML

    It's easy to fool yourself: Case studies on identifying bias and confounding in bio-medical datasets

    Authors: Subhashini Venugopalan, Arunachalam Narayanaswamy, Samuel Yang, Anton Geraschenko, Scott Lipnick, Nina Makhortova, James Hawrot, Christine Marques, Joao Pereira, Michael Brenner, Lee Rubin, Brian Wainger, Marc Berndl

    Abstract: Confounding variables are a well known source of nuisance in biomedical studies. They present an even greater challenge when we combine them with black-box machine learning techniques that operate on raw data. This work presents two case studies. In one, we discovered biases arising from systematic errors in the data generation process. In the other, we found a spurious source of signal unrelated… ▽ More

    Submitted 6 April, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

    Comments: Accepted at Neurips 2019 LMRL workshop -- extended abstract track

  31. arXiv:1907.13511  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Personalizing ASR for Dysarthric and Accented Speech with Limited Data

    Authors: Joel Shor, Dotan Emanuel, Oran Lang, Omry Tuval, Michael Brenner, Julie Cattiau, Fernando Vieira, Maeve McNally, Taylor Charbonneau, Melissa Nollstadt, Avinatan Hassidim, Yossi Matias

    Abstract: Automatic speech recognition (ASR) systems have dramatically improved over the last few years. ASR systems are most often trained from 'typical' speech, which means that underrepresented groups don't experience the same level of improvement. In this paper, we present and evaluate finetuning techniques to improve ASR for users with non-standard speech. We focus on two types of non-standard speech:… ▽ More

    Submitted 31 July, 2019; originally announced July 2019.

    Comments: 5 pages

  32. Using Attribution to Decode Dataset Bias in Neural Network Models for Chemistry

    Authors: Kevin McCloskey, Ankur Taly, Federico Monti, Michael P. Brenner, Lucy Colwell

    Abstract: Deep neural networks have achieved state of the art accuracy at classifying molecules with respect to whether they bind to specific protein targets. A key breakthrough would occur if these models could reveal the fragment pharmacophores that are causally involved in binding. Extracting chemical details of binding from the networks could potentially lead to scientific discoveries about the mechanis… ▽ More

    Submitted 19 May, 2019; v1 submitted 27 November, 2018; originally announced November 2018.