Skip to main content

Showing 1–46 of 46 results for author: Brown, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.03583  [pdf, other

    cs.LG eess.SP

    Scalable Hypergraph Structure Learning with Diverse Smoothness Priors

    Authors: Benjamin T. Brown, Haoxiang Zhang, Daniel L. Lau, Gonzalo R. Arce

    Abstract: In graph signal processing, learning the weighted connections between nodes from a set of sample signals is a fundamental task when the underlying relationships are not known a priori. This task is typically addressed by finding a graph Laplacian on which the observed signals are smooth. With the extension of graphs to hypergraphs - where edges can connect more than two nodes - graph learning meth… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

    Comments: 13 pages, 6 figures, submitted to IEEE for possible publication

  2. arXiv:2503.10676  [pdf, other

    cs.CL cs.AI cs.LG

    Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data

    Authors: Swati Rallapalli, Shannon Gallagher, Andrew O. Mellinger, Jasmine Ratchford, Anusha Sinha, Tyler Brooks, William R. Nichols, Nick Winski, Bryan Brown

    Abstract: We study the efficacy of fine-tuning Large Language Models (LLMs) for the specific task of report (government archives, news, intelligence reports) summarization. While this topic is being very actively researched - our specific application set-up faces two challenges: (i) ground-truth summaries maybe unavailable (e.g., for government archives), and (ii) availability of limited compute power - the… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

  3. arXiv:2501.14723  [pdf, other

    cs.LG

    CodeMonkeys: Scaling Test-Time Compute for Software Engineering

    Authors: Ryan Ehrlich, Bradley Brown, Jordan Juravsky, Ronald Clark, Christopher Ré, Azalia Mirhoseini

    Abstract: Scaling test-time compute is a promising axis for improving LLM capabilities. However, test-time compute can be scaled in a variety of ways, and effectively combining different approaches remains an active area of research. Here, we explore this problem in the context of solving real-world GitHub issues from the SWE-bench dataset. Our system, named CodeMonkeys, allows models to iteratively edit a… ▽ More

    Submitted 3 February, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

  4. arXiv:2412.10599  [pdf, other

    cs.RO cs.AI

    Advances in Transformers for Robotic Applications: A Review

    Authors: Nikunj Sanghai, Nik Bear Brown

    Abstract: The introduction of Transformers architecture has brought about significant breakthroughs in Deep Learning (DL), particularly within Natural Language Processing (NLP). Since their inception, Transformers have outperformed many traditional neural network architectures due to their "self-attention" mechanism and their scalability across various applications. In this paper, we cover the use of Transf… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

    Comments: Early preprint, focusing primarily on general purpose robots, more updates to come

  5. arXiv:2410.03423  [pdf, other

    eess.SP cs.LG

    Aircraft Radar Altimeter Interference Mitigation Through a CNN-Layer Only Denoising Autoencoder Architecture

    Authors: Samuel B. Brown, Stephen Young, Adam Wagenknecht, Daniel Jakubisin, Charles E. Thornton, Aaron Orndorff, William C. Headley

    Abstract: Denoising autoencoders for signal processing applications have been shown to experience significant difficulty in learning to reconstruct radio frequency communication signals, particularly in the large sample regime. In communication systems, this challenge is primarily due to the need to reconstruct the modulated data stream which is generally highly stochastic in nature. In this work, we take a… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

    Comments: To be presented at MILCOM 2024, Washington DC

  6. arXiv:2407.21787  [pdf, other

    cs.LG cs.AI

    Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

    Authors: Bradley Brown, Jordan Juravsky, Ryan Ehrlich, Ronald Clark, Quoc V. Le, Christopher Ré, Azalia Mirhoseini

    Abstract: Scaling the amount of compute used to train language models has dramatically improved their capabilities. However, when it comes to inference, we often limit models to making only one attempt at a problem. Here, we explore inference compute as another axis for scaling, using the simple technique of repeatedly sampling candidate solutions from a model. Across multiple tasks and models, we observe t… ▽ More

    Submitted 30 December, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

  7. arXiv:2407.12687  [pdf, other

    cs.CY cs.AI cs.LG

    Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach

    Authors: Irina Jurenka, Markus Kunesch, Kevin R. McKee, Daniel Gillick, Shaojian Zhu, Sara Wiltberger, Shubham Milind Phal, Katherine Hermann, Daniel Kasenberg, Avishkar Bhoopchand, Ankit Anand, Miruna Pîslar, Stephanie Chan, Lisa Wang, Jennifer She, Parsa Mahmoudieh, Aliya Rysbek, Wei-Jen Ko, Andrea Huber, Brett Wiltshire, Gal Elidan, Roni Rabin, Jasmin Rubinovitz, Amit Pitaru, Mac McAllister , et al. (49 additional authors not shown)

    Abstract: A major challenge facing the world is the provision of equitable and universal access to quality education. Recent advances in generative AI (gen AI) have created excitement about the potential of new technologies to offer a personal tutor for every learner and a teaching assistant for every teacher. The full extent of this dream, however, has not yet materialised. We argue that this is primarily… ▽ More

    Submitted 19 July, 2024; v1 submitted 21 May, 2024; originally announced July 2024.

  8. arXiv:2406.01943  [pdf, ps, other

    cs.CL cs.AI

    Enhancing Trust in LLMs: Algorithms for Comparing and Interpreting LLMs

    Authors: Nik Bear Brown

    Abstract: This paper surveys evaluation techniques to enhance the trustworthiness and understanding of Large Language Models (LLMs). As reliance on LLMs grows, ensuring their reliability, fairness, and transparency is crucial. We explore algorithmic methods and metrics to assess LLM performance, identify weaknesses, and guide development towards more trustworthy applications. Key evaluation metrics include… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: An extensive survey of the literature specifying algorithms and techniques enhancing the trustworthiness and understanding of Large Language Models (LLMs)

    MSC Class: 2020: 68T50; 68Q25 ACM Class: I.2.7; F.2.2

  9. arXiv:2403.04087  [pdf, other

    cs.AI

    The Cognitive Type Project -- Mapping Typography to Cognition

    Authors: Nik Bear Brown

    Abstract: The Cognitive Type Project is focused on developing computational tools to enable the design of typefaces with varying cognitive properties. This initiative aims to empower typographers to craft fonts that enhance click-through rates for online ads, improve reading levels in children's books, enable dyslexics to create personalized type, or provide insights into customer reactions to textual conte… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  10. arXiv:2402.05099  [pdf, other

    cs.LG

    Hydragen: High-Throughput LLM Inference with Shared Prefixes

    Authors: Jordan Juravsky, Bradley Brown, Ryan Ehrlich, Daniel Y. Fu, Christopher Ré, Azalia Mirhoseini

    Abstract: Transformer-based large language models (LLMs) are now deployed to hundreds of millions of users. LLM inference is commonly performed on batches of sequences that share a prefix, such as few-shot examples or a chatbot system prompt. Decoding in this large-batch setting can be bottlenecked by the attention operation, which reads large key-value (KV) caches from memory and computes inefficient matri… ▽ More

    Submitted 13 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  11. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1326 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 9 May, 2025; v1 submitted 18 December, 2023; originally announced December 2023.

  12. arXiv:2310.16982  [pdf, other

    quant-ph cond-mat.str-el cs.IT hep-th math.GT

    Non-Clifford and parallelizable fault-tolerant logical gates on constant and almost-constant rate homological quantum LDPC codes via higher symmetries

    Authors: Guanyu Zhu, Shehryar Sikander, Elia Portnoy, Andrew W. Cross, Benjamin J. Brown

    Abstract: We study parallel fault-tolerant quantum computing for families of homological quantum low-density parity-check (LDPC) codes defined on 3-manifolds with constant or almost-constant encoding rate. We derive generic formula for a transversal $T$ gate of color codes on general 3-manifolds, which acts as collective non-Clifford logical CCZ gates on any triplet of logical qubits with their logical-$X$… ▽ More

    Submitted 23 October, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

    Comments: 40 pages, 32 figures. In the updated version v2, we have simplified the TQFT derivation of the logical gates via an operator-valued cochain formalism in Sec. III, which also gives rise to the explicit construction of constant-depth circuits corresponding to logical CCZ and CZ gates in three copies of identical toric codes defined on arbitrary 3-manifolds

  13. arXiv:2306.08848  [pdf, other

    cs.LG cs.CY cs.HC

    Datasheets for Machine Learning Sensors: Towards Transparency, Auditability, and Responsibility for Intelligent Sensing

    Authors: Matthew Stewart, Pete Warden, Yasmine Omri, Shvetank Prakash, Joao Santos, Shawn Hymel, Benjamin Brown, Jim MacArthur, Nat Jeffries, Sachin Katti, Brian Plancher, Vijay Janapa Reddi

    Abstract: Machine learning (ML) sensors are enabling intelligence at the edge by empowering end-users with greater control over their data. ML sensors offer a new paradigm for sensing that moves the processing and analysis to the device itself rather than relying on the cloud, bringing benefits like lower latency and greater data privacy. The rise of these intelligent edge devices, while revolutionizing are… ▽ More

    Submitted 16 February, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

  14. arXiv:2305.12571  [pdf, other

    cs.LG cs.AI cs.SE

    Reproducibility Requires Consolidated Artifacts

    Authors: Iordanis Fostiropoulos, Bowman Brown, Laurent Itti

    Abstract: Machine learning is facing a 'reproducibility crisis' where a significant number of works report failures when attempting to reproduce previously published results. We evaluate the sources of reproducibility failures using a meta-analysis of 142 replication studies from ReScience C and 204 code repositories. We find that missing experiment details such as hyperparameters are potential causes of un… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

  15. Low-field magnetic resonance image enhancement via stochastic image quality transfer

    Authors: Hongxiang Lin, Matteo Figini, Felice D'Arco, Godwin Ogbole, Ryutaro Tanno, Stefano B. Blumberg, Lisa Ronan, Biobele J. Brown, David W. Carmichael, Ikeoluwa Lagunju, Judith Helen Cross, Delmiro Fernandez-Reyes, Daniel C. Alexander

    Abstract: Low-field (<1T) magnetic resonance imaging (MRI) scanners remain in widespread use in low- and middle-income countries (LMICs) and are commonly used for some applications in higher income countries e.g. for small child patients with obesity, claustrophobia, implants, or tattoos. However, low-field MR images commonly have lower resolution and poorer contrast than images from high field (1.5T, 3T, a… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

    Comments: Accepted in Medical Image Analysis

  16. arXiv:2304.11681  [pdf, other

    cs.CR

    Money Over Morals: A Business Analysis of Conti Ransomware

    Authors: Ian W. Gray, Jack Cable, Benjamin Brown, Vlad Cuiujuclu, Damon McCoy

    Abstract: Ransomware operations have evolved from relatively unsophisticated threat actors into highly coordinated cybercrime syndicates that regularly extort millions of dollars in a single attack. Despite dominating headlines and crippling businesses across the globe, there is relatively little in-depth research into the modern structure and economics of ransomware operations. In this paper, we leverage… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: To be published in 2022 APWG Symposium on Electronic Crime Research (eCrime)

  17. arXiv:2304.09787  [pdf, other

    cs.CV

    NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

    Authors: Seung Wook Kim, Bradley Brown, Kangxue Yin, Karsten Kreis, Katja Schwarz, Daiqing Li, Robin Rombach, Antonio Torralba, Sanja Fidler

    Abstract: Automatically generating high-quality real world 3D scenes is of enormous interest for applications such as virtual reality and robotics simulation. Towards this goal, we introduce NeuralField-LDM, a generative model capable of synthesizing complex 3D environments. We leverage Latent Diffusion Models that have been successfully utilized for efficient high-quality 2D content creation. We first trai… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  18. arXiv:2211.13239  [pdf, other

    cs.LG cs.AI

    Relating Regularization and Generalization through the Intrinsic Dimension of Activations

    Authors: Bradley C. A. Brown, Jordan Juravsky, Anthony L. Caterini, Gabriel Loaiza-Ganem

    Abstract: Given a pair of models with similar training set performance, it is natural to assume that the model that possesses simpler internal representations would exhibit better generalization. In this work, we provide empirical evidence for this intuition through an analysis of the intrinsic dimension (ID) of model activations, which can be thought of as the minimal number of factors of variation in the… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022 OPT and HITY workshops

  19. arXiv:2209.02847   

    cs.CV cs.LG

    DC-Art-GAN: Stable Procedural Content Generation using DC-GANs for Digital Art

    Authors: Rohit Gandikota, Nik Bear Brown

    Abstract: Art is an artistic method of using digital technologies as a part of the generative or creative process. With the advent of digital currency and NFTs (Non-Fungible Token), the demand for digital art is growing aggressively. In this manuscript, we advocate the concept of using deep generative networks with adversarial training for a stable and variant art generation. The work mainly focuses on usin… ▽ More

    Submitted 13 March, 2023; v1 submitted 6 September, 2022; originally announced September 2022.

    Comments: the project is done as an undergrad report. On the hind sight, it does not contain full and exhaustive analysis

  20. arXiv:2207.02862  [pdf, other

    stat.ML cs.AI cs.LG

    Verifying the Union of Manifolds Hypothesis for Image Data

    Authors: Bradley C. A. Brown, Anthony L. Caterini, Brendan Leigh Ross, Jesse C. Cresswell, Gabriel Loaiza-Ganem

    Abstract: Deep learning has had tremendous success at learning low-dimensional representations of high-dimensional data. This success would be impossible if there was no hidden low-dimensional structure in data of interest; this existence is posited by the manifold hypothesis, which states that the data lies on an unknown manifold of low intrinsic dimension. In this paper, we argue that this hypothesis does… ▽ More

    Submitted 2 March, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

    Comments: ICLR 2023

  21. arXiv:2204.01108  [pdf

    cs.CV

    Adjusting for Bias with Procedural Data

    Authors: Shesh Narayan Gupta, Nicholas Bear Brown

    Abstract: 3D softwares are now capable of producing highly realistic images that look nearly indistinguishable from the real images. This raises the question: can real datasets be enhanced with 3D rendered data? We investigate this question. In this paper we demonstrate the use of 3D rendered data, procedural, data for the adjustment of bias in image datasets. We perform error analysis of images of animals… ▽ More

    Submitted 4 April, 2022; v1 submitted 3 April, 2022; originally announced April 2022.

    Comments: 11 pages, 9 figures, 4 tables, presented in RISE 2022 Northeastern University

  22. arXiv:2203.10626  [pdf

    cs.LG cs.CV

    Automated Detection of Acute Promyelocytic Leukemia in Blood Films and Bone Marrow Aspirates with Annotation-free Deep Learning

    Authors: Petru Manescu, Priya Narayanan, Christopher Bendkowski, Muna Elmi, Remy Claveau, Vijay Pawar, Biobele J. Brown, Mike Shaw, Anupama Rao, Delmiro Fernandez-Reyes

    Abstract: While optical microscopy inspection of blood films and bone marrow aspirates by a hematologist is a crucial step in establishing diagnosis of acute leukemia, especially in low-resource settings where other diagnostic modalities might not be available, the task remains time-consuming and prone to human inconsistencies. This has an impact especially in cases of Acute Promyelocytic Leukemia (APL) tha… ▽ More

    Submitted 20 March, 2022; originally announced March 2022.

    Comments: 13 pages, 2 tables, 5 figures

    MSC Class: 68T07

  23. arXiv:2111.13786  [pdf, other

    cs.LG cs.AI

    Learning from learning machines: a new generation of AI technology to meet the needs of science

    Authors: Luca Pion-Tonachini, Kristofer Bouchard, Hector Garcia Martin, Sean Peisert, W. Bradley Holtz, Anil Aswani, Dipankar Dwivedi, Haruko Wainwright, Ghanshyam Pilania, Benjamin Nachman, Babetta L. Marrone, Nicola Falco, Prabhat, Daniel Arnold, Alejandro Wolf-Yadlin, Sarah Powers, Sharlee Climer, Quinn Jackson, Ty Carlson, Michael Sohn, Petrus Zwart, Neeraj Kumar, Amy Justice, Claire Tomlin, Daniel Jacobson , et al. (11 additional authors not shown)

    Abstract: We outline emerging opportunities and challenges to enhance the utility of AI for scientific discovery. The distinct goals of AI for industry versus the goals of AI for science create tension between identifying patterns in data versus discovering patterns in the world from data. If we address the fundamental challenges associated with "bridging the gap" between domain-driven scientific models and… ▽ More

    Submitted 26 November, 2021; originally announced November 2021.

  24. arXiv:2109.13488  [pdf, other

    cs.CV

    Towards Rotation Invariance in Object Detection

    Authors: Agastya Kalra, Guy Stoppi, Bradley Brown, Rishav Agarwal, Achuta Kadambi

    Abstract: Rotation augmentations generally improve a model's invariance/equivariance to rotation - except in object detection. In object detection the shape is not known, therefore rotation creates a label ambiguity. We show that the de-facto method for bounding box label rotation, the Largest Box Method, creates very large labels, leading to poor performance and in many cases worse performance than using n… ▽ More

    Submitted 30 September, 2021; v1 submitted 28 September, 2021; originally announced September 2021.

    Comments: Accepted ICCV 2021

  25. arXiv:2106.04008  [pdf, other

    cs.LG

    Widening Access to Applied Machine Learning with TinyML

    Authors: Vijay Janapa Reddi, Brian Plancher, Susan Kennedy, Laurence Moroney, Pete Warden, Anant Agarwal, Colby Banbury, Massimo Banzi, Matthew Bennett, Benjamin Brown, Sharad Chitlangia, Radhika Ghosal, Sarah Grafman, Rupert Jaeger, Srivatsan Krishnan, Maximilian Lam, Daniel Leiker, Cara Mann, Mark Mazumder, Dominic Pajak, Dhilan Ramaprasad, J. Evan Smith, Matthew Stewart, Dustin Tingley

    Abstract: Broadening access to both computational and educational resources is critical to diffusing machine-learning (ML) innovation. However, today, most ML resources and experts are siloed in a few countries and organizations. In this paper, we describe our pedagogical approach to increasing access to applied ML through a massive open online course (MOOC) on Tiny Machine Learning (TinyML). We suggest tha… ▽ More

    Submitted 9 June, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: Understanding the underpinnings of the TinyML edX course series: https://www.edx.org/professional-certificate/harvardx-tiny-machine-learning

  26. arXiv:2106.02190  [pdf, other

    cs.LG cs.AI q-bio.BM

    Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery

    Authors: Yulun Wu, Mikaela Cashman, Nicholas Choma, Érica T. Prates, Verónica G. Melesse Vergara, Manesh Shah, Andrew Chen, Austin Clyde, Thomas S. Brettin, Wibe A. de Jong, Neeraj Kumar, Martha S. Head, Rick L. Stevens, Peter Nugent, Daniel A. Jacobson, James B. Brown

    Abstract: We developed Distilled Graph Attention Policy Network (DGAPN), a reinforcement learning model to generate novel graph-structured chemical representations that optimize user-defined objectives by efficiently navigating a physically constrained domain. The framework is examined on the task of generating molecules that are designed to bind, noncovalently, to functional sites of SARS-CoV-2 proteins. W… ▽ More

    Submitted 11 May, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

  27. arXiv:2010.14701  [pdf, other

    cs.LG cs.CL cs.CV

    Scaling Laws for Autoregressive Generative Modeling

    Authors: Tom Henighan, Jared Kaplan, Mor Katz, Mark Chen, Christopher Hesse, Jacob Jackson, Heewoo Jun, Tom B. Brown, Prafulla Dhariwal, Scott Gray, Chris Hallacy, Benjamin Mann, Alec Radford, Aditya Ramesh, Nick Ryder, Daniel M. Ziegler, John Schulman, Dario Amodei, Sam McCandlish

    Abstract: We identify empirical scaling laws for the cross-entropy loss in four domains: generative image modeling, video modeling, multimodal image$\leftrightarrow$text models, and mathematical problem solving. In all cases autoregressive Transformers smoothly improve in performance as model size and compute budgets increase, following a power-law plus constant scaling law. The optimal model size also depe… ▽ More

    Submitted 5 November, 2020; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: 20+17 pages, 33 figures; added appendix with additional language results

  28. arXiv:2006.13188  [pdf, other

    cs.CV cs.GR

    Efficient Spatially Adaptive Convolution and Correlation

    Authors: Thomas W. Mitchel, Benedict Brown, David Koller, Tim Weyrich, Szymon Rusinkiewicz, Michael Kazhdan

    Abstract: Fast methods for convolution and correlation underlie a variety of applications in computer vision and graphics, including efficient filtering, analysis, and simulation. However, standard convolution and correlation are inherently limited to fixed filters: spatial adaptation is impossible without sacrificing efficient computation. In early work, Freeman and Adelson have shown how steerable filters… ▽ More

    Submitted 28 July, 2020; v1 submitted 23 June, 2020; originally announced June 2020.

  29. arXiv:2005.14165  [pdf, other

    cs.CL

    Language Models are Few-Shot Learners

    Authors: Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess , et al. (6 additional authors not shown)

    Abstract: Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few… ▽ More

    Submitted 22 July, 2020; v1 submitted 28 May, 2020; originally announced May 2020.

    Comments: 40+32 pages

  30. arXiv:2005.04305  [pdf

    cs.LG cs.CV stat.ML

    Measuring the Algorithmic Efficiency of Neural Networks

    Authors: Danny Hernandez, Tom B. Brown

    Abstract: Three factors drive the advance of AI: algorithmic innovation, data, and the amount of compute available for training. Algorithmic progress has traditionally been more difficult to quantify than compute and data. In this work, we argue that algorithmic progress has an aspect that is both straightforward to measure and interesting: reductions over time in the compute needed to reach past capabiliti… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Comments: 20 pages, 5 figures

  31. arXiv:2003.07216  [pdf, other

    eess.IV cs.CV physics.med-ph

    Image Quality Transfer Enhances Contrast and Resolution of Low-Field Brain MRI in African Paediatric Epilepsy Patients

    Authors: Matteo Figini, Hongxiang Lin, Godwin Ogbole, Felice D Arco, Stefano B. Blumberg, David W. Carmichael, Ryutaro Tanno, Enrico Kaden, Biobele J. Brown, Ikeoluwa Lagunju, Helen J. Cross, Delmiro Fernandez-Reyes, Daniel C. Alexander

    Abstract: 1.5T or 3T scanners are the current standard for clinical MRI, but low-field (<1T) scanners are still common in many lower- and middle-income countries for reasons of cost and robustness to power failures. Compared to modern high-field scanners, low-field scanners provide images with lower signal-to-noise ratio at equivalent resolution, leaving practitioners to compensate by using large slice thic… ▽ More

    Submitted 18 March, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

    Comments: 6 pages, 3 figures, accepted at ICLR 2020 workshop on Artificial Intelligence for Affordable Healthcare

  32. arXiv:2001.08361  [pdf, other

    cs.LG stat.ML

    Scaling Laws for Neural Language Models

    Authors: Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei

    Abstract: We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven orders of magnitude. Other architectural details such as network width or depth have minimal effects within a wide range. Simple equations govern the dependence… ▽ More

    Submitted 22 January, 2020; originally announced January 2020.

    Comments: 19 pages, 15 figures

  33. arXiv:1909.08593  [pdf, other

    cs.CL cs.LG stat.ML

    Fine-Tuning Language Models from Human Preferences

    Authors: Daniel M. Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B. Brown, Alec Radford, Dario Amodei, Paul Christiano, Geoffrey Irving

    Abstract: Reward learning enables the application of reinforcement learning (RL) to tasks where reward is defined by human judgment, building a model of reward by asking humans questions. Most work on reward learning has used simulated environments, but complex information about values is often expressed in natural language, and we believe reward learning for language is a key to making RL practical and saf… ▽ More

    Submitted 8 January, 2020; v1 submitted 18 September, 2019; originally announced September 2019.

  34. arXiv:1909.07947  [pdf, other

    stat.ML cs.LG

    Sparse Canonical Correlation Analysis via Concave Minimization

    Authors: Omid S. Solari, James B. Brown, Peter J. Bickel

    Abstract: A new approach to the sparse Canonical Correlation Analysis (sCCA)is proposed with the aim of discovering interpretable associations in very high-dimensional multi-view, i.e.observations of multiple sets of variables on the same subjects, problems. Inspired by the sparse PCA approach of Journee et al. (2010), we also show that the sparse CCA formulation, while non-convex, is equivalent to a maximi… ▽ More

    Submitted 17 September, 2019; originally announced September 2019.

    Comments: 45 Pages

  35. arXiv:1909.07944  [pdf, other

    stat.ML cs.LG

    BLOCCS: Block Sparse Canonical Correlation Analysis With Application To Interpretable Omics Integration

    Authors: Omid Shams Solari, Rojin Safavi, James B. Brown

    Abstract: We introduce Block Sparse Canonical Correlation Analysis which estimates multiple pairs of canonical directions (together a "block") at once, resulting in significantly improved orthogonality of the sparse directions which, we demonstrate, translates to more interpretable solutions. Our approach builds on the sparse CCA method of (Solari, Brown, and Bickel 2019) in that we also express the bi-conv… ▽ More

    Submitted 20 January, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

    Comments: 8 pages

  36. arXiv:1909.06763  [pdf, other

    eess.IV cs.CV

    Deep Learning for Low-Field to High-Field MR: Image Quality Transfer with Probabilistic Decimation Simulator

    Authors: Hongxiang Lin, Matteo Figini, Ryutaro Tanno, Stefano B. Blumberg, Enrico Kaden, Godwin Ogbole, Biobele J. Brown, Felice D'Arco, David W. Carmichael, Ikeoluwa Lagunju, Helen J. Cross, Delmiro Fernandez-Reyes, Daniel C. Alexander

    Abstract: MR images scanned at low magnetic field ($<1$T) have lower resolution in the slice direction and lower contrast, due to a relatively small signal-to-noise ratio (SNR) than those from high field (typically 1.5T and 3T). We adapt the recent idea of Image Quality Transfer (IQT) to enhance very low-field structural images aiming to estimate the resolution, spatial coverage, and contrast of high-field… ▽ More

    Submitted 15 September, 2019; originally announced September 2019.

  37. arXiv:1906.07502  [pdf

    cs.LG stat.AP stat.ML

    Data-Driven Malaria Prevalence Prediction in Large Densely-Populated Urban Holoendemic sub-Saharan West Africa: Harnessing Machine Learning Approaches and 22-years of Prospectively Collected Data

    Authors: Biobele J. Brown, Alexander A. Przybylski, Petru Manescu, Fabio Caccioli, Gbeminiyi Oyinloye, Muna Elmi, Michael J. Shaw, Vijay Pawar, Remy Claveau, John Shawe-Taylor, Mandayam A. Srinivasan, Nathaniel K. Afolabi, Adebola E. Orimadegun, Wasiu A. Ajetunmobi, Francis Akinkunmi, Olayinka Kowobari, Kikelomo Osinusi, Felix O. Akinbami, Samuel Omokhodion, Wuraola A. Shokunbi, Ikeoluwa Lagunju, Olugbemiro Sodeinde, Delmiro Fernandez-Reyes

    Abstract: Plasmodium falciparum malaria still poses one of the greatest threats to human life with over 200 million cases globally leading to half-million deaths annually. Of these, 90% of cases and of the mortality occurs in sub-Saharan Africa, mostly among children. Although malaria prediction systems are central to the 2016-2030 malaria Global Technical Strategy, currently these are inadequate at capturi… ▽ More

    Submitted 18 June, 2019; originally announced June 2019.

    Comments: 40 pages, 10 figures

    ACM Class: J.3; I.5.4

  38. arXiv:1906.07496  [pdf

    eess.IV cs.CV

    Deep Learning Enhanced Extended Depth-of-Field for Thick Blood-Film Malaria High-Throughput Microscopy

    Authors: Petru Manescu, Lydia Neary- Zajiczek, Michael J. Shaw, Muna Elmi, Remy Claveau, Vijay Pawar, John Shawe-Taylor, Iasonas Kokkinos, Mandayam A. Srinivasan, Ikeoluwa Lagunju, Olugbemiro Sodeinde, Biobele J. Brown, Delmiro Fernandez-Reyes

    Abstract: Fast accurate diagnosis of malaria is still a global health challenge for which automated digital-pathology approaches could provide scalable solutions amenable to be deployed in low-to-middle income countries. Here we address the problem of Extended Depth-of-Field (EDoF) in thick blood film microscopy for rapid automated malaria diagnosis. High magnification oil-objectives (100x) with large numer… ▽ More

    Submitted 18 June, 2019; originally announced June 2019.

    Comments: 10 pages, 4 figures

  39. arXiv:1810.07287  [pdf, other

    stat.ML cs.LG

    Signed iterative random forests to identify enhancer-associated transcription factor binding

    Authors: Karl Kumbier, Sumanta Basu, Erwin Frise, Susan E. Celniker, James B. Brown, Susan Celniker, Bin Yu

    Abstract: Standard ChIP-seq peak calling pipelines seek to differentiate biochemically reproducible signals of individual genomic elements from background noise. However, reproducibility alone does not imply functional regulation (e.g., enhancer activation, alternative splicing). Here we present a general-purpose, interpretable machine learning method: signed iterative random forests (siRF), which we use to… ▽ More

    Submitted 12 July, 2023; v1 submitted 16 October, 2018; originally announced October 2018.

  40. arXiv:1809.08352  [pdf, other

    stat.ML cs.CV cs.LG

    Unrestricted Adversarial Examples

    Authors: Tom B. Brown, Nicholas Carlini, Chiyuan Zhang, Catherine Olsson, Paul Christiano, Ian Goodfellow

    Abstract: We introduce a two-player contest for evaluating the safety and robustness of machine learning systems, with a large prize pool. Unlike most prior work in ML robustness, which studies norm-constrained adversaries, we shift our focus to unconstrained adversaries. Defenders submit machine learning models, and try to achieve high accuracy and coverage on non-adversarial data while making no confident… ▽ More

    Submitted 21 September, 2018; originally announced September 2018.

  41. arXiv:1802.08768  [pdf, other

    stat.ML cs.LG

    Is Generator Conditioning Causally Related to GAN Performance?

    Authors: Augustus Odena, Jacob Buckman, Catherine Olsson, Tom B. Brown, Christopher Olah, Colin Raffel, Ian Goodfellow

    Abstract: Recent work (Pennington et al, 2017) suggests that controlling the entire distribution of Jacobian singular values is an important design consideration in deep learning. Motivated by this, we study the distribution of singular values of the Jacobian of the generator in Generative Adversarial Networks (GANs). We find that this Jacobian generally becomes ill-conditioned at the beginning of training.… ▽ More

    Submitted 18 June, 2018; v1 submitted 23 February, 2018; originally announced February 2018.

  42. arXiv:1712.09665  [pdf, other

    cs.CV

    Adversarial Patch

    Authors: Tom B. Brown, Dandelion Mané, Aurko Roy, Martín Abadi, Justin Gilmer

    Abstract: We present a method to create universal, robust, targeted adversarial image patches in the real world. The patches are universal because they can be used to attack any scene, robust because they work under a wide variety of transformations, and targeted because they can cause a classifier to output any target class. These adversarial patches can be printed, added to any scene, photographed, and pr… ▽ More

    Submitted 16 May, 2018; v1 submitted 27 December, 2017; originally announced December 2017.

  43. arXiv:1706.03741  [pdf, other

    stat.ML cs.AI cs.HC cs.LG

    Deep reinforcement learning from human preferences

    Authors: Paul Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, Dario Amodei

    Abstract: For sophisticated reinforcement learning (RL) systems to interact usefully with real-world environments, we need to communicate complex goals to these systems. In this work, we explore goals defined in terms of (non-expert) human preferences between pairs of trajectory segments. We show that this approach can effectively solve complex RL tasks without access to the reward function, including Atari… ▽ More

    Submitted 17 February, 2023; v1 submitted 12 June, 2017; originally announced June 2017.

  44. arXiv:1706.02769  [pdf, ps, other

    cs.SE cs.IR cs.PL

    Source Forager: A Search Engine for Similar Source Code

    Authors: Vineeth Kashyap, David Bingham Brown, Ben Liblit, David Melski, Thomas Reps

    Abstract: Developers spend a significant amount of time searching for code: e.g., to understand how to complete, correct, or adapt their own code for a new context. Unfortunately, the state of the art in code search has not evolved much beyond text search over tokenized source. Code has much richer structure and semantics than normal text, and this property can be exploited to specialize the code-search pro… ▽ More

    Submitted 8 June, 2017; originally announced June 2017.

    Comments: 11 pages

    MSC Class: 68N15

  45. arXiv:1010.5445  [pdf, other

    math.OC cs.CE

    Theory and Applications of Robust Optimization

    Authors: Dimitris Bertsimas, David B. Brown, Constantine Caramanis

    Abstract: In this paper we survey the primary research, both theoretical and applied, in the area of Robust Optimization (RO). Our focus is on the computational attractiveness of RO approaches, as well as the modeling power and broad applicability of the methodology. In addition to surveying prominent theoretical results of RO, we also present some recent results linking RO to adaptable models for multi-sta… ▽ More

    Submitted 26 October, 2010; originally announced October 2010.

    Comments: 50 pages

    MSC Class: 90C25

  46. Computational Difficulty of Computing the Density of States

    Authors: Brielin Brown, Steven T. Flammia, Norbert Schuch

    Abstract: We study the computational difficulty of computing the ground state degeneracy and the density of states for local Hamiltonians. We show that the difficulty of both problems is exactly captured by a class which we call #BQP, which is the counting version of the quantum complexity class QMA. We show that #BQP is not harder than its classical counting counterpart #P, which in turn implies that compu… ▽ More

    Submitted 20 July, 2011; v1 submitted 14 October, 2010; originally announced October 2010.

    Comments: v2: Accepted version. 9 pages, 1 figure

    Journal ref: Phys. Rev. Lett. 107, 040501 (2011)