Skip to main content

Showing 1–8 of 8 results for author: Aberger, C R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2010.06192  [pdf, other

    cs.LG stat.ML

    Revisiting BFloat16 Training

    Authors: Pedram Zamirai, Jian Zhang, Christopher R. Aberger, Christopher De Sa

    Abstract: State-of-the-art generic low-precision training algorithms use a mix of 16-bit and 32-bit precision, creating the folklore that 16-bit hardware compute units alone are not enough to maximize model accuracy. As a result, deep learning accelerators are forced to support both 16-bit and 32-bit floating-point units (FPUs), which is more costly than only using 16-bit FPUs for hardware design. We ask: c… ▽ More

    Submitted 7 March, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

  2. arXiv:2003.04983  [pdf, other

    cs.CL cs.LG stat.ML

    Understanding the Downstream Instability of Word Embeddings

    Authors: Megan Leszczynski, Avner May, Jian Zhang, Sen Wu, Christopher R. Aberger, Christopher Ré

    Abstract: Many industrial machine learning (ML) systems require frequent retraining to keep up-to-date with constantly changing data. This retraining exacerbates a large challenge facing ML systems today: model training is unstable, i.e., small changes in training data can cause significant changes in the model's predictions. In this paper, we work on developing a deeper understanding of this instability, w… ▽ More

    Submitted 28 February, 2020; originally announced March 2020.

    Comments: In Proceedings of the 3rd MLSys Conference, 2020

  3. arXiv:1910.05124  [pdf, other

    cs.DC cs.LG stat.ML

    PipeMare: Asynchronous Pipeline Parallel DNN Training

    Authors: Bowen Yang, Jian Zhang, Jonathan Li, Christopher Ré, Christopher R. Aberger, Christopher De Sa

    Abstract: Pipeline parallelism (PP) when training neural networks enables larger models to be partitioned spatially, leading to both lower network communication and overall higher hardware utilization. Unfortunately, to preserve the statistical efficiency of sequential training, existing PP techniques sacrifice hardware efficiency by decreasing pipeline utilization or incurring extra memory costs. In this p… ▽ More

    Submitted 8 February, 2020; v1 submitted 9 October, 2019; originally announced October 2019.

  4. arXiv:1904.10631  [pdf, other

    cs.LG stat.ML

    Low-Memory Neural Network Training: A Technical Report

    Authors: Nimit S. Sohoni, Christopher R. Aberger, Megan Leszczynski, Jian Zhang, Christopher Ré

    Abstract: Memory is increasingly often the bottleneck when training neural network models. Despite this, techniques to lower the overall memory requirements of training have been less widely studied compared to the extensive literature on reducing the memory requirements of inference. In this paper we study a fundamental question: How much memory is actually needed to train a neural network? To answer this… ▽ More

    Submitted 8 April, 2022; v1 submitted 23 April, 2019; originally announced April 2019.

    Comments: Version notes: Copyedits and citation fixes

  5. arXiv:1803.03383  [pdf, other

    cs.LG stat.ML

    High-Accuracy Low-Precision Training

    Authors: Christopher De Sa, Megan Leszczynski, Jian Zhang, Alana Marzoev, Christopher R. Aberger, Kunle Olukotun, Christopher Ré

    Abstract: Low-precision computation is often used to lower the time and energy cost of machine learning, and recently hardware accelerators have been developed to support it. Still, it has been used primarily for inference - not training. Previous low-precision training algorithms suffered from a fundamental tradeoff: as the number of bits of precision is lowered, quantization noise is added to the model, w… ▽ More

    Submitted 8 March, 2018; originally announced March 2018.

  6. arXiv:1708.07859  [pdf, other

    cs.DB

    LevelHeaded: Making Worst-Case Optimal Joins Work in the Common Case

    Authors: Christopher R. Aberger, Andrew Lamb, Kunle Olukotun, Christopher Ré

    Abstract: Pipelines combining SQL-style business intelligence (BI) queries and linear algebra (LA) are becoming increasingly common in industry. As a result, there is a growing need to unify these workloads in a single framework. Unfortunately, existing solutions either sacrifice the inherent benefits of exclusively using a relational database (e.g. logical and physical independence) or incur orders of magn… ▽ More

    Submitted 25 August, 2017; originally announced August 2017.

  7. arXiv:1602.03557  [pdf, other

    cs.DB

    Old Techniques for New Join Algorithms: A Case Study in RDF Processing

    Authors: Christopher R. Aberger, Susan Tu, Kunle Olukotun, Christopher Ré

    Abstract: Recently there has been significant interest around designing specialized RDF engines, as traditional query processing mechanisms incur orders of magnitude performance gaps on many RDF workloads. At the same time researchers have released new worst-case optimal join algorithms which can be asymptotically better than the join algorithms in traditional engines. In this paper we apply worst-case opti… ▽ More

    Submitted 10 February, 2016; originally announced February 2016.

  8. arXiv:1503.02368  [pdf, other

    cs.DB

    EmptyHeaded: A Relational Engine for Graph Processing

    Authors: Christopher R. Aberger, Susan Tu, Kunle Olukotun, Christopher Ré

    Abstract: There are two types of high-performance graph processing engines: low- and high-level engines. Low-level engines (Galois, PowerGraph, Snap) provide optimized data structures and computation models but require users to write low-level imperative code, hence ensuring that efficiency is the burden of the user. In high-level engines, users write in query languages like datalog (SociaLite) or SQL (Grai… ▽ More

    Submitted 5 January, 2017; v1 submitted 9 March, 2015; originally announced March 2015.