Skip to main content

Showing 1–3 of 3 results for author: Kottalam, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:1806.01270  [pdf, other

    cs.DC cs.DB physics.data-an stat.CO

    Alchemist: An Apache Spark <=> MPI Interface

    Authors: Alex Gittens, Kai Rothauge, Shusen Wang, Michael W. Mahoney, Jey Kottalam, Lisa Gerhardt, Prabhat, Michael Ringenburg, Kristyn Maschhoff

    Abstract: The Apache Spark framework for distributed computation is popular in the data analytics community due to its ease of use, but its MapReduce-style programming model can incur significant overheads when performing computations that do not map directly onto this model. One way to mitigate these costs is to off-load computations onto MPI codes. In recent work, we introduced Alchemist, a system for the… ▽ More

    Submitted 3 June, 2018; originally announced June 2018.

    Comments: Accepted for publication in Concurrency and Computation: Practice and Experience, Special Issue on the Cray User Group 2018. arXiv admin note: text overlap with arXiv:1805.11800

  2. arXiv:1805.11800  [pdf, other

    cs.DC cs.DB physics.data-an stat.CO

    Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist

    Authors: Alex Gittens, Kai Rothauge, Shusen Wang, Michael W. Mahoney, Lisa Gerhardt, Prabhat, Jey Kottalam, Michael Ringenburg, Kristyn Maschhoff

    Abstract: Apache Spark is a popular system aimed at the analysis of large data sets, but recent studies have shown that certain computations---in particular, many linear algebra computations that are the basis for solving common machine learning problems---are significantly slower in Spark than when done using libraries written in a high-performance computing framework such as the Message-Passing Interface… ▽ More

    Submitted 30 May, 2018; originally announced May 2018.

    Comments: Accepted for publication in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, 2018

  3. arXiv:1310.5426  [pdf, other

    cs.LG cs.DC stat.ML

    MLI: An API for Distributed Machine Learning

    Authors: Evan R. Sparks, Ameet Talwalkar, Virginia Smith, Jey Kottalam, Xinghao Pan, Joseph Gonzalez, Michael J. Franklin, Michael I. Jordan, Tim Kraska

    Abstract: MLI is an Application Programming Interface designed to address the challenges of building Machine Learn- ing algorithms in a distributed setting based on data-centric computing. Its primary goal is to simplify the development of high-performance, scalable, distributed algorithms. Our initial results show that, relative to existing systems, this interface can be used to build distributed implement… ▽ More

    Submitted 25 October, 2013; v1 submitted 21 October, 2013; originally announced October 2013.