Skip to main content

Showing 1–3 of 3 results for author: Baunsgaard, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.11067  [pdf, other

    cs.DB cs.DC cs.LG

    Morphing-based Compression for Data-centric ML Pipelines

    Authors: Sebastian Baunsgaard, Matthias Boehm

    Abstract: Data-centric ML pipelines extend traditional machine learning (ML) pipelines -- of feature transformations and ML model training -- by outer loops for data cleaning, augmentation, and feature engineering to create high-quality input data. Existing lossless matrix compression applies lightweight compression schemes to numeric matrices and performs linear algebra operations such as matrix-vector mul… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

    Comments: 20 pages, 28 figures, 4 tables

  2. arXiv:2003.12366  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Training for Speech Recognition on Coprocessors

    Authors: Sebastian Baunsgaard, Sebastian B. Wrede, Pınar Tozun

    Abstract: Automatic Speech Recognition (ASR) has increased in popularity in recent years. The evolution of processor and storage technologies has enabled more advanced ASR mechanisms, fueling the development of virtual assistants such as Amazon Alexa, Apple Siri, Microsoft Cortana, and Google Home. The interest in such assistants, in turn, has amplified the novel developments in ASR research. However, despi… ▽ More

    Submitted 3 December, 2024; v1 submitted 22 March, 2020; originally announced March 2020.

    Comments: published at ADMS 2020

    ACM Class: I.2; C.1; H.2

  3. arXiv:1909.02976  [pdf, other

    cs.DB

    SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle

    Authors: Matthias Boehm, Iulian Antonov, Sebastian Baunsgaard, Mark Dokter, Robert Ginthoer, Kevin Innerebner, Florijan Klezin, Stefanie Lindstaedt, Arnab Phani, Benjamin Rath, Berthold Reinwald, Shafaq Siddiqi, Sebastian Benjamin Wrede

    Abstract: Machine learning (ML) applications become increasingly common in many domains. ML systems to execute these workloads include numerical computing frameworks and libraries, ML algorithm libraries, and specialized systems for deep neural networks and distributed ML. These systems focus primarily on efficient model training and scoring. However, the data science process is exploratory, and deals with… ▽ More

    Submitted 7 January, 2020; v1 submitted 6 September, 2019; originally announced September 2019.

    Comments: CIDR 2020