Skip to main content

Showing 1–4 of 4 results for author: Röhm, U

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.16544  [pdf, other

    cs.DB

    First Past the Post: Evaluating Query Optimization in MongoDB

    Authors: Dawei Tao, Enqi Liu, Sidath Randeni Kadupitige, Michael Cahill, Alan Fekete, Uwe Röhm

    Abstract: Query optimization is crucial for every database management system (DBMS) to enable fast execution of declarative queries. Most DBMS designs include cost-based query optimization. However, MongoDB implements a different approach to choose an execution plan that we call "first past the post" (FPTP) query optimization. FPTP does not estimate costs for each execution plan, but rather partially execut… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

  2. arXiv:1904.01279  [pdf, other

    cs.DB

    Learning a Partitioning Advisor with Deep Reinforcement Learning

    Authors: Benjamin Hilprecht, Carsten Binnig, Uwe Roehm

    Abstract: Commercial data analytics products such as Microsoft Azure SQL Data Warehouse or Amazon Redshift provide ready-to-use scale-out database solutions for OLAP-style workloads in the cloud. While the provisioning of a database cluster is usually fully automated by cloud providers, customers typically still have to make important design decisions which were traditionally made by the database administra… ▽ More

    Submitted 2 April, 2019; originally announced April 2019.

  3. arXiv:1803.10836  [pdf

    cs.DC

    Technical Report: On the Usability of Hadoop MapReduce, Apache Spark & Apache Flink for Data Science

    Authors: Bilal Akil, Ying Zhou, Uwe Röhm

    Abstract: Distributed data processing platforms for cloud computing are important tools for large-scale data analytics. Apache Hadoop MapReduce has become the de facto standard in this space, though its programming interface is relatively low-level, requiring many implementation steps even for simple analysis tasks. This has led to the development of advanced dataflow oriented platforms, most prominently Ap… ▽ More

    Submitted 28 March, 2018; originally announced March 2018.

    Report number: School of IT, University of Sydney, Tech. Rep. 709

  4. arXiv:0909.1764  [pdf

    cs.DB q-bio.GN

    Data Management for High-Throughput Genomics

    Authors: Uwe Roehm, Jose Blakeley

    Abstract: Today's sequencing technology allows sequencing an individual genome within a few weeks for a fraction of the costs of the original Human Genome project. Genomics labs are faced with dozens of TB of data per week that have to be automatically processed and made available to scientists for further analysis. This paper explores the potential and the limitations of using relational database systems… ▽ More

    Submitted 9 September, 2009; originally announced September 2009.

    Comments: CIDR 2009