Skip to main content

Showing 1–12 of 12 results for author: Quiané-Ruiz, J

.
  1. arXiv:2312.03435  [pdf, ps, other

    cs.DB

    Counting Butterflies in Fully Dynamic Bipartite Graph Streams

    Authors: Serafeim Papadias, Zoi Kaoudi, Varun Pandey, Jorge-Arnulfo Quiane-Ruiz, Volker Markl

    Abstract: A bipartite graph extensively models relationships between real-world entities of two different types, such as user-product data in e-commerce. Such graph data are inherently becoming more and more streaming, entailing continuous insertions and deletions of edges. A butterfly (i.e., 2x2 bi-clique) is the smallest non-trivial cohesive structure that plays a crucial role. Counting such butterfly pat… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  2. arXiv:2209.06063  [pdf, other

    cs.DB

    Space-Efficient Random Walks on Streaming Graphs

    Authors: Serafeim Papadias, Zoi Kaoudi, Jorge-Arnulfo Quiane-Ruiz, Volker Markl

    Abstract: Graphs in many applications, such as social networks and IoT, are inherently streaming, involving continuous additions and deletions of vertices and edges at high rates. Constructing random walks in a graph, i.e., sequences of vertices selected with a specific probability distribution, is a prominent task in many of these graph applications as well as machine learning (ML) on graph-structured data… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

  3. arXiv:2208.10830  [pdf, other

    cs.DB cs.CV

    Satellite Image Search in AgoraEO

    Authors: Ahmet Kerem Aksoy, Pavel Dushev, Eleni Tzirita Zacharatou, Holmer Hemsen, Marcela Charfuelan, Jorge-Arnulfo Quiané-Ruiz, Begüm Demir, Volker Markl

    Abstract: The growing operational capability of global Earth Observation (EO) creates new opportunities for data-driven approaches to understand and protect our planet. However, the current use of EO archives is very restricted due to the huge archive sizes and the limited exploration capabilities provided by EO platforms. To address this limitation, we have recently proposed MiLaN, a content-based image re… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    Comments: Accepted in VLDB 2022

    ACM Class: H.2.8; H.3.3; I.4.8; I.4.10

  4. arXiv:2110.00318  [pdf, other

    cs.DB

    MATE: Multi-Attribute Table Extraction

    Authors: Mahdi Esmailoghli, Jorge-Arnulfo Quiané-Ruiz, Ziawasch Abedjan

    Abstract: A core operation in data discovery is to find joinable tables for a given table. Real-world tables include both unary and n-ary join keys. However, existing table discovery systems are optimized for unary joins and are ineffective and slow in the existence of n-ary keys. In this paper, we introduce MATE, a table discovery system that leverages a novel hash-based index that enables n-ary join disco… ▽ More

    Submitted 25 April, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

  5. arXiv:1909.03026  [pdf, other

    cs.DB cs.AI cs.DC eess.SY

    Agora: A Unified Asset Ecosystem Going Beyond Marketplaces and Cloud Services

    Authors: Jonas Traub, Jorge-Arnulfo Quiané-Ruiz, Zoi Kaoudi, Volker Markl

    Abstract: Data, algorithms, and compute/storage infrastructure are key assets that drive data science and artificial intelligence applications. As providing all these assets requires a huge investment, data science and artificial intelligence technologies are currently dominated by a small number of providers who can afford these investments. This leads to lock-in effects and hinders features that require a… ▽ More

    Submitted 19 July, 2020; v1 submitted 6 September, 2019; originally announced September 2019.

  6. arXiv:1905.09130  [pdf, other

    cs.AI cs.LG

    AI-CARGO: A Data-Driven Air-Cargo Revenue Management System

    Authors: Stefano Giovanni Rizzo, Ji Lucas, Zoi Kaoudi, Jorge-Arnulfo Quiane-Ruiz, Sanjay Chawla

    Abstract: We propose AI-CARGO, a revenue management system for air-cargo that combines machine learning prediction with decision-making using mathematical optimization methods. AI-CARGO addresses a problem that is unique to the air-cargo business, namely the wide discrepancy between the quantity (weight or volume) that a shipper will book and the actual received amount at departure time by the airline. The… ▽ More

    Submitted 22 May, 2019; originally announced May 2019.

    Comments: 9 pages, 8 figures

  7. arXiv:1805.11723  [pdf, other

    cs.DB

    Building your Cross-Platform Application with RHEEM

    Authors: Sanjay Chawla, Bertty Contreras-Rojas, Zoi Kaoudi, Sebastian Kruse, Jorge-Arnulfo Quiané-Ruiz

    Abstract: Today, organizations typically perform tedious and costly tasks to juggle their code and data across different data processing platforms. Addressing this pain and achieving automatic cross-platform data processing is quite challenging because it requires quite good expertise for all the available data processing platforms. In this report, we present Rheem, a general-purpose cross-platform data pro… ▽ More

    Submitted 29 May, 2018; originally announced May 2018.

  8. RHEEMix in the Data Jungle: A Cost-based Optimizer for Cross-platform Systems

    Authors: Sebastian Kruse, Zoi Kaoudi, Bertty Contreras, Sanjay Chawla, Felix Naumann, Jorge-Arnulfo Quiané-Ruiz

    Abstract: In pursuit of efficient and scalable data analytics, the insight that "one size does not fit all" has given rise to a plethora of specialized data processing platforms and today's complex data analytics are moving beyond the limits of a single platform. In this paper, we present the cost-based optimizer of Rheem, an open-source cross-platform system that copes with these new requirements. The opti… ▽ More

    Submitted 5 September, 2020; v1 submitted 9 May, 2018; originally announced May 2018.

    Journal ref: VLDB Journal 2020

  9. A Cost-based Optimizer for Gradient Descent Optimization

    Authors: Zoi Kaoudi, Jorge-Arnulfo Quiané-Ruiz, Saravanan Thirumuruganathan, Sanjay Chawla, Divy Agrawal

    Abstract: As the use of machine learning (ML) permeates into diverse application domains, there is an urgent need to support a declarative framework for ML. Ideally, a user will specify an ML task in a high-level and easy-to-use language and the framework will invoke the appropriate algorithms and system configurations to execute it. An important observation towards designing such a framework is that many M… ▽ More

    Submitted 27 March, 2017; originally announced March 2017.

    Comments: Accepted at SIGMOD 2017

  10. arXiv:1701.06093  [pdf, other

    cs.DB

    INGESTBASE: A Declarative Data Ingestion System

    Authors: Alekh Jindal, Jorge-Arnulfo Quiane-Ruiz, Samuel Madden

    Abstract: Big data applications have fast arriving data that must be quickly ingested. At the same time, they have specific needs to preprocess and transform the data before it could be put to use. The current practice is to do these preparatory transformations once the data is already ingested, however, this is expensive to run and cumbersome to manage. As a result, there is a need to push data preprocessi… ▽ More

    Submitted 21 January, 2017; originally announced January 2017.

  11. arXiv:1212.3480  [pdf, other

    cs.DB cs.DC

    Towards Zero-Overhead Adaptive Indexing in Hadoop

    Authors: Stefan Richter, Jorge-Arnulfo Quiané-Ruiz, Stefan Schuh, Jens Dittrich

    Abstract: Several research works have focused on supporting index access in MapReduce systems. These works have allowed users to significantly speed up selective MapReduce jobs by orders of magnitude. However, all these proposals require users to create indexes upfront, which might be a difficult task in certain applications (such as in scientific and social applications) where workloads are evolving or har… ▽ More

    Submitted 14 December, 2012; originally announced December 2012.

    Comments: Tech Report, Saarland University

  12. arXiv:1208.0287  [pdf, other

    cs.DB

    Only Aggressive Elephants are Fast Elephants

    Authors: Jens Dittrich, Jorge-Arnulfo Quiané-Ruiz, Stefan Richter, Stefan Schuh, Alekh Jindal, Jörg Schad

    Abstract: Yellow elephants are slow. A major reason is that they consume their inputs entirely before responding to an elephant rider's orders. Some clever riders have trained their yellow elephants to only consume parts of the inputs before responding. However, the teaching time to make an elephant do that is high. So high that the teaching lessons often do not pay off. We take a different approach. We mak… ▽ More

    Submitted 1 August, 2012; originally announced August 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 11, pp. 1591-1602 (2012)