PushdownDB: Accelerating a DBMS using S3 Computation
Authors:
Xiangyao Yu,
Matt Youill,
Matthew Woicik,
Abdurrahman Ghanem,
Marco Serafini,
Ashraf Aboulnaga,
Michael Stonebraker
Abstract:
This paper studies the effectiveness of pushing parts of DBMS analytics queries into the Simple Storage Service (S3) engine of Amazon Web Services (AWS), using a recently released capability called S3 Select. We show that some DBMS primitives (filter, projection, aggregation) can always be cost-effectively moved into S3. Other more complex operations (join, top-K, group-by) require reimplementatio…
▽ More
This paper studies the effectiveness of pushing parts of DBMS analytics queries into the Simple Storage Service (S3) engine of Amazon Web Services (AWS), using a recently released capability called S3 Select. We show that some DBMS primitives (filter, projection, aggregation) can always be cost-effectively moved into S3. Other more complex operations (join, top-K, group-by) require reimplementation to take advantage of S3 Select and are often candidates for pushdown. We demonstrate these capabilities through experimentation using a new DBMS that we developed, PushdownDB. Experimentation with a collection of queries including TPC-H queries shows that PushdownDB is on average 30% cheaper and 6.7X faster than a baseline that does not use S3 Select.
△ Less
Submitted 13 February, 2020;
originally announced February 2020.