Skip to main content

Showing 1–1 of 1 results for author: Gandhi, L

Searching in archive cs. Search in all archives.
.
  1. AutoComp: Automated Data Compaction for Log-Structured Tables in Data Lakes

    Authors: Anja Gruenheid, Jesús Camacho-Rodríguez, Carlo Curino, Raghu Ramakrishnan, Stanislav Pak, Sumedh Sakdeo, Lenisha Gandhi, Sandeep K. Singhal, Pooja Nilangekar, Daniel J. Abadi

    Abstract: The proliferation of small files in data lakes poses significant challenges, including degraded query performance, increased storage costs, and scalability bottlenecks in distributed storage systems. Log-structured table formats (LSTs) such as Delta Lake, Apache Iceberg, and Apache Hudi exacerbate this issue due to their append-only write patterns and metadata-intensive operations. While compactio… ▽ More

    Submitted 5 April, 2025; originally announced April 2025.

    Journal ref: ACM SIGMOD 2025