-
ECO: An LLM-Driven Efficient Code Optimizer for Warehouse Scale Computers
Authors:
Hannah Lin,
Martin Maas,
Maximilian Roquemore,
Arman Hasanzadeh,
Fred Lewis,
Yusuf Simonson,
Tzu-Wei Yang,
Amir Yazdanbakhsh,
Deniz Altinbüken,
Florin Papa,
Maggie Nolan Edmonds,
Aditya Patil,
Don Schwarz,
Satish Chandra,
Chris Kennelly,
Milad Hashemi,
Parthasarathy Ranganathan
Abstract:
With the end of Moore's Law, optimizing code for performance has become paramount for meeting ever-increasing compute demands, particularly in hyperscale data centers where even small efficiency gains translate to significant resource and energy savings. Traditionally, this process requires significant programmer effort to identify optimization opportunities, modify the code to implement the optim…
▽ More
With the end of Moore's Law, optimizing code for performance has become paramount for meeting ever-increasing compute demands, particularly in hyperscale data centers where even small efficiency gains translate to significant resource and energy savings. Traditionally, this process requires significant programmer effort to identify optimization opportunities, modify the code to implement the optimization, and carefully deploy and measure the optimization's impact. Despite a significant amount of work on automating program edits and promising results in small-scale settings, such performance optimizations have remained elusive in large real-world production environments, due to the scale, high degree of complexity, and reliability required.
This paper introduces ECO (Efficient Code Optimizer), a system that automatically refactors source code to improve performance at scale. To achieve these performance gains, ECO searches through historical commits at scale to create a dictionary of performance anti-patterns that these commits addressed. These anti-patterns are used to search for similar patterns in a code base of billions of lines of code, pinpointing other code segments with similar potential optimization opportunities. Using a fine-tuned LLM, ECO then automatically refactors the code to generate and apply similar edits. Next, ECO verifies the transformed code, submits it for code review, and measures the impact of the optimization in production.
Currently deployed on Google's hyperscale production fleet, this system has driven >25k changed lines of production code, across over 6.4k submitted commits, with a >99.5% production success rate. Over the past year, ECO has consistently resulted in significant performance savings every quarter. On average, the savings produced per quarter are equivalent to over 500k normalized CPU cores.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Natural Language Outlines for Code: Literate Programming in the LLM Era
Authors:
Kensen Shi,
Deniz Altınbüken,
Saswat Anand,
Mihai Christodorescu,
Katja Grünwedel,
Alexa Koenings,
Sai Naidu,
Anurag Pathak,
Marc Rasi,
Fredde Ribeiro,
Brandon Ruffin,
Siddhant Sanyam,
Maxim Tabachnyk,
Sara Toth,
Roy Tu,
Tobias Welp,
Pengcheng Yin,
Manzil Zaheer,
Satish Chandra,
Charles Sutton
Abstract:
We propose using natural language outlines as a novel modality and interaction surface for providing AI assistance to developers throughout the software development process. An NL outline for a code function comprises multiple statements written in concise prose, which partition the code and summarize its main ideas in the style of literate programming. Crucially, we find that modern LLMs can gene…
▽ More
We propose using natural language outlines as a novel modality and interaction surface for providing AI assistance to developers throughout the software development process. An NL outline for a code function comprises multiple statements written in concise prose, which partition the code and summarize its main ideas in the style of literate programming. Crucially, we find that modern LLMs can generate accurate and high-quality NL outlines in practice. Moreover, NL outlines enable a bidirectional sync between code and NL, where a developer can change either code or NL and have the LLM automatically update the other. We discuss many use cases for NL outlines: they can accelerate understanding and navigation of code and diffs, simplify code maintenance, augment code search, steer code generation, and more. We then propose and compare multiple LLM prompting techniques for generating outlines and ask professional developers to judge outline quality. Finally, we present two case studies applying NL outlines toward code review and malware detection.
△ Less
Submitted 17 April, 2025; v1 submitted 8 August, 2024;
originally announced August 2024.
-
Kepler: Robust Learning for Faster Parametric Query Optimization
Authors:
Lyric Doshi,
Vincent Zhuang,
Gaurav Jain,
Ryan Marcus,
Haoyu Huang,
Deniz Altinbüken,
Eugene Brevdo,
Campbell Fraser
Abstract:
Most existing parametric query optimization (PQO) techniques rely on traditional query optimizer cost models, which are often inaccurate and result in suboptimal query performance. We propose Kepler, an end-to-end learning-based approach to PQO that demonstrates significant speedups in query latency over a traditional query optimizer. Central to our method is Row Count Evolution (RCE), a novel pla…
▽ More
Most existing parametric query optimization (PQO) techniques rely on traditional query optimizer cost models, which are often inaccurate and result in suboptimal query performance. We propose Kepler, an end-to-end learning-based approach to PQO that demonstrates significant speedups in query latency over a traditional query optimizer. Central to our method is Row Count Evolution (RCE), a novel plan generation algorithm based on perturbations in the sub-plan cardinality space. While previous approaches require accurate cost models, we bypass this requirement by evaluating candidate plans via actual execution data and training an ML model to predict the fastest plan given parameter binding values. Our models leverage recent advances in neural network uncertainty in order to robustly predict faster plans while avoiding regressions in query performance. Experimentally, we show that Kepler achieves significant improvements in query runtime on multiple datasets on PostgreSQL.
△ Less
Submitted 18 October, 2023; v1 submitted 11 June, 2023;
originally announced June 2023.
-
Learned Indexes for a Google-scale Disk-based Database
Authors:
Hussam Abu-Libdeh,
Deniz Altınbüken,
Alex Beutel,
Ed H. Chi,
Lyric Doshi,
Tim Kraska,
Xiaozhou,
Li,
Andy Ly,
Christopher Olston
Abstract:
There is great excitement about learned index structures, but understandable skepticism about the practicality of a new method uprooting decades of research on B-Trees. In this paper, we work to remove some of that uncertainty by demonstrating how a learned index can be integrated in a distributed, disk-based database system: Google's Bigtable. We detail several design decisions we made to integra…
▽ More
There is great excitement about learned index structures, but understandable skepticism about the practicality of a new method uprooting decades of research on B-Trees. In this paper, we work to remove some of that uncertainty by demonstrating how a learned index can be integrated in a distributed, disk-based database system: Google's Bigtable. We detail several design decisions we made to integrate learned indexes in Bigtable. Our results show that integrating learned index significantly improves the end-to-end read latency and throughput for Bigtable.
△ Less
Submitted 23 December, 2020;
originally announced December 2020.