MLOS: An Infrastructure for Automated Software Performance Engineering
Authors:
Carlo Curino,
Neha Godwal,
Brian Kroth,
Sergiy Kuryata,
Greg Lapinski,
Siqi Liu,
Slava Oks,
Olga Poppe,
Adam Smiechowski,
Ed Thayer,
Markus Weimer,
Yiwen Zhu
Abstract:
Developing modern systems software is a complex task that combines business logic programming and Software Performance Engineering (SPE). The later is an experimental and labor-intensive activity focused on optimizing the system for a given hardware, software, and workload (hw/sw/wl) context.
Today's SPE is performed during build/release phases by specialized teams, and cursed by: 1) lack of sta…
▽ More
Developing modern systems software is a complex task that combines business logic programming and Software Performance Engineering (SPE). The later is an experimental and labor-intensive activity focused on optimizing the system for a given hardware, software, and workload (hw/sw/wl) context.
Today's SPE is performed during build/release phases by specialized teams, and cursed by: 1) lack of standardized and automated tools, 2) significant repeated work as hw/sw/wl context changes, 3) fragility induced by a "one-size-fit-all" tuning (where improvements on one workload or component may impact others). The net result: despite costly investments, system software is often outside its optimal operating point - anecdotally leaving 30% to 40% of performance on the table.
The recent developments in Data Science (DS) hints at an opportunity: combining DS tooling and methodologies with a new developer experience to transform the practice of SPE. In this paper we present: MLOS, an ML-powered infrastructure and methodology to democratize and automate Software Performance Engineering. MLOS enables continuous, instance-level, robust, and trackable systems optimization. MLOS is being developed and employed within Microsoft to optimize SQL Server performance. Early results indicated that component-level optimizations can lead to 20%-90% improvements when custom-tuning for a specific hw/sw/wl, hinting at a significant opportunity. However, several research challenges remain that will require community involvement. To this end, we are in the process of open-sourcing the MLOS core infrastructure, and we are engaging with academic institutions to create an educational program around Software 2.0 and MLOS ideas.
△ Less
Submitted 4 June, 2020; v1 submitted 1 June, 2020;
originally announced June 2020.
Managing Query Compilation Memory Consumption to Improve DBMS Throughput
Authors:
Boris Baryshnikov,
Cipri Clinciu,
Conor Cunningham,
Leo Giakoumakis,
Slava Oks,
Stefano Stefani
Abstract:
While there are known performance trade-offs between database page buffer pool and query execution memory allocation policies, little has been written on the impact of query compilation memory use on overall throughput of the database management system (DBMS). We present a new aspect of the query optimization problem and offer a solution implemented in Microsoft SQL Server 2005. The solution pro…
▽ More
While there are known performance trade-offs between database page buffer pool and query execution memory allocation policies, little has been written on the impact of query compilation memory use on overall throughput of the database management system (DBMS). We present a new aspect of the query optimization problem and offer a solution implemented in Microsoft SQL Server 2005. The solution provides stable throughput for a range of workloads even when memory requests outstrip the ability of the hardware to service those requests.
△ Less
Submitted 21 December, 2006;
originally announced December 2006.