-
Kernel-as-a-Service: A Serverless Interface to GPUs
Authors:
Nathan Pemberton,
Anton Zabreyko,
Zhoujie Ding,
Randy Katz,
Joseph Gonzalez
Abstract:
Serverless computing has made it easier than ever to deploy applications over scalable cloud resources, all the while driving higher utilization for cloud providers. While this technique has worked well for easily divisible resources like CPU and local DRAM, it has struggled to incorporate more expensive and monolithic resources like GPUs or other application accelerators. We cannot simply slap a…
▽ More
Serverless computing has made it easier than ever to deploy applications over scalable cloud resources, all the while driving higher utilization for cloud providers. While this technique has worked well for easily divisible resources like CPU and local DRAM, it has struggled to incorporate more expensive and monolithic resources like GPUs or other application accelerators. We cannot simply slap a GPU on a FaaS platform and expect to keep all the benefits serverless promises. We need a more tailored approach if we want to best utilize these critical resources.
In this paper we present Kernel-as-a-Service (KaaS), a serverless interface to GPUs. In KaaS, GPUs are first-class citizens that are invoked just like any other serverless function. Rather than mixing host and GPU code as is typically done, KaaS runs graphs of GPU-only code while host code is run on traditional functions. The KaaS system is responsible for managing GPU memory and schedules user kernels across the entire pool of available GPUs rather than relying on static allocations. This approach allows us to more effectively share expensive GPU resources, especially in multitenant environments like the cloud. We add support for KaaS to the Ray distributed computing framework and evaluate it with workloads including a TVM-based deep learning compiler and a BLAS library. Our results show that KaaS is able to drive up to 50x higher throughput and 16x lower latency when GPU resources are contended.
△ Less
Submitted 15 December, 2022;
originally announced December 2022.
-
A FaaS File System for Serverless Computing
Authors:
Johann Schleier-Smith,
Leonhard Holz,
Nathan Pemberton,
Joseph M. Hellerstein
Abstract:
Serverless computing with cloud functions is quickly gaining adoption, but constrains programmers with its limited support for state management. We introduce a shared file system for cloud functions. It offers familiar POSIX semantics while taking advantage of distinctive aspects of cloud functions to achieve scalability and performance beyond what traditional shared file systems can offer. We tak…
▽ More
Serverless computing with cloud functions is quickly gaining adoption, but constrains programmers with its limited support for state management. We introduce a shared file system for cloud functions. It offers familiar POSIX semantics while taking advantage of distinctive aspects of cloud functions to achieve scalability and performance beyond what traditional shared file systems can offer. We take advantage of the function-grained fault tolerance model of cloud functions to proceed optimistically using local state, safe in the knowledge that we can restart if cache reads or lock activity cannot be reconciled upon commit. The boundaries of cloud functions provide implicit commit and rollback points, giving us the flexibility to use transaction processing techniques without changing the programming model or API. This allows a variety of stateful sever-based applications to benefit from the simplicity and scalability of serverless computing, often with little or no modification.
△ Less
Submitted 16 September, 2020;
originally announced September 2020.
-
CoVista: A Unified View on Privacy Sensitive Mobile Contact Tracing Effort
Authors:
David Culler,
Prabal Dutta,
Gabe Fierro,
Joseph E. Gonzalez,
Nathan Pemberton,
Johann Schleier-Smith,
K. Shankari,
Alvin Wan,
Thomas Zachariah
Abstract:
Governments around the world have become increasingly frustrated with tech giants dictating public health policy. The software created by Apple and Google enables individuals to track their own potential exposure through collated exposure notifications. However, the same software prohibits location tracking, denying key information needed by public health officials for robust contract tracing. Thi…
▽ More
Governments around the world have become increasingly frustrated with tech giants dictating public health policy. The software created by Apple and Google enables individuals to track their own potential exposure through collated exposure notifications. However, the same software prohibits location tracking, denying key information needed by public health officials for robust contract tracing. This information is needed to treat and isolate COVID-19 positive people, identify transmission hotspots, and protect against continued spread of infection. In this article, we present two simple ideas: the lighthouse and the covid-commons that address the needs of public health authorities while preserving the privacy-sensitive goals of the Apple and google exposure notification protocols.
△ Less
Submitted 27 May, 2020;
originally announced May 2020.