-
QuAKE: Speeding up Model Inference Using Quick and Approximate Kernels for Exponential Non-Linearities
Authors:
Sai Kiran Narayanaswami,
Gopalakrishnan Srinivasan,
Balaraman Ravindran
Abstract:
As machine learning gets deployed more and more widely, and model sizes continue to grow, improving computational efficiency during model inference has become a key challenge. In many commonly used model architectures, including Transformers, a significant portion of the inference computation is comprised of exponential non-linearities such as Softmax. In this work, we develop QuAKE, a collection…
▽ More
As machine learning gets deployed more and more widely, and model sizes continue to grow, improving computational efficiency during model inference has become a key challenge. In many commonly used model architectures, including Transformers, a significant portion of the inference computation is comprised of exponential non-linearities such as Softmax. In this work, we develop QuAKE, a collection of novel operators that leverage certain properties of IEEE-754 floating point representations to quickly approximate the exponential function without requiring specialized hardware, extra memory, or precomputation. We propose optimizations that enhance the efficiency of QuAKE in commonly used exponential non-linearities such as Softmax, GELU, and the Logistic function. Our benchmarks demonstrate substantial inference speed improvements between 10% and 35% on server CPUs, and 5% and 45% on embedded and mobile-scale CPUs for a variety of model architectures and sizes. Evaluations of model performance on standard datasets and tasks from various domains show that QuAKE operators are able to provide sizable speed benefits with little to no loss of performance on downstream tasks.
△ Less
Submitted 30 November, 2024;
originally announced December 2024.
-
Generalized Random Surfer-Pair Models
Authors:
Sai Kiran Narayanaswami,
Balaraman Ravindran,
Venkatesh Ramaiyan
Abstract:
SimRank is a widely studied link-based similarity measure that is known for its simple, yet powerful philosophy that two nodes are similar if they are referenced by similar nodes. While this philosophy has been the basis of several improvements, there is another useful, albeit less frequently discussed interpretation for SimRank known as the Random Surfer-Pair Model. In this work, we show that oth…
▽ More
SimRank is a widely studied link-based similarity measure that is known for its simple, yet powerful philosophy that two nodes are similar if they are referenced by similar nodes. While this philosophy has been the basis of several improvements, there is another useful, albeit less frequently discussed interpretation for SimRank known as the Random Surfer-Pair Model. In this work, we show that other well known measures related to SimRank can also be reinterpreted using Random Surfer-Pair Models, and establish a mathematically sound, general and unifying framework for several link-based similarity measures. This also serves to provide new insights into their functioning and allows for using these measures in a Monte Carlo framework, which provides several computational benefits. Next, we describe how the framework can be used as a versatile tool for developing measures according to given design requirements. As an illustration of this utility, we develop a new measure by combining the benefits of two existing measures under this framework, and empirically demonstrate that it results in a better performing measure.
△ Less
Submitted 4 July, 2019; v1 submitted 2 July, 2019;
originally announced July 2019.
-
An Active Learning Framework for Efficient Robust Policy Search
Authors:
Sai Kiran Narayanaswami,
Nandan Sudarsanam,
Balaraman Ravindran
Abstract:
Robust Policy Search is the problem of learning policies that do not degrade in performance when subject to unseen environment model parameters. It is particularly relevant for transferring policies learned in a simulation environment to the real world. Several existing approaches involve sampling large batches of trajectories which reflect the differences in various possible environments, and the…
▽ More
Robust Policy Search is the problem of learning policies that do not degrade in performance when subject to unseen environment model parameters. It is particularly relevant for transferring policies learned in a simulation environment to the real world. Several existing approaches involve sampling large batches of trajectories which reflect the differences in various possible environments, and then selecting some subset of these to learn robust policies, such as the ones that result in the worst performance. We propose an active learning based framework, EffAcTS, to selectively choose model parameters for this purpose so as to collect only as much data as necessary to select such a subset. We apply this framework using Linear Bandits, and experimentally validate the gains in sample efficiency and the performance of our approach on standard continuous control tasks. We also present a Multi-Task Learning perspective to the problem of Robust Policy Search, and draw connections from our proposed framework to existing work on Multi-Task Learning.
△ Less
Submitted 20 November, 2021; v1 submitted 1 January, 2019;
originally announced January 2019.