-
Comparative Analysis of Efficient Adapter-Based Fine-Tuning of State-of-the-Art Transformer Models
Authors:
Saad Mashkoor Siddiqui,
Mohammad Ali Sheikh,
Muhammad Aleem,
Kajol R Singh
Abstract:
In this work, we investigate the efficacy of various adapter architectures on supervised binary classification tasks from the SuperGLUE benchmark as well as a supervised multi-class news category classification task from Kaggle. Specifically, we compare classification performance and time complexity of three transformer models, namely DistilBERT, ELECTRA, and BART, using conventional fine-tuning a…
▽ More
In this work, we investigate the efficacy of various adapter architectures on supervised binary classification tasks from the SuperGLUE benchmark as well as a supervised multi-class news category classification task from Kaggle. Specifically, we compare classification performance and time complexity of three transformer models, namely DistilBERT, ELECTRA, and BART, using conventional fine-tuning as well as nine state-of-the-art (SoTA) adapter architectures. Our analysis reveals performance differences across adapter architectures, highlighting their ability to achieve comparable or better performance relative to fine-tuning at a fraction of the training time. Similar results are observed on the new classification task, further supporting our findings and demonstrating adapters as efficient and flexible alternatives to fine-tuning. This study provides valuable insights and guidelines for selecting and implementing adapters in diverse natural language processing (NLP) applications.
△ Less
Submitted 14 January, 2025;
originally announced January 2025.
-
Mesh-Based Affine Abstraction of Nonlinear Systems with Tighter Bounds
Authors:
Kanishka Raj Singh,
Qiang Shen,
Sze Zheng Yong
Abstract:
In this paper, we consider the problem of piecewise affine abstraction of nonlinear systems, i.e., the overapproximation of its nonlinear dynamics by a pair of piecewise affine functions that "includes" the dynamical characteristics of the original system. As such, guarantees for controllers or estimators based on the affine abstraction also apply to the original nonlinear system. Our approach con…
▽ More
In this paper, we consider the problem of piecewise affine abstraction of nonlinear systems, i.e., the overapproximation of its nonlinear dynamics by a pair of piecewise affine functions that "includes" the dynamical characteristics of the original system. As such, guarantees for controllers or estimators based on the affine abstraction also apply to the original nonlinear system. Our approach consists of solving a linear programming (LP) problem that over-approximates the nonlinear function at only the grid points of a mesh with a given resolution and then accounting for the entire domain via an appropriate correction term. To achieve a desired approximation accuracy, we also iteratively subdivide the domain into subregions. Our method applies to nonlinear functions with different degrees of smoothness, including Lipschitz continuous functions, and improves on existing approaches by enabling the use of tighter bounds. Finally, we compare the effectiveness of our approach with existing optimization-based methods in simulation and illustrate its applicability for estimator design.
△ Less
Submitted 6 November, 2018;
originally announced November 2018.
-
Fast symmetric factorization of hierarchical matrices with applications
Authors:
Sivaram Ambikasaran,
Michael O'Neil,
Karan Raj Singh
Abstract:
We present a fast direct algorithm for computing symmetric factorizations, i.e. $A = WW^T$, of symmetric positive-definite hierarchical matrices with weak-admissibility conditions. The computational cost for the symmetric factorization scales as $\mathcal{O}(n \log^2 n)$ for hierarchically off-diagonal low-rank matrices. Once this factorization is obtained, the cost for inversion, application, and…
▽ More
We present a fast direct algorithm for computing symmetric factorizations, i.e. $A = WW^T$, of symmetric positive-definite hierarchical matrices with weak-admissibility conditions. The computational cost for the symmetric factorization scales as $\mathcal{O}(n \log^2 n)$ for hierarchically off-diagonal low-rank matrices. Once this factorization is obtained, the cost for inversion, application, and determinant computation scales as $\mathcal{O}(n \log n)$. In particular, this allows for the near optimal generation of correlated random variates in the case where $A$ is a covariance matrix. This symmetric factorization algorithm depends on two key ingredients. First, we present a novel symmetric factorization formula for low-rank updates to the identity of the form $I+UKU^T$. This factorization can be computed in $\mathcal{O}(n)$ time if the rank of the perturbation is sufficiently small. Second, combining this formula with a recursive divide-and-conquer strategy, near linear complexity symmetric factorizations for hierarchically structured matrices can be obtained. We present numerical results for matrices relevant to problems in probability \& statistics (Gaussian processes), interpolation (Radial basis functions), and Brownian dynamics calculations in fluid mechanics (the Rotne-Prager-Yamakawa tensor).
△ Less
Submitted 30 December, 2016; v1 submitted 1 May, 2014;
originally announced May 2014.