Search | arXiv e-print repository

AdUE: Improving uncertainty estimation head for LoRA adapters in LLMs

Authors: Artem Zabolotnyi, Roman Makarov, Mile Mitrovic, Polina Proskura, Oleg Travkin, Roman Alferov, Alexey Zaytsev

Abstract: Uncertainty estimation remains a critical challenge in adapting pre-trained language models to classification tasks, particularly under parameter-efficient fine-tuning approaches such as adapters. We introduce AdUE1, an efficient post-hoc uncertainty estimation (UE) method, to enhance softmax-based estimates. Our approach (1) uses a differentiable approximation of the maximum function and (2) appl… ▽ More Uncertainty estimation remains a critical challenge in adapting pre-trained language models to classification tasks, particularly under parameter-efficient fine-tuning approaches such as adapters. We introduce AdUE1, an efficient post-hoc uncertainty estimation (UE) method, to enhance softmax-based estimates. Our approach (1) uses a differentiable approximation of the maximum function and (2) applies additional regularization through L2-SP, anchoring the fine-tuned head weights and regularizing the model. Evaluations on five NLP classification datasets across four language models (RoBERTa, ELECTRA, LLaMA-2, Qwen) demonstrate that our method consistently outperforms established baselines such as Mahalanobis distance and softmax response. Our approach is lightweight (no base-model changes) and produces better-calibrated confidence. △ Less

Submitted 21 May, 2025; originally announced May 2025.

Comments: 9 pages, 1 figure

arXiv:2407.20288 [pdf, other]

Supervised Learning based Method for Condition Monitoring of Overhead Line Insulators using Leakage Current Measurement

Authors: Mile Mitrovic, Dmitry Titov, Klim Volkhov, Irina Lukicheva, Andrey Kudryavzev, Petr Vorobev, Qi Li, Vladimir Terzija

Abstract: As a new practical and economical solution to the aging problem of overhead line (OHL) assets, the technical policies of most power grid companies in the world experienced a gradual transition from scheduled preventive maintenance to a risk-based approach in asset management. Even though the accumulation of contamination is predictable within a certain degree, there are currently no effective ways… ▽ More As a new practical and economical solution to the aging problem of overhead line (OHL) assets, the technical policies of most power grid companies in the world experienced a gradual transition from scheduled preventive maintenance to a risk-based approach in asset management. Even though the accumulation of contamination is predictable within a certain degree, there are currently no effective ways to identify the risk of the insulator flashover in order to plan its replacement. This paper presents a novel machine learning (ML) based method for estimating the flashover probability of the cup-and-pin glass insulator string. The proposed method is based on the Extreme Gradient Boosting (XGBoost) supervised ML model, in which the leakage current (LC) features and applied voltage are used as the inputs. The established model can estimate the critical flashover voltage (U50%) for various designs of OHL insulators with different voltage levels. The proposed method is also able to accurately determine the condition of the insulator strings and instruct asset management engineers to take appropriate actions. △ Less

Submitted 26 July, 2024; originally announced July 2024.

Comments: 10 pages, 9 figures

arXiv:2402.11365 [pdf, other]

Data-Driven Stochastic AC-OPF using Gaussian Processes

Authors: Mile Mitrovic

Abstract: The thesis focuses on developing a data-driven algorithm, based on machine learning, to solve the stochastic alternating current (AC) chance-constrained (CC) Optimal Power Flow (OPF) problem. Although the AC CC-OPF problem has been successful in academic circles, it is highly nonlinear and computationally demanding, which limits its practical impact. The proposed approach aims to address this limi… ▽ More The thesis focuses on developing a data-driven algorithm, based on machine learning, to solve the stochastic alternating current (AC) chance-constrained (CC) Optimal Power Flow (OPF) problem. Although the AC CC-OPF problem has been successful in academic circles, it is highly nonlinear and computationally demanding, which limits its practical impact. The proposed approach aims to address this limitation and demonstrate its empirical efficiency through applications to multiple IEEE test cases. To solve the non-convex and computationally challenging CC AC-OPF problem, the proposed approach relies on a machine learning Gaussian process regression (GPR) model. The full Gaussian process (GP) approach is capable of learning a simple yet non-convex data-driven approximation to the AC power flow equations that can incorporate uncertain inputs. The proposed approach uses various approximations for GP-uncertainty propagation. The full GP CC-OPF approach exhibits highly competitive and promising results, outperforming the state-of-the-art sample-based chance constraint approaches. To further improve the robustness and complexity/accuracy trade-off of the full GP CC-OPF, a fast data-driven setup is proposed. This setup relies on the sparse and hybrid Gaussian processes (GP) framework to model the power flow equations with input uncertainty. △ Less

Submitted 17 February, 2024; originally announced February 2024.

Comments: 112 pages, 29 figures

arXiv:2302.08454 [pdf, other]

GP CC-OPF: Gaussian Process based optimization tool for Chance-Constrained Optimal Power Flow

Authors: Mile Mitrovic, Ognjen Kundacina, Aleksandr Lukashevich, Petr Vorobev, Vladimir Terzija, Yury Maximov, Deepjyoti Deka

Abstract: The Gaussian Process (GP) based Chance-Constrained Optimal Power Flow (CC-OPF) is an open-source Python code developed for solving economic dispatch (ED) problem in modern power grids. In recent years, integrating a significant amount of renewables into a power grid causes high fluctuations and thus brings a lot of uncertainty to power grid operations. This fact makes the conventional model-based… ▽ More The Gaussian Process (GP) based Chance-Constrained Optimal Power Flow (CC-OPF) is an open-source Python code developed for solving economic dispatch (ED) problem in modern power grids. In recent years, integrating a significant amount of renewables into a power grid causes high fluctuations and thus brings a lot of uncertainty to power grid operations. This fact makes the conventional model-based CC-OPF problem non-convex and computationally complex to solve. The developed tool presents a novel data-driven approach based on the GP regression model for solving the CC-OPF problem with a trade-off between complexity and accuracy. The proposed approach and developed software can help system operators to effectively perform ED optimization in the presence of large uncertainties in the power grid. △ Less

Submitted 16 February, 2023; originally announced February 2023.

Comments: 6 pages, 2 figures

arXiv:2208.14814 [pdf, other]

Data-Driven Chance Constrained AC-OPF using Hybrid Sparse Gaussian Processes

Authors: Mile Mitrovic, Aleksandr Lukashevich, Petr Vorobev, Vladimir Terzija, Yury Maximov, Deepjyoti Deka

Abstract: The alternating current (AC) chance-constrained optimal power flow (CC-OPF) problem addresses the economic efficiency of electricity generation and delivery under generation uncertainty. The latter is intrinsic to modern power grids because of the high amount of renewables. Despite its academic success, the AC CC-OPF problem is highly nonlinear and computationally demanding, which limits its pract… ▽ More The alternating current (AC) chance-constrained optimal power flow (CC-OPF) problem addresses the economic efficiency of electricity generation and delivery under generation uncertainty. The latter is intrinsic to modern power grids because of the high amount of renewables. Despite its academic success, the AC CC-OPF problem is highly nonlinear and computationally demanding, which limits its practical impact. For improving the AC-OPF problem complexity/accuracy trade-off, the paper proposes a fast data-driven setup that uses the sparse and hybrid Gaussian processes (GP) framework to model the power flow equations with input uncertainty. We advocate the efficiency of the proposed approach by a numerical study over multiple IEEE test cases showing up to two times faster and more accurate solutions compared to the state-of-the-art methods. △ Less

Submitted 30 August, 2022; originally announced August 2022.

Comments: 7 pages, 3 figures

arXiv:2207.10781 [pdf, other]

Data-Driven Stochastic AC-OPF using Gaussian Processes

Authors: Mile Mitrovic, Aleksandr Lukashevich, Petr Vorobev, Vladimir Terzija, Semen Budenny, Yury Maximov, Deepjyoti Deka

Abstract: In recent years, electricity generation has been responsible for more than a quarter of the greenhouse gas emissions in the US. Integrating a significant amount of renewables into a power grid is probably the most accessible way to reduce carbon emissions from power grids and slow down climate change. Unfortunately, the most accessible renewable power sources, such as wind and solar, are highly fl… ▽ More In recent years, electricity generation has been responsible for more than a quarter of the greenhouse gas emissions in the US. Integrating a significant amount of renewables into a power grid is probably the most accessible way to reduce carbon emissions from power grids and slow down climate change. Unfortunately, the most accessible renewable power sources, such as wind and solar, are highly fluctuating and thus bring a lot of uncertainty to power grid operations and challenge existing optimization and control policies. The chance-constrained alternating current (AC) optimal power flow (OPF) framework finds the minimum cost generation dispatch maintaining the power grid operations within security limits with a prescribed probability. Unfortunately, the AC-OPF problem's chance-constrained extension is non-convex, computationally challenging, and requires knowledge of system parameters and additional assumptions on the behavior of renewable distribution. Known linear and convex approximations to the above problems, though tractable, are too conservative for operational practice and do not consider uncertainty in system parameters. This paper presents an alternative data-driven approach based on Gaussian process (GP) regression to close this gap. The GP approach learns a simple yet non-convex data-driven approximation to the AC power flow equations that can incorporate uncertainty inputs. The latter is then used to determine the solution of CC-OPF efficiently, by accounting for both input and parameter uncertainty. The practical efficiency of the proposed approach using different approximations for GP-uncertainty propagation is illustrated over numerous IEEE test cases. △ Less

Submitted 21 July, 2022; originally announced July 2022.

Comments: 29 pages, 5 figures

arXiv:1905.00948 [pdf, other]

Submodular Streaming in All its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity

Authors: Ehsan Kazemi, Marko Mitrovic, Morteza Zadimoghaddam, Silvio Lattanzi, Amin Karbasi

Abstract: Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose Sieve-Streaming++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight… ▽ More Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose Sieve-Streaming++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight $(1/2)$-approximation guarantee. The best previously known streaming algorithms either achieve a suboptimal $(1/4)$-approximation with $Θ(k)$ memory or the optimal $(1/2)$-approximation with $O(k\log k)$ memory. Next, we show that by buffering a small fraction of the stream and applying a careful filtering procedure, one can heavily reduce the number of adaptive computational rounds, thus substantially lowering the computational complexity of Sieve-Streaming++. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight $(1/2)$-approximation guarantee with $O(k)$ shared memory while minimizing not only the required rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world data summarization tasks for multi-source streams of tweets and of YouTube videos. △ Less

Submitted 13 May, 2019; v1 submitted 2 May, 2019; originally announced May 2019.

Comments: Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019

arXiv:1902.05981 [pdf, other]

Adaptive Sequence Submodularity

Authors: Marko Mitrovic, Ehsan Kazemi, Moran Feldman, Andreas Krause, Amin Karbasi

Abstract: In many machine learning applications, one needs to interactively select a sequence of items (e.g., recommending movies based on a user's feedback) or make sequential decisions in a certain order (e.g., guiding an agent through a series of states). Not only do sequences already pose a dauntingly large search space, but we must also take into account past observations, as well as the uncertainty of… ▽ More In many machine learning applications, one needs to interactively select a sequence of items (e.g., recommending movies based on a user's feedback) or make sequential decisions in a certain order (e.g., guiding an agent through a series of states). Not only do sequences already pose a dauntingly large search space, but we must also take into account past observations, as well as the uncertainty of future outcomes. Without further structure, finding an optimal sequence is notoriously challenging, if not completely intractable. In this paper, we view the problem of adaptive and sequential decision making through the lens of submodularity and propose an adaptive greedy policy with strong theoretical guarantees. Additionally, to demonstrate the practical utility of our results, we run experiments on Amazon product recommendation and Wikipedia link prediction tasks. △ Less

Submitted 20 June, 2019; v1 submitted 15 February, 2019; originally announced February 2019.

arXiv:1806.02815 [pdf, other]

Data Summarization at Scale: A Two-Stage Submodular Approach

Authors: Marko Mitrovic, Ehsan Kazemi, Morteza Zadimoghaddam, Amin Karbasi

Abstract: The sheer scale of modern datasets has resulted in a dire need for summarization techniques that identify representative elements in a dataset. Fortunately, the vast majority of data summarization tasks satisfy an intuitive diminishing returns condition known as submodularity, which allows us to find nearly-optimal solutions in linear time. We focus on a two-stage submodular framework where the go… ▽ More The sheer scale of modern datasets has resulted in a dire need for summarization techniques that identify representative elements in a dataset. Fortunately, the vast majority of data summarization tasks satisfy an intuitive diminishing returns condition known as submodularity, which allows us to find nearly-optimal solutions in linear time. We focus on a two-stage submodular framework where the goal is to use some given training functions to reduce the ground set so that optimizing new functions (drawn from the same distribution) over the reduced set provides almost as much value as optimizing them over the entire ground set. In this paper, we develop the first streaming and distributed solutions to this problem. In addition to providing strong theoretical guarantees, we demonstrate both the utility and efficiency of our algorithms on real-world tasks including image summarization and ride-share optimization. △ Less

Submitted 7 June, 2018; originally announced June 2018.

Showing 1–9 of 9 results for author: Mitrovic, M