-
Improved Impossible Tuning and Lipschitz-Adaptive Universal Online Learning with Gradient Variations
Authors:
Kei Takemura,
Ryuta Matsuno,
Keita Sakuma
Abstract:
A central goal in online learning is to achieve adaptivity to unknown problem characteristics, such as environmental changes captured by gradient variation (GV), function curvature (universal online learning, UOL), and gradient scales (Lipschitz adaptivity, LA). Simultaneously achieving these with optimal performance is a major challenge, partly due to limitations in algorithms for prediction with…
▽ More
A central goal in online learning is to achieve adaptivity to unknown problem characteristics, such as environmental changes captured by gradient variation (GV), function curvature (universal online learning, UOL), and gradient scales (Lipschitz adaptivity, LA). Simultaneously achieving these with optimal performance is a major challenge, partly due to limitations in algorithms for prediction with expert advice. These algorithms often serve as meta-algorithms in online ensemble frameworks, and their sub-optimality hinders overall UOL performance. Specifically, existing algorithms addressing the ``impossible tuning'' issue incur an excess $\sqrt{\log T}$ factor in their regret bound compared to the lower bound. To solve this problem, we propose a novel optimistic online mirror descent algorithm with an auxiliary initial round using large learning rates. This design enables a refined analysis where a generated negative term cancels the gap-related factor, resolving the impossible tuning issue up to $\log\log T$ factors. Leveraging our improved algorithm as a meta-algorithm, we develop the first UOL algorithm that simultaneously achieves state-of-the-art GV bounds and LA under standard assumptions. Our UOL result overcomes key limitations of prior works, notably resolving the conflict between LA mechanisms and regret analysis for GV bounds -- an open problem highlighted by Xie et al.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
Backward Compatibility in Attributive Explanation and Enhanced Model Training Method
Authors:
Ryuta Matsuno
Abstract:
Model update is a crucial process in the operation of ML/AI systems. While updating a model generally enhances the average prediction performance, it also significantly impacts the explanations of predictions. In real-world applications, even minor changes in explanations can have detrimental consequences. To tackle this issue, this paper introduces BCX, a quantitative metric that evaluates the ba…
▽ More
Model update is a crucial process in the operation of ML/AI systems. While updating a model generally enhances the average prediction performance, it also significantly impacts the explanations of predictions. In real-world applications, even minor changes in explanations can have detrimental consequences. To tackle this issue, this paper introduces BCX, a quantitative metric that evaluates the backward compatibility of feature attribution explanations between pre- and post-update models. BCX utilizes practical agreement metrics to calculate the average agreement between the explanations of pre- and post-update models, specifically among samples on which both models accurately predict. In addition, we propose BCXR, a BCX-aware model training method by designing surrogate losses which theoretically lower bounds agreement scores. Furthermore, we present a universal variant of BCXR that improves all agreement metrics, utilizing L2 distance among the explanations of the models. To validate our approach, we conducted experiments on eight real-world datasets, demonstrating that BCXR achieves superior trade-offs between predictive performances and BCX scores, showcasing the effectiveness of our BCXR methods.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Improved mixing time for k-subgraph sampling
Authors:
Ryuta Matsuno,
Aristides Gionis
Abstract:
Understanding the local structure of a graph provides valuable insights about the underlying phenomena from which the graph has originated. Sampling and examining k-subgraphs is a widely used approach to understand the local structure of a graph. In this paper, we study the problem of sampling uniformly k-subgraphs from a given graph. We analyze a few different Markov chain Monte Carlo (MCMC) appr…
▽ More
Understanding the local structure of a graph provides valuable insights about the underlying phenomena from which the graph has originated. Sampling and examining k-subgraphs is a widely used approach to understand the local structure of a graph. In this paper, we study the problem of sampling uniformly k-subgraphs from a given graph. We analyze a few different Markov chain Monte Carlo (MCMC) approaches, and obtain analytical results on their mixing times, which improve significantly the state of the art. In particular, we improve the bound on the mixing times of the standard MCMC approach, and the state-of-the-art MCMC sampling method PSRW, using the canonical-paths argument. In addition, we propose a novel sampling method, which we call recursive subgraph sampling, RSS, and its optimized variant RSS+. The proposed methods, RSS and RSS+, are significantly faster than existing approaches.
△ Less
Submitted 26 January, 2020;
originally announced January 2020.