Skip to main content

Showing 1–4 of 4 results for author: Takezawa, Y

Searching in archive math. Search in all archives.
.
  1. arXiv:2506.05791  [pdf, ps, other

    cs.LG math.OC

    Exploiting Similarity for Computation and Communication-Efficient Decentralized Optimization

    Authors: Yuki Takezawa, Xiaowen Jiang, Anton Rodomanov, Sebastian U. Stich

    Abstract: Reducing communication complexity is critical for efficient decentralized optimization. The proximal decentralized optimization (PDO) framework is particularly appealing, as methods within this framework can exploit functional similarity among nodes to reduce communication rounds. Specifically, when local functions at different nodes are similar, these methods achieve faster convergence with fewer… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: ICML 2025

  2. arXiv:2501.15259  [pdf, other

    cs.LG math.OC stat.ML

    Scalable Decentralized Learning with Teleportation

    Authors: Yuki Takezawa, Sebastian U. Stich

    Abstract: Decentralized SGD can run with low communication costs, but its sparse communication characteristics deteriorate the convergence rate, especially when the number of nodes is large. In decentralized learning settings, communication is assumed to occur on only a given topology, while in many practical cases, the topology merely represents a preferred communication pattern, and connecting to arbitrar… ▽ More

    Submitted 27 February, 2025; v1 submitted 25 January, 2025; originally announced January 2025.

    Comments: ICLR 2025

  3. arXiv:2405.15010  [pdf, other

    cs.LG math.OC

    Parameter-free Clipped Gradient Descent Meets Polyak

    Authors: Yuki Takezawa, Han Bao, Ryoma Sato, Kenta Niwa, Makoto Yamada

    Abstract: Gradient descent and its variants are de facto standard algorithms for training machine learning models. As gradient descent is sensitive to its hyperparameters, we need to tune the hyperparameters carefully using a grid search. However, the method is time-consuming, particularly when multiple hyperparameters exist. Therefore, recent studies have analyzed parameter-free methods that adjust the hyp… ▽ More

    Submitted 31 October, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: NeurIPS 2024

  4. arXiv:2205.11979  [pdf, other

    math.OC cs.LG

    Theoretical Analysis of Primal-Dual Algorithm for Non-Convex Stochastic Decentralized Optimization

    Authors: Yuki Takezawa, Kenta Niwa, Makoto Yamada

    Abstract: In recent years, decentralized learning has emerged as a powerful tool not only for large-scale machine learning, but also for preserving privacy. One of the key challenges in decentralized learning is that the data distribution held by each node is statistically heterogeneous. To address this challenge, the primal-dual algorithm called the Edge-Consensus Learning (ECL) was proposed and was experi… ▽ More

    Submitted 22 September, 2022; v1 submitted 23 May, 2022; originally announced May 2022.