Skip to main content

Showing 1–8 of 8 results for author: Fujisawa, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.19470  [pdf, ps, other

    stat.ML cs.LG

    Information-theoretic Generalization Analysis for VQ-VAEs: A Role of Latent Variables

    Authors: Futoshi Futami, Masahiro Fujisawa

    Abstract: Latent variables (LVs) play a crucial role in encoder-decoder models by enabling effective data compression, prediction, and generation. Although their theoretical properties, such as generalization, have been extensively studied in supervised learning, similar analyses for unsupervised models such as variational autoencoders (VAEs) remain insufficiently underexplored. In this work, we extend info… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

  2. arXiv:2505.17859  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Scalable Valuation of Human Feedback through Provably Robust Model Alignment

    Authors: Masahiro Fujisawa, Masaki Adachi, Michael A. Osborne

    Abstract: Despite the importance of aligning language models with human preferences, crowd-sourced human feedback is often noisy -- for example, preferring less desirable responses -- posing a fundamental challenge to alignment. A truly robust alignment objective should yield identical model parameters even under severe label noise, a property known as redescending. We prove that no existing alignment metho… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: 38 pages, 7 figures

  3. arXiv:2503.06079  [pdf, ps, other

    stat.ML cs.LG

    Fixing the Pitfalls of Probabilistic Time-Series Forecasting Evaluation by Kernel Quadrature

    Authors: Masaki Adachi, Masahiro Fujisawa, Michael A Osborne

    Abstract: Despite the significance of probabilistic time-series forecasting models, their evaluation metrics often involve intractable integrations. The most widely used metric, the continuous ranked probability score (CRPS), is a strictly proper scoring function; however, its computation requires approximation. We found that popular CRPS estimators--specifically, the quantile-based estimator implemented in… ▽ More

    Submitted 23 July, 2025; v1 submitted 8 March, 2025; originally announced March 2025.

    Comments: 11 pages, 6 figures

    MSC Class: 62C10; 62F15

  4. arXiv:2406.06227  [pdf, ps, other

    cs.LG stat.ML

    PAC-Bayes Analysis for Recalibration in Classification

    Authors: Masahiro Fujisawa, Futoshi Futami

    Abstract: Nonparametric estimation using uniform-width binning is a standard approach for evaluating the calibration performance of machine learning models. However, existing theoretical analyses of the bias induced by binning are limited to binary classification, creating a significant gap with practical applications such as multiclass classification. Additionally, many parametric recalibration algorithms… ▽ More

    Submitted 10 July, 2025; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted by the 42nd International Conference on Machine Learning (ICML2025), 38 pages, 8 figures

  5. arXiv:2405.15709  [pdf, other

    cs.LG math.ST stat.ML

    Information-theoretic Generalization Analysis for Expected Calibration Error

    Authors: Futoshi Futami, Masahiro Fujisawa

    Abstract: While the expected calibration error (ECE), which employs binning, is widely adopted to evaluate the calibration performance of machine learning models, theoretical understanding of its estimation bias is limited. In this paper, we present the first comprehensive analysis of the estimation bias in the two common binning strategies, uniform mass and uniform width binning. Our analysis establishes u… ▽ More

    Submitted 26 May, 2025; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted by the 38th Conference on Neural Information Processing Systems (NeurIPS2024), 52 pages, 6 figures

  6. arXiv:2311.01046  [pdf, ps, other

    cs.LG stat.ML

    Time-Independent Information-Theoretic Generalization Bounds for SGLD

    Authors: Futoshi Futami, Masahiro Fujisawa

    Abstract: We provide novel information-theoretic generalization bounds for stochastic gradient Langevin dynamics (SGLD) under the assumptions of smoothness and dissipativity, which are widely used in sampling and non-convex optimization studies. Our bounds are time-independent and decay to zero as the sample size increases, regardless of the number of iterations and whether the step size is fixed. Unlike pr… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: Accepted by the Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS2023), 29 pages

  7. arXiv:2006.07571  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    $γ$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a Robust Divergence Estimator

    Authors: Masahiro Fujisawa, Takeshi Teshima, Issei Sato, Masashi Sugiyama

    Abstract: Approximate Bayesian computation (ABC) is a likelihood-free inference method that has been employed in various applications. However, ABC can be sensitive to outliers if a data discrepancy measure is chosen inappropriately. In this paper, we propose to use a nearest-neighbor-based $γ$-divergence estimator as a data discrepancy measure. We show that our estimator possesses a suitable theoretical ro… ▽ More

    Submitted 5 March, 2021; v1 submitted 13 June, 2020; originally announced June 2020.

    Comments: The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021); 48 pages, 22 figures

  8. arXiv:1902.00468  [pdf, other

    stat.ML cs.LG

    Multilevel Monte Carlo Variational Inference

    Authors: Masahiro Fujisawa, Issei Sato

    Abstract: We propose a variance reduction framework for variational inference using the Multilevel Monte Carlo (MLMC) method. Our framework is built on reparameterized gradient estimators and "recycles" parameters obtained from past update history in optimization. In addition, our framework provides a new optimization algorithm based on stochastic gradient descent (SGD) that adaptively estimates the sample… ▽ More

    Submitted 2 December, 2021; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: 44pages, 10 figures; Journal of Machine Learning Research (JMLR)