Skip to main content

Showing 1–26 of 26 results for author: Imaizumi, M

Searching in archive math. Search in all archives.
.
  1. arXiv:2505.13570  [pdf, ps, other

    math.ST stat.ML

    Minimax Rates of Estimation for Optimal Transport Map between Infinite-Dimensional Spaces

    Authors: Donlapark Ponnoprat, Masaaki Imaizumi

    Abstract: We investigate the estimation of an optimal transport map between probability measures on an infinite-dimensional space and reveal its minimax optimal rate. Optimal transport theory defines distances within a space of probability measures, utilizing an optimal transport map as its key component. Estimating the optimal transport map from samples finds several applications, such as simulating dynami… ▽ More

    Submitted 27 May, 2025; v1 submitted 19 May, 2025; originally announced May 2025.

    Comments: 60 pages, 5 figures

    MSC Class: 62G05

  2. arXiv:2505.04898  [pdf, other

    cs.LG cs.AI math.OC math.ST stat.ML

    Precise gradient descent training dynamics for finite-width multi-layer neural networks

    Authors: Qiyang Han, Masaaki Imaizumi

    Abstract: In this paper, we provide the first precise distributional characterization of gradient descent iterates for general multi-layer neural networks under the canonical single-index regression model, in the `finite-width proportional regime' where the sample size and feature dimension grow proportionally while the network width and depth remain bounded. Our non-asymptotic state evolution theory captur… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  3. arXiv:2502.11467  [pdf, other

    cs.LG math.FA

    Approximation of Permutation Invariant Polynomials by Transformers: Efficient Construction in Column-Size

    Authors: Naoki Takeshita, Masaaki Imaizumi

    Abstract: Transformers are a type of neural network that have demonstrated remarkable performance across various domains, particularly in natural language processing tasks. Motivated by this success, research on the theoretical understanding of transformers has garnered significant attention. A notable example is the mathematical analysis of their approximation power, which validates the empirical expressiv… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

    Comments: 29 pages

  4. arXiv:2410.19244  [pdf, ps, other

    math.ST

    Universality of Estimator for High-Dimensional Linear Models with Block Dependency

    Authors: Toshiki Tsuda, Masaaki Imaizumi

    Abstract: We study the universality property of estimators for high-dimensional linear models, which exhibits a distribution of estimators is independent of whether covariates follows a Gaussian distribution. Recent high-dimensional statistics require covariates to strictly follow a Gaussian distribution to reveal precise properties of estimators. To relax the Gaussianity requirement, the existing literatur… ▽ More

    Submitted 24 October, 2024; originally announced October 2024.

    Comments: 35 pages

  5. arXiv:2410.08709  [pdf, other

    cs.LG math.NA stat.ML

    Distillation of Discrete Diffusion through Dimensional Correlations

    Authors: Satoshi Hayakawa, Yuhta Takida, Masaaki Imaizumi, Hiromi Wakaki, Yuki Mitsufuji

    Abstract: Diffusion models have demonstrated exceptional performances in various fields of generative modeling, but suffer from slow sampling speed due to their iterative nature. While this issue is being addressed in continuous domains, discrete diffusion models face unique challenges, particularly in capturing dependencies between elements (e.g., pixel relationships in image, sequential dependencies in la… ▽ More

    Submitted 8 May, 2025; v1 submitted 11 October, 2024; originally announced October 2024.

    Comments: 39 pages, ICML 2025 accepted

  6. arXiv:2404.17812  [pdf, other

    math.ST stat.ME

    High-Dimensional Single-Index Models: Link Estimation and Marginal Inference

    Authors: Kazuma Sawaya, Yoshimasa Uematsu, Masaaki Imaizumi

    Abstract: This study proposes a novel method for estimation and hypothesis testing in high-dimensional single-index models. We address a common scenario where the sample size and the dimension of regression coefficients are large and comparable. Unlike traditional approaches, which often overlook the estimation of the unknown link function, we introduce a new method for link function estimation. Leveraging… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: 42 pages

  7. Uniform Confidence Band for Optimal Transport Map on One-Dimensional Data

    Authors: Donlapark Ponnoprat, Ryo Okano, Masaaki Imaizumi

    Abstract: We develop a statistical inference method for an optimal transport map between distributions on real numbers with uniform confidence bands. The concept of optimal transport (OT) is used to measure distances between distributions, and OT maps are used to construct the distance. OT has been applied in many fields in recent years, and its statistical properties have attracted much interest. In partic… ▽ More

    Submitted 15 February, 2024; v1 submitted 18 July, 2023; originally announced July 2023.

    Comments: 37 pages

    MSC Class: 62G15; 62G20

    Journal ref: Electronic Journal of Statistics 2024, Vol. 18, No. 1, 515-552

  8. arXiv:2307.06137  [pdf, other

    stat.ME math.ST

    Distribution-on-Distribution Regression with Wasserstein Metric: Multivariate Gaussian Case

    Authors: Ryo Okano, Masaaki Imaizumi

    Abstract: Distribution data refers to a data set where each sample is represented as a probability distribution, a subject area receiving burgeoning interest in the field of statistics. Although several studies have developed distribution-to-distribution regression models for univariate variables, the multivariate scenario remains under-explored due to technical complexities. In this study, we introduce mod… ▽ More

    Submitted 8 February, 2024; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: 34 pages

  9. arXiv:2305.17731  [pdf, other

    math.ST

    Moment-Based Adjustments of Statistical Inference in High-Dimensional Generalized Linear Models

    Authors: Kazuma Sawaya, Yoshimasa Uematsu, Masaaki Imaizumi

    Abstract: We developed a statistical inference method applicable to a broad range of generalized linear models (GLMs) in high-dimensional settings, where the number of unknown coefficients scales proportionally with the sample size. Although a pioneering inference method has been developed for logistic regression, which is a specific instance of GLMs, we cannot apply this method directly to other GLMs becau… ▽ More

    Submitted 23 May, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: 33 pages

  10. arXiv:2305.15754  [pdf, other

    math.ST stat.ME stat.ML

    Bayesian Analysis for Over-parameterized Linear Model via Effective Spectra

    Authors: Tomoya Wakayama, Masaaki Imaizumi

    Abstract: In high-dimensional Bayesian statistics, various methods have been developed, including prior distributions that induce parameter sparsity to handle many parameters. Yet, these approaches often overlook the rich spectral structure of the covariate matrix, which can be crucial when true signals are not sparse. To address this gap, we introduce a data-adaptive Gaussian prior whose covariance is alig… ▽ More

    Submitted 5 May, 2025; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: 46 pages

  11. arXiv:2304.04037  [pdf, other

    math.ST

    Benign Overfitting of Non-Sparse High-Dimensional Linear Regression with Correlated Noise

    Authors: Toshiki Tsuda, Masaaki Imaizumi

    Abstract: We investigate the high-dimensional linear regression problem in the presence of noise correlated with Gaussian covariates. This correlation, known as endogeneity in regression models, often arises from unobserved variables and other factors. It has been a major challenge in causal inference and econometrics. When the covariates are high-dimensional, it has been common to assume sparsity on the tr… ▽ More

    Submitted 20 October, 2023; v1 submitted 8 April, 2023; originally announced April 2023.

    Comments: 73 pages

  12. arXiv:2302.02988  [pdf, other

    cs.LG econ.EM math.ST stat.ME stat.ML

    Asymptotically Optimal Fixed-Budget Best Arm Identification with Variance-Dependent Bounds

    Authors: Masahiro Kato, Masaaki Imaizumi, Takuya Ishihara, Toru Kitagawa

    Abstract: We investigate the problem of fixed-budget best arm identification (BAI) for minimizing expected simple regret. In an adaptive experiment, a decision maker draws one of multiple treatment arms based on past observations and observes the outcome of the drawn arm. After the experiment, the decision maker recommends the treatment arm with the highest expected outcome. We evaluate the decision based o… ▽ More

    Submitted 12 July, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

  13. arXiv:2210.09756  [pdf, ps, other

    math.ST

    Dimension-free Bounds for Sum of Dependent Matrices and Operators with Heavy-Tailed Distribution

    Authors: Shogo Nakakita, Pierre Alquier, Masaaki Imaizumi

    Abstract: We prove deviation inequalities for sums of high-dimensional random matrices and operators with dependence and {\rc heavy tails}. Estimation of high-dimensional matrices is a concern for numerous modern applications. However, most results are stated for independent observations. Therefore, it is critical to derive results for dependent and heavy-tailed matrices. In this paper, we derive a dimensio… ▽ More

    Submitted 25 June, 2025; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: 40 pages in total

    Journal ref: Electron. J. Statist. 18(1): 1130-1159 (2024)

  14. arXiv:2209.07330  [pdf, other

    cs.LG econ.EM math.ST stat.ME stat.ML

    Best Arm Identification with Contextual Information under a Small Gap

    Authors: Masahiro Kato, Masaaki Imaizumi, Takuya Ishihara, Toru Kitagawa

    Abstract: We study the best-arm identification (BAI) problem with a fixed budget and contextual (covariate) information. In each round of an adaptive experiment, after observing contextual information, we choose a treatment arm using past observations and current context. Our goal is to identify the best treatment arm, which is a treatment arm with the maximal expected reward marginalized over the contextua… ▽ More

    Submitted 4 January, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: For the sake of completeness, we show a part of the results of Kato et al. (arXiv:2201.04469). arXiv admin note: text overlap with arXiv:2201.04469

  15. arXiv:2204.08369  [pdf, ps, other

    math.ST stat.ML

    Benign Overfitting in Time Series Linear Models with Over-Parameterization

    Authors: Shogo Nakakita, Masaaki Imaizumi

    Abstract: The success of large-scale models in recent years has increased the importance of statistical models with numerous parameters. Several studies have analyzed over-parameterized linear models with high-dimensional data, which may not be sparse; however, existing results rely on the assumption of sample independence. In this study, we analyze a linear regression model with dependent time-series data… ▽ More

    Submitted 13 March, 2025; v1 submitted 18 April, 2022; originally announced April 2022.

    Comments: Accepted at Bernoulli

  16. arXiv:2202.05245  [pdf, ps, other

    econ.EM cs.LG math.ST stat.ML

    Benign-Overfitting in Conditional Average Treatment Effect Prediction with Linear Regression

    Authors: Masahiro Kato, Masaaki Imaizumi

    Abstract: We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE), with linear regression models. As the development of machine learning for causal inference, a wide range of large-scale models for causality are gaining attention. One problem is that suspicions have been raised that the large-scale models are prone to overfitting to observations with sampl… ▽ More

    Submitted 11 February, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: arXiv admin note: text overlap with arXiv:1906.11300 by other authors

  17. arXiv:2201.04469  [pdf, other

    stat.ML cs.LG econ.EM math.ST

    Optimal Best Arm Identification in Two-Armed Bandits with a Fixed Budget under a Small Gap

    Authors: Masahiro Kato, Kaito Ariu, Masaaki Imaizumi, Masahiro Nomura, Chao Qin

    Abstract: We consider fixed-budget best-arm identification in two-armed Gaussian bandit problems. One of the longstanding open questions is the existence of an optimal strategy under which the probability of misidentification matches a lower bound. We show that a strategy following the Neyman allocation rule (Neyman, 1934) is asymptotically optimal when the gap between the expected rewards is small. First,… ▽ More

    Submitted 28 December, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

  18. arXiv:2112.00213  [pdf, other

    math.ST stat.ML

    Minimax Analysis for Inverse Risk in Nonparametric Planer Invertible Regression

    Authors: Akifumi Okuno, Masaaki Imaizumi

    Abstract: We study a minimax risk of estimating inverse functions on a plane, while keeping an estimator is also invertible. Learning invertibility from data and exploiting an invertible estimator are used in many domains, such as statistics, econometrics, and machine learning. Although the consistency and universality of invertible estimators have been well investigated, analysis of the efficiency of these… ▽ More

    Submitted 25 December, 2023; v1 submitted 30 November, 2021; originally announced December 2021.

    Comments: 34 pages, 34 figures, accepted to Electronic Journal of Statistics

  19. arXiv:2111.04004  [pdf, other

    cs.LG math.OC

    Exponential escape efficiency of SGD from sharp minima in non-stationary regime

    Authors: Hikaru Ibayashi, Masaaki Imaizumi

    Abstract: We show that stochastic gradient descent (SGD) escapes from sharp minima exponentially fast even before SGD reaches stationary distribution. SGD has been a de-facto standard training algorithm for various machine learning tasks. However, there still exists an open question as to why SGDs find highly generalizable parameters from non-convex target functions, such as the loss function of neural netw… ▽ More

    Submitted 18 March, 2022; v1 submitted 7 November, 2021; originally announced November 2021.

  20. arXiv:2104.02978  [pdf, other

    math.ST

    Fast Convergence on Perfect Classification for Functional Data

    Authors: Tomoya Wakayama, Masaaki Imaizumi

    Abstract: We investigate the availability of approaching perfect classification on functional data with finite samples. The seminal work (Delaigle and Hall (2012)) showed that perfect classification for functional data is easier to achieve than for finite-dimensional data. This result is based on their finding that a sufficient condition for the existence of a perfect classifier, named a Delaigle--Hall cond… ▽ More

    Submitted 7 January, 2023; v1 submitted 7 April, 2021; originally announced April 2021.

    Comments: 32 pages, accepted by Statistics Sinica

  21. arXiv:2103.00500  [pdf, other

    stat.ML cs.LG math.ST

    Asymptotic Risk of Overparameterized Likelihood Models: Double Descent Theory for Deep Neural Networks

    Authors: Ryumei Nakada, Masaaki Imaizumi

    Abstract: We investigate the asymptotic risk of a general class of overparameterized likelihood models, including deep models. The recent empirical success of large-scale models has motivated several theoretical studies to investigate a scenario wherein both the number of samples, $n$, and parameters, $p$, diverge to infinity and derive an asymptotic risk at the limit. However, these theorems are only valid… ▽ More

    Submitted 15 March, 2021; v1 submitted 28 February, 2021; originally announced March 2021.

    Comments: 36 pages

  22. arXiv:2102.02981  [pdf, ps, other

    cs.LG math.ST stat.ML

    Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency

    Authors: Masatoshi Uehara, Masaaki Imaizumi, Nan Jiang, Nathan Kallus, Wen Sun, Tengyang Xie

    Abstract: We offer a theoretical characterization of off-policy evaluation (OPE) in reinforcement learning using function approximation for marginal importance weights and $q$-functions when these are estimated using recent minimax methods. Under various combinations of realizability and completeness assumptions, we show that the minimax approach enables us to achieve a fast rate of convergence for weights… ▽ More

    Submitted 24 July, 2022; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: Under Review

  23. arXiv:2012.15678  [pdf, ps, other

    math.ST

    On Gaussian Approximation for M-Estimator

    Authors: Masaaki Imaizumi, Taisuke Otsu

    Abstract: This study develops a non-asymptotic Gaussian approximation theory for distributions of M-estimators, which are defined as maximizers of empirical criterion functions. In existing mathematical statistics literature, numerous studies have focused on approximating the distributions of the M-estimators for statistical inference. In contrast to the existing approaches, which mainly focus on limiting b… ▽ More

    Submitted 2 January, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: 48 pages

  24. arXiv:1910.07773  [pdf, other

    math.ST

    Hypothesis Test and Confidence Analysis with Wasserstein Distance on General Dimension

    Authors: Masaaki Imaizumi, Hirofumi Ota, Takuo Hamaguchi

    Abstract: We develop a general framework for statistical inference with the 1-Wasserstein distance. Recently, the Wasserstein distance has attracted considerable attention and has been widely applied to various machine learning tasks because of its excellent properties. However, hypothesis tests and a confidence analysis for the Wasserstein distance have not been established in a general multivariate settin… ▽ More

    Submitted 15 February, 2022; v1 submitted 17 October, 2019; originally announced October 2019.

    Comments: 36 pages

  25. arXiv:1612.07490  [pdf, other

    math.ST

    A simple method to construct confidence bands in functional linear regression

    Authors: Masaaki Imaizumi, Kengo Kato

    Abstract: This paper develops a simple method to construct confidence bands, centered at a principal component analysis (PCA) based estimator, for the slope function in a functional linear regression model with a scalar response variable and a functional predictor variable. The PCA-based estimator is a series estimator with estimated basis functions, and so construction of valid confidence bands for it is a… ▽ More

    Submitted 1 May, 2017; v1 submitted 22 December, 2016; originally announced December 2016.

    Comments: 29 pages

  26. arXiv:1609.00286  [pdf, other

    math.ST

    PCA-based estimation for functional linear regression with functional responses

    Authors: Masaaki Imaizumi, Kengo Kato

    Abstract: This paper studies a regression model where both predictor and response variables are random functions. We consider a functional linear model where the conditional mean of the response variable at each time point is given by a linear functional of the predictor variable. In this paper, we are interested in estimation of the integral kernel $b(s,t)$ of the conditional expectation operator, where… ▽ More

    Submitted 22 March, 2017; v1 submitted 1 September, 2016; originally announced September 2016.

    Comments: 30 pages