Skip to main content

Showing 1–40 of 40 results for author: Imaizumi, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.00846  [pdf, ps, other

    cs.LG stat.ML

    Infinite-Width Limit of a Single Attention Layer: Analysis via Tensor Programs

    Authors: Mana Sakai, Ryo Karakida, Masaaki Imaizumi

    Abstract: In modern theoretical analyses of neural networks, the infinite-width limit is often invoked to justify Gaussian approximations of neuron preactivations (e.g., via neural network Gaussian processes or Tensor Programs). However, these Gaussian-based asymptotic theories have so far been unable to capture the behavior of attention layers, except under special regimes such as infinitely many heads or… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  2. arXiv:2505.14102  [pdf, ps, other

    stat.ML cs.LG stat.ME

    High-dimensional Nonparametric Contextual Bandit Problem

    Authors: Shogo Iwazaki, Junpei Komiyama, Masaaki Imaizumi

    Abstract: We consider the kernelized contextual bandit problem with a large feature space. This problem involves $K$ arms, and the goal of the forecaster is to maximize the cumulative rewards through learning the relationship between the contexts and the rewards. It serves as a general framework for various decision-making scenarios, such as personalized online advertising and recommendation systems. Kernel… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: 38 pages

  3. arXiv:2505.13570  [pdf, ps, other

    math.ST stat.ML

    Minimax Rates of Estimation for Optimal Transport Map between Infinite-Dimensional Spaces

    Authors: Donlapark Ponnoprat, Masaaki Imaizumi

    Abstract: We investigate the estimation of an optimal transport map between probability measures on an infinite-dimensional space and reveal its minimax optimal rate. Optimal transport theory defines distances within a space of probability measures, utilizing an optimal transport map as its key component. Estimating the optimal transport map from samples finds several applications, such as simulating dynami… ▽ More

    Submitted 27 May, 2025; v1 submitted 19 May, 2025; originally announced May 2025.

    Comments: 60 pages, 5 figures

    MSC Class: 62G05

  4. arXiv:2505.04898  [pdf, other

    cs.LG cs.AI math.OC math.ST stat.ML

    Precise gradient descent training dynamics for finite-width multi-layer neural networks

    Authors: Qiyang Han, Masaaki Imaizumi

    Abstract: In this paper, we provide the first precise distributional characterization of gradient descent iterates for general multi-layer neural networks under the canonical single-index regression model, in the `finite-width proportional regime' where the sample size and feature dimension grow proportionally while the network width and depth remain bounded. Our non-asymptotic state evolution theory captur… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  5. arXiv:2503.23642  [pdf, other

    stat.ML cs.LG

    Learning a Single Index Model from Anisotropic Data with vanilla Stochastic Gradient Descent

    Authors: Guillaume Braun, Minh Ha Quang, Masaaki Imaizumi

    Abstract: We investigate the problem of learning a Single Index Model (SIM)- a popular model for studying the ability of neural networks to learn features - from anisotropic Gaussian inputs by training a neuron using vanilla Stochastic Gradient Descent (SGD). While the isotropic case has been extensively studied, the anisotropic case has received less attention and the impact of the covariance matrix on the… ▽ More

    Submitted 30 March, 2025; originally announced March 2025.

    Journal ref: Proceedings of the 28th International Conference on Artificial Intelligence and Statistics (AISTATS) 2025

  6. arXiv:2503.06393  [pdf, ps, other

    nlin.CD cond-mat.dis-nn physics.data-an stat.CO

    Landscape computations for the edge of chaos in nonlinear dynamical systems

    Authors: Motoki Nakata, Masaaki Imaizumi

    Abstract: We propose a stochastic sampling approach to identify stability boundaries in general dynamical systems. The global landscape of Lyapunov exponent in multi-dimensional parameter space provides transition boundaries for stable/unstable trajectories, i.e., the edge of chaos. Despite its usefulness, it is generally difficult to derive analytically. In this study, we reveal the transition boundaries b… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

    Comments: 9 pages, 5 figures

    Report number: RIKEN-iTHEMS-Report-24

  7. arXiv:2411.01161  [pdf, other

    stat.ML cs.CR cs.LG

    Federated Learning with Relative Fairness

    Authors: Shogo Nakakita, Tatsuya Kaneko, Shinya Takamaeda-Yamazaki, Masaaki Imaizumi

    Abstract: This paper proposes a federated learning framework designed to achieve \textit{relative fairness} for clients. Traditional federated learning frameworks typically ensure absolute fairness by guaranteeing minimum performance across all client subgroups. However, this approach overlooks disparities in model performance between subgroups. The proposed framework uses a minimax problem approach to mini… ▽ More

    Submitted 2 November, 2024; originally announced November 2024.

    Comments: 43 pages

  8. arXiv:2410.08709  [pdf, other

    cs.LG math.NA stat.ML

    Distillation of Discrete Diffusion through Dimensional Correlations

    Authors: Satoshi Hayakawa, Yuhta Takida, Masaaki Imaizumi, Hiromi Wakaki, Yuki Mitsufuji

    Abstract: Diffusion models have demonstrated exceptional performances in various fields of generative modeling, but suffer from slow sampling speed due to their iterative nature. While this issue is being addressed in continuous domains, discrete diffusion models face unique challenges, particularly in capturing dependencies between elements (e.g., pixel relationships in image, sequential dependencies in la… ▽ More

    Submitted 8 May, 2025; v1 submitted 11 October, 2024; originally announced October 2024.

    Comments: 39 pages, ICML 2025 accepted

  9. arXiv:2407.08228  [pdf, ps, other

    stat.ME

    Wasserstein $k$-Centers Clustering for Distributional Data

    Authors: Ryo Okano, Masaaki Imaizumi

    Abstract: We develop a novel clustering method for distributional data, where each data point is regarded as a probability distribution on the real line. For distributional data, it has been challenging to develop a clustering method that utilizes modes of variation of the data because the space of probability distributions lacks a vector space structure, preventing the application of existing methods devis… ▽ More

    Submitted 21 June, 2025; v1 submitted 11 July, 2024; originally announced July 2024.

    Comments: 40 pages

  10. arXiv:2406.16032  [pdf, other

    stat.ML cs.LG

    Effect of Random Learning Rate: Theoretical Analysis of SGD Dynamics in Non-Convex Optimization via Stationary Distribution

    Authors: Naoki Yoshida, Shogo Nakakita, Masaaki Imaizumi

    Abstract: We consider a variant of the stochastic gradient descent (SGD) with a random learning rate and reveal its convergence properties. SGD is a widely used stochastic optimization algorithm in machine learning, especially deep learning. Numerous studies reveal the convergence properties of SGD and its simplified variants. Among these, the analysis of convergence using a stationary distribution of updat… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 28 pages

  11. arXiv:2405.16819  [pdf, other

    cs.LG stat.ML

    Automatic Domain Adaptation by Transformers in In-Context Learning

    Authors: Ryuichiro Hataya, Kota Matsui, Masaaki Imaizumi

    Abstract: Selecting or designing an appropriate domain adaptation algorithm for a given problem remains challenging. This paper presents a Transformer model that can provably approximate and opt for domain adaptation methods for a given dataset in the in-context learning framework, where a foundation model performs new tasks without updating its parameters at test time. Specifically, we prove that Transform… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  12. arXiv:2404.17812  [pdf, other

    math.ST stat.ME

    High-Dimensional Single-Index Models: Link Estimation and Marginal Inference

    Authors: Kazuma Sawaya, Yoshimasa Uematsu, Masaaki Imaizumi

    Abstract: This study proposes a novel method for estimation and hypothesis testing in high-dimensional single-index models. We address a common scenario where the sample size and the dimension of regression coefficients are large and comparable. Unlike traditional approaches, which often overlook the estimation of the unknown link function, we introduce a new method for link function estimation. Leveraging… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: 42 pages

  13. arXiv:2401.17269  [pdf, other

    stat.ML cs.LG

    Effect of Weight Quantization on Learning Models by Typical Case Analysis

    Authors: Shuhei Kashiwamura, Ayaka Sakata, Masaaki Imaizumi

    Abstract: This paper examines the quantization methods used in large-scale data analysis models and their hyperparameter choices. The recent surge in data analysis scale has significantly increased computational resource requirements. To address this, quantizing model weights has become a prevalent practice in data analysis applications such as deep learning. Quantization is particularly vital for deploying… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  14. arXiv:2310.16819  [pdf, other

    econ.EM cs.LG stat.AP stat.ME stat.ML

    CATE Lasso: Conditional Average Treatment Effect Estimation with High-Dimensional Linear Regression

    Authors: Masahiro Kato, Masaaki Imaizumi

    Abstract: In causal inference about two treatments, Conditional Average Treatment Effects (CATEs) play an important role as a quantity representing an individualized causal effect, defined as a difference between the expected outcomes of the two treatments conditioned on covariates. This study assumes two linear regression models between a potential outcome and covariates of the two treatments and defines C… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  15. arXiv:2307.06137  [pdf, other

    stat.ME math.ST

    Distribution-on-Distribution Regression with Wasserstein Metric: Multivariate Gaussian Case

    Authors: Ryo Okano, Masaaki Imaizumi

    Abstract: Distribution data refers to a data set where each sample is represented as a probability distribution, a subject area receiving burgeoning interest in the field of statistics. Although several studies have developed distribution-to-distribution regression models for univariate variables, the multivariate scenario remains under-explored due to technical complexities. In this study, we introduce mod… ▽ More

    Submitted 8 February, 2024; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: 34 pages

  16. arXiv:2307.04042  [pdf, other

    stat.ML cs.LG

    Sup-Norm Convergence of Deep Neural Network Estimator for Nonparametric Regression by Adversarial Training

    Authors: Masaaki Imaizumi

    Abstract: We show the sup-norm convergence of deep neural network estimators with a novel adversarial training scheme. For the nonparametric regression problem, it has been shown that an estimator using deep neural networks can achieve better performances in the sense of the $L2$-norm. In contrast, it is difficult for the neural estimator with least-squares to achieve the sup-norm convergence, due to the de… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

    Comments: 38 pages

  17. arXiv:2306.11017  [pdf, ps, other

    stat.ML cs.LG

    High-dimensional Contextual Bandit Problem without Sparsity

    Authors: Junpei Komiyama, Masaaki Imaizumi

    Abstract: In this research, we investigate the high-dimensional linear contextual bandit problem where the number of features $p$ is greater than the budget $T$, or it may even be infinite. Differing from the majority of previous works in this field, we do not impose sparsity on the regression coefficients. Instead, we rely on recent findings on overparameterized models, which enables us to analyze the perf… ▽ More

    Submitted 25 June, 2025; v1 submitted 19 June, 2023; originally announced June 2023.

    Journal ref: Advances in Neural Information Processing Systems, 36, 49416-49427, 2023

  18. arXiv:2305.15754  [pdf, other

    math.ST stat.ME stat.ML

    Bayesian Analysis for Over-parameterized Linear Model via Effective Spectra

    Authors: Tomoya Wakayama, Masaaki Imaizumi

    Abstract: In high-dimensional Bayesian statistics, various methods have been developed, including prior distributions that induce parameter sparsity to handle many parameters. Yet, these approaches often overlook the rich spectral structure of the covariate matrix, which can be crucial when true signals are not sparse. To address this gap, we introduce a data-adaptive Gaussian prior whose covariance is alig… ▽ More

    Submitted 5 May, 2025; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: 46 pages

  19. arXiv:2302.02988  [pdf, other

    cs.LG econ.EM math.ST stat.ME stat.ML

    Asymptotically Optimal Fixed-Budget Best Arm Identification with Variance-Dependent Bounds

    Authors: Masahiro Kato, Masaaki Imaizumi, Takuya Ishihara, Toru Kitagawa

    Abstract: We investigate the problem of fixed-budget best arm identification (BAI) for minimizing expected simple regret. In an adaptive experiment, a decision maker draws one of multiple treatment arms based on past observations and observes the outcome of the drawn arm. After the experiment, the decision maker recommends the treatment arm with the highest expected outcome. We evaluate the decision based o… ▽ More

    Submitted 12 July, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

  20. arXiv:2209.07330  [pdf, other

    cs.LG econ.EM math.ST stat.ME stat.ML

    Best Arm Identification with Contextual Information under a Small Gap

    Authors: Masahiro Kato, Masaaki Imaizumi, Takuya Ishihara, Toru Kitagawa

    Abstract: We study the best-arm identification (BAI) problem with a fixed budget and contextual (covariate) information. In each round of an adaptive experiment, after observing contextual information, we choose a treatment arm using past observations and current context. Our goal is to identify the best treatment arm, which is a treatment arm with the maximal expected reward marginalized over the contextua… ▽ More

    Submitted 4 January, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: For the sake of completeness, we show a part of the results of Kato et al. (arXiv:2201.04469). arXiv admin note: text overlap with arXiv:2201.04469

  21. arXiv:2204.08369  [pdf, ps, other

    math.ST stat.ML

    Benign Overfitting in Time Series Linear Models with Over-Parameterization

    Authors: Shogo Nakakita, Masaaki Imaizumi

    Abstract: The success of large-scale models in recent years has increased the importance of statistical models with numerous parameters. Several studies have analyzed over-parameterized linear models with high-dimensional data, which may not be sparse; however, existing results rely on the assumption of sample independence. In this study, we analyze a linear regression model with dependent time-series data… ▽ More

    Submitted 13 March, 2025; v1 submitted 18 April, 2022; originally announced April 2022.

    Comments: Accepted at Bernoulli

  22. arXiv:2202.05495  [pdf, other

    stat.ME

    Inference for Projection-Based Wasserstein Distances on Finite Spaces

    Authors: Ryo Okano, Masaaki Imaizumi

    Abstract: The Wasserstein distance is a distance between two probability distributions and has recently gained increasing popularity in statistics and machine learning, owing to its attractive properties. One important approach to extending this distance is using low-dimensional projections of distributions to avoid a high computational cost and the curse of dimensionality in empirical estimation, such as t… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

  23. arXiv:2202.05245  [pdf, ps, other

    econ.EM cs.LG math.ST stat.ML

    Benign-Overfitting in Conditional Average Treatment Effect Prediction with Linear Regression

    Authors: Masahiro Kato, Masaaki Imaizumi

    Abstract: We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE), with linear regression models. As the development of machine learning for causal inference, a wide range of large-scale models for causality are gaining attention. One problem is that suspicions have been raised that the large-scale models are prone to overfitting to observations with sampl… ▽ More

    Submitted 11 February, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: arXiv admin note: text overlap with arXiv:1906.11300 by other authors

  24. arXiv:2201.13127  [pdf, other

    cs.LG cs.AI stat.ML

    Unified Perspective on Probability Divergence via Maximum Likelihood Density Ratio Estimation: Bridging KL-Divergence and Integral Probability Metrics

    Authors: Masahiro Kato, Masaaki Imaizumi, Kentaro Minami

    Abstract: This paper provides a unified perspective for the Kullback-Leibler (KL)-divergence and the integral probability metrics (IPMs) from the perspective of maximum likelihood density-ratio estimation (DRE). Both the KL-divergence and the IPMs are widely used in various fields in applications such as generative modeling. However, a unified understanding of these concepts has still been unexplored. In th… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

  25. On generalization bounds for deep networks based on loss surface implicit regularization

    Authors: Masaaki Imaizumi, Johannes Schmidt-Hieber

    Abstract: The classical statistical learning theory implies that fitting too many parameters leads to overfitting and poor performance. That modern deep neural networks generalize well despite a large number of parameters contradicts this finding and constitutes a major unsolved problem towards explaining the success of deep learning. While previous work focuses on the implicit regularization induced by sto… ▽ More

    Submitted 16 October, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

    Comments: To appear in IEEE Transaction on Information Theory

  26. arXiv:2201.04469  [pdf, other

    stat.ML cs.LG econ.EM math.ST

    Optimal Best Arm Identification in Two-Armed Bandits with a Fixed Budget under a Small Gap

    Authors: Masahiro Kato, Kaito Ariu, Masaaki Imaizumi, Masahiro Nomura, Chao Qin

    Abstract: We consider fixed-budget best-arm identification in two-armed Gaussian bandit problems. One of the longstanding open questions is the existence of an optimal strategy under which the probability of misidentification matches a lower bound. We show that a strategy following the Neyman allocation rule (Neyman, 1934) is asymptotically optimal when the gap between the expected rewards is small. First,… ▽ More

    Submitted 28 December, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

  27. arXiv:2112.00213  [pdf, other

    math.ST stat.ML

    Minimax Analysis for Inverse Risk in Nonparametric Planer Invertible Regression

    Authors: Akifumi Okuno, Masaaki Imaizumi

    Abstract: We study a minimax risk of estimating inverse functions on a plane, while keeping an estimator is also invertible. Learning invertibility from data and exploiting an invertible estimator are used in many domains, such as statistics, econometrics, and machine learning. Although the consistency and universality of invertible estimators have been well investigated, analysis of the efficiency of these… ▽ More

    Submitted 25 December, 2023; v1 submitted 30 November, 2021; originally announced December 2021.

    Comments: 34 pages, 34 figures, accepted to Electronic Journal of Statistics

  28. arXiv:2108.01312  [pdf, other

    econ.EM cs.LG stat.AP stat.ME stat.ML

    Learning Causal Models from Conditional Moment Restrictions by Importance Weighting

    Authors: Masahiro Kato, Masaaki Imaizumi, Kenichiro McAlinn, Haruo Kakehi, Shota Yasui

    Abstract: We consider learning causal relationships under conditional moment restrictions. Unlike causal inference under unconditional moment restrictions, conditional moment restrictions pose serious challenges for causal inference, especially in high-dimensional settings. To address this issue, we propose a method that transforms conditional moment restrictions to unconditional moment restrictions through… ▽ More

    Submitted 28 September, 2022; v1 submitted 3 August, 2021; originally announced August 2021.

  29. arXiv:2103.00500  [pdf, other

    stat.ML cs.LG math.ST

    Asymptotic Risk of Overparameterized Likelihood Models: Double Descent Theory for Deep Neural Networks

    Authors: Ryumei Nakada, Masaaki Imaizumi

    Abstract: We investigate the asymptotic risk of a general class of overparameterized likelihood models, including deep models. The recent empirical success of large-scale models has motivated several theoretical studies to investigate a scenario wherein both the number of samples, $n$, and parameters, $p$, diverge to infinity and derive an asymptotic risk at the limit. However, these theorems are only valid… ▽ More

    Submitted 15 March, 2021; v1 submitted 28 February, 2021; originally announced March 2021.

    Comments: 36 pages

  30. arXiv:2102.03609  [pdf, other

    cs.LG stat.ML

    Understanding Higher-order Structures in Evolving Graphs: A Simplicial Complex based Kernel Estimation Approach

    Authors: Manohar Kaul, Masaaki Imaizumi

    Abstract: Dynamic graphs are rife with higher-order interactions, such as co-authorship relationships and protein-protein interactions in biological networks, that naturally arise between more than two nodes at once. In spite of the ubiquitous presence of such higher-order interactions, limited attention has been paid to the higher-order counterpart of the popular pairwise link prediction problem. Existing… ▽ More

    Submitted 6 February, 2021; originally announced February 2021.

  31. arXiv:2102.02981  [pdf, ps, other

    cs.LG math.ST stat.ML

    Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency

    Authors: Masatoshi Uehara, Masaaki Imaizumi, Nan Jiang, Nathan Kallus, Wen Sun, Tengyang Xie

    Abstract: We offer a theoretical characterization of off-policy evaluation (OPE) in reinforcement learning using function approximation for marginal importance weights and $q$-functions when these are estimated using recent minimax methods. Under various combinations of realizability and completeness assumptions, we show that the minimax approach enables us to achieve a fast rate of convergence for weights… ▽ More

    Submitted 24 July, 2022; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: Under Review

  32. arXiv:2011.02256  [pdf, other

    stat.ML cs.LG

    Advantage of Deep Neural Networks for Estimating Functions with Singularity on Hypersurfaces

    Authors: Masaaki Imaizumi, Kenji Fukumizu

    Abstract: We develop a minimax rate analysis to describe the reason that deep neural networks (DNNs) perform better than other standard methods. For nonparametric regression problems, it is well known that many standard methods attain the minimax optimal rate of estimation errors for smooth functions, and thus, it is not straightforward to identify the theoretical advantages of DNNs. This study tries to fil… ▽ More

    Submitted 8 February, 2022; v1 submitted 4 November, 2020; originally announced November 2020.

    Comments: Complete version of arXiv:1802.04474

  33. arXiv:1910.06552  [pdf, other

    stat.ML cs.LG

    Improved Generalization Bounds of Group Invariant / Equivariant Deep Networks via Quotient Feature Spaces

    Authors: Akiyoshi Sannai, Masaaki Imaizumi, Makoto Kawano

    Abstract: Numerous invariant (or equivariant) neural networks have succeeded in handling invariant data such as point clouds and graphs. However, a generalization theory for the neural networks has not been well developed, because several essential factors for the theory, such as network size and margin distribution, are not deeply connected to the invariance and equivariance. In this study, we develop a no… ▽ More

    Submitted 19 June, 2021; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: Old title: "Improved Generalization Bound of Permutation Invariant Deep Neural Networks"

  34. arXiv:1907.02177  [pdf, other

    stat.ML cs.LG

    Adaptive Approximation and Generalization of Deep Neural Network with Intrinsic Dimensionality

    Authors: Ryumei Nakada, Masaaki Imaizumi

    Abstract: In this study, we prove that an intrinsic low dimensionality of covariates is the main factor that determines the performance of deep neural networks (DNNs). DNNs generally provide outstanding empirical performance. Hence, numerous studies have actively investigated the theoretical properties of DNNs to understand their underlying mechanisms. In particular, the behavior of DNNs in terms of high-di… ▽ More

    Submitted 17 September, 2020; v1 submitted 3 July, 2019; originally announced July 2019.

    Comments: 38 pages

    Journal ref: Journal of Machine Learning Research, 21(174), 2020

  35. arXiv:1901.09541  [pdf, other

    stat.ML cs.LG

    On Random Subsampling of Gaussian Process Regression: A Graphon-Based Analysis

    Authors: Kohei Hayashi, Masaaki Imaizumi, Yuichi Yoshida

    Abstract: In this paper, we study random subsampling of Gaussian process regression, one of the simplest approximation baselines, from a theoretical perspective. Although subsampling discards a large part of training data, we show provable guarantees on the accuracy of the predictive mean/variance and its generalization ability. For analysis, we consider embedding kernel matrices into graphons, which encaps… ▽ More

    Submitted 28 January, 2019; originally announced January 2019.

  36. arXiv:1802.04474  [pdf, other

    stat.ML

    Deep Neural Networks Learn Non-Smooth Functions Effectively

    Authors: Masaaki Imaizumi, Kenji Fukumizu

    Abstract: We theoretically discuss why deep neural networks (DNNs) performs better than other models in some cases by investigating statistical properties of DNNs for non-smooth functions. While DNNs have empirically shown higher performance than other standard methods, understanding its mechanism is still a challenging problem. From an aspect of the statistical theory, it is known many standard methods att… ▽ More

    Submitted 7 July, 2018; v1 submitted 13 February, 2018; originally announced February 2018.

    Comments: 31 pages

  37. arXiv:1708.00132  [pdf, other

    stat.ML

    On Tensor Train Rank Minimization: Statistical Efficiency and Scalable Algorithm

    Authors: Masaaki Imaizumi, Takanori Maehara, Kohei Hayashi

    Abstract: Tensor train (TT) decomposition provides a space-efficient representation for higher-order tensors. Despite its advantage, we face two crucial limitations when we apply the TT decomposition to machine learning problems: the lack of statistical theory and of scalable algorithms. In this paper, we address the limitations. First, we introduce a convex relaxation of the TT decomposition problem and de… ▽ More

    Submitted 1 August, 2017; v1 submitted 31 July, 2017; originally announced August 2017.

    Comments: 24 pages

  38. arXiv:1707.09688  [pdf, ps, other

    stat.ML

    Consistent Nonparametric Different-Feature Selection via the Sparsest $k$-Subgraph Problem

    Authors: Satoshi Hara, Takayuki Katsuki, Hiroki Yanagisawa, Masaaki Imaizumi, Takafumi Ono, Ryo Okamoto, Shigeki Takeuchi

    Abstract: Two-sample feature selection is the problem of finding features that describe a difference between two probability distributions, which is a ubiquitous problem in both scientific and engineering studies. However, existing methods have limited applicability because of their restrictive assumptions on data distributoins or computational difficulty. In this paper, we resolve these difficulties by for… ▽ More

    Submitted 31 July, 2017; v1 submitted 30 July, 2017; originally announced July 2017.

    Comments: 32 pages

  39. arXiv:1506.06722  [pdf, ps, other

    stat.CO

    Approximation method for discrete Markov decision models with a large state space

    Authors: Masaaki Imaizumi

    Abstract: To solve discrete Markov decision models with a large number of dimensions is always difficult (and at times, impossible), because size of state space and computation cost increases exponentially with the number of dimensions. This phenomenon is called "The Curse of Dimensionality," and it prevents us from using models with many state variables. To overcome this problem, we propose a new approxima… ▽ More

    Submitted 11 November, 2016; v1 submitted 22 June, 2015; originally announced June 2015.

    Comments: 21 pages

  40. arXiv:1506.05967  [pdf, ps, other

    stat.ML

    Doubly Decomposing Nonparametric Tensor Regression

    Authors: Masaaki Imaizumi, Kohei Hayashi

    Abstract: Nonparametric extension of tensor regression is proposed. Nonlinearity in a high-dimensional tensor space is broken into simple local functions by incorporating low-rank tensor decomposition. Compared to naive nonparametric approaches, our formulation considerably improves the convergence rate of estimation while maintaining consistency with the same function class under specific conditions. To es… ▽ More

    Submitted 8 March, 2016; v1 submitted 19 June, 2015; originally announced June 2015.

    Comments: 21 pages