-
Dynamical Properties of Dense Associative Memory
Authors:
Kazushi Mimura,
Jun'ichi Takeuchi,
Yuto Sumikawa,
Yoshiyuki Kabashima,
Anthony C. C. Coolen
Abstract:
The dense associative memory is one of the basic modern Hopfield networks and can store large numbers of memory patterns. While the stationary state storage capacity has been investigated so far, its dynamical properties have not been discussed. In this paper, we analyze the dynamics by means of an exact approach based on generating functional analysis. It allows us to investigate convergence prop…
▽ More
The dense associative memory is one of the basic modern Hopfield networks and can store large numbers of memory patterns. While the stationary state storage capacity has been investigated so far, its dynamical properties have not been discussed. In this paper, we analyze the dynamics by means of an exact approach based on generating functional analysis. It allows us to investigate convergence properties as well as the size of the attraction basins. We also analyze the stationary state of the updating rule.
△ Less
Submitted 1 June, 2025;
originally announced June 2025.
-
Quantum effects in rotationally invariant spin glass models
Authors:
Yoshinori Hara,
Yoshiyuki Kabashima
Abstract:
This study investigates the quantum effects in transverse-field Ising spin glass models with rotationally invariant random interactions. The primary aim is to evaluate the validity of a quasi-static approximation that captures the imaginary-time dependence of the order parameters beyond the conventional static approximation. Using the replica method combined with the Suzuki--Trotter decomposition,…
▽ More
This study investigates the quantum effects in transverse-field Ising spin glass models with rotationally invariant random interactions. The primary aim is to evaluate the validity of a quasi-static approximation that captures the imaginary-time dependence of the order parameters beyond the conventional static approximation. Using the replica method combined with the Suzuki--Trotter decomposition, we established a stability condition for the replica symmetric solution, which is analogous to the de Almeida--Thouless criterion. Numerical analysis of the Sherrington--Kirkpatrick model estimates a value of the critical transverse field, $Γ_\mathrm{c}$, which agrees with previous Monte Carlo-based estimations. For the Hopfield model, it provides an estimate of $Γ_\mathrm{c}$, which has not been previously evaluated. For the random orthogonal model, our analysis suggests that quantum effects alter the random first-order transition scenario in the low-temperature limit. This study supports a quasi-static treatment for analyzing quantum spin glasses and may offer useful insights into the analysis of quantum optimization algorithms.
△ Less
Submitted 24 April, 2025;
originally announced April 2025.
-
Exact Replica Symmetric solution for transverse field Hopfield model under finite Trotter size
Authors:
Koki Okajima,
Yoshiyuki Kabashima
Abstract:
We analyze the quantum Hopfield model in which an extensive number of patterns are embedded in the presence of a uniform transverse field. This analysis employs the replica method under the replica symmetric ansatz on the Suzuki-Trotter representation of the model, while keeping the number of Trotter slices $M$ finite. The statistical properties of the quantum Hopfield model in imaginary time are…
▽ More
We analyze the quantum Hopfield model in which an extensive number of patterns are embedded in the presence of a uniform transverse field. This analysis employs the replica method under the replica symmetric ansatz on the Suzuki-Trotter representation of the model, while keeping the number of Trotter slices $M$ finite. The statistical properties of the quantum Hopfield model in imaginary time are reduced to an effective $M$-spin long-range classical Ising model, which can be extensively studied using a dedicated Monte Carlo algorithm. This approach contrasts with the commonly applied static approximation, which ignores the imaginary time dependency of the order parameters, but allows $M \to \infty$ to be taken analytically. During the analysis, we introduce an exact but fundamentally weaker static relation, referred to as the quasi-static relation. We present the phase diagram of the model with respect to the transverse field strength and the number of embedded patterns, indicating a small but quantitative difference from previous results obtained using the static approximation.
△ Less
Submitted 14 March, 2025; v1 submitted 4 November, 2024;
originally announced November 2024.
-
Forecasting long-time dynamics in quantum many-body systems by dynamic mode decomposition
Authors:
Ryui Kaneko,
Masatoshi Imada,
Yoshiyuki Kabashima,
Tomi Ohtsuki
Abstract:
Reliable numerical computation of quantum dynamics is a fundamental challenge when the long-ranged quantum entanglement plays essential roles as in the cases governed by quantum criticality in strongly correlated systems. Here we apply a method that utilizes reliable short-time data of physical quantities to accurately forecast long-time behavior of the strongly entangled systems. We straightforwa…
▽ More
Reliable numerical computation of quantum dynamics is a fundamental challenge when the long-ranged quantum entanglement plays essential roles as in the cases governed by quantum criticality in strongly correlated systems. Here we apply a method that utilizes reliable short-time data of physical quantities to accurately forecast long-time behavior of the strongly entangled systems. We straightforwardly employ the simple dynamic mode decomposition (DMD), which is commonly used in fluid dynamics. Despite the simplicity of the method, the effectiveness and applicability of the DMD in quantum many-body systems such as the Ising model in the transverse field at the critical point are demonstrated, even when the time evolution at long time exhibits complicated features such as a volume-law entanglement entropy and consequential power-law decays of correlations characteristic of systems with long-ranged quantum entanglements unlike fluid dynamics. The present method, though simple, enables accurate forecasts amazingly at time as long as nearly an order of magnitude longer than that of the short-time training data. Effects of noise on the accuracy of the forecast are also investigated, because they are important especially when dealing with the experimental data. We find that a few percentages of noise do not affect the prediction accuracy destructively.
△ Less
Submitted 23 January, 2025; v1 submitted 28 March, 2024;
originally announced March 2024.
-
Detection of diffusion anisotropy from an individual short particle trajectory
Authors:
Kaito Takanami,
Daisuke Taniguchi,
Sawako Enoki,
Masafumi Kuroda,
Yasushi Okada,
Yoshiyuki Kabashima
Abstract:
In parallel with advances in microscale imaging techniques, the fields of biology and materials science have focused on precisely extracting particle properties based on their diffusion behavior. Although the majority of real-world particles exhibit anisotropy, their behavior has been studied less than that of isotropic particles. In this study, we introduce a new method for estimating the diffusi…
▽ More
In parallel with advances in microscale imaging techniques, the fields of biology and materials science have focused on precisely extracting particle properties based on their diffusion behavior. Although the majority of real-world particles exhibit anisotropy, their behavior has been studied less than that of isotropic particles. In this study, we introduce a new method for estimating the diffusion coefficients of individual anisotropic particles using short-trajectory data on the basis of a maximum likelihood framework. Traditional estimation techniques often use mean-squared displacement (MSD) values or other statistical measures that inherently remove angular information. Instead, we treated the angle as a latent variable and used belief propagation to estimate it while maximizing the likelihood using the expectation-maximization algorithm. Compared to conventional methods, this approach facilitates better estimation of shorter trajectories and faster rotations, as confirmed by numerical simulations and experimental data involving bacteria and quantum rods. Additionally, we performed an analytical investigation of the limits of detectability of anisotropy and provided guidelines for the experimental design. In addition to serving as a powerful tool for analyzing complex systems, the proposed method will pave the way for applying maximum likelihood methods to more complex diffusion phenomena.
△ Less
Submitted 21 May, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Average case analysis of Lasso under ultra-sparse conditions
Authors:
Koki Okajima,
Xiangming Meng,
Takashi Takahashi,
Yoshiyuki Kabashima
Abstract:
We analyze the performance of the least absolute shrinkage and selection operator (Lasso) for the linear model when the number of regressors $N$ grows larger keeping the true support size $d$ finite, i.e., the ultra-sparse case. The result is based on a novel treatment of the non-rigorous replica method in statistical physics, which has been applied only to problem settings where $N$ ,$d$ and the…
▽ More
We analyze the performance of the least absolute shrinkage and selection operator (Lasso) for the linear model when the number of regressors $N$ grows larger keeping the true support size $d$ finite, i.e., the ultra-sparse case. The result is based on a novel treatment of the non-rigorous replica method in statistical physics, which has been applied only to problem settings where $N$ ,$d$ and the number of observations $M$ tend to infinity at the same rate. Our analysis makes it possible to assess the average performance of Lasso with Gaussian sensing matrices without assumptions on the scaling of $N$ and $M$, the noise distribution, and the profile of the true signal. Under mild conditions on the noise distribution, the analysis also offers a lower bound on the sample complexity necessary for partial and perfect support recovery when $M$ diverges as $M = O(\log N)$. The obtained bound for perfect support recovery is a generalization of that given in previous literature, which only considers the case of Gaussian noise and diverging $d$. Extensive numerical experiments strongly support our analysis.
△ Less
Submitted 25 February, 2023;
originally announced February 2023.
-
Statistical mechanics analysis of general multi-dimensional knapsack problems
Authors:
Yuta Nakamura,
Takashi Takahashi,
Yoshiyuki Kabashima
Abstract:
Knapsack problem (KP) is a representative combinatorial optimization problem that aims to maximize the total profit by selecting a subset of items under given constraints on the total weights. In this study, we analyze a generalized version of KP, which is termed the generalized multidimensional knapsack problem (GMDKP). As opposed to the basic KP, GMDKP allows multiple choices per item type under…
▽ More
Knapsack problem (KP) is a representative combinatorial optimization problem that aims to maximize the total profit by selecting a subset of items under given constraints on the total weights. In this study, we analyze a generalized version of KP, which is termed the generalized multidimensional knapsack problem (GMDKP). As opposed to the basic KP, GMDKP allows multiple choices per item type under multiple weight constraints. Although several efficient algorithms are known and the properties of their solutions have been examined to a significant extent for basic KPs, there is a paucity of known algorithms and studies on the solution properties of GMDKP. To gain insight into the problem, we assess the typical achievable limit of the total profit for a random ensemble of GMDKP using the replica method. Our findings are summarized as follows: (1) When the profits of item types are normally distributed, the total profit grows in the leading order with respect to the number of item types as the maximum number of choices per item type $x^{\rm max}$ increases while it depends on $x^{\rm max}$ only in a sub-leading order if the profits are constant among the item types. (2) A greedy-type heuristic can find a nearly optimal solution whose total profit is lower than the optimal value only by a sub-leading order with a low computational cost. (3) The sub-leading difference from the optimal total profit can be improved by a heuristic algorithm based on the cavity method. Extensive numerical experiments support these findings.
△ Less
Submitted 21 August, 2022; v1 submitted 18 January, 2022;
originally announced January 2022.
-
Assessing transfer entropy from biochemical data
Authors:
Takuya Imaizumi,
Nobuhisa Umeki,
Ryo Yoshizawa,
Tomoyuki Obuchi,
Yasushi Sako,
Yoshiyuki Kabashima
Abstract:
We address the problem of evaluating the transfer entropy (TE) produced by biochemical reactions from experimentally measured data. Although these reactions are generally non-linear and non-stationary processes making it challenging to achieve accurate modeling, Gaussian approximation can facilitate the TE assessment only by estimating covariance matrices using multiple data obtained from simultan…
▽ More
We address the problem of evaluating the transfer entropy (TE) produced by biochemical reactions from experimentally measured data. Although these reactions are generally non-linear and non-stationary processes making it challenging to achieve accurate modeling, Gaussian approximation can facilitate the TE assessment only by estimating covariance matrices using multiple data obtained from simultaneously measured time series representing the activation levels of biomolecules such as proteins. Nevertheless, the non-stationary nature of biochemical signals makes it difficult to theoretically assess the sampling distributions of TE, which are necessary for evaluating the statistical confidence and significance of the data-driven estimates. We resolve this difficulty by computationally assessing the sampling distributions using techniques from computational statistics. The computational methods are tested by using them in analyzing data generated from a theoretically tractable time-varying signal model, which leads to the development of a method to screen only statistically significant estimates. The usefulness of the developed method is examined by applying it to real biological data experimentally measured from the ERBB-RAS-MAPK system that superintends diverse cell fate decisions. A comparison between cells containing wild-type and mutant proteins exhibits a distinct difference in the time evolution of TE while apparent difference is hardly found in average profiles of the raw signals. Such comparison may help in unveiling important pathways of biochemical reactions.
△ Less
Submitted 8 March, 2022; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Decision Theoretic Cutoff and ROC Analysis for Bayesian Optimal Group Testing
Authors:
Ayaka Sakata,
Yoshiyuki Kabashima
Abstract:
We study the inference problem in the group testing to identify defective items from the perspective of the decision theory. We introduce Bayesian inference and consider the Bayesian optimal setting in which the true generative process of the test results is known. We demonstrate the adequacy of the posterior marginal probability in the Bayesian optimal setting as a diagnostic variable based on th…
▽ More
We study the inference problem in the group testing to identify defective items from the perspective of the decision theory. We introduce Bayesian inference and consider the Bayesian optimal setting in which the true generative process of the test results is known. We demonstrate the adequacy of the posterior marginal probability in the Bayesian optimal setting as a diagnostic variable based on the area under the curve (AUC). Using the posterior marginal probability, we derive the general expression of the optimal cutoff value that yields the minimum expected risk function. Furthermore, we evaluate the performance of the Bayesian group testing without knowing the true states of the items: defective or non-defective. By introducing an analytical method from statistical physics, we derive the receiver operating characteristics curve, and quantify the corresponding AUC under the Bayesian optimal setting. The obtained analytical results precisely describes the actual performance of the belief propagation algorithm defined for single samples when the number of items is sufficiently large.
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Matrix completion based on Gaussian parameterized belief propagation
Authors:
Koki Okajima,
Yoshiyuki Kabashima
Abstract:
We develop a message-passing algorithm for noisy matrix completion problems based on matrix factorization. The algorithm is derived by approximating message distributions of belief propagation with Gaussian distributions that share the same first and second moments. We also derive a memory-friendly version of the proposed algorithm by applying a perturbation treatment commonly used in the literatu…
▽ More
We develop a message-passing algorithm for noisy matrix completion problems based on matrix factorization. The algorithm is derived by approximating message distributions of belief propagation with Gaussian distributions that share the same first and second moments. We also derive a memory-friendly version of the proposed algorithm by applying a perturbation treatment commonly used in the literature of approximate message passing. In addition, a damping technique, which is demonstrated to be crucial for optimal performance, is introduced without computational strain, and the relationship to the message-passing version of alternating least squares, a method reported to be optimal in certain settings, is discussed. Experiments on synthetic datasets show that while the proposed algorithm quantitatively exhibits almost the same performance under settings where the earlier algorithm is optimal, it is advantageous when the observed datasets are corrupted by non-Gaussian noise. Experiments on real-world datasets also emphasize the performance differences between the two algorithms.
△ Less
Submitted 24 August, 2021; v1 submitted 1 May, 2021;
originally announced May 2021.
-
Ising Model Selection Using $\ell_{1}$-Regularized Linear Regression: A Statistical Mechanics Analysis
Authors:
Xiangming Meng,
Tomoyuki Obuchi,
Yoshiyuki Kabashima
Abstract:
We theoretically analyze the typical learning performance of $\ell_{1}$-regularized linear regression ($\ell_1$-LinR) for Ising model selection using the replica method from statistical mechanics. For typical random regular graphs in the paramagnetic phase, an accurate estimate of the typical sample complexity of $\ell_1$-LinR is obtained. Remarkably, despite the model misspecification, $\ell_1$-L…
▽ More
We theoretically analyze the typical learning performance of $\ell_{1}$-regularized linear regression ($\ell_1$-LinR) for Ising model selection using the replica method from statistical mechanics. For typical random regular graphs in the paramagnetic phase, an accurate estimate of the typical sample complexity of $\ell_1$-LinR is obtained. Remarkably, despite the model misspecification, $\ell_1$-LinR is model selection consistent with the same order of sample complexity as $\ell_{1}$-regularized logistic regression ($\ell_1$-LogR), i.e., $M=\mathcal{O}\left(\log N\right)$, where $N$ is the number of variables of the Ising model. Moreover, we provide an efficient method to accurately predict the non-asymptotic behavior of $\ell_1$-LinR for moderate $M, N$, such as precision and recall. Simulations show a fairly good agreement between theoretical predictions and experimental results, even for graphs with many loops, which supports our findings. Although this paper mainly focuses on $\ell_1$-LinR, our method is readily applicable for precisely characterizing the typical learning performances of a wide class of $\ell_{1}$-regularized $M$-estimators including $\ell_1$-LogR and interaction screening.
△ Less
Submitted 1 November, 2021; v1 submitted 7 February, 2021;
originally announced February 2021.
-
Structure Learning in Inverse Ising Problems Using $\ell_2$-Regularized Linear Estimator
Authors:
Xiangming Meng,
Tomoyuki Obuchi,
Yoshiyuki Kabashima
Abstract:
The inference performance of the pseudolikelihood method is discussed in the framework of the inverse Ising problem when the $\ell_2$-regularized (ridge) linear regression is adopted. This setup is introduced for theoretically investigating the situation where the data generation model is different from the inference one, namely the model mismatch situation. In the teacher-student scenario under t…
▽ More
The inference performance of the pseudolikelihood method is discussed in the framework of the inverse Ising problem when the $\ell_2$-regularized (ridge) linear regression is adopted. This setup is introduced for theoretically investigating the situation where the data generation model is different from the inference one, namely the model mismatch situation. In the teacher-student scenario under the assumption that the teacher couplings are sparse, the analysis is conducted using the replica and cavity methods, with a special focus on whether the presence/absence of teacher couplings is correctly inferred or not. The result indicates that despite the model mismatch, one can perfectly identify the network structure using naive linear regression without regularization when the number of spins $N$ is smaller than the dataset size $M$, in the thermodynamic limit $N\to \infty$. Further, to access the underdetermined region $M < N$, we examine the effect of the $\ell_2$ regularization, and find that biases appear in all the coupling estimates, preventing the perfect identification of the network structure. We, however, find that the biases are shown to decay exponentially fast as the distance from the center spin chosen in the pseudolikelihood method grows. Based on this finding, we propose a two-stage estimator: In the first stage, the ridge regression is used and the estimates are pruned by a relatively small threshold; in the second stage the naive linear regression is conducted only on the remaining couplings, and the resultant estimates are again pruned by another relatively large threshold. This estimator with the appropriate regularization coefficient and thresholds is shown to achieve the perfect identification of the network structure even in $0<M/N<1$. Results of extensive numerical experiments support these findings.
△ Less
Submitted 23 November, 2020; v1 submitted 19 August, 2020;
originally announced August 2020.
-
Reconstructing Sparse Signals via Greedy Monte-Carlo Search
Authors:
Kao Hayashi,
Tomoyuki Obuchi,
Yoshiyuki Kabashima
Abstract:
We propose a Monte-Carlo-based method for reconstructing sparse signals in the formulation of sparse linear regression in a high-dimensional setting. The basic idea of this algorithm is to explicitly select variables or covariates to represent a given data vector or responses and accept randomly generated updates of that selection if and only if the energy or cost function decreases. This algorith…
▽ More
We propose a Monte-Carlo-based method for reconstructing sparse signals in the formulation of sparse linear regression in a high-dimensional setting. The basic idea of this algorithm is to explicitly select variables or covariates to represent a given data vector or responses and accept randomly generated updates of that selection if and only if the energy or cost function decreases. This algorithm is called the greedy Monte-Carlo (GMC) search algorithm. Its performance is examined via numerical experiments, which suggests that in the noiseless case, GMC can achieve perfect reconstruction in undersampling situations of a reasonable level: it can outperform the $\ell_1$ relaxation but does not reach the algorithmic limit of MC-based methods theoretically clarified by an earlier analysis. The necessary computational time is also examined and compared with that of an algorithm using simulated annealing. Additionally, experiments on the noisy case are conducted on synthetic datasets and on a real-world dataset, supporting the practicality of GMC.
△ Less
Submitted 29 January, 2021; v1 submitted 7 August, 2020;
originally announced August 2020.
-
Semi-analytic approximate stability selection for correlated data in generalized linear models
Authors:
Takashi Takahashi,
Yoshiyuki Kabashima
Abstract:
We consider the variable selection problem of generalized linear models (GLMs). Stability selection (SS) is a promising method proposed for solving this problem. Although SS provides practical variable selection criteria, it is computationally demanding because it needs to fit GLMs to many re-sampled datasets. We propose a novel approximate inference algorithm that can conduct SS without the repea…
▽ More
We consider the variable selection problem of generalized linear models (GLMs). Stability selection (SS) is a promising method proposed for solving this problem. Although SS provides practical variable selection criteria, it is computationally demanding because it needs to fit GLMs to many re-sampled datasets. We propose a novel approximate inference algorithm that can conduct SS without the repeated fitting. The algorithm is based on the replica method of statistical mechanics and vector approximate message passing of information theory. For datasets characterized by rotation-invariant matrix ensembles, we derive state evolution equations that macroscopically describe the dynamics of the proposed algorithm. We also show that their fixed points are consistent with the replica symmetric solution obtained by the replica method. Numerical experiments indicate that the algorithm exhibits fast convergence and high approximation accuracy for both synthetic and real-world data.
△ Less
Submitted 25 June, 2020; v1 submitted 19 March, 2020;
originally announced March 2020.
-
Macroscopic Analysis of Vector Approximate Message Passing in a Model Mismatch Setting
Authors:
Takashi Takahashi,
Yoshiyuki Kabashima
Abstract:
Vector approximate message passing (VAMP) is an efficient approximate inference algorithm used for generalized linear models. Although VAMP exhibits excellent performance, particularly when measurement matrices are sampled from rotationally invariant ensembles, existing convergence and performance analyses have been limited mostly to cases in which the correct posterior distribution is available.…
▽ More
Vector approximate message passing (VAMP) is an efficient approximate inference algorithm used for generalized linear models. Although VAMP exhibits excellent performance, particularly when measurement matrices are sampled from rotationally invariant ensembles, existing convergence and performance analyses have been limited mostly to cases in which the correct posterior distribution is available. Here, we extend the analyses for cases in which the correct posterior distribution is not used in the inference stage. We derive state evolution equations, which macroscopically describe the dynamics of VAMP, and show that their fixed point is consistent with the replica symmetric solution obtained by the replica method of statistical mechanics. We also show that the fixed point of VAMP can exhibit a microscopic instability, the critical condition of which agrees with that for breaking the replica symmetry. The results of numerical experiments support our findings.
△ Less
Submitted 16 January, 2020; v1 submitted 8 January, 2020;
originally announced January 2020.
-
Learning performance in inverse Ising problems with sparse teacher couplings
Authors:
Alia Abbara,
Yoshiyuki Kabashima,
Tomoyuki Obuchi,
Yingying Xu
Abstract:
We investigate the learning performance of the pseudolikelihood maximization method for inverse Ising problems. In the teacher-student scenario under the assumption that the teacher's couplings are sparse and the student does not know the graphical structure, the learning curve and order parameters are assessed in the typical case using the replica and cavity methods from statistical mechanics. Ou…
▽ More
We investigate the learning performance of the pseudolikelihood maximization method for inverse Ising problems. In the teacher-student scenario under the assumption that the teacher's couplings are sparse and the student does not know the graphical structure, the learning curve and order parameters are assessed in the typical case using the replica and cavity methods from statistical mechanics. Our formulation is also applicable to a certain class of cost functions having locality; the standard likelihood does not belong to that class. The derived analytical formulas indicate that the perfect inference of the presence/absence of the teacher's couplings is possible in the thermodynamic limit taking the number of spins $N$ as infinity while keeping the dataset size $M$ proportional to $N$, as long as $α=M/N > 2$. Meanwhile, the formulas also show that the estimated coupling values corresponding to the truly existing ones in the teacher tend to be overestimated in the absolute value, manifesting the presence of estimation bias. These results are considered to be exact in the thermodynamic limit on locally tree-like networks, such as the regular random or Erdős--Rényi graphs. Numerical simulation results fully support the theoretical predictions. Additional biases in the estimators on loopy graphs are also discussed.
△ Less
Submitted 1 May, 2020; v1 submitted 24 December, 2019;
originally announced December 2019.
-
Statistical mechanics of the minimum vertex cover problem in stochastic block models
Authors:
Masato Suzuki,
Yoshiyuki Kabashima
Abstract:
The minimum vertex cover (Min-VC) problem is a well-known NP-hard problem. Earlier studies illustrate that the problem defined over the Erdös-Rényi random graph with a mean degree $c$ exhibits computational difficulty in searching the Min-VC set above a critical point $c = e = 2.718 \ldots$. Here, we address how this difficulty is influenced by the mesoscopic structures of graphs. For this, we eva…
▽ More
The minimum vertex cover (Min-VC) problem is a well-known NP-hard problem. Earlier studies illustrate that the problem defined over the Erdös-Rényi random graph with a mean degree $c$ exhibits computational difficulty in searching the Min-VC set above a critical point $c = e = 2.718 \ldots$. Here, we address how this difficulty is influenced by the mesoscopic structures of graphs. For this, we evaluate the critical condition of difficulty for the stochastic block model. We perform a detailed examination of the specific cases of two equal-size communities characterized by in- and out- degrees, which are denoted by $c_{\rm in}$ and $c_{\rm out}$, respectively. Our analysis based on the cavity method indicates that the solution search becomes difficult when $c_{\rm in }+c_{\rm out} > e$, but becomes easy again when $c_{\text{out}}$ is sufficiently larger than $c_{\mathrm{in}}$ in the region $c_{\rm out}>e$. Experiments based on various search algorithms support the theoretical prediction.
△ Less
Submitted 20 August, 2019;
originally announced August 2019.
-
Approximate matrix completion based on cavity method
Authors:
Chihiro Noguchi,
Yoshiyuki Kabashima
Abstract:
In order to solve large matrix completion problems with practical computational cost, an approximate approach based on matrix factorization has been widely used. Alternating least squares (ALS) and stochastic gradient descent (SGD) are two major algorithms to this end. In this study, we propose a new algorithm, namely cavity-based matrix factorization (CBMF) and approximate cavity-based matrix fac…
▽ More
In order to solve large matrix completion problems with practical computational cost, an approximate approach based on matrix factorization has been widely used. Alternating least squares (ALS) and stochastic gradient descent (SGD) are two major algorithms to this end. In this study, we propose a new algorithm, namely cavity-based matrix factorization (CBMF) and approximate cavity-based matrix factorization (ACBMF), which are developed based on the cavity method from statistical mechanics. ALS yields solutions with less iterations when compared to those of SGD. This is because its update rules are described in a closed form although it entails higher computational cost. CBMF can also write its update rules in a closed form, and its computational cost is lower than that of ALS. ACBMF is proposed to compensate a disadvantage of CBMF in terms of relatively high memory cost. We experimentally illustrate that the proposed methods outperform the two existing algorithms in terms of convergence speed per iteration, and it can work under the condition where observed entries are relatively fewer. Additionally, in contrast to SGD, (A)CBMF does not require scheduling of the learning rate.
△ Less
Submitted 28 June, 2019;
originally announced July 2019.
-
Replicated Vector Approximate Message Passing For Resampling Problem
Authors:
Takashi Takahashi,
Yoshiyuki Kabashima
Abstract:
Resampling techniques are widely used in statistical inference and ensemble learning, in which estimators' statistical properties are essential. However, existing methods are computationally demanding, because repetitions of estimation/learning via numerical optimization/integral for each resampled data are required. In this study, we introduce a computationally efficient method to resolve such pr…
▽ More
Resampling techniques are widely used in statistical inference and ensemble learning, in which estimators' statistical properties are essential. However, existing methods are computationally demanding, because repetitions of estimation/learning via numerical optimization/integral for each resampled data are required. In this study, we introduce a computationally efficient method to resolve such problem: replicated vector approximate message passing. This is based on a combination of the replica method of statistical physics and an accurate approximate inference algorithm, namely the vector approximate message passing of information theory. The method provides tractable densities without repeating estimation/learning, and the densities approximately offer an arbitrary degree of the estimators' moment in practical time. In the experiment, we apply the proposed method to the stability selection method, which is commonly used in variable selection problems. The numerical results show its fast convergence and high approximation accuracy for problems involving both synthetic and real-world datasets.
△ Less
Submitted 23 May, 2019;
originally announced May 2019.
-
Statistical mechanical analysis of sparse linear regression as a variable selection problem
Authors:
Tomoyuki Obuchi,
Yoshinori Nakanishi-Ohno,
Masato Okada,
Yoshiyuki Kabashima
Abstract:
An algorithmic limit of compressed sensing or related variable-selection problems is analytically evaluated when a design matrix is given by an overcomplete random matrix. The replica method from statistical mechanics is employed to derive the result. The analysis is conducted through evaluation of the entropy, an exponential rate of the number of combinations of variables giving a specific value…
▽ More
An algorithmic limit of compressed sensing or related variable-selection problems is analytically evaluated when a design matrix is given by an overcomplete random matrix. The replica method from statistical mechanics is employed to derive the result. The analysis is conducted through evaluation of the entropy, an exponential rate of the number of combinations of variables giving a specific value of fit error to given data which is assumed to be generated from a linear process using the design matrix. This yields the typical achievable limit of the fit error when solving a representative $\ell_0$ problem and includes the presence of unfavourable phase transitions preventing local search algorithms from reaching the minimum-error configuration. The associated phase diagrams are presented. A noteworthy outcome of the phase diagrams is that there exists a wide parameter region where any phase transition is absent from the high temperature to the lowest temperature at which the minimum-error configuration or the ground state is reached. This implies that certain local search algorithms can find the ground state with moderate computational costs in that region. Another noteworthy result is the presence of the random first-order transition in the strong noise case. The theoretical evaluation of the entropy is confirmed by extensive numerical methods using the exchange Monte Carlo and the multi-histogram methods. Another numerical test based on a metaheuristic optimisation algorithm called simulated annealing is conducted, which well supports the theoretical predictions on the local search algorithms. In the successful region with no phase transition, the computational cost of the simulated annealing to reach the ground state is estimated as the third order polynomial of the model dimensionality.
△ Less
Submitted 10 September, 2018; v1 submitted 29 May, 2018;
originally announced May 2018.
-
Objective and efficient inference for couplings in neuronal networks
Authors:
Yu Terada,
Tomoyuki Obuchi,
Takuya Isomura,
Yoshiyuki Kabashima
Abstract:
Inferring directional couplings from the spike data of networks is desired in various scientific fields such as neuroscience. Here, we apply a recently proposed objective procedure to the spike data obtained from the Hodgkin--Huxley type models and in vitro neuronal networks cultured in a circular structure. As a result, we succeed in reconstructing synaptic connections accurately from the evoked…
▽ More
Inferring directional couplings from the spike data of networks is desired in various scientific fields such as neuroscience. Here, we apply a recently proposed objective procedure to the spike data obtained from the Hodgkin--Huxley type models and in vitro neuronal networks cultured in a circular structure. As a result, we succeed in reconstructing synaptic connections accurately from the evoked activity as well as the spontaneous one. To obtain the results, we invent an analytic formula approximately implementing a method of screening relevant couplings. This significantly reduces the computational cost of the screening method employed in the proposed objective procedure, making it possible to treat large-size systems as in this study.
△ Less
Submitted 18 May, 2018;
originally announced May 2018.
-
A statistical mechanics approach to de-biasing and uncertainty estimation in LASSO for random measurements
Authors:
Takashi Takahashi,
Yoshiyuki Kabashima
Abstract:
In high-dimensional statistical inference in which the number of parameters to be estimated is larger than that of the holding data, regularized linear estimation techniques are widely used. These techniques have, however, some drawbacks. First, estimators are biased in the sense that their absolute values are shrunk toward zero because of the regularization effect. Second, their statistical prope…
▽ More
In high-dimensional statistical inference in which the number of parameters to be estimated is larger than that of the holding data, regularized linear estimation techniques are widely used. These techniques have, however, some drawbacks. First, estimators are biased in the sense that their absolute values are shrunk toward zero because of the regularization effect. Second, their statistical properties are difficult to characterize as they are given as numerical solutions to certain optimization problems. In this manuscript, we tackle such problems concerning LASSO, which is a widely used method for sparse linear estimation, when the measurement matrix is regarded as a sample from a rotationally invariant ensemble. We develop a new computationally feasible scheme to construct a de-biased estimator with a confidence interval and conduct hypothesis testing for the null hypothesis that a certain parameter vanishes. It is numerically confirmed that the proposed method successfully de-biases the LASSO estimator and constructs confidence intervals and p-values by experiments for noisy linear measurements.
△ Less
Submitted 27 March, 2018;
originally announced March 2018.
-
Inferring neuronal couplings from spiking data using a systematic procedure with a statistical criterion
Authors:
Yu Terada,
Tomoyuki Obuchi,
Takuya Isomura,
Yoshiyuki Kabashima
Abstract:
Recent remarkable advances in the experimental techniques have provided a background for inferring neuronal couplings from point process data that includes a great number of neurons. Here, we propose a systematic procedure for pre- and post-processing generic point process data in an objective manner, to handle data in the framework of a binary simple statistical model, the Ising or generalized Mc…
▽ More
Recent remarkable advances in the experimental techniques have provided a background for inferring neuronal couplings from point process data that includes a great number of neurons. Here, we propose a systematic procedure for pre- and post-processing generic point process data in an objective manner, to handle data in the framework of a binary simple statistical model, the Ising or generalized McCulloch--Pitts model. The procedure involves two steps: (1) determining time-bin size for transforming the point-process data into discrete-time binary data and (2) screening relevant couplings from the estimated couplings. For the first step, we decide the optimal time-bin size by introducing the null hypothesis that all neurons would fire independently, then choosing a time-bin size so that the null hypothesis is rejected with the most strict criterion. The likelihood associated with the null hypothesis is analytically evaluated and used for the rejection process. For the second post-processing step, after a certain estimator of coupling is obtained based on the pre-processed dataset, the estimate is compared with many other estimates derived from datasets obtained by randomizing the original dataset in the time direction. We accept the original estimate as relevant only if its absolute value is sufficiently larger than them of randomized datasets. These manipulations suppress false positive couplings induced by statistical noise. We apply this inference procedure to spiking data from synthetic and in vitro neuronal networks. The results show that the proposed procedure identifies the presence/absence of synaptic couplings fairly well including their signs, for the synthetic and experimental data. In particular, the results support that we can infer the physical connections of underlying systems in favorable situations, even when using the simple statistical model.
△ Less
Submitted 7 July, 2020; v1 submitted 13 March, 2018;
originally announced March 2018.
-
Semi-Analytic Resampling in Lasso
Authors:
Tomoyuki Obuchi,
Yoshiyuki Kabashima
Abstract:
An approximate method for conducting resampling in Lasso, the $\ell_1$ penalized linear regression, in a semi-analytic manner is developed, whereby the average over the resampled datasets is directly computed without repeated numerical sampling, thus enabling an inference free of the statistical fluctuations due to sampling finiteness, as well as a significant reduction of computational time. The…
▽ More
An approximate method for conducting resampling in Lasso, the $\ell_1$ penalized linear regression, in a semi-analytic manner is developed, whereby the average over the resampled datasets is directly computed without repeated numerical sampling, thus enabling an inference free of the statistical fluctuations due to sampling finiteness, as well as a significant reduction of computational time. The proposed method is based on a message passing type algorithm, and its fast convergence is guaranteed by the state evolution analysis, when covariates are provided as zero-mean independently and identically distributed Gaussian random variables. It is employed to implement bootstrapped Lasso (Bolasso) and stability selection, both of which are variable selection methods using resampling in conjunction with Lasso, and resolves their disadvantage regarding computational cost. To examine approximation accuracy and efficiency, numerical experiments were carried out using simulated datasets. Moreover, an application to a real-world dataset, the wine quality dataset, is presented. To process such real-world datasets, an objective criterion for determining the relevance of selected variables is also introduced by the addition of noise variables and resampling.
△ Less
Submitted 10 December, 2018; v1 submitted 27 February, 2018;
originally announced February 2018.
-
Accelerating Cross-Validation in Multinomial Logistic Regression with $\ell_1$-Regularization
Authors:
Tomoyuki Obuchi,
Yoshiyuki Kabashima
Abstract:
We develop an approximate formula for evaluating a cross-validation estimator of predictive likelihood for multinomial logistic regression regularized by an $\ell_1$-norm. This allows us to avoid repeated optimizations required for literally conducting cross-validation; hence, the computational time can be significantly reduced. The formula is derived through a perturbative approach employing the…
▽ More
We develop an approximate formula for evaluating a cross-validation estimator of predictive likelihood for multinomial logistic regression regularized by an $\ell_1$-norm. This allows us to avoid repeated optimizations required for literally conducting cross-validation; hence, the computational time can be significantly reduced. The formula is derived through a perturbative approach employing the largeness of the data size and the model dimensionality. An extension to the elastic net regularization is also addressed. The usefulness of the approximate formula is demonstrated on simulated data and the ISOLET dataset from the UCI machine learning repository.
△ Less
Submitted 18 September, 2018; v1 submitted 15 November, 2017;
originally announced November 2017.
-
Statistical properties of interaction parameter estimates in direct coupling analysis
Authors:
Yingying Xu,
Erik Aurell,
Jukka Corander,
Yoshiyuki Kabashima
Abstract:
We consider the statistical properties of interaction parameter estimates obtained by the direct coupling analysis (DCA) approach to learning interactions from large data sets. Assuming that the data are generated from a random background distribution, we determine the distribution of inferred interactions. Two inference methods are considered: the L2 regularized naive mean-field inference procedu…
▽ More
We consider the statistical properties of interaction parameter estimates obtained by the direct coupling analysis (DCA) approach to learning interactions from large data sets. Assuming that the data are generated from a random background distribution, we determine the distribution of inferred interactions. Two inference methods are considered: the L2 regularized naive mean-field inference procedure (regularized least squares, RLS), and the pseudo-likelihood maximization (plmDCA). For RLS we also study a model where the data matrix elements are real numbers, identically and independently generated from a Gaussian distribution; in this setting we analytically find that the distribution of the inferred interactions is Gaussian. For data of Boolean type, more realistic in practice, the inferred interactions do not generally follow a Gaussian. However, extensive numerical simulations indicate that their distribution can be characterized by a single function determined by a few system parameters after normalization by the standard deviation. This property holds for both RLS and plmDCA and may be exploitable for inferring the distribution of extremely large interactions from simulations for smaller system sizes.
△ Less
Submitted 5 April, 2017;
originally announced April 2017.
-
Accelerating cross-validation with total variation and its application to super-resolution imaging
Authors:
Tomoyuki Obuchi,
Shiro Ikeda,
Kazunori Akiyama,
Yoshiyuki Kabashima
Abstract:
We develop an approximation formula for the cross-validation error (CVE) of a sparse linear regression penalized by $\ell_1$-norm and total variation terms, which is based on a perturbative expansion utilizing the largeness of both the data dimensionality and the model. The developed formula allows us to reduce the necessary computational cost of the CVE evaluation significantly. The practicality…
▽ More
We develop an approximation formula for the cross-validation error (CVE) of a sparse linear regression penalized by $\ell_1$-norm and total variation terms, which is based on a perturbative expansion utilizing the largeness of both the data dimensionality and the model. The developed formula allows us to reduce the necessary computational cost of the CVE evaluation significantly. The practicality of the formula is tested through application to simulated black-hole image reconstruction on the event-horizon scale with super resolution. The results demonstrate that our approximation reproduces the CVE values obtained via literally conducted cross-validation with reasonably good precision.
△ Less
Submitted 20 November, 2017; v1 submitted 22 November, 2016;
originally announced November 2016.
-
Relative species abundance of replicator dynamics with sparse interactions
Authors:
Tomoyuki Obuchi,
Yoshiyuki Kabashima,
Kei Tokita
Abstract:
A theory of relative species abundance on sparsely-connected networks is presented by investigating the replicator dynamics with symmetric interactions. Sparseness of a network involves difficulty in analyzing the fixed points of the equation, and we avoid this problem by treating large self interaction $u$, which allows us to construct a perturbative expansion. Based on this perturbation, we find…
▽ More
A theory of relative species abundance on sparsely-connected networks is presented by investigating the replicator dynamics with symmetric interactions. Sparseness of a network involves difficulty in analyzing the fixed points of the equation, and we avoid this problem by treating large self interaction $u$, which allows us to construct a perturbative expansion. Based on this perturbation, we find that the nature of the interactions is directly connected to the abundance distribution, and some characteristic behaviors, such as multiple peaks in the abundance distribution and all species coexistence at moderate values of $u$, are discovered in a wide class of the distribution of the interactions. The all species coexistence collapses at a critical value of $u$, $u_c$, and this collapsing is regarded as a phase transition. To get more quantitative information, we also construct a non-perturbative theory on random graphs based on techniques of statistical mechanics. The result shows those characteristic behaviors are sustained well even for not large $u$. For even smaller values of $u$, extinct species start to appear and the abundance distribution becomes rounded and closer to a standard functional form. Another interesting finding is the non-monotonic behavior of diversity, which quantifies the number of coexisting species, when changing the ratio of mutualistic relations $Δ$. These results are examined by numerical simulations, and the multiple peaks in the abundance distribution are confirmed to be robust against a certain level of modifications of the problem. The numerical results also show that our theory is exact for the case without extinct species, but becomes less and less precise as the proportion of extinct species grows.
△ Less
Submitted 31 May, 2016;
originally announced May 2016.
-
Multiple peaks of species abundance distributions induced by sparse interactions
Authors:
Tomoyuki Obuchi,
Yoshiyuki Kabashima,
Kei Tokita
Abstract:
We investigate the replicator dynamics with "sparse" symmetric interactions which represent specialist-specialist interactions in ecological communities. By considering a large self interaction $u$, we conduct a perturbative expansion which manifests that the nature of the interactions has a direct impact on the species abundance distribution. The central results are all species coexistence in a r…
▽ More
We investigate the replicator dynamics with "sparse" symmetric interactions which represent specialist-specialist interactions in ecological communities. By considering a large self interaction $u$, we conduct a perturbative expansion which manifests that the nature of the interactions has a direct impact on the species abundance distribution. The central results are all species coexistence in a realistic range of the model parameters and that a certain discrete nature of the interactions induces multiple peaks in the species abundance distribution, providing the possibility of theoretically explaining multiple peaks observed in various field studies. To get more quantitative information, we also construct a non-perturbative theory which becomes exact on tree-like networks if all the species coexist, providing exact critical values of $u$ below which extinct species emerge. Numerical simulations in various different situations are conducted and they clarify the robustness of the presented mechanism of all species coexistence and multiple peaks in the species abundance distributions.
△ Less
Submitted 30 May, 2016;
originally announced May 2016.
-
Resilience of antagonistic networks with regard to the effects of initial failures and degree-degree correlations
Authors:
Shunsuke Watanabe,
Yoshiyuki Kabashima
Abstract:
In this study, we investigate the resilience of duplex networked layers ($α$ and $β$) coupled with antagonistic interlinks, each layer of which inhibits its counterpart at the microscopic level, changing the following factors: whether the influence of the initial failures in $α$ remains (quenched (Case Q)) or not (free (Case F)); the effect of intralayer degree-degree correlations in each layer an…
▽ More
In this study, we investigate the resilience of duplex networked layers ($α$ and $β$) coupled with antagonistic interlinks, each layer of which inhibits its counterpart at the microscopic level, changing the following factors: whether the influence of the initial failures in $α$ remains (quenched (Case Q)) or not (free (Case F)); the effect of intralayer degree-degree correlations in each layer and interlayer degree-degree correlations; and the type of the initial failures, such as random failures (RFs) or targeted attacks (TAs). We illustrate that the percolation processes repeat in both Cases Q and F, although only in Case F are nodes that initially failed reactivated. To analytically evaluate the resilience of each layer, we develop a methodology based on the cavity method for deriving the size of a giant component (GC). Strong hysteresis, which is ignored in the standard cavity analysis, is observed in the repetition of the percolation processes particularly in Case F. To handle this, we heuristically modify interlayer messages for macroscopic analysis, the utility of which is verified by numerical experiments. The percolation transition in each layer is continuous in both Cases Q and F. We also analyze the influences of degree-degree correlations on the robustness of layer $α$, in particular for the case of TAs. The analysis indicates that the critical fraction of initial failures that makes the GC size in layer $α$ vanish depends only on its intralayer degree-degree correlations. Although our model is defined in a somewhat abstract manner, it may have relevance to ecological systems that are composed of endangered species (layer $α$) and invaders (layer $β$), the former of which are damaged by the latter whereas the latter are exterminated in the areas where the former are active.
△ Less
Submitted 28 July, 2016; v1 submitted 18 May, 2016;
originally announced May 2016.
-
Sampling approach to sparse approximation problem: determining degrees of freedom by simulated annealing
Authors:
Tomoyuki Obuchi,
Yoshiyuki Kabashima
Abstract:
The approximation of a high-dimensional vector by a small combination of column vectors selected from a fixed matrix has been actively debated in several different disciplines. In this paper, a sampling approach based on the Monte Carlo method is presented as an efficient solver for such problems. Especially, the use of simulated annealing (SA), a metaheuristic optimization algorithm, for determin…
▽ More
The approximation of a high-dimensional vector by a small combination of column vectors selected from a fixed matrix has been actively debated in several different disciplines. In this paper, a sampling approach based on the Monte Carlo method is presented as an efficient solver for such problems. Especially, the use of simulated annealing (SA), a metaheuristic optimization algorithm, for determining degrees of freedom (the number of used columns) by cross validation is focused on and tested. Test on a synthetic model indicates that our SA-based approach can find a nearly optimal solution for the approximation problem and, when combined with the CV framework, it can optimize the generalization ability. Its utility is also confirmed by application to a real-world supernova data set.
△ Less
Submitted 4 October, 2016; v1 submitted 4 March, 2016;
originally announced March 2016.
-
Sparse approximation problem: how rapid simulated annealing succeeds and fails
Authors:
Tomoyuki Obuchi,
Yoshiyuki Kabashima
Abstract:
Information processing techniques based on sparseness have been actively studied in several disciplines. Among them, a mathematical framework to approximately express a given dataset by a combination of a small number of basis vectors of an overcomplete basis is termed the {\em sparse approximation}. In this paper, we apply simulated annealing, a metaheuristic algorithm for general optimization pr…
▽ More
Information processing techniques based on sparseness have been actively studied in several disciplines. Among them, a mathematical framework to approximately express a given dataset by a combination of a small number of basis vectors of an overcomplete basis is termed the {\em sparse approximation}. In this paper, we apply simulated annealing, a metaheuristic algorithm for general optimization problems, to sparse approximation in the situation where the given data have a planted sparse representation and noise is present. The result in the noiseless case shows that our simulated annealing works well in a reasonable parameter region: the planted solution is found fairly rapidly. This is true even in the case where a common relaxation of the sparse approximation problem, the $\ell_1$-relaxation, is ineffective. On the other hand, when the dimensionality of the data is close to the number of non-zero components, another metastable state emerges, and our algorithm fails to find the planted solution. This phenomenon is associated with a first-order phase transition. In the case of very strong noise, it is no longer meaningful to search for the planted solution. In this situation, our algorithm determines a solution with close-to-minimum distortion fairly quickly.
△ Less
Submitted 4 March, 2016; v1 submitted 5 January, 2016;
originally announced January 2016.
-
Cross validation in LASSO and its acceleration
Authors:
Tomoyuki Obuchi,
Yoshiyuki Kabashima
Abstract:
We investigate leave-one-out cross validation (CV) as a determinator of the weight of the penalty term in the least absolute shrinkage and selection operator (LASSO). First, on the basis of the message passing algorithm and a perturbative discussion assuming that the number of observations is sufficiently large, we provide simple formulas for approximately assessing two types of CV errors, which e…
▽ More
We investigate leave-one-out cross validation (CV) as a determinator of the weight of the penalty term in the least absolute shrinkage and selection operator (LASSO). First, on the basis of the message passing algorithm and a perturbative discussion assuming that the number of observations is sufficiently large, we provide simple formulas for approximately assessing two types of CV errors, which enable us to significantly reduce the necessary cost of computation. These formulas also provide a simple connection of the CV errors to the residual sums of squares between the reconstructed and the given measurements. Second, on the basis of this finding, we analytically evaluate the CV errors when the design matrix is given as a simple random matrix in the large size limit by using the replica method. Finally, these results are compared with those of numerical simulations on finite-size systems and are confirmed to be correct. We also apply the simple formulas of the first type of CV error to an actual dataset of the supernovae.
△ Less
Submitted 4 March, 2016; v1 submitted 28 December, 2015;
originally announced January 2016.
-
Sparse approximation based on a random overcomplete basis
Authors:
Yoshinori Nakanishi-Ohno,
Tomoyuki Obuchi,
Masato Okada,
Yoshiyuki Kabashima
Abstract:
We discuss a strategy of sparse approximation that is based on the use of an overcomplete basis, and evaluate its performance when a random matrix is used as this basis. A small combination of basis vectors is chosen from a given overcomplete basis, according to a given compression rate, such that they compactly represent the target data with as small a distortion as possible. As a selection metho…
▽ More
We discuss a strategy of sparse approximation that is based on the use of an overcomplete basis, and evaluate its performance when a random matrix is used as this basis. A small combination of basis vectors is chosen from a given overcomplete basis, according to a given compression rate, such that they compactly represent the target data with as small a distortion as possible. As a selection method, we study the $\ell_0$- and $\ell_1$-based methods, which employ the exhaustive search and $\ell_1$-norm regularization techniques, respectively. The performance is assessed in terms of the trade-off relation between the representation distortion and the compression rate. First, we evaluate the performance analytically in the case that the methods are carried out ideally, using methods of statistical mechanics. Our result clarifies the fact that the $\ell_0$-based method greatly outperforms the $\ell_1$-based one. Second, we examine the practical performances of two well-known algorithms, orthogonal matching pursuit and approximate message passing, when they are used to execute the $\ell_0$- and $\ell_1$-based methods, respectively. Our examination shows that orthogonal matching pursuit achieves a much better performance than the exact execution of the $\ell_1$-based method, as well as approximate message passing. However, regarding the $\ell_0$-based method, there is still room to design more effective greedy algorithms than orthogonal matching pursuit. Finally, we evaluate the performances of the algorithms when they are applied to image data compression.
△ Less
Submitted 2 March, 2016; v1 submitted 7 October, 2015;
originally announced October 2015.
-
Detectability of the spectral method for sparse graph partitioning
Authors:
Tatsuro Kawamoto,
Yoshiyuki Kabashima
Abstract:
We show that modularity maximization with the resolution parameter offers a unifying framework of graph partitioning. In this framework, we demonstrate that the spectral method exhibits universal detectability, irrespective of the value of the resolution parameter, as long as the graph is partitioned. Furthermore, we show that when the resolution parameter is sufficiently small, a first-order phas…
▽ More
We show that modularity maximization with the resolution parameter offers a unifying framework of graph partitioning. In this framework, we demonstrate that the spectral method exhibits universal detectability, irrespective of the value of the resolution parameter, as long as the graph is partitioned. Furthermore, we show that when the resolution parameter is sufficiently small, a first-order phase transition occurs, resulting in the graph being unpartitioned.
△ Less
Submitted 18 December, 2015; v1 submitted 22 September, 2015;
originally announced September 2015.
-
Online compressed sensing
Authors:
Paulo V. Rossi,
Yoshiyuki Kabashima,
Jun-ichi Inoue
Abstract:
In this paper, we explore the possibilities and limitations of recovering sparse signals in an online fashion. Employing a mean field approximation to the Bayes recursion formula yields an online signal recovery algorithm that can be performed with a computational cost that is linearly proportional to the signal length per update. Analysis of the resulting algorithm indicates that the online algor…
▽ More
In this paper, we explore the possibilities and limitations of recovering sparse signals in an online fashion. Employing a mean field approximation to the Bayes recursion formula yields an online signal recovery algorithm that can be performed with a computational cost that is linearly proportional to the signal length per update. Analysis of the resulting algorithm indicates that the online algorithm asymptotically saturates the optimal performance limit achieved by the offline method in the presence of Gaussian measurement noise, while differences in the allowable computational costs may result in fundamental gaps of the achievable performance in the absence of noise.
△ Less
Submitted 16 September, 2015;
originally announced September 2015.
-
Limitations in the spectral method for graph partitioning: detectability threshold and localization of eigenvectors
Authors:
Tatsuro Kawamoto,
Yoshiyuki Kabashima
Abstract:
Investigating the performance of different methods is a fundamental problem in graph partitioning. In this paper, we estimate the so-called detectability threshold for the spectral method with both unnormalized and normalized Laplacians in sparse graphs. The detectability threshold is the critical point at which the result of the spectral method is completely uncorrelated to the planted partition.…
▽ More
Investigating the performance of different methods is a fundamental problem in graph partitioning. In this paper, we estimate the so-called detectability threshold for the spectral method with both unnormalized and normalized Laplacians in sparse graphs. The detectability threshold is the critical point at which the result of the spectral method is completely uncorrelated to the planted partition. We also analyze whether the localization of eigenvectors affects the partitioning performance in the detectable region. We use the replica method, which is often used in the field of spin-glass theory, and focus on the case of bisection. We show that the gap between the estimated threshold for the spectral method and the threshold obtained from Bayesian inference is considerable in sparse graphs, even without eigenvector localization. This gap closes in a dense limit.
△ Less
Submitted 9 June, 2015; v1 submitted 24 February, 2015;
originally announced February 2015.
-
Replica Symmetric Bound for Restricted Isometry Constant
Authors:
Ayaka Sakata,
Yoshiyuki Kabashima
Abstract:
We develop a method for evaluating restricted isometry constants (RICs). This evaluation is reduced to the identification of the zero-points of entropy, which is defined for submatrices that are composed of columns selected from a given measurement matrix. Using the replica method developed in statistical mechanics, we assess RICs for Gaussian random matrices under the replica symmetric (RS) assum…
▽ More
We develop a method for evaluating restricted isometry constants (RICs). This evaluation is reduced to the identification of the zero-points of entropy, which is defined for submatrices that are composed of columns selected from a given measurement matrix. Using the replica method developed in statistical mechanics, we assess RICs for Gaussian random matrices under the replica symmetric (RS) assumption. In order to numerically validate the adequacy of our analysis, we employ the exchange Monte Carlo (EMC) method, which has been empirically demonstrated to achieve much higher numerical accuracy than naive Monte Carlo methods. The EMC method suggests that our theoretical estimation of an RIC corresponds to an upper bound that is tighter than in preceding studies. Physical consideration indicates that our assessment of the RIC could be improved by taking into account the replica symmetry breaking.
△ Less
Submitted 23 June, 2015; v1 submitted 26 January, 2015;
originally announced January 2015.
-
Replica analysis of Franz-Parisi potential for sparse systems
Authors:
Masahiko Ueda,
Yoshiyuki Kabashima
Abstract:
We propose a method for calculating the Franz-Parisi potential for spin glass models on sparse random graphs using the replica method under the replica symmetric ansatz. The resulting self-consistent equations have the solution with the characteristic structure of multi-body overlaps, and the self-consistent equations under this solution are equivalent to the one-step replica symmetry breaking (1R…
▽ More
We propose a method for calculating the Franz-Parisi potential for spin glass models on sparse random graphs using the replica method under the replica symmetric ansatz. The resulting self-consistent equations have the solution with the characteristic structure of multi-body overlaps, and the self-consistent equations under this solution are equivalent to the one-step replica symmetry breaking (1RSB) cavity equation with Parisi parameter $x=1$. This method is useful for the evaluation of transition temperatures of the $p$-spin model on regular random graphs under a uniform magnetic field.
△ Less
Submitted 10 March, 2015; v1 submitted 5 December, 2014;
originally announced December 2014.
-
Origin of the computational hardness for learning with binary synapses
Authors:
Haiping Huang,
Yoshiyuki Kabashima
Abstract:
Supervised learning in a binary perceptron is able to classify an extensive number of random patterns by a proper assignment of binary synaptic weights. However, to find such assignments in practice, is quite a nontrivial task. The relation between the weight space structure and the algorithmic hardness has not yet been fully understood. To this end, we analytically derive the Franz-Parisi potenti…
▽ More
Supervised learning in a binary perceptron is able to classify an extensive number of random patterns by a proper assignment of binary synaptic weights. However, to find such assignments in practice, is quite a nontrivial task. The relation between the weight space structure and the algorithmic hardness has not yet been fully understood. To this end, we analytically derive the Franz-Parisi potential for the binary preceptron problem, by starting from an equilibrium solution of weights and exploring the weight space structure around it. Our result reveals the geometrical organization of the weight space\textemdash the weight space is composed of isolated solutions, rather than clusters of exponentially many close-by solutions. The point-like clusters far apart from each other in the weight space explain the previously observed glassy behavior of stochastic local search heuristics.
△ Less
Submitted 8 August, 2014;
originally announced August 2014.
-
Phase transitions and sample complexity in Bayes-optimal matrix factorization
Authors:
Yoshiyuki Kabashima,
Florent Krzakala,
Marc Mézard,
Ayaka Sakata,
Lenka Zdeborová
Abstract:
We analyse the matrix factorization problem. Given a noisy measurement of a product of two matrices, the problem is to estimate back the original matrices. It arises in many applications such as dictionary learning, blind matrix calibration, sparse principal component analysis, blind source separation, low rank matrix completion, robust principal component analysis or factor analysis. It is also i…
▽ More
We analyse the matrix factorization problem. Given a noisy measurement of a product of two matrices, the problem is to estimate back the original matrices. It arises in many applications such as dictionary learning, blind matrix calibration, sparse principal component analysis, blind source separation, low rank matrix completion, robust principal component analysis or factor analysis. It is also important in machine learning: unsupervised representation learning can often be studied through matrix factorization. We use the tools of statistical mechanics - the cavity and replica methods - to analyze the achievability and computational tractability of the inference problems in the setting of Bayes-optimal inference, which amounts to assuming that the two matrices have random independent elements generated from some known distribution, and this information is available to the inference algorithm. In this setting, we compute the minimal mean-squared-error achievable in principle in any computational time, and the error that can be achieved by an efficient approximate message passing algorithm. The computation is based on the asymptotic state-evolution analysis of the algorithm. The performance that our analysis predicts, both in terms of the achieved mean-squared-error, and in terms of sample complexity, is extremely promising and motivating for a further development of the algorithm.
△ Less
Submitted 21 March, 2016; v1 submitted 6 February, 2014;
originally announced February 2014.
-
Signal recovery using expectation consistent approximation for linear observations
Authors:
Yoshiyuki Kabashima,
Mikko Vehkapera
Abstract:
A signal recovery scheme is developed for linear observation systems based on expectation consistent (EC) mean field approximation. Approximate message passing (AMP) is known to be consistent with the results obtained using the replica theory, which is supposed to be exact in the large system limit, when each entry of the observation matrix is independently generated from an identical distribution…
▽ More
A signal recovery scheme is developed for linear observation systems based on expectation consistent (EC) mean field approximation. Approximate message passing (AMP) is known to be consistent with the results obtained using the replica theory, which is supposed to be exact in the large system limit, when each entry of the observation matrix is independently generated from an identical distribution. However, this is not necessarily the case for general matrices. We show that EC recovery exhibits consistency with the replica theory for a wider class of random observation matrices. This is numerically confirmed by experiments for the Bayesian optimal signal recovery of compressed sensing using random row-orthogonal matrices.
△ Less
Submitted 6 July, 2014; v1 submitted 20 January, 2014;
originally announced January 2014.
-
Reconstruction algorithm in compressed sensing based on maximum a posteriori estimation
Authors:
Koujin Takeda,
Yoshiyuki Kabashima
Abstract:
We propose a systematic method for constructing a sparse data reconstruction algorithm in compressed sensing at a relatively low computational cost for general observation matrix. It is known that the cost of l1-norm minimization using a standard linear programming algorithm is O(N^3). We show that this cost can be reduced to O(N^2) by applying the approach of posterior maximization. Furthermore,…
▽ More
We propose a systematic method for constructing a sparse data reconstruction algorithm in compressed sensing at a relatively low computational cost for general observation matrix. It is known that the cost of l1-norm minimization using a standard linear programming algorithm is O(N^3). We show that this cost can be reduced to O(N^2) by applying the approach of posterior maximization. Furthermore, in principle, the algorithm from our approach is expected to achieve the widest successful reconstruction region, which is evaluated from theoretical argument. We also discuss the relation between the belief propagation-based reconstruction algorithm introduced in preceding works and our approach.
△ Less
Submitted 1 November, 2013;
originally announced November 2013.
-
Dynamics of asymmetric kinetic Ising systems revisited
Authors:
Haiping Huang,
Yoshiyuki Kabashima
Abstract:
The dynamics of an asymmetric kinetic Ising model is studied. Two schemes for improving the existing mean-field description are proposed. In the first scheme, we derive the formulas for instantaneous magnetization, equal-time correlation, and time-delayed correlation, considering the correlation between different local fields. To derive the time-delayed correlation, we emphasize that the small cor…
▽ More
The dynamics of an asymmetric kinetic Ising model is studied. Two schemes for improving the existing mean-field description are proposed. In the first scheme, we derive the formulas for instantaneous magnetization, equal-time correlation, and time-delayed correlation, considering the correlation between different local fields. To derive the time-delayed correlation, we emphasize that the small correlation assumption adopted in previous work [M. Mézard and J. Sakellariou, J. Stat. Mech., L07001 (2011)] is in fact not required. To confirm the inference efficiency of our method, we perform extensive simulations on single instances with either temporally constant external driving fields or sinusoidal external fields. In the second scheme, we develop an improved mean-field theory for instantaneous magnetization prediction utilizing the notion of the cavity system in conjunction with a perturbative expansion approach. Its efficiency is numerically confirmed by comparison with the existing mean-field theory when partially asymmetric couplings are present.
△ Less
Submitted 15 April, 2014; v1 submitted 18 October, 2013;
originally announced October 2013.
-
Cavity-based robustness analysis of interdependent networks: Influences of intranetwork and internetwork degree-degree correlations
Authors:
Shunsuke Watanabe,
Yoshiyuki Kabashima
Abstract:
We develop a methodology for analyzing the percolation phenomena of two mutually coupled (interdependent) networks based on the cavity method of statistical mechanics. In particular, we take into account the influence of degree-degree correlations inside and between the networks on the network robustness against targeted attacks and random failures. We show that the developed methodology is reduce…
▽ More
We develop a methodology for analyzing the percolation phenomena of two mutually coupled (interdependent) networks based on the cavity method of statistical mechanics. In particular, we take into account the influence of degree-degree correlations inside and between the networks on the network robustness against targeted attacks and random failures. We show that the developed methodology is reduced to the well-known generating function formalism in the absence of degree-degree correlations. The validity of the developed methodology is confirmed by a comparison with the results of numerical experiments. Our analytical results imply that the robustness of the interdependent networks depends considerably on both the intra- and internetwork degree-degree correlations in the case of targeted attacks, whereas the significance of the degree-degree correlations is relatively low for random failures.
△ Less
Submitted 9 February, 2014; v1 submitted 6 August, 2013;
originally announced August 2013.
-
Entropy landscape of solutions in the binary perceptron problem
Authors:
Haiping Huang,
K. Y. Michael Wong,
Yoshiyuki Kabashima
Abstract:
The statistical picture of the solution space for a binary perceptron is studied. The binary perceptron learns a random classification of input random patterns by a set of binary synaptic weights. The learning of this network is difficult especially when the pattern (constraint) density is close to the capacity, which is supposed to be intimately related to the structure of the solution space. The…
▽ More
The statistical picture of the solution space for a binary perceptron is studied. The binary perceptron learns a random classification of input random patterns by a set of binary synaptic weights. The learning of this network is difficult especially when the pattern (constraint) density is close to the capacity, which is supposed to be intimately related to the structure of the solution space. The geometrical organization is elucidated by the entropy landscape from a reference configuration and of solution-pairs separated by a given Hamming distance in the solution space. We evaluate the entropy at the annealed level as well as replica symmetric level and the mean field result is confirmed by the numerical simulations on single instances using the proposed message passing algorithms. From the first landscape (a random configuration as a reference), we see clearly how the solution space shrinks as more constraints are added. From the second landscape of solution-pairs, we deduce the coexistence of clustering and freezing in the solution space.
△ Less
Submitted 8 August, 2013; v1 submitted 10 April, 2013;
originally announced April 2013.
-
Adaptive Thouless-Anderson-Palmer approach to inverse Ising problems with quenched random fields
Authors:
Haiping Huang,
Yoshiyuki Kabashima
Abstract:
The adaptive Thouless-Anderson-Palmer equation is derived for inverse Ising problems in the presence of quenched random fields. We test the proposed scheme on Sherrington-Kirkpatrick, Hopfield, and random orthogonal models and find that the adaptive Thouless-Anderson-Palmer approach allows surprisingly accurate inference of quenched random fields whose distribution can be either Gaussian or bimoda…
▽ More
The adaptive Thouless-Anderson-Palmer equation is derived for inverse Ising problems in the presence of quenched random fields. We test the proposed scheme on Sherrington-Kirkpatrick, Hopfield, and random orthogonal models and find that the adaptive Thouless-Anderson-Palmer approach allows surprisingly accurate inference of quenched random fields whose distribution can be either Gaussian or bimodal, compared with other existing mean-field methods.
△ Less
Submitted 5 May, 2013; v1 submitted 12 March, 2013;
originally announced March 2013.
-
Sample Complexity of Bayesian Optimal Dictionary Learning
Authors:
Ayaka Sakata,
Yoshiyuki Kabashima
Abstract:
We consider a learning problem of identifying a dictionary matrix D (M times N dimension) from a sample set of M dimensional vectors Y = N^{-1/2} DX, where X is a sparse matrix (N times P dimension) in which the density of non-zero entries is 0<rho< 1. In particular, we focus on the minimum sample size P_c (sample complexity) necessary for perfectly identifying D of the optimal learning scheme whe…
▽ More
We consider a learning problem of identifying a dictionary matrix D (M times N dimension) from a sample set of M dimensional vectors Y = N^{-1/2} DX, where X is a sparse matrix (N times P dimension) in which the density of non-zero entries is 0<rho< 1. In particular, we focus on the minimum sample size P_c (sample complexity) necessary for perfectly identifying D of the optimal learning scheme when D and X are independently generated from certain distributions. By using the replica method of statistical mechanics, we show that P_c=O(N) holds as long as alpha = M/N >rho is satisfied in the limit of N to infinity. Our analysis also implies that the posterior distribution given Y is condensed only at the correct dictionary D when the compression rate alpha is greater than a certain critical value alpha_M(rho). This suggests that belief propagation may allow us to learn D with a low computational complexity using O(N) samples.
△ Less
Submitted 5 February, 2014; v1 submitted 25 January, 2013;
originally announced January 2013.
-
Statistical mechanics approach to 1-bit compressed sensing
Authors:
Yingying Xu,
Yoshiyuki Kabashima
Abstract:
Compressed sensing is a technique for recovering a high-dimensional signal from lower-dimensional data, whose components represent partial information about the signal, utilizing prior knowledge on the sparsity of the signal. For further reducing the data size of the compressed expression, a scheme to recover the original signal utilizing only the sign of each entry of the linearly transformed vec…
▽ More
Compressed sensing is a technique for recovering a high-dimensional signal from lower-dimensional data, whose components represent partial information about the signal, utilizing prior knowledge on the sparsity of the signal. For further reducing the data size of the compressed expression, a scheme to recover the original signal utilizing only the sign of each entry of the linearly transformed vector was recently proposed. This approach is often termed the 1-bit compressed sensing. Here we analyze the typical performance of an L1-norm based signal recovery scheme for the 1-bit compressed sensing using statistical mechanics methods. We show that the signal recovery performance predicted by the replica method under the replica symmetric ansatz, which turns out to be locally unstable for modes breaking the replica symmetry, is in a good consistency with experimental results of an approximate recovery algorithm developed earlier. This suggests that the L1-based recovery problem typically has many local optima of a similar recovery accuracy, which can be achieved by the approximate algorithm. We also develop another approximate recovery algorithm inspired by the cavity method. Numerical experiments show that when the density of nonzero entries in the original signal is relatively large the new algorithm offers better performance than the abovementioned scheme and does so with a lower computational cost.
△ Less
Submitted 15 February, 2014; v1 submitted 8 January, 2013;
originally announced January 2013.
-
Typical $l_1$-recovery limit of sparse vectors represented by concatenations of random orthogonal matrices
Authors:
Yoshiyuki Kabashima,
Mikko Vehkapera,
Saikat Chatterjee
Abstract:
We consider the problem of recovering an $N$-dimensional sparse vector $\vm{x}$ from its linear transformation $\vm{y}=\vm{D} \vm{x}$ of $M(< N)$ dimension. Minimizing the $l_{1}$-norm of $\vm{x}$ under the constraint $\vm{y} = \vm{D} \vm{x}$ is a standard approach for the recovery problem, and earlier studies report that the critical condition for typically successful $l_1$-recovery is universal…
▽ More
We consider the problem of recovering an $N$-dimensional sparse vector $\vm{x}$ from its linear transformation $\vm{y}=\vm{D} \vm{x}$ of $M(< N)$ dimension. Minimizing the $l_{1}$-norm of $\vm{x}$ under the constraint $\vm{y} = \vm{D} \vm{x}$ is a standard approach for the recovery problem, and earlier studies report that the critical condition for typically successful $l_1$-recovery is universal over a variety of randomly constructed matrices $\vm{D}$. For examining the extent of the universality, we focus on the case in which $\vm{D}$ is provided by concatenating $\nb=N/M$ matrices $\vm{O}_{1}, \vm{O}_{2},..., \vm{O}_\nb$ drawn uniformly according to the Haar measure on the $M \times M$ orthogonal matrices. By using the replica method in conjunction with the development of an integral formula for handling the random orthogonal matrices, we show that the concatenated matrices can result in better recovery performance than what the universality predicts when the density of non-zero signals is not uniform among the $\nb$ matrix modules. The universal condition is reproduced for the special case of uniform non-zero signal densities. Extensive numerical experiments support the theoretical predictions.
△ Less
Submitted 6 December, 2012; v1 submitted 23 August, 2012;
originally announced August 2012.