Search | arXiv e-print repository

Playing it safe: information constrains collective betting strategies

Authors: Philipp Fleig, Vijay Balasubramanian

Abstract: Every interaction of a living organism with its environment involves the placement of a bet. Armed with partial knowledge about a stochastic world, the organism must decide its next step or near-term strategy, an act that implicitly or explicitly involves the assumption of a model of the world. Better information about environmental statistics can improve the bet quality, but in practice resources… ▽ More Every interaction of a living organism with its environment involves the placement of a bet. Armed with partial knowledge about a stochastic world, the organism must decide its next step or near-term strategy, an act that implicitly or explicitly involves the assumption of a model of the world. Better information about environmental statistics can improve the bet quality, but in practice resources for information gathering are always limited. We argue that theories of optimal inference dictate that ``complex'' models are harder to infer with bounded information and lead to larger prediction errors. Thus, we propose a principle of ``playing it safe'' where, given finite information gathering capacity, biological systems should be biased towards simpler models of the world, and thereby to less risky betting strategies. In the framework of Bayesian inference, we show that there is an optimally safe adaptation strategy determined by the Bayesian prior. We then demonstrate that, in the context of stochastic phenotypic switching by bacteria, implementation of our principle of ``playing it safe'' increases fitness (population growth rate) of the bacterial collective. We suggest that the principle applies broadly to problems of adaptation, learning and evolution, and illuminates the types of environments in which organisms are able to thrive. △ Less

Submitted 28 May, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

Comments: 23 pages, 10 figures; replotted figs S1-S3; corrected typos; added references

arXiv:2212.02987 [pdf, other]

Generative random latent features models and statistics of natural images

Authors: Philipp Fleig, Ilya Nemenman

Abstract: Complex, multivariable systems are often analyzed by grouping their constituent units into components, sometimes referred to as latent features, which afford physical or biological interpretation. However, a priori many different types of latent features and data decompositions can be defined, and one typically uses a trial and error approach to determine a decomposition that is natural to the sys… ▽ More Complex, multivariable systems are often analyzed by grouping their constituent units into components, sometimes referred to as latent features, which afford physical or biological interpretation. However, a priori many different types of latent features and data decompositions can be defined, and one typically uses a trial and error approach to determine a decomposition that is natural to the system and its data. It is highly desirable to develop principled understanding of which decomposition is appropriate for given a data set. In this work, we take a step in this direction and argue that sample-sample correlations in the data carry important information to this effect. For this we construct a generative random latent feature matrix model of large data based on linear mixing of latent features. Key ingredient of our model is that we allow for statistical dependence between the mixing coefficients and argue that the model captures characteristic properties found in many types of natural data. Latent dimensionality and correlation patterns of the data are controlled by only two model parameters. The model's data patterns include (overlapping) clusters, sparse mixing, and constrained (non-negative) mixing. We describe the characteristic correlation and eigenvalue distributions of each pattern. Finally, we fit the model on correlation data from natural images and find a near perfect match with the sparse mixing regime of our model. This finding is in line with the well-known sparse coding structure in natural scene images and provides information about the appropriate data decomposition, namely a sparse coding scheme. We believe that our work will deliver similar insights for diverse data of biological systems. △ Less

Submitted 13 June, 2024; v1 submitted 6 December, 2022; originally announced December 2022.

Comments: 15 pages, 8 figures

arXiv:2111.04641 [pdf, other]

doi 10.1103/PhysRevE.106.014102

Statistical properties of large data sets with linear latent features

Authors: Philipp Fleig, Ilya Nemenman

Abstract: Analytical understanding of how low-dimensional latent features reveal themselves in large-dimensional data is still lacking. We study this by defining a linear latent feature model with additive noise constructed from probabilistic matrices, and analytically and numerically computing the statistical distributions of pairwise correlations and eigenvalues of the correlation matrix. This allows us t… ▽ More Analytical understanding of how low-dimensional latent features reveal themselves in large-dimensional data is still lacking. We study this by defining a linear latent feature model with additive noise constructed from probabilistic matrices, and analytically and numerically computing the statistical distributions of pairwise correlations and eigenvalues of the correlation matrix. This allows us to resolve the latent feature structure across a wide range of data regimes set by the number of recorded variables, observations, latent features and the signal-to-noise ratio. We find a characteristic imprint of latent features in the distribution of correlations and eigenvalues and provide an analytic estimate for the boundary between signal and noise even in the absence of a clear spectral gap. △ Less

Submitted 8 November, 2021; originally announced November 2021.

Comments: 20 pages, 6 figures

Showing 1–3 of 3 results for author: Fleig, P