-
Exploring the Energy Landscape of RBMs: Reciprocal Space Insights into Bosons, Hierarchical Learning and Symmetry Breaking
Authors:
J. Quetzalcóatl Toledo-Marin,
Anindita Maiti,
Geoffrey C. Fox,
Roger G. Melko
Abstract:
Deep generative models have become ubiquitous due to their ability to learn and sample from complex distributions. Despite the proliferation of various frameworks, the relationships among these models remain largely unexplored, a gap that hinders the development of a unified theory of AI learning. We address two central challenges: clarifying the connections between different deep generative model…
▽ More
Deep generative models have become ubiquitous due to their ability to learn and sample from complex distributions. Despite the proliferation of various frameworks, the relationships among these models remain largely unexplored, a gap that hinders the development of a unified theory of AI learning. We address two central challenges: clarifying the connections between different deep generative models and deepening our understanding of their learning mechanisms. We focus on Restricted Boltzmann Machines (RBMs), known for their universal approximation capabilities for discrete distributions. By introducing a reciprocal space formulation, we reveal a connection between RBMs, diffusion processes, and coupled Bosons. We show that at initialization, the RBM operates at a saddle point, where the local curvature is determined by the singular values, whose distribution follows the Marcenko-Pastur law and exhibits rotational symmetry. During training, this rotational symmetry is broken due to hierarchical learning, where different degrees of freedom progressively capture features at multiple levels of abstraction. This leads to a symmetry breaking in the energy landscape, reminiscent of Landau theory. This symmetry breaking in the energy landscape is characterized by the singular values and the weight matrix eigenvector matrix. We derive the corresponding free energy in a mean-field approximation. We show that in the limit of infinite size RBM, the reciprocal variables are Gaussian distributed. Our findings indicate that in this regime, there will be some modes for which the diffusion process will not converge to the Boltzmann distribution. To illustrate our results, we trained replicas of RBMs with different hidden layer sizes using the MNIST dataset. Our findings bridge the gap between disparate generative frameworks and also shed light on the processes underpinning learning in generative models.
△ Less
Submitted 27 March, 2025;
originally announced March 2025.
-
Propagation of Enzyme-driven Active Fluctuations in Crowded Milieu
Authors:
Rik Chakraborty,
Arnab Maiti,
Diptangshu Paul,
Rajnandan Borthakur,
K. R. Jayaprakash,
Uddipta Ghosh,
Krishna Kanti Dey
Abstract:
We investigated the energy transfer from active enzymes to their surroundings in crowded environments by measuring the diffusion of passive microscopic tracers in active solutions of ficoll and glycerol. Despite observing lower rates of substrate turnover and relatively smaller enhancement of passive tracer diffusion in artificial crowded media compared to those in aqueous solutions, we found a si…
▽ More
We investigated the energy transfer from active enzymes to their surroundings in crowded environments by measuring the diffusion of passive microscopic tracers in active solutions of ficoll and glycerol. Despite observing lower rates of substrate turnover and relatively smaller enhancement of passive tracer diffusion in artificial crowded media compared to those in aqueous solutions, we found a significantly higher relative diffusion enhancement in crowded environments in the presence of enzymatic activity. Our experimental observations, coupled with supporting analytical estimations, underscored the critical role of the intervening media in facilitating mechanical energy distribution around active enzymes.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
Bayesian RG Flow in Neural Network Field Theories
Authors:
Jessica N. Howard,
Marc S. Klinger,
Anindita Maiti,
Alexander G. Stapleton
Abstract:
The Neural Network Field Theory correspondence (NNFT) is a mapping from neural network (NN) architectures into the space of statistical field theories (SFTs). The Bayesian renormalization group (BRG) is an information-theoretic coarse graining scheme that generalizes the principles of the exact renormalization group (ERG) to arbitrarily parameterized probability distributions, including those of N…
▽ More
The Neural Network Field Theory correspondence (NNFT) is a mapping from neural network (NN) architectures into the space of statistical field theories (SFTs). The Bayesian renormalization group (BRG) is an information-theoretic coarse graining scheme that generalizes the principles of the exact renormalization group (ERG) to arbitrarily parameterized probability distributions, including those of NNs. In BRG, coarse graining is performed in parameter space with respect to an information-theoretic distinguishability scale set by the Fisher information metric. In this paper, we unify NNFT and BRG to form a powerful new framework for exploring the space of NNs and SFTs, which we coin BRG-NNFT. With BRG-NNFT, NN training dynamics can be interpreted as inducing a flow in the space of SFTs from the information-theoretic `IR' $\rightarrow$ `UV'. Conversely, applying an information-shell coarse graining to the trained network's parameters induces a flow in the space of SFTs from the information-theoretic `UV' $\rightarrow$ `IR'. When the information-theoretic cutoff scale coincides with a standard momentum scale, BRG is equivalent to ERG. We demonstrate the BRG-NNFT correspondence on two analytically tractable examples. First, we construct BRG flows for trained, infinite-width NNs, of arbitrary depth, with generic activation functions. As a special case, we then restrict to architectures with a single infinitely-wide layer, scalar outputs, and generalized cos-net activations. In this case, we show that BRG coarse-graining corresponds exactly to the momentum-shell ERG flow of a free scalar SFT. Our analytic results are corroborated by a numerical experiment in which an ensemble of asymptotically wide NNs are trained and subsequently renormalized using an information-shell BRG scheme.
△ Less
Submitted 5 February, 2025; v1 submitted 27 May, 2024;
originally announced May 2024.
-
Asymptotic theory of in-context learning by linear attention
Authors:
Yue M. Lu,
Mary I. Letey,
Jacob A. Zavatone-Veth,
Anindita Maiti,
Cengiz Pehlevan
Abstract:
Transformers have a remarkable ability to learn and execute tasks based on examples provided within the input itself, without explicit prior training. It has been argued that this capability, known as in-context learning (ICL), is a cornerstone of Transformers' success, yet questions about the necessary sample complexity, pretraining task diversity, and context length for successful ICL remain unr…
▽ More
Transformers have a remarkable ability to learn and execute tasks based on examples provided within the input itself, without explicit prior training. It has been argued that this capability, known as in-context learning (ICL), is a cornerstone of Transformers' success, yet questions about the necessary sample complexity, pretraining task diversity, and context length for successful ICL remain unresolved. Here, we provide a precise answer to these questions in an exactly solvable model of ICL of a linear regression task by linear attention. We derive sharp asymptotics for the learning curve in a phenomenologically-rich scaling regime where the token dimension is taken to infinity; the context length and pretraining task diversity scale proportionally with the token dimension; and the number of pretraining examples scales quadratically. We demonstrate a double-descent learning curve with increasing pretraining examples, and uncover a phase transition in the model's behavior between low and high task diversity regimes: In the low diversity regime, the model tends toward memorization of training tasks, whereas in the high diversity regime, it achieves genuine in-context learning and generalization beyond the scope of pretrained tasks. These theoretical insights are empirically validated through experiments with both linear attention and full nonlinear Transformer architectures.
△ Less
Submitted 4 February, 2025; v1 submitted 19 May, 2024;
originally announced May 2024.
-
Wilsonian Renormalization of Neural Network Gaussian Processes
Authors:
Jessica N. Howard,
Ro Jefferson,
Anindita Maiti,
Zohar Ringel
Abstract:
Separating relevant and irrelevant information is key to any modeling process or scientific inquiry. Theoretical physics offers a powerful tool for achieving this in the form of the renormalization group (RG). Here we demonstrate a practical approach to performing Wilsonian RG in the context of Gaussian Process (GP) Regression. We systematically integrate out the unlearnable modes of the GP kernel…
▽ More
Separating relevant and irrelevant information is key to any modeling process or scientific inquiry. Theoretical physics offers a powerful tool for achieving this in the form of the renormalization group (RG). Here we demonstrate a practical approach to performing Wilsonian RG in the context of Gaussian Process (GP) Regression. We systematically integrate out the unlearnable modes of the GP kernel, thereby obtaining an RG flow of the GP in which the data sets the IR scale. In simple cases, this results in a universal flow of the ridge parameter, which becomes input-dependent in the richer scenario in which non-Gaussianities are included. In addition to being analytically tractable, this approach goes beyond structural analogies between RG and neural networks by providing a natural connection between RG flow and learnable vs. unlearnable modes. Studying such flows may improve our understanding of feature learning in deep neural networks, and enable us to identify potential universality classes in these models.
△ Less
Submitted 13 May, 2025; v1 submitted 9 May, 2024;
originally announced May 2024.
-
Activity Induced Diffusion Recovery in Crowded Colloidal Suspension
Authors:
Arnab Maiti,
Yuki Koyano,
Hiroyuki Kitahata,
Krishna Kanti Dey
Abstract:
We show that the force generated by active enzyme molecules are strong enough to influence the dynamics of their surroundings under artificial crowded environments. We measured the behavior of polymer microparticles in a quasi-two-dimensional system under aqueous environment, at various area fraction values of particles. In the presence of enzymatic activity not only the diffusion of the suspended…
▽ More
We show that the force generated by active enzyme molecules are strong enough to influence the dynamics of their surroundings under artificial crowded environments. We measured the behavior of polymer microparticles in a quasi-two-dimensional system under aqueous environment, at various area fraction values of particles. In the presence of enzymatic activity not only the diffusion of the suspended particles at shorter time-scale regime enhanced, the system also showed a transition from sub-diffusive to diffusive dynamics at longer time-scale limits. Similar observations were also recorded with enzyme functionalized microparticles. Brownian dynamics simulations have been performed to support the experimental observations.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
Design of acoustic diffraction plates for manipulating ultrasound in liquid Helium
Authors:
Ayanesh Maiti,
Dillip K. Pradhan,
Ambarish Ghosh
Abstract:
Many experiments in liquid Helium, such as the optical imaging of exploding electron bubbles, which enables research on individual particles under applied conditions, involve the usage of ultrasound generated by piezoelectric transducers. Previous studies either use planar transducers, which limits the maximum sound intensity and the spatial resolution, or curved transducers, which only allow obse…
▽ More
Many experiments in liquid Helium, such as the optical imaging of exploding electron bubbles, which enables research on individual particles under applied conditions, involve the usage of ultrasound generated by piezoelectric transducers. Previous studies either use planar transducers, which limits the maximum sound intensity and the spatial resolution, or curved transducers, which only allow observations at fixed foci and make it difficult to apply uniform electric fields. In this paper, we introduce the usage of acoustic diffraction plates in liquid Helium to amplify ultrasonic pressure oscillations at an arbitrary set of primary foci coupled with large counts of secondary foci, all of which can be freely moved around by changing the ultrasound frequency. The frequency dependence also allows us to generate controlled Faraday instabilities at the surface, which enables the generation of multi-electron bubbles with desired parameters.
△ Less
Submitted 11 March, 2022;
originally announced March 2022.
-
Anomalies in the electronic structure of a 5$d$ transition metal oxide, IrO$_2$
Authors:
Swapnil Patil,
Aniket Maiti,
Surajit Dutta,
Khadiza Ali,
Pramita Mishra,
Ram Prakash Pandeya,
Arindam Pramanik,
Sawani Datta,
Srinivas C. Kandukuri,
Kalobaran Maiti
Abstract:
Ir-based materials have drawn much attention due to the observation of insulating phase believed to be driven by spin-orbit coupling while Ir 5$d$ states are expected to be weakly correlated due to their large orbital extensions. IrO$_2$, a simple binary material, shows metallic ground state which seems to deviate from the behavior of most other Ir-based materials and varied predictions in these m…
▽ More
Ir-based materials have drawn much attention due to the observation of insulating phase believed to be driven by spin-orbit coupling while Ir 5$d$ states are expected to be weakly correlated due to their large orbital extensions. IrO$_2$, a simple binary material, shows metallic ground state which seems to deviate from the behavior of most other Ir-based materials and varied predictions in these material class. We studied the electronic structure of IrO$_2$ at different temperatures employing high resolution photoemission spectroscopy with photon energies spanning from ultraviolet to hard $x$-ray range. Experimental spectra exhibit a signature of enhancement of Ir-O covalency in the bulk compared to the surface electronic structure. The branching ratio of the spin-orbit split Ir core level peaks is found to be larger than its atomic values and it enhances further in the bulk electronic structure. Such deviation from the atomic description of the core level spectroscopy manifests the enhancement of the orbital moment due to the solid state effects. The valence band spectra could be captured well within the density functional theory. The photon energy dependence of the features in the valence band spectra and their comparison with the calculated results show dominant Ir 5$d$ character of the features near the Fermi level; O 2$p$ peaks appear at higher binding energies. Interestingly, the O 2$p$ contributions of the feature at the Fermi level is significant and it enhances at low temperatures. This reveals an orbital selective enhancement of the covalency with cooling which is an evidence against purely spin-orbit coupling based scenario proposed for these systems.
△ Less
Submitted 25 September, 2021;
originally announced September 2021.
-
Growth and characterization of high-quality single-crystalline SnTe retaining cubic symmetry down to the lowest temperature studied
Authors:
Ayanesh Maiti,
Ankita Singh,
Kartik K. Iyer,
Arumugam Thamizhavel
Abstract:
SnTe, an archetypical topological crystalline insulator, often shows a transition from a highly symmetric cubic phase to a rhombohedral structure at low temperatures. In order to achieve the highly symmetric cubic phase at low temperatures suitable for quantum behaviour, we have employed the modified Bridgman method to grow a high-quality single-crystalline sample of SnTe. Analysis of the crystal…
▽ More
SnTe, an archetypical topological crystalline insulator, often shows a transition from a highly symmetric cubic phase to a rhombohedral structure at low temperatures. In order to achieve the highly symmetric cubic phase at low temperatures suitable for quantum behaviour, we have employed the modified Bridgman method to grow a high-quality single-crystalline sample of SnTe. Analysis of the crystal structure using Laue diffraction and rocking curve measurements show a very high degree of single crystallinity of the sample. Resistivity and the specific heat data do not show the signature of structural transition down to the lowest temperature studied. The magnetic susceptibility shows diamagnetic behaviour. All these properties manifest the behaviour of a typical bulk semiconductor with conducting surface states as expected in a topological material. Detailed powder x-ray diffraction measurements show cubic structure in the whole temperature range studied.
△ Less
Submitted 17 November, 2021; v1 submitted 3 September, 2021;
originally announced September 2021.
-
Anomalies in the temperature evolution of the Dirac states in a topological crystalline insulator SnTe
Authors:
Ayanesh Maiti,
Ram Prakash Pandeya,
Bahadur Singh,
Kartik K Iyer,
A Thamizhavel,
Kalobaran Maiti
Abstract:
Discovery of topologically protected surface states, believed to be immune to weak disorder and thermal effects, opened up a new avenue to reveal exotic fundamental science and advanced technology. While time-reversal symmetry plays the key role in most such materials, the bulk crystalline symmetries such as mirror symmetry preserve the topological properties of topological crystalline insulators…
▽ More
Discovery of topologically protected surface states, believed to be immune to weak disorder and thermal effects, opened up a new avenue to reveal exotic fundamental science and advanced technology. While time-reversal symmetry plays the key role in most such materials, the bulk crystalline symmetries such as mirror symmetry preserve the topological properties of topological crystalline insulators (TCIs). It is apparent that any structural change may alter the topological properties of TCIs. To investigate this relatively unexplored landscape, we study the temperature evolution of the Dirac fermion states in an archetypical mirror-symmetry protected TCI, SnTe employing high-resolution angle-resolved photoemission spectroscopy and density functional theory studies. Experimental results reveal a perplexing scenario; the bulk bands observed at 22 K move nearer to the Fermi level at 60 K and again shift back to higher binding energies at 120 K. The slope of the surface Dirac bands at 22 K becomes smaller at 60 K and changes back to a larger value at 120 K. Our results from the first-principles calculations suggest that these anomalies can be attributed to the evolution of the hybridization physics with complex structural changes induced by temperature. In addition, we discover drastically reduced intensity of the Dirac states at the Fermi level at high temperatures may be due to complex evolution of anharmonicity, strain, etc. These results address robustness of the topologically protected surface states due to thermal effects and emphasize importance of covalency and anharmonicity in the topological properties of such emerging quantum materials.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
Neural Networks and Quantum Field Theory
Authors:
James Halverson,
Anindita Maiti,
Keegan Stoner
Abstract:
We propose a theoretical understanding of neural networks in terms of Wilsonian effective field theory. The correspondence relies on the fact that many asymptotic neural networks are drawn from Gaussian processes, the analog of non-interacting field theories. Moving away from the asymptotic limit yields a non-Gaussian process and corresponds to turning on particle interactions, allowing for the co…
▽ More
We propose a theoretical understanding of neural networks in terms of Wilsonian effective field theory. The correspondence relies on the fact that many asymptotic neural networks are drawn from Gaussian processes, the analog of non-interacting field theories. Moving away from the asymptotic limit yields a non-Gaussian process and corresponds to turning on particle interactions, allowing for the computation of correlation functions of neural network outputs with Feynman diagrams. Minimal non-Gaussian process likelihoods are determined by the most relevant non-Gaussian terms, according to the flow in their coefficients induced by the Wilsonian renormalization group. This yields a direct connection between overparameterization and simplicity of neural network likelihoods. Whether the coefficients are constants or functions may be understood in terms of GP limit symmetries, as expected from 't Hooft's technical naturalness. General theoretical calculations are matched to neural network experiments in the simplest class of models allowing the correspondence. Our formalism is valid for any of the many architectures that becomes a GP in an asymptotic limit, a property preserved under certain types of training.
△ Less
Submitted 15 March, 2021; v1 submitted 19 August, 2020;
originally announced August 2020.
-
Probing Localized Surface Plasmons of Trisoctahedral Gold Nanocrystals for Surface Enhanced Raman Scattering
Authors:
Achyut Maity,
Arpan Maiti,
Biswarup Satpati,
Avinash Patsha,
Sandip Dhara,
Tapas Kumar Chini
Abstract:
Trisoctahedral (TOH) shaped Au nanocrystals (NCs) have emerged as a new class of metal nanoparticles (MNP) due to its superior catalystic and SERS activities caused by the presence of high density atomic steps and dangling bonds on their high index facets. We examine the radiative localized surface plasmon resonance (LSPR) modes of an isolated single TOH Au NC using cathodoluminescence (CL), with…
▽ More
Trisoctahedral (TOH) shaped Au nanocrystals (NCs) have emerged as a new class of metal nanoparticles (MNP) due to its superior catalystic and SERS activities caused by the presence of high density atomic steps and dangling bonds on their high index facets. We examine the radiative localized surface plasmon resonance (LSPR) modes of an isolated single TOH Au NC using cathodoluminescence (CL), with high resolution spatial information of the local density of optical states (LDOS) across the visible spectral range. Further we show pronounced enhancement factor in the Raman scattering by performing Raman spectroscopic measurements on Rhodamine 6G (R6G) covered TOH Au NPs aggregates on a Si substrate. We believe that the hot spots between two adjacent MNP surfaces (nanogaps) can be significantly stronger than single particle LSPRs. Such nanogaps hotspots may have crucial role on the substantial SERS enhancement observed in this report. Consequently, the present study indicates that MNPs aggregates are highly desirable than individual plasmonic nanoparticles for possible applications in SERS based biosensing.
△ Less
Submitted 12 May, 2017;
originally announced May 2017.
-
Electronic transport through carbon nanotubes -- effects of structural deformation and tube chirality
Authors:
Amitesh Maiti,
Alexei Svizhenko,
M. P. Anantram
Abstract:
Atomistic simulations using a combination of classical forcefield and Density-Functional-Theory (DFT) show that carbon atoms remain essentially sp2 coordinated in either bent tubes or tubes pushed by an atomically sharp AFM tip. Subsequent Green's-function-based transport calculations reveal that for armchair tubes there is no significant drop in conductance, while for zigzag tubes the conductan…
▽ More
Atomistic simulations using a combination of classical forcefield and Density-Functional-Theory (DFT) show that carbon atoms remain essentially sp2 coordinated in either bent tubes or tubes pushed by an atomically sharp AFM tip. Subsequent Green's-function-based transport calculations reveal that for armchair tubes there is no significant drop in conductance, while for zigzag tubes the conductance can drop by several orders of magnitude in AFM-pushed tubes. The effect can be attributed to simple stretching of the tube under tip deformation, which opens up an energy gap at the Fermi surface.
△ Less
Submitted 20 February, 2002;
originally announced February 2002.