Skip to main content

Showing 1–8 of 8 results for author: Advani, M

Searching in archive cond-mat. Search in all archives.
.
  1. arXiv:2008.08653  [pdf, other

    cond-mat.dis-nn cond-mat.stat-mech cs.LG q-bio.NC stat.ML

    A new role for circuit expansion for learning in neural networks

    Authors: Julia Steinberg, Madhu Advani, Haim Sompolinsky

    Abstract: Many sensory pathways in the brain rely on sparsely active populations of neurons downstream from the input stimuli. The biological reason for the occurrence of expanded structure in the brain is unclear, but may be because expansion can increase the expressive power of a neural network. In this work, we show that expanding a neural network can improve its generalization performance even in cases… ▽ More

    Submitted 21 December, 2020; v1 submitted 19 August, 2020; originally announced August 2020.

    Comments: 13+10 pages, 13 figures

    Journal ref: Phys. Rev. E 103, 022404 (2021)

  2. arXiv:1906.08632  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.LG

    Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup

    Authors: Sebastian Goldt, Madhu S. Advani, Andrew M. Saxe, Florent Krzakala, Lenka Zdeborová

    Abstract: Deep neural networks achieve stellar generalisation even when they have enough parameters to easily fit all their training data. We study this phenomenon by analysing the dynamics and the performance of over-parameterised two-layer neural networks in the teacher-student setup, where one network, the student, is trained on data generated by another network, called the teacher. We show how the dynam… ▽ More

    Submitted 27 October, 2019; v1 submitted 18 June, 2019; originally announced June 2019.

    Comments: 9 pages + references + supplemental material. Oral presentation at NeurIPS 2019. arXiv admin note: substantial text overlap with arXiv:1901.09085

    Journal ref: J. Stat. Mech. 2020 124010 & NeurIPS 2019

  3. arXiv:1901.09085  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.LG

    Generalisation dynamics of online learning in over-parameterised neural networks

    Authors: Sebastian Goldt, Madhu S. Advani, Andrew M. Saxe, Florent Krzakala, Lenka Zdeborová

    Abstract: Deep neural networks achieve stellar generalisation on a variety of problems, despite often being large enough to easily fit all their training data. Here we study the generalisation dynamics of two-layer neural networks in a teacher-student setup, where one network, the student, is trained using stochastic gradient descent (SGD) on data generated by another network, called the teacher. We show ho… ▽ More

    Submitted 25 January, 2019; originally announced January 2019.

    Comments: 25 pages, 13 figures

    Journal ref: Presented at the ICML 2019 Workshop on Theoretical Physics for Deep Learning

  4. arXiv:1803.01927  [pdf, other

    cs.LG cond-mat.stat-mech stat.ML

    Energy-entropy competition and the effectiveness of stochastic gradient descent in machine learning

    Authors: Yao Zhang, Andrew M. Saxe, Madhu S. Advani, Alpha A. Lee

    Abstract: Finding parameters that minimise a loss function is at the core of many machine learning methods. The Stochastic Gradient Descent algorithm is widely used and delivers state of the art results for many problems. Nonetheless, Stochastic Gradient Descent typically cannot find the global minimum, thus its empirical effectiveness is hitherto mysterious. We derive a correspondence between parameter inf… ▽ More

    Submitted 5 March, 2018; originally announced March 2018.

  5. arXiv:1707.03957  [pdf, other

    q-bio.PE cond-mat.dis-nn cond-mat.stat-mech

    Environmental engineering is an emergent feature of diverse ecosystems and drives community structure

    Authors: Madhu Advani, Guy Bunin, Pankaj Mehta

    Abstract: A central question in ecology is to understand the ecological processes that shape community structure. Niche-based theories have emphasized the important role played by competition for maintaining species diversity. Many of these insights have been derived using MacArthur's consumer resource model (MCRM) or its generalizations. Most theoretical work on the MCRM has focused on small ecosystems wit… ▽ More

    Submitted 12 July, 2017; originally announced July 2017.

    Comments: 14 pages, 5 figures

  6. arXiv:1609.07060  [pdf, other

    stat.ML cond-mat.dis-nn math.ST q-bio.NC

    An equivalence between high dimensional Bayes optimal inference and M-estimation

    Authors: Madhu Advani, Surya Ganguli

    Abstract: When recovering an unknown signal from noisy measurements, the computational difficulty of performing optimal Bayesian MMSE (minimum mean squared error) inference often necessitates the use of maximum a posteriori (MAP) inference, a special case of regularized M-estimation, as a surrogate. However, MAP is suboptimal in high dimensions, when the number of unknown signal components is similar to the… ▽ More

    Submitted 22 September, 2016; originally announced September 2016.

    Comments: To appear in NIPS 2016

  7. arXiv:1601.04650  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech math.ST q-bio.QM

    Statistical Mechanics of High-Dimensional Inference

    Authors: Madhu Advani, Surya Ganguli

    Abstract: To model modern large-scale datasets, we need efficient algorithms to infer a set of $P$ unknown model parameters from $N$ noisy measurements. What are fundamental limits on the accuracy of parameter inference, given finite signal-to-noise ratios, limited measurements, prior information, and computational tractability requirements? How can we combine prior information with measurements to achieve… ▽ More

    Submitted 21 February, 2016; v1 submitted 18 January, 2016; originally announced January 2016.

    Comments: See http://ganguli-gang.stanford.edu/pdf/HighDimInf.Supp.pdf for supplementary material

    Journal ref: Phys. Rev. X 6, 031034 (2016)

  8. arXiv:1301.7115  [pdf, ps, other

    q-bio.NC cond-mat.dis-nn stat.ML

    Statistical mechanics of complex neural systems and high dimensional data

    Authors: Madhu Advani, Subhaneil Lahiri, Surya Ganguli

    Abstract: Recent experimental advances in neuroscience have opened new vistas into the immense complexity of neuronal networks. This proliferation of data challenges us on two parallel fronts. First, how can we form adequate theoretical frameworks for understanding how dynamical network processes cooperate across widely disparate spatiotemporal scales to solve important computational problems? And second, h… ▽ More

    Submitted 29 January, 2013; originally announced January 2013.

    Comments: 72 pages, 8 figures, iopart.cls, to appear in JSTAT