Sparse Coding and Autoencoders

Rangamani, Akshay; Mukherjee, Anirbit; Basu, Amitabh; Ganapathy, Tejaswini; Arora, Ashish; Chin, Sang; Tran, Trac D.

Computer Science > Machine Learning

arXiv:1708.03735 (cs)

[Submitted on 12 Aug 2017 (v1), last revised 20 Oct 2017 (this version, v2)]

Title:Sparse Coding and Autoencoders

Authors:Akshay Rangamani, Anirbit Mukherjee, Amitabh Basu, Tejaswini Ganapathy, Ashish Arora, Sang Chin, Trac D. Tran

View PDF

Abstract:In "Dictionary Learning" one tries to recover incoherent matrices $A^* \in \mathbb{R}^{n \times h}$ (typically overcomplete and whose columns are assumed to be normalized) and sparse vectors $x^* \in \mathbb{R}^h$ with a small support of size $h^p$ for some $0 <p < 1$ while having access to observations $y \in \mathbb{R}^n$ where $y = A^*x^*$. In this work we undertake a rigorous analysis of whether gradient descent on the squared loss of an autoencoder can solve the dictionary learning problem. The "Autoencoder" architecture we consider is a $\mathbb{R}^n \rightarrow \mathbb{R}^n$ mapping with a single ReLU activation layer of size $h$.
Under very mild distributional assumptions on $x^*$, we prove that the norm of the expected gradient of the standard squared loss function is asymptotically (in sparse code dimension) negligible for all points in a small neighborhood of $A^*$. This is supported with experimental evidence using synthetic data. We also conduct experiments to suggest that $A^*$ is a local minimum. Along the way we prove that a layer of ReLU gates can be set up to automatically recover the support of the sparse codes. This property holds independent of the loss function. We believe that it could be of independent interest.

Comments:	In this new version of the paper with a small change in the distributional assumptions we are actually able to prove the asymptotic criticality of a neighbourhood of the ground truth dictionary for even just the standard squared loss of the ReLU autoencoder (unlike the regularized loss in the older version)
Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:1708.03735 [cs.LG]
	(or arXiv:1708.03735v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1708.03735

Submission history

From: Anirbit Mukherjee [view email]
[v1] Sat, 12 Aug 2017 01:02:47 UTC (1,348 KB)
[v2] Fri, 20 Oct 2017 18:07:53 UTC (1,523 KB)

Computer Science > Machine Learning

Title:Sparse Coding and Autoencoders

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Sparse Coding and Autoencoders

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators