Skip to main content

Showing 1–3 of 3 results for author: Jermyn, A S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2211.09169  [pdf, other

    cs.LG cs.AI

    Engineering Monosemanticity in Toy Models

    Authors: Adam S. Jermyn, Nicholas Schiefer, Evan Hubinger

    Abstract: In some neural networks, individual neurons correspond to natural ``features'' in the input. Such \emph{monosemantic} neurons are of great help in interpretability studies, as they can be cleanly understood. In this work we report preliminary attempts to engineer monosemanticity in toy models. We find that models can be made more monosemantic without increasing the loss by just changing which loca… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: 31 pages, 26 figures

  2. arXiv:2210.01892  [pdf, other

    cs.NE cs.AI cs.LG

    Polysemanticity and Capacity in Neural Networks

    Authors: Adam Scherlis, Kshitij Sachan, Adam S. Jermyn, Joe Benton, Buck Shlegeris

    Abstract: Individual neurons in neural networks often represent a mixture of unrelated features. This phenomenon, called polysemanticity, can make interpreting neural networks more difficult and so we aim to understand its causes. We propose doing so through the lens of feature \emph{capacity}, which is the fractional dimension each feature consumes in the embedding space. We show that in a toy model the op… ▽ More

    Submitted 25 March, 2025; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: 22 pages, 7 figures. Improved notation and corrected an error in the description of the most general efficient matrices

  3. arXiv:2001.08063  [pdf, other

    cs.NE math.NA physics.comp-ph quant-ph

    Algorithms for Tensor Network Contraction Ordering

    Authors: Frank Schindler, Adam S. Jermyn

    Abstract: Contracting tensor networks is often computationally demanding. Well-designed contraction sequences can dramatically reduce the contraction cost. We explore the performance of simulated annealing and genetic algorithms, two common discrete optimization techniques, to this ordering problem. We benchmark their performance as well as that of the commonly-used greedy search on physically relevant tens… ▽ More

    Submitted 15 January, 2020; originally announced January 2020.

    Comments: 10 pages, 10 figures

    Journal ref: Mach. Learn.: Sci. Technol. 1 035001 (2020)