Skip to main content

Showing 1–2 of 2 results for author: AlquBoj, H V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.15624  [pdf, ps, other

    cs.LG cs.CL

    Mechanistic Insights into Grokking from the Embedding Layer

    Authors: H. V. AlquBoj, Hilal AlQuabeh, Velibor Bojkovic, Munachiso Nwadike, Kentaro Inui

    Abstract: Grokking, a delayed generalization in neural networks after perfect training performance, has been observed in Transformers and MLPs, but the components driving it remain underexplored. We show that embeddings are central to grokking: introducing them into MLPs induces delayed generalization in modular arithmetic tasks, whereas MLPs without embeddings can generalize immediately. Our analysis ident… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

    Comments: Mechanistic view of embedding layers

  2. arXiv:2502.16147  [pdf, other

    cs.CL

    Number Representations in LLMs: A Computational Parallel to Human Perception

    Authors: H. V. AlquBoj, Hilal AlQuabeh, Velibor Bojkovic, Tatsuya Hiraoka, Ahmed Oumar El-Shangiti, Munachiso Nwadike, Kentaro Inui

    Abstract: Humans are believed to perceive numbers on a logarithmic mental number line, where smaller values are represented with greater resolution than larger ones. This cognitive bias, supported by neuroscience and behavioral studies, suggests that numerical magnitudes are processed in a sublinear fashion rather than on a uniform linear scale. Inspired by this hypothesis, we investigate whether large lang… ▽ More

    Submitted 22 February, 2025; originally announced February 2025.

    Comments: The number line of LLMs

    MSC Class: 68T50