Skip to main content

Showing 1–7 of 7 results for author: Bojkovic, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.15624  [pdf, ps, other

    cs.LG cs.CL

    Mechanistic Insights into Grokking from the Embedding Layer

    Authors: H. V. AlquBoj, Hilal AlQuabeh, Velibor Bojkovic, Munachiso Nwadike, Kentaro Inui

    Abstract: Grokking, a delayed generalization in neural networks after perfect training performance, has been observed in Transformers and MLPs, but the components driving it remain underexplored. We show that embeddings are central to grokking: introducing them into MLPs induces delayed generalization in modular arithmetic tasks, whereas MLPs without embeddings can generalize immediately. Our analysis ident… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

    Comments: Mechanistic view of embedding layers

  2. arXiv:2502.16147  [pdf, other

    cs.CL

    Number Representations in LLMs: A Computational Parallel to Human Perception

    Authors: H. V. AlquBoj, Hilal AlQuabeh, Velibor Bojkovic, Tatsuya Hiraoka, Ahmed Oumar El-Shangiti, Munachiso Nwadike, Kentaro Inui

    Abstract: Humans are believed to perceive numbers on a logarithmic mental number line, where smaller values are represented with greater resolution than larger ones. This cognitive bias, supported by neuroscience and behavioral studies, suggests that numerical magnitudes are processed in a sublinear fashion rather than on a uniform linear scale. Inspired by this hypothesis, we investigate whether large lang… ▽ More

    Submitted 22 February, 2025; originally announced February 2025.

    Comments: The number line of LLMs

    MSC Class: 68T50

  3. arXiv:2502.14487  [pdf, other

    cs.LG cs.AI cs.CV

    Temporal Misalignment in ANN-SNN Conversion and Its Mitigation via Probabilistic Spiking Neurons

    Authors: Velibor Bojković, Xiaofeng Wu, Bin Gu

    Abstract: Spiking Neural Networks (SNNs) offer a more energy-efficient alternative to Artificial Neural Networks (ANNs) by mimicking biological neural principles, establishing them as a promising approach to mitigate the increasing energy demands of large-scale neural models. However, fully harnessing the capabilities of SNNs remains challenging due to their discrete signal processing and temporal dynamics.… ▽ More

    Submitted 21 February, 2025; v1 submitted 20 February, 2025; originally announced February 2025.

  4. arXiv:2501.13491  [pdf, other

    cs.CL cs.AI

    RECALL: Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles

    Authors: Munachiso Nwadike, Zangir Iklassov, Toluwani Aremu, Tatsuya Hiraoka, Velibor Bojkovic, Benjamin Heinzerling, Hilal Alqaubeh, Martin Takáč, Kentaro Inui

    Abstract: We introduce the concept of the self-referencing causal cycle (abbreviated RECALL) - a mechanism that enables large language models (LLMs) to bypass the limitations of unidirectional causality, which underlies a phenomenon known as the reversal curse. When an LLM is prompted with sequential data, it often fails to recall preceding context. For example, when we ask an LLM to recall the line precedi… ▽ More

    Submitted 23 January, 2025; originally announced January 2025.

  5. arXiv:2412.16188  [pdf, other

    cs.LG cs.AI

    A Decade of Deep Learning: A Survey on The Magnificent Seven

    Authors: Dilshod Azizov, Muhammad Arslan Manzoor, Velibor Bojkovic, Yingxu Wang, Zixiao Wang, Zangir Iklassov, Kailong Zhao, Liang Li, Siwei Liu, Yu Zhong, Wei Liu, Shangsong Liang

    Abstract: Deep learning has fundamentally reshaped the landscape of artificial intelligence over the past decade, enabling remarkable achievements across diverse domains. At the heart of these developments lie multi-layered neural network architectures that excel at automatic feature extraction, leading to significant improvements in machine learning tasks. To demystify these advances and offer accessible g… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

  6. arXiv:2403.18388  [pdf, other

    cs.AI cs.CV

    FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN Conversion

    Authors: Xiaofeng Wu, Velibor Bojkovic, Bin Gu, Kun Suo, Kai Zou

    Abstract: Spiking Neural Networks (SNNs) offer a promising avenue for energy-efficient computing compared with Artificial Neural Networks (ANNs), closely mirroring biological neural processes. However, this potential comes with inherent challenges in directly training SNNs through spatio-temporal backpropagation -- stemming from the temporal dynamics of spiking neurons and their discrete signal processing -… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  7. arXiv:2302.00910  [pdf, other

    cs.LG cs.AI

    Energy Efficient Training of SNN using Local Zeroth Order Method

    Authors: Bhaskar Mukhoty, Velibor Bojkovic, William de Vazelhes, Giulia De Masi, Huan Xiong, Bin Gu

    Abstract: Spiking neural networks are becoming increasingly popular for their low energy requirement in real-world tasks with accuracy comparable to the traditional ANNs. SNN training algorithms face the loss of gradient information and non-differentiability due to the Heaviside function in minimizing the model loss over model parameters. To circumvent the problem surrogate method uses a differentiable appr… ▽ More

    Submitted 5 February, 2023; v1 submitted 2 February, 2023; originally announced February 2023.