Skip to main content

Showing 1–6 of 6 results for author: Vashisht, R

.
  1. arXiv:2411.13264  [pdf, other

    cs.LG

    Transformers with Sparse Attention for Granger Causality

    Authors: Riya Mahesh, Rahul Vashisht, Chandrashekar Lakshminarayanan

    Abstract: Temporal causal analysis means understanding the underlying causes behind observed variables over time. Deep learning based methods such as transformers are increasingly used to capture temporal dynamics and causal relationships beyond mere correlations. Recent works suggest self-attention weights of transformers as a useful indicator of causal links. We leverage this to propose a novel modificati… ▽ More

    Submitted 20 November, 2024; originally announced November 2024.

  2. arXiv:2411.04569  [pdf, other

    cs.LG cs.AI

    Impact of Label Noise on Learning Complex Features

    Authors: Rahul Vashisht, P. Krishna Kumar, Harsha Vardhan Govind, Harish G. Ramaswamy

    Abstract: Neural networks trained with stochastic gradient descent exhibit an inductive bias towards simpler decision boundaries, typically converging to a narrow family of functions, and often fail to capture more complex features. This phenomenon raises concerns about the capacity of deep models to adequately learn and represent real-world datasets. Traditional approaches such as explicit regularization,… ▽ More

    Submitted 7 November, 2024; originally announced November 2024.

    Comments: Accepted at Workshop on Scientific Methods for Understanding Deep Learning, NeurIPS 2024

  3. On the Learning Dynamics of Attention Networks

    Authors: Rahul Vashisht, Harish G. Ramaswamy

    Abstract: Attention models are typically learned by optimizing one of three standard loss functions that are variously called -- soft attention, hard attention, and latent variable marginal likelihood (LVML) attention. All three paradigms are motivated by the same goal of finding two models -- a `focus' model that `selects' the right \textit{segment} of the input and a `classification' model that processes… ▽ More

    Submitted 12 October, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: Proceedings at ECAI-2023 IOS Press

  4. arXiv:2212.14776  [pdf, ps, other

    cs.LG

    On the Interpretability of Attention Networks

    Authors: Lakshmi Narayan Pandey, Rahul Vashisht, Harish G. Ramaswamy

    Abstract: Attention mechanisms form a core component of several successful deep learning architectures, and are based on one key idea: ''The output depends only on a small (but unknown) segment of the input.'' In several practical applications like image captioning and language translation, this is mostly true. In trained models with an attention mechanism, the outputs of an intermediate module that encodes… ▽ More

    Submitted 14 May, 2023; v1 submitted 30 December, 2022; originally announced December 2022.

    Comments: ACML 2022,PMLR, Volume 189, https://proceedings.mlr.press/v189/pandey23a/pandey23a.pdf

    Journal ref: Proceedings of The 14th Asian Conference on Machine, 832--847, 2023, Volume:189; PMLR

  5. arXiv:2012.08854  [pdf, ps, other

    cs.LG stat.ML

    Using noise resilience for ranking generalization of deep neural networks

    Authors: Depen Morwani, Rahul Vashisht, Harish G. Ramaswamy

    Abstract: Recent papers have shown that sufficiently overparameterized neural networks can perfectly fit even random labels. Thus, it is crucial to understand the underlying reason behind the generalization performance of a network on real-world data. In this work, we propose several measures to predict the generalization error of a network given the training data and its parameters. Using one of these meas… ▽ More

    Submitted 16 December, 2020; originally announced December 2020.

    ACM Class: I.5.1

  6. Structural Health Monitoring of Cantilever Beam, a Case Study -- Using Bayesian Neural Network AND Deep Learning

    Authors: Rahul Vashisht, H. Viji, T. Sundararajan, D. Mohankumar, S. Sumitra

    Abstract: The advancement of machine learning algorithms has opened a wide scope for vibration-based SHM (Structural Health Monitoring). Vibration-based SHM is based on the fact that damage will alter the dynamic properties viz., structural response, frequencies, mode shapes, etc of the structure. The responses measured using sensors, which are high dimensional in nature, can be intelligently analyzed using… ▽ More

    Submitted 17 August, 2019; originally announced August 2019.

    Comments: 10 Pages

    Report number: 11