Search | arXiv e-print repository

Log-Linear Attention

Authors: Han Guo, Songlin Yang, Tarushii Goel, Eric P. Xing, Tri Dao, Yoon Kim

Abstract: The attention mechanism in Transformers is an important primitive for accurate and scalable sequence modeling. Its quadratic-compute and linear-memory complexity however remain significant bottlenecks. Linear attention and state-space models enable linear-time, constant-memory sequence modeling and can moreover be trained efficiently through matmul-rich parallelization across sequence length. Howe… ▽ More The attention mechanism in Transformers is an important primitive for accurate and scalable sequence modeling. Its quadratic-compute and linear-memory complexity however remain significant bottlenecks. Linear attention and state-space models enable linear-time, constant-memory sequence modeling and can moreover be trained efficiently through matmul-rich parallelization across sequence length. However, at their core these models are still RNNs, and thus their use of a fixed-size hidden state to model the context is a fundamental limitation. This paper develops log-linear attention, an attention mechanism that balances linear attention's efficiency and the expressiveness of softmax attention. Log-linear attention replaces the fixed-size hidden state with a logarithmically growing set of hidden states. We show that with a particular growth function, log-linear attention admits a similarly matmul-rich parallel form whose compute cost is log-linear in sequence length. Log-linear attention is a general framework and can be applied on top of existing linear attention variants. As case studies, we instantiate log-linear variants of two recent architectures -- Mamba-2 and Gated DeltaNet -- and find they perform well compared to their linear-time variants. △ Less

Submitted 25 June, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

arXiv:2504.00408 [pdf, other]

From Intuition to Understanding: Using AI Peers to Overcome Physics Misconceptions

Authors: Ruben Weijers, Denton Wu, Hannah Betts, Tamara Jacod, Yuxiang Guan, Vidya Sujaya, Kushal Dev, Toshali Goel, William Delooze, Reihaneh Rabbany, Ying Wu, Jean-François Godbout, Kellin Pelrine

Abstract: Generative AI has the potential to transform personalization and accessibility of education. However, it raises serious concerns about accuracy and helping students become independent critical thinkers. In this study, we designed a helpful AI "Peer" to help students correct fundamental physics misconceptions related to Newtonian mechanic concepts. In contrast to approaches that seek near-perfect a… ▽ More Generative AI has the potential to transform personalization and accessibility of education. However, it raises serious concerns about accuracy and helping students become independent critical thinkers. In this study, we designed a helpful AI "Peer" to help students correct fundamental physics misconceptions related to Newtonian mechanic concepts. In contrast to approaches that seek near-perfect accuracy to create an authoritative AI tutor or teacher, we directly inform students that this AI can answer up to 40% of questions incorrectly. In a randomized controlled trial with 165 students, those who engaged in targeted dialogue with the AI Peer achieved post-test scores that were, on average, 10.5 percentage points higher - with over 20 percentage points higher normalized gain - than a control group that discussed physics history. Qualitative feedback indicated that 91% of the treatment group's AI interactions were rated as helpful. Furthermore, by comparing student performance on pre- and post-test questions about the same concept, along with experts' annotations of the AI interactions, we find initial evidence suggesting the improvement in performance does not depend on the correctness of the AI. With further research, the AI Peer paradigm described here could open new possibilities for how we learn, adapt to, and grow with AI. △ Less

Submitted 1 April, 2025; originally announced April 2025.

arXiv:2401.16472 [pdf, ps, other]

doi 10.1103/PhysRevResearch.6.013246

Optimal function estimation with photonic quantum sensor networks

Authors: Jacob Bringewatt, Adam Ehrenberg, Tarushii Goel, Alexey V. Gorshkov

Abstract: The problem of optimally measuring an analytic function of unknown local parameters each linearly coupled to a qubit sensor is well understood, with applications ranging from field interpolation to noise characterization. Here, we resolve a number of open questions that arise when extending this framework to Mach-Zehnder interferometers and quadrature displacement sensing. In particular, we derive… ▽ More The problem of optimally measuring an analytic function of unknown local parameters each linearly coupled to a qubit sensor is well understood, with applications ranging from field interpolation to noise characterization. Here, we resolve a number of open questions that arise when extending this framework to Mach-Zehnder interferometers and quadrature displacement sensing. In particular, we derive lower bounds on the achievable mean square error in estimating a linear function of either local phase shifts or quadrature displacements. In the case of local phase shifts, these results prove, and somewhat generalize, a conjecture by Proctor et al. [arXiv:1702.04271 (2017)]. For quadrature displacements, we extend proofs of lower bounds to the case of arbitrary linear functions. We provide optimal protocols achieving these bounds up to small (multiplicative) constants and describe an algebraic approach to deriving new optimal protocols, possibly subject to additional constraints. Using this approach, we prove necessary conditions for the amount of entanglement needed for any optimal protocol for both local phase and displacement sensing. △ Less

Submitted 20 March, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: 20 pages

Journal ref: Phys. Rev. Research 6, 013246 (2024)

arXiv:2301.07496 [pdf]

Machine learning techniques for the Schizophrenia diagnosis: A comprehensive review and future research directions

Authors: Shradha Verma, Tripti Goel, M Tanveer, Weiping Ding, Rahul Sharma, R Murugan

Abstract: Schizophrenia (SCZ) is a brain disorder where different people experience different symptoms, such as hallucination, delusion, flat-talk, disorganized thinking, etc. In the long term, this can cause severe effects and diminish life expectancy by more than ten years. Therefore, early and accurate diagnosis of SCZ is prevalent, and modalities like structural magnetic resonance imaging (sMRI), functi… ▽ More Schizophrenia (SCZ) is a brain disorder where different people experience different symptoms, such as hallucination, delusion, flat-talk, disorganized thinking, etc. In the long term, this can cause severe effects and diminish life expectancy by more than ten years. Therefore, early and accurate diagnosis of SCZ is prevalent, and modalities like structural magnetic resonance imaging (sMRI), functional MRI (fMRI), diffusion tensor imaging (DTI), and electroencephalogram (EEG) assist in witnessing the brain abnormalities of the patients. Moreover, for accurate diagnosis of SCZ, researchers have used machine learning (ML) algorithms for the past decade to distinguish the brain patterns of healthy and SCZ brains using MRI and fMRI images. This paper seeks to acquaint SCZ researchers with ML and to discuss its recent applications to the field of SCZ study. This paper comprehensively reviews state-of-the-art techniques such as ML classifiers, artificial neural network (ANN), deep learning (DL) models, methodological fundamentals, and applications with previous studies. The motivation of this paper is to benefit from finding the research gaps that may lead to the development of a new model for accurate SCZ diagnosis. The paper concludes with the research finding, followed by the future scope that directly contributes to new research directions. △ Less

Submitted 16 January, 2023; originally announced January 2023.

arXiv:2212.03868 [pdf, other]

doi 10.1016/j.inffus.2023.03.007

Deep Learning for Brain Age Estimation: A Systematic Review

Authors: M. Tanveer, M. A. Ganaie, Iman Beheshti, Tripti Goel, Nehal Ahmad, Kuan-Ting Lai, Kaizhu Huang, Yu-Dong Zhang, Javier Del Ser, Chin-Teng Lin

Abstract: Over the years, Machine Learning models have been successfully employed on neuroimaging data for accurately predicting brain age. Deviations from the healthy brain aging pattern are associated to the accelerated brain aging and brain abnormalities. Hence, efficient and accurate diagnosis techniques are required for eliciting accurate brain age estimations. Several contributions have been reported… ▽ More Over the years, Machine Learning models have been successfully employed on neuroimaging data for accurately predicting brain age. Deviations from the healthy brain aging pattern are associated to the accelerated brain aging and brain abnormalities. Hence, efficient and accurate diagnosis techniques are required for eliciting accurate brain age estimations. Several contributions have been reported in the past for this purpose, resorting to different data-driven modeling methods. Recently, deep neural networks (also referred to as deep learning) have become prevalent in manifold neuroimaging studies, including brain age estimation. In this review, we offer a comprehensive analysis of the literature related to the adoption of deep learning for brain age estimation with neuroimaging data. We detail and analyze different deep learning architectures used for this application, pausing at research works published to date quantitatively exploring their application. We also examine different brain age estimation frameworks, comparatively exposing their advantages and weaknesses. Finally, the review concludes with an outlook towards future directions that should be followed by prospective studies. The ultimate goal of this paper is to establish a common and informed reference for newcomers and experienced researchers willing to approach brain age estimation by using deep learning models △ Less

Submitted 7 December, 2022; originally announced December 2022.

arXiv:2211.04764 [pdf]

Quantitative Susceptibility Mapping in Cognitive Decline: A Review of Technical Aspects and Applications

Authors: Shradha Verma, Tripti Goel, M Tanveer

Abstract: In the human brain, essential iron molecules for proper neurological functioning exist in transferrin (tf) and ferritin (Fe3) forms. However, its unusual increment manifests iron overload, which reacts with hydrogen peroxide. This reaction will generate hydroxyl radicals, and irons higher oxidation states. Further, this reaction causes tissue damage or cognitive decline in the brain and also leads… ▽ More In the human brain, essential iron molecules for proper neurological functioning exist in transferrin (tf) and ferritin (Fe3) forms. However, its unusual increment manifests iron overload, which reacts with hydrogen peroxide. This reaction will generate hydroxyl radicals, and irons higher oxidation states. Further, this reaction causes tissue damage or cognitive decline in the brain and also leads to neurodegenerative diseases. The susceptibility difference due to iron overload within the volume of interest (VOI) responsible for field perturbation of MRI and can benefit in estimating the neural disorder. The quantitative susceptibility mapping (QSM) technique can estimate susceptibility alteration and assist in quantifying the local tissue susceptibility differences. It has attracted many researchers and clinicians to diagnose and detect neural disorders such as Parkinsons, Alzheimers, Multiple Sclerosis, and aging. The paper presents a systematic review illustrating QSM fundamentals and its processing steps, including phase unwrapping, background field removal, and susceptibility inversion. Using QSM, the present work delivers novel predictive biomarkers for various neural disorders. It can strengthen new researchers fundamental knowledge and provides insight into its applicability for cognitive decline disclosure. The paper discusses the future scope of QSM processing stages and their applications in identifying new biomarkers for neural disorders. △ Less

Submitted 9 November, 2022; originally announced November 2022.

arXiv:2211.02868 [pdf]

Lightweight 3D Convolutional Neural Network for Schizophrenia diagnosis using MRI Images and Ensemble Bagging Classifier

Authors: P Supriya Patro, Tripti Goel, S A VaraPrasad, M Tanveer, R Murugan

Abstract: Structural alterations have been thoroughly investigated in the brain during the early onset of schizophrenia (SCZ) with the development of neuroimaging methods. The objective of the paper is an efficient classification of SCZ in 2 different classes: Cognitive Normal (CN), and SCZ using magnetic resonance imaging (MRI) images. This paper proposed a lightweight 3D convolutional neural network (CNN)… ▽ More Structural alterations have been thoroughly investigated in the brain during the early onset of schizophrenia (SCZ) with the development of neuroimaging methods. The objective of the paper is an efficient classification of SCZ in 2 different classes: Cognitive Normal (CN), and SCZ using magnetic resonance imaging (MRI) images. This paper proposed a lightweight 3D convolutional neural network (CNN) based framework for SCZ diagnosis using MRI images. In the proposed model, lightweight 3D CNN is used to extract both spatial and spectral features simultaneously from 3D volume MRI scans, and classification is done using an ensemble bagging classifier. Ensemble bagging classifier contributes to preventing overfitting, reduces variance, and improves the model's accuracy. The proposed algorithm is tested on datasets taken from three benchmark databases available as open-source: MCICShare, COBRE, and fBRINPhase-II. These datasets have undergone preprocessing steps to register all the MRI images to the standard template and reduce the artifacts. The model achieves the highest accuracy 92.22%, sensitivity 94.44%, specificity 90%, precision 90.43%, recall 94.44%, F1-score 92.39% and G-mean 92.19% as compared to the current state-of-the-art techniques. The performance metrics evidenced the use of this model to assist the clinicians for automatic accurate diagnosis of SCZ. △ Less

Submitted 5 November, 2022; originally announced November 2022.

arXiv:2007.04297 [pdf, other]

Open Domain Suggestion Mining Leveraging Fine-Grained Analysis

Authors: Shreya Singal, Tanishq Goel, Shivang Chopra, Sonika Dahiya

Abstract: Suggestion mining tasks are often semantically complex and lack sophisticated methodologies that can be applied to real-world data. The presence of suggestions across a large diversity of domains and the absence of large labelled and balanced datasets render this task particularly challenging to deal with. In an attempt to overcome these challenges, we propose a two-tier pipeline that leverages Di… ▽ More Suggestion mining tasks are often semantically complex and lack sophisticated methodologies that can be applied to real-world data. The presence of suggestions across a large diversity of domains and the absence of large labelled and balanced datasets render this task particularly challenging to deal with. In an attempt to overcome these challenges, we propose a two-tier pipeline that leverages Discourse Marker based oversampling and fine-grained suggestion mining techniques to retrieve suggestions from online forums. Through extensive comparison on a real-world open-domain suggestion dataset, we demonstrate how the oversampling technique combined with transformer based fine-grained analysis can beat the state of the art. Additionally, we perform extensive qualitative and qualitative analysis to give construct validity to our proposed pipeline. Finally, we discuss the practical, computational and reproducibility aspects of the deployment of our pipeline across the web. △ Less

Submitted 11 July, 2020; v1 submitted 27 June, 2020; originally announced July 2020.

Showing 1–8 of 8 results for author: Goel, T