Search | arXiv e-print repository

Learning the RoPEs: Better 2D and 3D Position Encodings with STRING

Authors: Connor Schenck, Isaac Reid, Mithun George Jacob, Alex Bewley, Joshua Ainslie, David Rendleman, Deepali Jain, Mohit Sharma, Avinava Dubey, Ayzaan Wahid, Sumeet Singh, René Wagner, Tianli Ding, Chuyuan Fu, Arunkumar Byravan, Jake Varley, Alexey Gritsenko, Matthias Minderer, Dmitry Kalashnikov, Jonathan Tompson, Vikas Sindhwani, Krzysztof Choromanski

Abstract: We introduce STRING: Separable Translationally Invariant Position Encodings. STRING extends Rotary Position Encodings, a recently proposed and widely used algorithm in large language models, via a unifying theoretical framework. Importantly, STRING still provides exact translation invariance, including token coordinates of arbitrary dimensionality, whilst maintaining a low computational footprint.… ▽ More We introduce STRING: Separable Translationally Invariant Position Encodings. STRING extends Rotary Position Encodings, a recently proposed and widely used algorithm in large language models, via a unifying theoretical framework. Importantly, STRING still provides exact translation invariance, including token coordinates of arbitrary dimensionality, whilst maintaining a low computational footprint. These properties are especially important in robotics, where efficient 3D token representation is key. We integrate STRING into Vision Transformers with RGB(-D) inputs (color plus optional depth), showing substantial gains, e.g. in open-vocabulary object detection and for robotics controllers. We complement our experiments with a rigorous mathematical analysis, proving the universality of our methods. △ Less

Submitted 4 February, 2025; originally announced February 2025.

Comments: Videos of STRING-based robotics controllers can be found here: https://sites.google.com/view/string-robotics

arXiv:2410.03462 [pdf, other]

Linear Transformer Topological Masking with Graph Random Features

Authors: Isaac Reid, Kumar Avinava Dubey, Deepali Jain, Will Whitney, Amr Ahmed, Joshua Ainslie, Alex Bewley, Mithun Jacob, Aranyak Mehta, David Rendleman, Connor Schenck, Richard E. Turner, René Wagner, Adrian Weller, Krzysztof Choromanski

Abstract: When training transformers on graph-structured data, incorporating information about the underlying topology is crucial for good performance. Topological masking, a type of relative position encoding, achieves this by upweighting or downweighting attention depending on the relationship between the query and keys in a graph. In this paper, we propose to parameterise topological masks as a learnable… ▽ More When training transformers on graph-structured data, incorporating information about the underlying topology is crucial for good performance. Topological masking, a type of relative position encoding, achieves this by upweighting or downweighting attention depending on the relationship between the query and keys in a graph. In this paper, we propose to parameterise topological masks as a learnable function of a weighted adjacency matrix -- a novel, flexible approach which incorporates a strong structural inductive bias. By approximating this mask with graph random features (for which we prove the first known concentration bounds), we show how this can be made fully compatible with linear attention, preserving $\mathcal{O}(N)$ time and space complexity with respect to the number of input tokens. The fastest previous alternative was $\mathcal{O}(N \log N)$ and only suitable for specific graphs. Our efficient masking algorithms provide strong performance gains for tasks on image and point cloud data, including with $>30$k nodes. △ Less

Submitted 15 October, 2024; v1 submitted 4 October, 2024; originally announced October 2024.

arXiv:2010.10631 [pdf, other]

doi 10.1109/TMI.2022.3224359

ENSURE: A General Approach for Unsupervised Training of Deep Image Reconstruction Algorithms

Authors: Hemant Kumar Aggarwal, Aniket Pramanik, Maneesh John, Mathews Jacob

Abstract: Image reconstruction using deep learning algorithms offers improved reconstruction quality and lower reconstruction time than classical compressed sensing and model-based algorithms. Unfortunately, clean and fully sampled ground-truth data to train the deep networks is often unavailable in several applications, restricting the applicability of the above methods. We introduce a novel metric termed… ▽ More Image reconstruction using deep learning algorithms offers improved reconstruction quality and lower reconstruction time than classical compressed sensing and model-based algorithms. Unfortunately, clean and fully sampled ground-truth data to train the deep networks is often unavailable in several applications, restricting the applicability of the above methods. We introduce a novel metric termed the ENsemble Stein's Unbiased Risk Estimate (ENSURE) framework, which can be used to train deep image reconstruction algorithms without fully sampled and noise-free images. The proposed framework is the generalization of the classical SURE and GSURE formulation to the setting where the images are sampled by different measurement operators, chosen randomly from a set. We evaluate the expectation of the GSURE loss functions over the sampling patterns to obtain the ENSURE loss function. We show that this loss is an unbiased estimate for the true mean-square error, which offers a better alternative to GSURE, which only offers an unbiased estimate for the projected error. Our experiments show that the networks trained with this loss function can offer reconstructions comparable to the supervised setting. While we demonstrate this framework in the context of MR image recovery, the ENSURE framework is generally applicable to arbitrary inverse problems. △ Less

Submitted 2 December, 2022; v1 submitted 20 October, 2020; originally announced October 2020.

Journal ref: IEEE Transactions on Medical Imaging, 2022

arXiv:1912.03433 [pdf, other]

Deep Generalization of Structured Low-Rank Algorithms (Deep-SLR)

Authors: Aniket Pramanik, Hemant Aggarwal, Mathews Jacob

Abstract: Structured low-rank (SLR) algorithms, which exploit annihilation relations between the Fourier samples of a signal resulting from different properties, is a powerful image reconstruction framework in several applications. This scheme relies on low-rank matrix completion to estimate the annihilation relations from the measurements. The main challenge with this strategy is the high computational com… ▽ More Structured low-rank (SLR) algorithms, which exploit annihilation relations between the Fourier samples of a signal resulting from different properties, is a powerful image reconstruction framework in several applications. This scheme relies on low-rank matrix completion to estimate the annihilation relations from the measurements. The main challenge with this strategy is the high computational complexity of matrix completion. We introduce a deep learning (DL) approach to significantly reduce the computational complexity. Specifically, we use a convolutional neural network (CNN)-based filterbank that is trained to estimate the annihilation relations from imperfect (under-sampled and noisy) k-space measurements of Magnetic Resonance Imaging (MRI). The main reason for the computational efficiency is the pre-learning of the parameters of the non-linear CNN from exemplar data, compared to SLR schemes that learn the linear filterbank parameters from the dataset itself. Experimental comparisons show that the proposed scheme can enable calibration-less parallel MRI; it can offer performance similar to SLR schemes while reducing the runtime by around three orders of magnitude. Unlike pre-calibrated and self-calibrated approaches, the proposed uncalibrated approach is insensitive to motion errors and affords higher acceleration. The proposed scheme also incorporates image domain priors that are complementary, thus significantly improving the performance over that of SLR schemes. △ Less

Submitted 8 August, 2020; v1 submitted 6 December, 2019; originally announced December 2019.

arXiv:1911.12443 [pdf, other]

Calibrationless Parallel MRI using Model based Deep Learning (C-MODL)

Authors: Aniket Pramanik, Hemant Aggarwal, Mathews Jacob

Abstract: We introduce a fast model based deep learning approach for calibrationless parallel MRI reconstruction. The proposed scheme is a non-linear generalization of structured low rank (SLR) methods that self learn linear annihilation filters from the same subject. It pre-learns non-linear annihilation relations in the Fourier domain from exemplar data. The pre-learning strategy significantly reduces the… ▽ More We introduce a fast model based deep learning approach for calibrationless parallel MRI reconstruction. The proposed scheme is a non-linear generalization of structured low rank (SLR) methods that self learn linear annihilation filters from the same subject. It pre-learns non-linear annihilation relations in the Fourier domain from exemplar data. The pre-learning strategy significantly reduces the computational complexity, making the proposed scheme three orders of magnitude faster than SLR schemes. The proposed framework also allows the use of a complementary spatial domain prior; the hybrid regularization scheme offers improved performance over calibrated image domain MoDL approach. The calibrationless strategy minimizes potential mismatches between calibration data and the main scan, while eliminating the need for a fully sampled calibration region. △ Less

Submitted 21 January, 2020; v1 submitted 27 November, 2019; originally announced November 2019.

arXiv:1910.12162 [pdf, other]

doi 10.1109/MSP.2019.2950432

Structured Low-Rank Algorithms: Theory, MR Applications, and Links to Machine Learning

Authors: Mathews Jacob, Merry P. Mani, Jong Chul Ye

Abstract: In this survey, we provide a detailed review of recent advances in the recovery of continuous domain multidimensional signals from their few non-uniform (multichannel) measurements using structured low-rank matrix completion formulation. This framework is centered on the fundamental duality between the compactness (e.g., sparsity) of the continuous signal and the rank of a structured matrix, whose… ▽ More In this survey, we provide a detailed review of recent advances in the recovery of continuous domain multidimensional signals from their few non-uniform (multichannel) measurements using structured low-rank matrix completion formulation. This framework is centered on the fundamental duality between the compactness (e.g., sparsity) of the continuous signal and the rank of a structured matrix, whose entries are functions of the signal. This property enables the reformulation of the signal recovery as a low-rank structured matrix completion, which comes with performance guarantees. We will also review fast algorithms that are comparable in complexity to current compressed sensing methods, which enables the application of the framework to large-scale magnetic resonance (MR) recovery problems. The remarkable flexibility of the formulation can be used to exploit signal properties that are difficult to capture by current sparse and low-rank optimization strategies. We demonstrate the utility of the framework in a wide range of MR imaging (MRI) applications, including highly accelerated imaging, calibration-free acquisition, MR artifact correction, and ungated dynamic MRI. △ Less

Submitted 26 October, 2019; originally announced October 2019.

Comments: Accepted for IEEE Signal Processing Magazine

arXiv:1812.10747 [pdf, other]

Off-the-grid model based deep learning (O-MODL)

Authors: Aniket Pramanik, Hemant Kumar Aggarwal, Mathews Jacob

Abstract: We introduce a model based off-the-grid image reconstruction algorithm using deep learned priors. The main difference of the proposed scheme with current deep learning strategies is the learning of non-linear annihilation relations in Fourier space. We rely on a model based framework, which allows us to use a significantly smaller deep network, compared to direct approaches that also learn how to… ▽ More We introduce a model based off-the-grid image reconstruction algorithm using deep learned priors. The main difference of the proposed scheme with current deep learning strategies is the learning of non-linear annihilation relations in Fourier space. We rely on a model based framework, which allows us to use a significantly smaller deep network, compared to direct approaches that also learn how to invert the forward model. Preliminary comparisons against image domain MoDL approach demonstrates the potential of the off-the-grid formulation. The main benefit of the proposed scheme compared to structured low-rank methods is the quite significant reduction in computational complexity. △ Less

Submitted 27 December, 2018; originally announced December 2018.

Comments: ISBI 2019

arXiv:1807.03845 [pdf, other]

Model-based free-breathing cardiac MRI reconstruction using deep learned \& STORM priors: MoDL-STORM

Authors: Sampurna Biswas, Hemant K. Aggarwal, Sunrita Poddar, Mathews Jacob

Abstract: We introduce a model-based reconstruction framework with deep learned (DL) and smoothness regularization on manifolds (STORM) priors to recover free breathing and ungated (FBU) cardiac MRI from highly undersampled measurements. The DL priors enable us to exploit the local correlations, while the STORM prior enables us to make use of the extensive non-local similarities that are subject dependent.… ▽ More We introduce a model-based reconstruction framework with deep learned (DL) and smoothness regularization on manifolds (STORM) priors to recover free breathing and ungated (FBU) cardiac MRI from highly undersampled measurements. The DL priors enable us to exploit the local correlations, while the STORM prior enables us to make use of the extensive non-local similarities that are subject dependent. We introduce a novel model-based formulation that allows the seamless integration of deep learning methods with available prior information, which current deep learning algorithms are not capable of. The experimental results demonstrate the preliminary potential of this work in accelerating FBU cardiac MRI. △ Less

Submitted 10 July, 2018; originally announced July 2018.

arXiv:1801.01455 [pdf, other]

Clustering of Data with Missing Entries

Authors: Sunrita Poddar, Mathews Jacob

Abstract: The analysis of large datasets is often complicated by the presence of missing entries, mainly because most of the current machine learning algorithms are designed to work with full data. The main focus of this work is to introduce a clustering algorithm, that will provide good clustering even in the presence of missing data. The proposed technique solves an $\ell_0$ fusion penalty based optimizat… ▽ More The analysis of large datasets is often complicated by the presence of missing entries, mainly because most of the current machine learning algorithms are designed to work with full data. The main focus of this work is to introduce a clustering algorithm, that will provide good clustering even in the presence of missing data. The proposed technique solves an $\ell_0$ fusion penalty based optimization problem to recover the clusters. We theoretically analyze the conditions needed for the successful recovery of the clusters. We also propose an algorithm to solve a relaxation of this problem using saturating non-convex fusion penalties. The method is demonstrated on simulated and real datasets, and is observed to perform well in the presence of large fractions of missing entries. △ Less

Submitted 3 January, 2018; originally announced January 2018.

Comments: arXiv admin note: substantial text overlap with arXiv:1709.01870

arXiv:1709.01870 [pdf, other]

Clustering of Data with Missing Entries using Non-convex Fusion Penalties

Authors: Sunrita Poddar, Mathews Jacob

Abstract: The presence of missing entries in data often creates challenges for pattern recognition algorithms. Traditional algorithms for clustering data assume that all the feature values are known for every data point. We propose a method to cluster data in the presence of missing information. Unlike conventional clustering techniques where every feature is known for each point, our algorithm can handle c… ▽ More The presence of missing entries in data often creates challenges for pattern recognition algorithms. Traditional algorithms for clustering data assume that all the feature values are known for every data point. We propose a method to cluster data in the presence of missing information. Unlike conventional clustering techniques where every feature is known for each point, our algorithm can handle cases where a few feature values are unknown for every point. For this more challenging problem, we provide theoretical guarantees for clustering using a $\ell_0$ fusion penalty based optimization problem. Furthermore, we propose an algorithm to solve a relaxation of this problem using saturating non-convex fusion penalties. It is observed that this algorithm produces solutions that degrade gradually with an increase in the fraction of missing feature values. We demonstrate the utility of the proposed method using a simulated dataset, the Wine dataset and also an under-sampled cardiac MRI dataset. It is shown that the proposed method is a promising clustering technique for datasets with large fractions of missing entries. △ Less

Submitted 6 September, 2017; originally announced September 2017.

arXiv:1412.2669 [pdf, other]

Two step recovery of jointly sparse and low-rank matrices: theoretical guarantees

Authors: Sampurna Biswas, Sunrita Poddar, Soura Dasgupta, Raghuraman Mudumbai, Mathews Jacob

Abstract: We introduce a two step algorithm with theoretical guarantees to recover a jointly sparse and low-rank matrix from undersampled measurements of its columns. The algorithm first estimates the row subspace of the matrix using a set of common measurements of the columns. In the second step, the subspace aware recovery of the matrix is solved using a simple least square algorithm. The results are veri… ▽ More We introduce a two step algorithm with theoretical guarantees to recover a jointly sparse and low-rank matrix from undersampled measurements of its columns. The algorithm first estimates the row subspace of the matrix using a set of common measurements of the columns. In the second step, the subspace aware recovery of the matrix is solved using a simple least square algorithm. The results are verified in the context of recovering CINE data from undersampled measurements; we obtain good recovery when the sampling conditions are satisfied. △ Less

Submitted 2 June, 2015; v1 submitted 5 December, 2014; originally announced December 2014.

Comments: 4 pages, 4 figures, ISBI 2015 conference submission

Showing 1–11 of 11 results for author: Jacob, M