-
A Certifably Correct Algorithm for Generalized Robot-World and Hand-Eye Calibration
Authors:
Emmett Wise,
Pushyami Kaveti,
Qilong Chen,
Wenhao Wang,
Hanumant Singh,
Jonathan Kelly,
David M. Rosen,
Matthew Giamou
Abstract:
Automatic extrinsic sensor calibration is a fundamental problem for multi-sensor platforms. Reliable and general-purpose solutions should be computationally efficient, require few assumptions about the structure of the sensing environment, and demand little effort from human operators. Since the engineering effort required to obtain accurate calibration parameters increases with the number of sens…
▽ More
Automatic extrinsic sensor calibration is a fundamental problem for multi-sensor platforms. Reliable and general-purpose solutions should be computationally efficient, require few assumptions about the structure of the sensing environment, and demand little effort from human operators. Since the engineering effort required to obtain accurate calibration parameters increases with the number of sensors deployed, robotics researchers have pursued methods requiring few assumptions about the sensing environment and minimal effort from human operators. In this work, we introduce a fast and certifiably globally optimal algorithm for solving a generalized formulation of the $\textit{robot-world and hand-eye calibration}$ (RWHEC) problem. The formulation of RWHEC presented is "generalized" in that it supports the simultaneous estimation of multiple sensor and target poses, and permits the use of monocular cameras that, alone, are unable to measure the scale of their environments. In addition to demonstrating our method's superior performance over existing solutions, we derive novel identifiability criteria and establish $\textit{a priori}$ guarantees of global optimality for problem instances with bounded measurement errors. We also introduce a complementary Lie-algebraic local solver for RWHEC and compare its performance with our global method and prior art. Finally, we provide a free and open-source implementation of our algorithms and experiments.
△ Less
Submitted 30 July, 2025;
originally announced July 2025.
-
From Low Field to High Value: Robust Cortical Mapping from Low-Field MRI
Authors:
Karthik Gopinath,
Annabel Sorby-Adams,
Jonathan W. Ramirez,
Dina Zemlyanker,
Jennifer Guo,
David Hunt,
Christine L. Mac Donald,
C. Dirk Keene,
Timothy Coalson,
Matthew F. Glasser,
David Van Essen,
Matthew S. Rosen,
Oula Puonti,
W. Taylor Kimberly,
Juan Eugenio Iglesias
Abstract:
Three-dimensional reconstruction of cortical surfaces from MRI for morphometric analysis is fundamental for understanding brain structure. While high-field MRI (HF-MRI) is standard in research and clinical settings, its limited availability hinders widespread use. Low-field MRI (LF-MRI), particularly portable systems, offers a cost-effective and accessible alternative. However, existing cortical s…
▽ More
Three-dimensional reconstruction of cortical surfaces from MRI for morphometric analysis is fundamental for understanding brain structure. While high-field MRI (HF-MRI) is standard in research and clinical settings, its limited availability hinders widespread use. Low-field MRI (LF-MRI), particularly portable systems, offers a cost-effective and accessible alternative. However, existing cortical surface analysis tools are optimized for high-resolution HF-MRI and struggle with the lower signal-to-noise ratio and resolution of LF-MRI. In this work, we present a machine learning method for 3D reconstruction and analysis of portable LF-MRI across a range of contrasts and resolutions. Our method works "out of the box" without retraining. It uses a 3D U-Net trained on synthetic LF-MRI to predict signed distance functions of cortical surfaces, followed by geometric processing to ensure topological accuracy. We evaluate our method using paired HF/LF-MRI scans of the same subjects, showing that LF-MRI surface reconstruction accuracy depends on acquisition parameters, including contrast type (T1 vs T2), orientation (axial vs isotropic), and resolution. A 3mm isotropic T2-weighted scan acquired in under 4 minutes, yields strong agreement with HF-derived surfaces: surface area correlates at r=0.96, cortical parcellations reach Dice=0.98, and gray matter volume achieves r=0.93. Cortical thickness remains more challenging with correlations up to r=0.70, reflecting the difficulty of sub-mm precision with 3mm voxels. We further validate our method on challenging postmortem LF-MRI, demonstrating its robustness. Our method represents a step toward enabling cortical surface analysis on portable LF-MRI. Code is available at https://surfer.nmr.mgh.harvard.edu/fswiki/ReconAny
△ Less
Submitted 18 May, 2025;
originally announced May 2025.
-
Deep learning of personalized priors from past MRI scans enables fast, quality-enhanced point-of-care MRI with low-cost systems
Authors:
Tal Oved,
Beatrice Lena,
Chloé F. Najac,
Sheng Shen,
Matthew S. Rosen,
Andrew Webb,
Efrat Shimron
Abstract:
Magnetic resonance imaging (MRI) offers superb-quality images, but its accessibility is limited by high costs, posing challenges for patients requiring longitudinal care. Low-field MRI provides affordable imaging with low-cost devices but is hindered by long scans and degraded image quality, including low signal-to-noise ratio (SNR) and tissue contrast. We propose a novel healthcare paradigm: usin…
▽ More
Magnetic resonance imaging (MRI) offers superb-quality images, but its accessibility is limited by high costs, posing challenges for patients requiring longitudinal care. Low-field MRI provides affordable imaging with low-cost devices but is hindered by long scans and degraded image quality, including low signal-to-noise ratio (SNR) and tissue contrast. We propose a novel healthcare paradigm: using deep learning to extract personalized features from past standard high-field MRI scans and harnessing them to enable accelerated, enhanced-quality follow-up scans with low-cost systems. To overcome the SNR and contrast differences, we introduce ViT-Fuser, a feature-fusion vision transformer that learns features from past scans, e.g. those stored in standard DICOM CDs. We show that \textit{a single prior scan is sufficient}, and this scan can come from various MRI vendors, field strengths, and pulse sequences. Experiments with four datasets, including glioblastoma data, low-field ($50mT$), and ultra-low-field ($6.5mT$) data, demonstrate that ViT-Fuser outperforms state-of-the-art methods, providing enhanced-quality images from accelerated low-field scans, with robustness to out-of-distribution data. Our freely available framework thus enables rapid, diagnostic-quality, low-cost imaging for wide healthcare applications.
△ Less
Submitted 5 May, 2025;
originally announced May 2025.
-
Reference-Free 3D Reconstruction of Brain Dissection Photographs with Machine Learning
Authors:
Lin Tian,
Sean I. Young,
Jonathan Williams Ramirez,
Dina Zemlyanker,
Lucas Jacob Deden Binder,
Rogeny Herisse,
Theresa R. Connors,
Derek H. Oakley,
Bradley T. Hyman,
Oula Puonti,
Matthew S. Rosen,
Juan Eugenio Iglesias
Abstract:
Correlation of neuropathology with MRI has the potential to transfer microscopic signatures of pathology to invivo scans. Recently, a classical registration method has been proposed, to build these correlations from 3D reconstructed stacks of dissection photographs, which are routinely taken at brain banks. These photographs bypass the need for exvivo MRI, which is not widely accessible. However,…
▽ More
Correlation of neuropathology with MRI has the potential to transfer microscopic signatures of pathology to invivo scans. Recently, a classical registration method has been proposed, to build these correlations from 3D reconstructed stacks of dissection photographs, which are routinely taken at brain banks. These photographs bypass the need for exvivo MRI, which is not widely accessible. However, this method requires a full stack of brain slabs and a reference mask (e.g., acquired with a surface scanner), which severely limits the applicability of the technique. Here we propose RefFree, a dissection photograph reconstruction method without external reference. RefFree is a learning approach that estimates the 3D coordinates in the atlas space for every pixel in every photograph; simple least-squares fitting can then be used to compute the 3D reconstruction. As a by-product, RefFree also produces an atlas-based segmentation of the reconstructed stack. RefFree is trained on synthetic photographs generated from digitally sliced 3D MRI data, with randomized appearance for enhanced generalization ability. Experiments on simulated and real data show that RefFree achieves performance comparable to the baseline method without an explicit reference while also enabling reconstruction of partial stacks. Our code is available at https://github.com/lintian-a/reffree.
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
Distributed Certifiably Correct Range-Aided SLAM
Authors:
Alexander Thoms,
Alan Papalia,
Jared Velasquez,
David M. Rosen,
Sriram Narasimhan
Abstract:
Reliable simultaneous localization and mapping (SLAM) algorithms are necessary for safety-critical autonomous navigation. In the communication-constrained multi-agent setting, navigation systems increasingly use point-to-point range sensors as they afford measurements with low bandwidth requirements and known data association. The state estimation problem for these systems takes the form of range-…
▽ More
Reliable simultaneous localization and mapping (SLAM) algorithms are necessary for safety-critical autonomous navigation. In the communication-constrained multi-agent setting, navigation systems increasingly use point-to-point range sensors as they afford measurements with low bandwidth requirements and known data association. The state estimation problem for these systems takes the form of range-aided (RA) SLAM. However, distributed algorithms for solving the RA-SLAM problem lack formal guarantees on the quality of the returned estimate. To this end, we present the first distributed algorithm for RA-SLAM that can efficiently recover certifiably globally optimal solutions. Our algorithm, distributed certifiably correct RA-SLAM (DCORA), achieves this via the Riemannian Staircase method, where computational procedures developed for distributed certifiably correct pose graph optimization are generalized to the RA-SLAM problem. We demonstrate DCORA's efficacy on real-world multi-agent datasets by achieving absolute trajectory errors comparable to those of a state-of-the-art centralized certifiably correct RA-SLAM algorithm. Additionally, we perform a parametric study on the structure of the RA-SLAM problem using synthetic data, revealing how common parameters affect DCORA's performance.
△ Less
Submitted 13 May, 2025; v1 submitted 5 March, 2025;
originally announced March 2025.
-
A Trust-Region Method for Graphical Stein Variational Inference
Authors:
Liam Pavlovic,
David M. Rosen
Abstract:
Stein variational inference (SVI) is a sample-based approximate Bayesian inference technique that generates a sample set by jointly optimizing the samples' locations to minimize an information-theoretic measure of discrepancy with the target probability distribution. SVI thus provides a fast and significantly more sample-efficient approach to Bayesian inference than traditional (random-sampling-ba…
▽ More
Stein variational inference (SVI) is a sample-based approximate Bayesian inference technique that generates a sample set by jointly optimizing the samples' locations to minimize an information-theoretic measure of discrepancy with the target probability distribution. SVI thus provides a fast and significantly more sample-efficient approach to Bayesian inference than traditional (random-sampling-based) alternatives. However, the optimization techniques employed in existing SVI methods struggle to address problems in which the target distribution is high-dimensional, poorly-conditioned, or non-convex, which severely limits the range of their practical applicability. In this paper, we propose a novel trust-region optimization approach for SVI that successfully addresses each of these challenges. Our method builds upon prior work in SVI by leveraging conditional independences in the target distribution (to achieve high-dimensional scaling) and second-order information (to address poor conditioning), while additionally providing an effective adaptive step control procedure, which is essential for ensuring convergence on challenging non-convex optimization problems. Experimental results show our method achieves superior numerical performance, both in convergence rate and sample accuracy, and scales better in high-dimensional distributions, than previous SVI techniques.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
TV-based Deep 3D Self Super-Resolution for fMRI
Authors:
Fernando Pérez-Bueno,
Hongwei Bran Li,
Matthew S. Rosen,
Shahin Nasr,
Cesar Caballero-Gaudes,
Juan Eugenio Iglesias
Abstract:
While functional Magnetic Resonance Imaging (fMRI) offers valuable insights into cognitive processes, its inherent spatial limitations pose challenges for detailed analysis of the fine-grained functional architecture of the brain. More specifically, MRI scanner and sequence specifications impose a trade-off between temporal resolution, spatial resolution, signal-to-noise ratio, and scan time. Deep…
▽ More
While functional Magnetic Resonance Imaging (fMRI) offers valuable insights into cognitive processes, its inherent spatial limitations pose challenges for detailed analysis of the fine-grained functional architecture of the brain. More specifically, MRI scanner and sequence specifications impose a trade-off between temporal resolution, spatial resolution, signal-to-noise ratio, and scan time. Deep Learning (DL) Super-Resolution (SR) methods have emerged as a promising solution to enhance fMRI resolution, generating high-resolution (HR) images from low-resolution (LR) images typically acquired with lower scanning times. However, most existing SR approaches depend on supervised DL techniques, which require training ground truth (GT) HR data, which is often difficult to acquire and simultaneously sets a bound for how far SR can go. In this paper, we introduce a novel self-supervised DL SR model that combines a DL network with an analytical approach and Total Variation (TV) regularization. Our method eliminates the need for external GT images, achieving competitive performance compared to supervised DL techniques and preserving the functional maps.
△ Less
Submitted 24 February, 2025; v1 submitted 5 October, 2024;
originally announced October 2024.
-
An Overview of the Burer-Monteiro Method for Certifiable Robot Perception
Authors:
Alan Papalia,
Yulun Tian,
David M. Rosen,
Jonathan P. How,
John J. Leonard
Abstract:
This paper presents an overview of the Burer-Monteiro method (BM), a technique that has been applied to solve robot perception problems to certifiable optimality in real-time. BM is often used to solve semidefinite programming relaxations, which can be used to perform global optimization for non-convex perception problems. Specifically, BM leverages the low-rank structure of typical semidefinite p…
▽ More
This paper presents an overview of the Burer-Monteiro method (BM), a technique that has been applied to solve robot perception problems to certifiable optimality in real-time. BM is often used to solve semidefinite programming relaxations, which can be used to perform global optimization for non-convex perception problems. Specifically, BM leverages the low-rank structure of typical semidefinite programs to dramatically reduce the computational cost of performing optimization. This paper discusses BM in certifiable perception, with three main objectives: (i) to consolidate information from the literature into a unified presentation, (ii) to elucidate the role of the linear independence constraint qualification (LICQ), a concept not yet well-covered in certifiable perception literature, and (iii) to share practical considerations that are discussed among practitioners but not thoroughly covered in the literature. Our general aim is to offer a practical primer for applying BM towards certifiable perception.
△ Less
Submitted 9 June, 2025; v1 submitted 30 September, 2024;
originally announced October 2024.
-
Probabilistic Contrastive Learning with Explicit Concentration on the Hypersphere
Authors:
Hongwei Bran Li,
Cheng Ouyang,
Tamaz Amiranashvili,
Matthew S. Rosen,
Bjoern Menze,
Juan Eugenio Iglesias
Abstract:
Self-supervised contrastive learning has predominantly adopted deterministic methods, which are not suited for environments characterized by uncertainty and noise. This paper introduces a new perspective on incorporating uncertainty into contrastive learning by embedding representations within a spherical space, inspired by the von Mises-Fisher distribution (vMF). We introduce an unnormalized form…
▽ More
Self-supervised contrastive learning has predominantly adopted deterministic methods, which are not suited for environments characterized by uncertainty and noise. This paper introduces a new perspective on incorporating uncertainty into contrastive learning by embedding representations within a spherical space, inspired by the von Mises-Fisher distribution (vMF). We introduce an unnormalized form of vMF and leverage the concentration parameter, kappa, as a direct, interpretable measure to quantify uncertainty explicitly. This approach not only provides a probabilistic interpretation of the embedding space but also offers a method to calibrate model confidence against varying levels of data corruption and characteristics. Our empirical results demonstrate that the estimated concentration parameter correlates strongly with the degree of unforeseen data corruption encountered at test time, enables failure analysis, and enhances existing out-of-distribution detection methods.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
LawInstruct: A Resource for Studying Language Model Adaptation to the Legal Domain
Authors:
Joel Niklaus,
Lucia Zheng,
Arya D. McCarthy,
Christopher Hahn,
Brian M. Rosen,
Peter Henderson,
Daniel E. Ho,
Garrett Honke,
Percy Liang,
Christopher Manning
Abstract:
Instruction tuning is an important step in making language models useful for direct user interaction. However, the legal domain is underrepresented in typical instruction datasets (e.g., only 10 out of 1600+ tasks in Super-NaturalInstructions). To study whether instruction tuning on legal datasets is necessary for strong legal reasoning, we aggregate 58 annotated legal datasets and write instructi…
▽ More
Instruction tuning is an important step in making language models useful for direct user interaction. However, the legal domain is underrepresented in typical instruction datasets (e.g., only 10 out of 1600+ tasks in Super-NaturalInstructions). To study whether instruction tuning on legal datasets is necessary for strong legal reasoning, we aggregate 58 annotated legal datasets and write instructions for each, creating LawInstruct. LawInstruct covers 17 global jurisdictions, 24 languages and a total of 12M examples across diverse tasks such as legal QA, summarization of court cases, and legal argument mining. We evaluate our models on LegalBench, measuring legal reasoning across five categories in 162 challenging and realistic legal tasks, and MMLU, to measure potential drops in general reasoning capabilities. We find that legal-specific instruction tuning on Flan-T5 - yielding FLawN-T5 - improves performance on LegalBench across all model sizes, with an aggregate increase of 15 points or 50% over Flan-T5 for the base size. No model size shows performance drops in MMLU. We publish LawInstruct as a resource for further study of instruction tuning in the legal domain.
△ Less
Submitted 23 January, 2025; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Generation of Synthetic Images for Pedestrian Detection Using a Sequence of GANs
Authors:
Viktor Seib,
Malte Roosen,
Ida Germann,
Stefan Wirtz,
Dietrich Paulus
Abstract:
Creating annotated datasets demands a substantial amount of manual effort. In this proof-of-concept work, we address this issue by proposing a novel image generation pipeline. The pipeline consists of three distinct generative adversarial networks (previously published), combined in a novel way to augment a dataset for pedestrian detection. Despite the fact that the generated images are not always…
▽ More
Creating annotated datasets demands a substantial amount of manual effort. In this proof-of-concept work, we address this issue by proposing a novel image generation pipeline. The pipeline consists of three distinct generative adversarial networks (previously published), combined in a novel way to augment a dataset for pedestrian detection. Despite the fact that the generated images are not always visually pleasant to the human eye, our detection benchmark reveals that the results substantially surpass the baseline. The presented proof-of-concept work was done in 2020 and is now published as a technical report after a three years retention period.
△ Less
Submitted 14 January, 2024;
originally announced January 2024.
-
Quantifying white matter hyperintensity and brain volumes in heterogeneous clinical and low-field portable MRI
Authors:
Pablo Laso,
Stefano Cerri,
Annabel Sorby-Adams,
Jennifer Guo,
Farrah Mateen,
Philipp Goebl,
Jiaming Wu,
Peirong Liu,
Hongwei Li,
Sean I. Young,
Benjamin Billot,
Oula Puonti,
Gordon Sze,
Sam Payabavash,
Adam DeHavenon,
Kevin N. Sheth,
Matthew S. Rosen,
John Kirsch,
Nicola Strisciuglio,
Jelmer M. Wolterink,
Arman Eshaghi,
Frederik Barkhof,
W. Taylor Kimberly,
Juan Eugenio Iglesias
Abstract:
Brain atrophy and white matter hyperintensity (WMH) are critical neuroimaging features for ascertaining brain injury in cerebrovascular disease and multiple sclerosis. Automated segmentation and quantification is desirable but existing methods require high-resolution MRI with good signal-to-noise ratio (SNR). This precludes application to clinical and low-field portable MRI (pMRI) scans, thus hamp…
▽ More
Brain atrophy and white matter hyperintensity (WMH) are critical neuroimaging features for ascertaining brain injury in cerebrovascular disease and multiple sclerosis. Automated segmentation and quantification is desirable but existing methods require high-resolution MRI with good signal-to-noise ratio (SNR). This precludes application to clinical and low-field portable MRI (pMRI) scans, thus hampering large-scale tracking of atrophy and WMH progression, especially in underserved areas where pMRI has huge potential. Here we present a method that segments white matter hyperintensity and 36 brain regions from scans of any resolution and contrast (including pMRI) without retraining. We show results on eight public datasets and on a private dataset with paired high- and low-field scans (3T and 64mT), where we attain strong correlation between the WMH ($ρ$=.85) and hippocampal volumes (r=.89) estimated at both fields. Our method is publicly available as part of FreeSurfer, at: http://surfer.nmr.mgh.harvard.edu/fswiki/WMH-SynthSeg.
△ Less
Submitted 15 February, 2024; v1 submitted 8 December, 2023;
originally announced December 2023.
-
Resolution- and Stimulus-agnostic Super-Resolution of Ultra-High-Field Functional MRI: Application to Visual Studies
Authors:
Hongwei Bran Li,
Matthew S. Rosen,
Shahin Nasr,
Juan Eugenio Iglesias
Abstract:
High-resolution fMRI provides a window into the brain's mesoscale organization. Yet, higher spatial resolution increases scan times, to compensate for the low signal and contrast-to-noise ratio. This work introduces a deep learning-based 3D super-resolution (SR) method for fMRI. By incorporating a resolution-agnostic image augmentation framework, our method adapts to varying voxel sizes without re…
▽ More
High-resolution fMRI provides a window into the brain's mesoscale organization. Yet, higher spatial resolution increases scan times, to compensate for the low signal and contrast-to-noise ratio. This work introduces a deep learning-based 3D super-resolution (SR) method for fMRI. By incorporating a resolution-agnostic image augmentation framework, our method adapts to varying voxel sizes without retraining. We apply this innovative technique to localize fine-scale motion-selective sites in the early visual areas. Detection of these sites typically requires a resolution higher than 1 mm isotropic, whereas here, we visualize them based on lower resolution (2-3mm isotropic) fMRI data. Remarkably, the super-resolved fMRI is able to recover high-frequency detail of the interdigitated organization of these sites (relative to the color-selective sites), even with training data sourced from different subjects and experimental paradigms -- including non-visual resting-state fMRI, underscoring its robustness and versatility. Quantitative and qualitative results indicate that our method has the potential to enhance the spatial resolution of fMRI, leading to a drastic reduction in acquisition time.
△ Less
Submitted 19 March, 2024; v1 submitted 24 November, 2023;
originally announced November 2023.
-
OASIS: Optimal Arrangements for Sensing in SLAM
Authors:
Pushyami Kaveti,
Matthew Giamou,
Hanumant Singh,
David M. Rosen
Abstract:
The number and arrangement of sensors on mobile robot dramatically influence its perception capabilities. Ensuring that sensors are mounted in a manner that enables accurate detection, localization, and mapping is essential for the success of downstream control tasks. However, when designing a new robotic platform, researchers and practitioners alike usually mimic standard configurations or maximi…
▽ More
The number and arrangement of sensors on mobile robot dramatically influence its perception capabilities. Ensuring that sensors are mounted in a manner that enables accurate detection, localization, and mapping is essential for the success of downstream control tasks. However, when designing a new robotic platform, researchers and practitioners alike usually mimic standard configurations or maximize simple heuristics like field-of-view (FOV) coverage to decide where to place exteroceptive sensors. In this work, we conduct an information-theoretic investigation of this overlooked element of robotic perception in the context of simultaneous localization and mapping (SLAM). We show how to formalize the sensor arrangement problem as a form of subset selection under the E-optimality performance criterion. While this formulation is NP-hard in general, we show that a combination of greedy sensor selection and fast convex relaxation-based post-hoc verification enables the efficient recovery of certifiably optimal sensor designs in practice. Results from synthetic experiments reveal that sensors placed with OASIS outperform benchmarks in terms of mean squared error of visual SLAM estimates.
△ Less
Submitted 21 March, 2024; v1 submitted 19 September, 2023;
originally announced September 2023.
-
Uncertainty Estimation and Out-of-Distribution Detection for Deep Learning-Based Image Reconstruction using the Local Lipschitz
Authors:
Danyal F. Bhutto,
Bo Zhu,
Jeremiah Z. Liu,
Neha Koonjoo,
Hongwei B. Li,
Bruce R. Rosen,
Matthew S. Rosen
Abstract:
Accurate image reconstruction is at the heart of diagnostics in medical imaging. Supervised deep learning-based approaches have been investigated for solving inverse problems including image reconstruction. However, these trained models encounter unseen data distributions that are widely shifted from training data during deployment. Therefore, it is essential to assess whether a given input falls…
▽ More
Accurate image reconstruction is at the heart of diagnostics in medical imaging. Supervised deep learning-based approaches have been investigated for solving inverse problems including image reconstruction. However, these trained models encounter unseen data distributions that are widely shifted from training data during deployment. Therefore, it is essential to assess whether a given input falls within the training data distribution for diagnostic purposes. Uncertainty estimation approaches exist but focus on providing an uncertainty map to radiologists, rather than assessing the training distribution fit. In this work, we propose a method based on the local Lipschitz-based metric to distinguish out-of-distribution images from in-distribution with an area under the curve of 99.94%. Empirically, we demonstrate a very strong relationship between the local Lipschitz value and mean absolute error (MAE), supported by a high Spearman's rank correlation coefficient of 0.8475, which determines the uncertainty estimation threshold for optimal model performance. Through the identification of false positives, the local Lipschitz and MAE relationship was used to guide data augmentation and reduce model uncertainty. Our study was validated using the AUTOMAP architecture for sensor-to-image Magnetic Resonance Imaging (MRI) reconstruction. We compare our proposed approach with baseline methods: Monte-Carlo dropout and deep ensembles, and further analysis included MRI denoising and Computed Tomography (CT) sparse-to-full view reconstruction using UNET architectures. We show that our approach is applicable to various architectures and learned functions, especially in the realm of medical image reconstruction, where preserving the diagnostic accuracy of reconstructed images remains paramount.
△ Less
Submitted 1 December, 2023; v1 submitted 12 May, 2023;
originally announced May 2023.
-
Certifiably Correct Range-Aided SLAM
Authors:
Alan Papalia,
Andrew Fishberg,
Brendan W. O'Neill,
Jonathan P. How,
David M. Rosen,
John J. Leonard
Abstract:
We present the first algorithm to efficiently compute certifiably optimal solutions to range-aided simultaneous localization and mapping (RA-SLAM) problems. Robotic navigation systems increasingly incorporate point-to-point ranging sensors, leading to state estimation problems in the form of RA-SLAM. However, the RA-SLAM problem is significantly more difficult to solve than traditional pose-graph…
▽ More
We present the first algorithm to efficiently compute certifiably optimal solutions to range-aided simultaneous localization and mapping (RA-SLAM) problems. Robotic navigation systems increasingly incorporate point-to-point ranging sensors, leading to state estimation problems in the form of RA-SLAM. However, the RA-SLAM problem is significantly more difficult to solve than traditional pose-graph SLAM: ranging sensor models introduce non-convexity and single range measurements do not uniquely determine the transform between the involved sensors. As a result, RA-SLAM inference is sensitive to initial estimates yet lacks reliable initialization techniques. Our approach, certifiably correct RA-SLAM (CORA), leverages a novel quadratically constrained quadratic programming (QCQP) formulation of RA-SLAM to relax the RA-SLAM problem to a semidefinite program (SDP). CORA solves the SDP efficiently using the Riemannian Staircase methodology; the SDP solution provides both (i) a lower bound on the RA-SLAM problem's optimal value, and (ii) an approximate solution of the RA-SLAM problem, which can be subsequently refined using local optimization. CORA applies to problems with arbitrary pose-pose, pose-landmark, and ranging measurements and, due to using convex relaxation, is insensitive to initialization. We evaluate CORA on several real-world problems. In contrast to state-of-the-art approaches, CORA is able to obtain high-quality solutions on all problems despite being initialized with random values. Additionally, we study the tightness of the SDP relaxation with respect to important problem parameters: the number of (i) robots, (ii) landmarks, and (iii) range measurements. These experiments demonstrate that the SDP relaxation is often tight and reveal relationships between graph connectivity and the tightness of the SDP relaxation.
△ Less
Submitted 28 September, 2024; v1 submitted 22 February, 2023;
originally announced February 2023.
-
Synthetic Low-Field MRI Super-Resolution Via Nested U-Net Architecture
Authors:
Aryan Kalluvila,
Neha Koonjoo,
Danyal Bhutto,
Marcio Rockenbach,
Matthew S. Rosen
Abstract:
Low-field (LF) MRI scanners have the power to revolutionize medical imaging by providing a portable and cheaper alternative to high-field MRI scanners. However, such scanners are usually significantly noisier and lower quality than their high-field counterparts. The aim of this paper is to improve the SNR and overall image quality of low-field MRI scans to improve diagnostic capability. To address…
▽ More
Low-field (LF) MRI scanners have the power to revolutionize medical imaging by providing a portable and cheaper alternative to high-field MRI scanners. However, such scanners are usually significantly noisier and lower quality than their high-field counterparts. The aim of this paper is to improve the SNR and overall image quality of low-field MRI scans to improve diagnostic capability. To address this issue, we propose a Nested U-Net neural network architecture super-resolution algorithm that outperforms previously suggested deep learning methods with an average PSNR of 78.83 and SSIM of 0.9551. We tested our network on artificial noisy downsampled synthetic data from a major T1 weighted MRI image dataset called the T1-mix dataset. One board-certified radiologist scored 25 images on the Likert scale (1-5) assessing overall image quality, anatomical structure, and diagnostic confidence across our architecture and other published works (SR DenseNet, Generator Block, SRCNN, etc.). We also introduce a new type of loss function called natural log mean squared error (NLMSE). In conclusion, we present a more accurate deep learning method for single image super-resolution applied to synthetic low-field MRI via a Nested U-Net architecture.
△ Less
Submitted 27 November, 2022;
originally announced November 2022.
-
SCORE: A Second-Order Conic Initialization for Range-Aided SLAM
Authors:
Alan Papalia,
Joseph Morales,
Kevin J. Doherty,
David M. Rosen,
John J. Leonard
Abstract:
We present a novel initialization technique for the range-aided simultaneous localization and mapping (RA-SLAM) problem. In RA-SLAM we consider measurements of point-to-point distances in addition to measurements of rigid transformations to landmark or pose variables. Standard formulations of RA-SLAM approach the problem as non-convex optimization, which requires a good initialization to obtain qu…
▽ More
We present a novel initialization technique for the range-aided simultaneous localization and mapping (RA-SLAM) problem. In RA-SLAM we consider measurements of point-to-point distances in addition to measurements of rigid transformations to landmark or pose variables. Standard formulations of RA-SLAM approach the problem as non-convex optimization, which requires a good initialization to obtain quality results. The initialization technique proposed here relaxes the RA-SLAM problem to a convex problem which is then solved to determine an initialization for the original, non-convex problem. The relaxation is a second-order cone program (SOCP), which is derived from a quadratically constrained quadratic program (QCQP) formulation of the RA-SLAM problem. As a SOCP, the method is highly scalable. We name this relaxation Second-order COnic RElaxation for RA-SLAM (SCORE). To our knowledge, this work represents the first convex relaxation for RA-SLAM. We present real-world and simulated experiments which show SCORE initialization permits the efficient recovery of quality solutions for a variety of challenging single- and multi-robot RA-SLAM problems with thousands of poses and range measurements.
△ Less
Submitted 6 October, 2022;
originally announced October 2022.
-
Accelerating Certifiable Estimation with Preconditioned Eigensolvers
Authors:
David M. Rosen
Abstract:
Convex (specifically semidefinite) relaxation provides a powerful approach to constructing robust machine perception systems, enabling the recovery of certifiably globally optimal solutions of challenging estimation problems in many practical settings. However, solving the large-scale semidefinite relaxations underpinning this approach remains a formidable computational challenge. A dominant cost…
▽ More
Convex (specifically semidefinite) relaxation provides a powerful approach to constructing robust machine perception systems, enabling the recovery of certifiably globally optimal solutions of challenging estimation problems in many practical settings. However, solving the large-scale semidefinite relaxations underpinning this approach remains a formidable computational challenge. A dominant cost in many state-of-the-art (Burer-Monteiro factorization-based) certifiable estimation methods is solution verification (testing the global optimality of a given candidate solution), which entails computing a minimum eigenpair of a certain symmetric certificate matrix. In this letter, we show how to significantly accelerate this verification step, and thereby the overall speed of certifiable estimation methods. First, we show that the certificate matrices arising in the Burer-Monteiro approach generically possess spectra that make the verification problem expensive to solve using standard iterative eigenvalue methods. We then show how to address this challenge using preconditioned eigensolvers; specifically, we design a specialized solution verification algorithm based upon the locally optimal block preconditioned conjugate gradient (LOBPCG) method together with a simple yet highly effective algebraic preconditioner. Experimental evaluation on a variety of simulated and real-world examples shows that our proposed verification scheme is very effective in practice, accelerating solution verification by up to 280x, and the overall Burer-Monteiro method by up to 16x, versus the standard Lanczos method when applied to relaxations derived from large-scale SLAM benchmarks.
△ Less
Submitted 13 November, 2022; v1 submitted 11 July, 2022;
originally announced July 2022.
-
Artificial Intelligence-Assisted Optimization and Multiphase Analysis of Polygon PEM Fuel Cells
Authors:
Ali Jabbary,
Nader Pourmahmoud,
Mir Ali Asghar Abdollahi,
Marc A. Rosen
Abstract:
This article presents new hexagonal and pentagonal PEM fuel cell models. The models have been optimized after achieving improved cell performance. The input parameters of the multi-objective optimization algorithm were pressure and temperature at the inlet, and consumption and output powers were the objective parameters. The output data of the numerical simulation has been trained using deep neura…
▽ More
This article presents new hexagonal and pentagonal PEM fuel cell models. The models have been optimized after achieving improved cell performance. The input parameters of the multi-objective optimization algorithm were pressure and temperature at the inlet, and consumption and output powers were the objective parameters. The output data of the numerical simulation has been trained using deep neural networks and then modeled with polynomial regression. The target functions have been extracted using the RSM (Response Surface Method), and the targets were optimized using the multi-objective genetic algorithm (NSGA-II). Compared to the base model, the optimized Pentagonal and Hexagonal models increase the output current density by 21.8% and 39.9%, respectively.
△ Less
Submitted 6 July, 2022; v1 submitted 10 April, 2022;
originally announced May 2022.
-
Spectral Measurement Sparsification for Pose-Graph SLAM
Authors:
Kevin J. Doherty,
David M. Rosen,
John J. Leonard
Abstract:
Simultaneous localization and mapping (SLAM) is a critical capability in autonomous navigation, but in order to scale SLAM to the setting of "lifelong" SLAM, particularly under memory or computation constraints, a robot must be able to determine what information should be retained and what can safely be forgotten. In graph-based SLAM, the number of edges (measurements) in a pose graph determines b…
▽ More
Simultaneous localization and mapping (SLAM) is a critical capability in autonomous navigation, but in order to scale SLAM to the setting of "lifelong" SLAM, particularly under memory or computation constraints, a robot must be able to determine what information should be retained and what can safely be forgotten. In graph-based SLAM, the number of edges (measurements) in a pose graph determines both the memory requirements of storing a robot's observations and the computational expense of algorithms deployed for performing state estimation using those observations; both of which can grow unbounded during long-term navigation. To address this, we propose a spectral approach for pose graph sparsification which maximizes the algebraic connectivity of the sparsified measurement graphs, a key quantity which has been shown to control the estimation error of pose graph SLAM solutions. Our algorithm, MAC (for "maximizing algebraic connectivity"), which is based on convex relaxation, is simple and computationally inexpensive, and admits formal post hoc performance guarantees on the quality of the solutions it provides. In experiments on benchmark pose-graph SLAM datasets, we show that our approach quickly produces high-quality sparsification results which retain the connectivity of the graph and, in turn, the quality of corresponding SLAM solutions, as compared to a baseline approach which does not consider graph connectivity.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
Distributed Riemannian Optimization with Lazy Communication for Collaborative Geometric Estimation
Authors:
Yulun Tian,
Amrit Singh Bedi,
Alec Koppel,
Miguel Calvo-Fullana,
David M. Rosen,
Jonathan P. How
Abstract:
We present the first distributed optimization algorithm with lazy communication for collaborative geometric estimation, the backbone of modern collaborative simultaneous localization and mapping (SLAM) and structure-from-motion (SfM) applications. Our method allows agents to cooperatively reconstruct a shared geometric model on a central server by fusing individual observations, but without the ne…
▽ More
We present the first distributed optimization algorithm with lazy communication for collaborative geometric estimation, the backbone of modern collaborative simultaneous localization and mapping (SLAM) and structure-from-motion (SfM) applications. Our method allows agents to cooperatively reconstruct a shared geometric model on a central server by fusing individual observations, but without the need to transmit potentially sensitive information about the agents themselves (such as their locations). Furthermore, to alleviate the burden of communication during iterative optimization, we design a set of communication triggering conditions that enable agents to selectively upload a targeted subset of local information that is useful to global optimization. Our approach thus achieves significant communication reduction with minimal impact on optimization performance. As our main theoretical contribution, we prove that our method converges to first-order critical points with a global sublinear convergence rate. Numerical evaluations on bundle adjustment problems from collaborative SLAM and SfM datasets show that our method performs competitively against existing distributed techniques, while achieving up to 78% total communication reduction.
△ Less
Submitted 29 July, 2022; v1 submitted 1 March, 2022;
originally announced March 2022.
-
On Real-time Image Reconstruction with Neural Networks for MRI-guided Radiotherapy
Authors:
David E. J. Waddington,
Nicholas Hindley,
Neha Koonjoo,
Christopher Chiu,
Tess Reynolds,
Paul Z. Y. Liu,
Bo Zhu,
Danyal Bhutto,
Chiara Paganelli,
Paul J. Keall,
Matthew S. Rosen
Abstract:
MRI-guidance techniques that dynamically adapt radiation beams to follow tumor motion in real-time will lead to more accurate cancer treatments and reduced collateral healthy tissue damage. The gold-standard for reconstruction of undersampled MR data is compressed sensing (CS) which is computationally slow and limits the rate that images can be available for real-time adaptation. Here, we demonstr…
▽ More
MRI-guidance techniques that dynamically adapt radiation beams to follow tumor motion in real-time will lead to more accurate cancer treatments and reduced collateral healthy tissue damage. The gold-standard for reconstruction of undersampled MR data is compressed sensing (CS) which is computationally slow and limits the rate that images can be available for real-time adaptation. Here, we demonstrate the use of automated transform by manifold approximation (AUTOMAP), a generalized framework that maps raw MR signal to the target image domain, to rapidly reconstruct images from undersampled radial k-space data. The AUTOMAP neural network was trained to reconstruct images from a golden-angle radial acquisition, a benchmark for motion-sensitive imaging, on lung cancer patient data and generic images from ImageNet. Model training was subsequently augmented with motion-encoded k-space data derived from videos in the YouTube-8M dataset to encourage motion robust reconstruction. We find that AUTOMAP-reconstructed radial k-space has equivalent accuracy to CS but with much shorter processing times after initial fine-tuning on retrospectively acquired lung cancer patient data. Validation of motion-trained models with a virtual dynamic lung tumor phantom showed that the generalized motion properties learned from YouTube lead to improved target tracking accuracy. Our work shows that AUTOMAP can achieve real-time, accurate reconstruction of radial data. These findings imply that neural-network-based reconstruction is potentially superior to existing approaches for real-time image guidance applications.
△ Less
Submitted 18 May, 2022; v1 submitted 9 February, 2022;
originally announced February 2022.
-
Accurate super-resolution low-field brain MRI
Authors:
Juan Eugenio Iglesias,
Riana Schleicher,
Sonia Laguna,
Benjamin Billot,
Pamela Schaefer,
Brenna McKaig,
Joshua N. Goldstein,
Kevin N. Sheth,
Matthew S. Rosen,
W. Taylor Kimberly
Abstract:
The recent introduction of portable, low-field MRI (LF-MRI) into the clinical setting has the potential to transform neuroimaging. However, LF-MRI is limited by lower resolution and signal-to-noise ratio, leading to incomplete characterization of brain regions. To address this challenge, recent advances in machine learning facilitate the synthesis of higher resolution images derived from one or mu…
▽ More
The recent introduction of portable, low-field MRI (LF-MRI) into the clinical setting has the potential to transform neuroimaging. However, LF-MRI is limited by lower resolution and signal-to-noise ratio, leading to incomplete characterization of brain regions. To address this challenge, recent advances in machine learning facilitate the synthesis of higher resolution images derived from one or multiple lower resolution scans. Here, we report the extension of a machine learning super-resolution (SR) algorithm to synthesize 1 mm isotropic MPRAGE-like scans from LF-MRI T1-weighted and T2-weighted sequences. Our initial results on a paired dataset of LF and high-field (HF, 1.5T-3T) clinical scans show that: (i) application of available automated segmentation tools directly to LF-MRI images falters; but (ii) segmentation tools succeed when applied to SR images with high correlation to gold standard measurements from HF-MRI (e.g., r = 0.85 for hippocampal volume, r = 0.84 for the thalamus, r = 0.92 for the whole cerebrum). This work demonstrates proof-of-principle post-processing image enhancement from lower resolution LF-MRI sequences. These results lay the foundation for future work to enhance the detection of normal and abnormal image findings at LF and ultimately improve the diagnostic performance of LF-MRI. Our tools are publicly available on FreeSurfer (surfer.nmr.mgh.harvard.edu/).
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
Performance Guarantees for Spectral Initialization in Rotation Averaging and Pose-Graph SLAM
Authors:
Kevin J. Doherty,
David M. Rosen,
John J. Leonard
Abstract:
In this work we present the first initialization methods equipped with explicit performance guarantees adapted to the pose-graph simultaneous localization and mapping (SLAM) and rotation averaging (RA) problems. SLAM and rotation averaging are typically formalized as large-scale nonconvex point estimation problems, with many bad local minima that can entrap the smooth optimization methods typicall…
▽ More
In this work we present the first initialization methods equipped with explicit performance guarantees adapted to the pose-graph simultaneous localization and mapping (SLAM) and rotation averaging (RA) problems. SLAM and rotation averaging are typically formalized as large-scale nonconvex point estimation problems, with many bad local minima that can entrap the smooth optimization methods typically applied to solve them; the performance of standard SLAM and RA algorithms thus crucially depends upon the quality of the estimates used to initialize this local search. While many initialization methods for SLAM and RA have appeared in the literature, these are typically obtained as purely heuristic approximations, making it difficult to determine whether (or under what circumstances) these techniques can be reliably deployed. In contrast, in this work we study the problem of initialization through the lens of spectral relaxation. Specifically, we derive a simple spectral relaxation of SLAM and RA, the form of which enables us to exploit classical linear-algebraic techniques (eigenvector perturbation bounds) to control the distance from our spectral estimate to both the (unknown) ground-truth and the global minimizer of the estimation problem as a function of measurement noise. Our results reveal the critical role that spectral graph-theoretic properties of the measurement network play in controlling estimation accuracy; moreover, as a by-product of our analysis we obtain new bounds on the estimation error for the maximum likelihood estimators in SLAM and RA, which are likely to be of independent interest. Finally, we show experimentally that our spectral estimator is very effective in practice, producing initializations of comparable or superior quality at lower computational cost compared to existing state-of-the-art techniques.
△ Less
Submitted 10 January, 2022;
originally announced January 2022.
-
Multiple Randomization Designs
Authors:
Patrick Bajari,
Brian Burdick,
Guido W. Imbens,
Lorenzo Masoero,
James McQueen,
Thomas Richardson,
Ido M. Rosen
Abstract:
In this study we introduce a new class of experimental designs. In a classical randomized controlled trial (RCT), or A/B test, a randomly selected subset of a population of units (e.g., individuals, plots of land, or experiences) is assigned to a treatment (treatment A), and the remainder of the population is assigned to the control treatment (treatment B). The difference in average outcome by tre…
▽ More
In this study we introduce a new class of experimental designs. In a classical randomized controlled trial (RCT), or A/B test, a randomly selected subset of a population of units (e.g., individuals, plots of land, or experiences) is assigned to a treatment (treatment A), and the remainder of the population is assigned to the control treatment (treatment B). The difference in average outcome by treatment group is an estimate of the average effect of the treatment. However, motivating our study, the setting for modern experiments is often different, with the outcomes and treatment assignments indexed by multiple populations. For example, outcomes may be indexed by buyers and sellers, by content creators and subscribers, by drivers and riders, or by travelers and airlines and travel agents, with treatments potentially varying across these indices. Spillovers or interference can arise from interactions between units across populations. For example, sellers' behavior may depend on buyers' treatment assignment, or vice versa. This can invalidate the simple comparison of means as an estimator for the average effect of the treatment in classical RCTs. We propose new experiment designs for settings in which multiple populations interact. We show how these designs allow us to study questions about interference that cannot be answered by classical randomized experiments. Finally, we develop new statistical methods for analyzing these Multiple Randomization Designs.
△ Less
Submitted 26 December, 2021;
originally announced December 2021.
-
Incremental Non-Gaussian Inference for SLAM Using Normalizing Flows
Authors:
Qiangqiang Huang,
Can Pu,
Kasra Khosoussi,
David M. Rosen,
Dehann Fourie,
Jonathan P. How,
John J. Leonard
Abstract:
This paper presents normalizing flows for incremental smoothing and mapping (NF-iSAM), a novel algorithm for inferring the full posterior distribution in SLAM problems with nonlinear measurement models and non-Gaussian factors. NF-iSAM exploits the expressive power of neural networks, and trains normalizing flows to model and sample the full posterior. By leveraging the Bayes tree, NF-iSAM enables…
▽ More
This paper presents normalizing flows for incremental smoothing and mapping (NF-iSAM), a novel algorithm for inferring the full posterior distribution in SLAM problems with nonlinear measurement models and non-Gaussian factors. NF-iSAM exploits the expressive power of neural networks, and trains normalizing flows to model and sample the full posterior. By leveraging the Bayes tree, NF-iSAM enables efficient incremental updates similar to iSAM2, albeit in the more challenging non-Gaussian setting. We demonstrate the advantages of NF-iSAM over state-of-the-art point and distribution estimation algorithms using range-only SLAM problems with data association ambiguity. NF-iSAM presents superior accuracy in describing the posterior beliefs of continuous variables (e.g., position) and discrete variables (e.g., data association).
△ Less
Submitted 2 July, 2022; v1 submitted 2 October, 2021;
originally announced October 2021.
-
Convex Iteration for Distance-Geometric Inverse Kinematics
Authors:
Matthew Giamou,
Filip Marić,
David M. Rosen,
Valentin Peretroukhin,
Nicholas Roy,
Ivan Petrović,
Jonathan Kelly
Abstract:
Inverse kinematics (IK) is the problem of finding robot joint configurations that satisfy constraints on the position or pose of one or more end-effectors. For robots with redundant degrees of freedom, there is often an infinite, nonconvex set of solutions. The IK problem is further complicated when collision avoidance constraints are imposed by obstacles in the workspace. In general, closed-form…
▽ More
Inverse kinematics (IK) is the problem of finding robot joint configurations that satisfy constraints on the position or pose of one or more end-effectors. For robots with redundant degrees of freedom, there is often an infinite, nonconvex set of solutions. The IK problem is further complicated when collision avoidance constraints are imposed by obstacles in the workspace. In general, closed-form expressions yielding feasible configurations do not exist, motivating the use of numerical solution methods. However, these approaches rely on local optimization of nonconvex problems, often requiring an accurate initialization or numerous re-initializations to converge to a valid solution. In this work, we first formulate inverse kinematics with complex workspace constraints as a convex feasibility problem whose low-rank feasible points provide exact IK solutions. We then present \texttt{CIDGIK} (Convex Iteration for Distance-Geometric Inverse Kinematics), an algorithm that solves this feasibility problem with a sequence of semidefinite programs whose objectives are designed to encourage low-rank minimizers. Our problem formulation elegantly unifies the configuration space and workspace constraints of a robot: intrinsic robot geometry and obstacle avoidance are both expressed as simple linear matrix equations and inequalities. Our experimental results for a variety of popular manipulator models demonstrate faster and more accurate convergence than a conventional nonlinear optimization-based approach, especially in environments with many obstacles.
△ Less
Submitted 6 July, 2022; v1 submitted 7 September, 2021;
originally announced September 2021.
-
Balboa: Bobbing and Weaving around Network Censorship
Authors:
Marc B. Rosen,
James Parker,
Alex J. Malozemoff
Abstract:
We introduce Balboa, a link obfuscation framework for censorship circumvention. Balboa provides a general framework for tunneling data through existing applications. Balboa sits between an application and the operating system, intercepting outgoing network traffic and rewriting it to embed data. To avoid introducing any distinguishable divergence from the expected application behavior, Balboa only…
▽ More
We introduce Balboa, a link obfuscation framework for censorship circumvention. Balboa provides a general framework for tunneling data through existing applications. Balboa sits between an application and the operating system, intercepting outgoing network traffic and rewriting it to embed data. To avoid introducing any distinguishable divergence from the expected application behavior, Balboa only rewrites traffic that matches an externally specified \emph{traffic model} pre-shared between the communicating parties. The traffic model captures some subset of the network traffic (e.g., some subset of music an audio streaming server streams). The sender uses this model to replace outgoing data with a pointer to the associated location in the model and embed data in the freed up space. The receiver then extracts the data, replacing the pointer with the original data from the model before passing the data on to the application. When using TLS, this approach means that application behavior with Balboa is \emph{equivalent}, modulo small (protocol-dependent) timing differences, to if the application was running without Balboa.
Balboa differs from prior approaches in that it (1) provides a framework for tunneling data through arbitrary (TLS-protected) protocols/applications, and (2) runs the unaltered application binaries on standard inputs, as opposed to most prior tunneling approaches which run the application on non-standard -- and thus potentially distinguishable -- inputs.
We present two instantiations of Balboa -- one for audio streaming and one for web browsing -- and demonstrate the difficulty of identifying Balboa by a machine learning classifier.
△ Less
Submitted 12 April, 2021;
originally announced April 2021.
-
Advances in Inference and Representation for Simultaneous Localization and Mapping
Authors:
David M. Rosen,
Kevin J. Doherty,
Antonio Teran Espinoza,
John J. Leonard
Abstract:
Simultaneous localization and mapping (SLAM) is the process of constructing a global model of an environment from local observations of it; this is a foundational capability for mobile robots, supporting such core functions as planning, navigation, and control. This article reviews recent progress in SLAM, focusing on advances in the expressive capacity of the environmental models used in SLAM sys…
▽ More
Simultaneous localization and mapping (SLAM) is the process of constructing a global model of an environment from local observations of it; this is a foundational capability for mobile robots, supporting such core functions as planning, navigation, and control. This article reviews recent progress in SLAM, focusing on advances in the expressive capacity of the environmental models used in SLAM systems (representation) and the performance of the algorithms used to estimate these models from data (inference). A prominent theme of recent SLAM research is the pursuit of environmental representations (including learned representations) that go beyond the classical attributes of geometry and appearance to model properties such as hierarchical organization, affordance, dynamics, and semantics; these advances equip autonomous agents with a more comprehensive understanding of the world, enabling more versatile and intelligent operation. A second major theme is a revitalized interest in the mathematical properties of the SLAM estimation problem itself (including its computational and information-theoretic performance limits); this work has led to the development of novel classes of certifiable and robust inference methods that dramatically improve the reliability of SLAM systems in real-world operation. We survey these advances with an emphasis on their ramifications for achieving robust, long-duration autonomy, and conclude with a discussion of open challenges and a perspective on future research directions.
△ Less
Submitted 8 March, 2021;
originally announced March 2021.
-
Shonan Rotation Averaging: Global Optimality by Surfing $SO(p)^n$
Authors:
Frank Dellaert,
David M. Rosen,
Jing Wu,
Robert Mahony,
Luca Carlone
Abstract:
Shonan Rotation Averaging is a fast, simple, and elegant rotation averaging algorithm that is guaranteed to recover globally optimal solutions under mild assumptions on the measurement noise. Our method employs semidefinite relaxation in order to recover provably globally optimal solutions of the rotation averaging problem. In contrast to prior work, we show how to solve large-scale instances of t…
▽ More
Shonan Rotation Averaging is a fast, simple, and elegant rotation averaging algorithm that is guaranteed to recover globally optimal solutions under mild assumptions on the measurement noise. Our method employs semidefinite relaxation in order to recover provably globally optimal solutions of the rotation averaging problem. In contrast to prior work, we show how to solve large-scale instances of these relaxations using manifold minimization on (only slightly) higher-dimensional rotation manifolds, re-using existing high-performance (but local) structure-from-motion pipelines. Our method thus preserves the speed and scalability of current SFM methods, while recovering globally optimal solutions.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
A Smooth Representation of Belief over SO(3) for Deep Rotation Learning with Uncertainty
Authors:
Valentin Peretroukhin,
Matthew Giamou,
David M. Rosen,
W. Nicholas Greene,
Nicholas Roy,
Jonathan Kelly
Abstract:
Accurate rotation estimation is at the heart of robot perception tasks such as visual odometry and object pose estimation. Deep neural networks have provided a new way to perform these tasks, and the choice of rotation representation is an important part of network design. In this work, we present a novel symmetric matrix representation of the 3D rotation group, SO(3), with two important propertie…
▽ More
Accurate rotation estimation is at the heart of robot perception tasks such as visual odometry and object pose estimation. Deep neural networks have provided a new way to perform these tasks, and the choice of rotation representation is an important part of network design. In this work, we present a novel symmetric matrix representation of the 3D rotation group, SO(3), with two important properties that make it particularly suitable for learned models: (1) it satisfies a smoothness property that improves convergence and generalization when regressing large rotation targets, and (2) it encodes a symmetric Bingham belief over the space of unit quaternions, permitting the training of uncertainty-aware models. We empirically validate the benefits of our formulation by training deep neural rotation regressors on two data modalities. First, we use synthetic point-cloud data to show that our representation leads to superior predictive accuracy over existing representations for arbitrary rotation targets. Second, we use image data collected onboard ground and aerial vehicles to demonstrate that our representation is amenable to an effective out-of-distribution (OOD) rejection technique that significantly improves the robustness of rotation estimates to unseen environmental effects and corrupted input images, without requiring the use of an explicit likelihood loss, stochastic sampling, or an auxiliary classifier. This capability is key for safety-critical applications where detecting novel inputs can prevent catastrophic failure of learned models.
△ Less
Submitted 17 January, 2021; v1 submitted 1 June, 2020;
originally announced June 2020.
-
Distributed Certifiably Correct Pose-Graph Optimization
Authors:
Yulun Tian,
Kasra Khosoussi,
David M. Rosen,
Jonathan P. How
Abstract:
This paper presents the first certifiably correct algorithm for distributed pose-graph optimization (PGO), the backbone of modern collaborative simultaneous localization and mapping (CSLAM) and camera network localization (CNL) systems. Our method is based upon a sparse semidefinite relaxation that we prove provides globally-optimal PGO solutions under moderate measurement noise (matching the guar…
▽ More
This paper presents the first certifiably correct algorithm for distributed pose-graph optimization (PGO), the backbone of modern collaborative simultaneous localization and mapping (CSLAM) and camera network localization (CNL) systems. Our method is based upon a sparse semidefinite relaxation that we prove provides globally-optimal PGO solutions under moderate measurement noise (matching the guarantees enjoyed by state-of-the-art centralized methods), but is amenable to distributed optimization using the low-rank Riemannian Staircase framework. To implement the Riemannian Staircase in the distributed setting, we develop Riemannian block coordinate descent (RBCD), a novel method for (locally) minimizing a function over a product of Riemannian manifolds. We also propose the first distributed solution verification and saddle escape methods to certify the global optimality of critical points recovered via RBCD, and to descend from suboptimal critical points (if necessary). All components of our approach are inherently decentralized: they require only local communication, provide privacy protection, and are easily parallelizable. Extensive evaluations on synthetic and real-world datasets demonstrate that the proposed method correctly recovers globally optimal solutions under moderate noise, and outperforms alternative distributed techniques in terms of solution precision and convergence speed.
△ Less
Submitted 18 May, 2021; v1 submitted 9 November, 2019;
originally announced November 2019.
-
MR fingerprinting Deep RecOnstruction NEtwork (DRONE)
Authors:
Ouri Cohen,
Bo Zhu,
Matthew S. Rosen
Abstract:
PURPOSE: Demonstrate a novel fast method for reconstruction of multi-dimensional MR Fingerprinting (MRF) data using Deep Learning methods.
METHODS: A neural network (NN) is defined using the TensorFlow framework and trained on simulated MRF data computed using the Bloch equations. The accuracy of the NN reconstruction of noisy data is compared to conventional MRF template matching as a function…
▽ More
PURPOSE: Demonstrate a novel fast method for reconstruction of multi-dimensional MR Fingerprinting (MRF) data using Deep Learning methods.
METHODS: A neural network (NN) is defined using the TensorFlow framework and trained on simulated MRF data computed using the Bloch equations. The accuracy of the NN reconstruction of noisy data is compared to conventional MRF template matching as a function of training data size, and quantified in a both simulated numerical brain phantom data and acquired data from the ISMRM/NIST phantom. The utility of the method is demonstrated in a healthy subject in vivo at 1.5 T.
RESULTS: Network training required 10 minutes and once trained, data reconstruction required approximately 10 ms. Reconstruction of simulated brain data using the NN resulted in a root-mean-square error (RMSE) of 3.5 ms for T1 and 7.8 ms for T2. The RMSE for the NN trained on sparse dictionaries was approximately 6 fold lower for T1 and 2 fold lower for T2 than conventional MRF dot-product dictionary matching on the same dictionaries. Phantom measurements yielded good agreement (R2=0.99) between the T1 and T2 estimated by the NN and reference values from the ISMRM/NIST phantom.
CONCLUSION: Reconstruction of MRF data with a NN is accurate, 300 fold faster and more robust to noise and undersampling than conventional MRF dictionary matching.
△ Less
Submitted 24 April, 2018; v1 submitted 14 October, 2017;
originally announced October 2017.
-
Image reconstruction by domain transform manifold learning
Authors:
Bo Zhu,
Jeremiah Z. Liu,
Bruce R. Rosen,
Matthew S. Rosen
Abstract:
Image reconstruction plays a critical role in the implementation of all contemporary imaging modalities across the physical and life sciences including optical, MRI, CT, PET, and radio astronomy. During an image acquisition, the sensor encodes an intermediate representation of an object in the sensor domain, which is subsequently reconstructed into an image by an inversion of the encoding function…
▽ More
Image reconstruction plays a critical role in the implementation of all contemporary imaging modalities across the physical and life sciences including optical, MRI, CT, PET, and radio astronomy. During an image acquisition, the sensor encodes an intermediate representation of an object in the sensor domain, which is subsequently reconstructed into an image by an inversion of the encoding function. Image reconstruction is challenging because analytic knowledge of the inverse transform may not exist a priori, especially in the presence of sensor non-idealities and noise. Thus, the standard reconstruction approach involves approximating the inverse function with multiple ad hoc stages in a signal processing chain whose composition depends on the details of each acquisition strategy, and often requires expert parameter tuning to optimize reconstruction performance. We present here a unified framework for image reconstruction, AUtomated TransfOrm by Manifold APproximation (AUTOMAP), which recasts image reconstruction as a data-driven, supervised learning task that allows a mapping between sensor and image domain to emerge from an appropriate corpus of training data. We implement AUTOMAP with a deep neural network and exhibit its flexibility in learning reconstruction transforms for a variety of MRI acquisition strategies, using the same network architecture and hyperparameters. We further demonstrate its efficiency in sparsely representing transforms along low-dimensional manifolds, resulting in superior immunity to noise and reconstruction artifacts compared with conventional handcrafted reconstruction methods. In addition to improving the reconstruction performance of existing acquisition methodologies, we anticipate accelerating the discovery of new acquisition strategies across modalities as the burden of reconstruction becomes lifted by AUTOMAP and learned-reconstruction approaches.
△ Less
Submitted 28 April, 2017;
originally announced April 2017.
-
SE-Sync: A Certifiably Correct Algorithm for Synchronization over the Special Euclidean Group
Authors:
David M. Rosen,
Luca Carlone,
Afonso S. Bandeira,
John J. Leonard
Abstract:
Many important geometric estimation problems take the form of synchronization over the special Euclidean group: estimate the values of a set of poses given a set of relative measurements between them. This problem is typically formulated as a nonconvex maximum-likelihood estimation that is computationally hard to solve in general. Nevertheless, in this paper we present an algorithm that is able to…
▽ More
Many important geometric estimation problems take the form of synchronization over the special Euclidean group: estimate the values of a set of poses given a set of relative measurements between them. This problem is typically formulated as a nonconvex maximum-likelihood estimation that is computationally hard to solve in general. Nevertheless, in this paper we present an algorithm that is able to efficiently recover certifiably globally optimal solutions of the special Euclidean synchronization problem in a non-adversarial noise regime. The crux of our approach is the development of a semidefinite relaxation of the maximum-likelihood estimation whose minimizer provides an exact MLE so long as the magnitude of the noise corrupting the available measurements falls below a certain critical threshold; furthermore, whenever exactness obtains, it is possible to verify this fact a posteriori, thereby certifying the optimality of the recovered estimate. We develop a specialized optimization scheme for solving large-scale instances of this relaxation by exploiting its low-rank, geometric, and graph-theoretic structure to reduce it to an equivalent optimization problem on a low-dimensional Riemannian manifold, and design a truncated-Newton trust-region method to solve this reduction efficiently. Finally, we combine this fast optimization approach with a simple rounding procedure to produce our algorithm, SE-Sync. Experimental evaluation on a variety of simulated and real-world pose-graph SLAM datasets shows that SE-Sync is able to recover certifiably globally optimal solutions when the available measurements are corrupted by noise up to an order of magnitude greater than that typically encountered in robotics and computer vision applications, and does so more than an order of magnitude faster than the Gauss-Newton-based approach that forms the basis of current state-of-the-art techniques.
△ Less
Submitted 4 February, 2017; v1 submitted 21 December, 2016;
originally announced December 2016.
-
A Certifiably Correct Algorithm for Synchronization over the Special Euclidean Group
Authors:
David M. Rosen,
Luca Carlone,
Afonso S. Bandeira,
John J. Leonard
Abstract:
Many geometric estimation problems take the form of synchronization over the special Euclidean group: estimate the values of a set of poses given noisy measurements of a subset of their pairwise relative transforms. This problem is typically formulated as a maximum-likelihood estimation that requires solving a nonconvex nonlinear program, which is computationally intractable in general. Neverthele…
▽ More
Many geometric estimation problems take the form of synchronization over the special Euclidean group: estimate the values of a set of poses given noisy measurements of a subset of their pairwise relative transforms. This problem is typically formulated as a maximum-likelihood estimation that requires solving a nonconvex nonlinear program, which is computationally intractable in general. Nevertheless, in this paper we present an algorithm that is able to efficiently recover certifiably globally optimal solutions of this estimation problem in a non-adversarial noise regime. The crux of our approach is the development of a semidefinite relaxation of the maximum-likelihood estimation whose minimizer provides the exact MLE so long as the magnitude of the noise corrupting the available measurements falls below a certain critical threshold; furthermore, whenever exactness obtains, it is possible to verify this fact a posteriori, thereby certifying the optimality of the recovered estimate. We develop a specialized optimization scheme for solving large-scale instances of this semidefinite relaxation by exploiting its low-rank, geometric, and graph-theoretic structure to reduce it to an equivalent optimization problem on a low-dimensional Riemannian manifold, and then design a Riemannian truncated-Newton trust-region method to solve this reduction efficiently. We combine this fast optimization approach with a simple rounding procedure to produce our algorithm, SE-Sync. Experimental evaluation on a variety of simulated and real-world pose-graph SLAM datasets shows that SE-Sync is capable of recovering globally optimal solutions when the available measurements are corrupted by noise up to an order of magnitude greater than that typically encountered in robotics applications, and does so at a computational cost that scales comparably with that of direct Newton-type local search techniques.
△ Less
Submitted 9 February, 2017; v1 submitted 1 November, 2016;
originally announced November 2016.
-
A Near-Optimal Sublinear-Time Algorithm for Approximating the Minimum Vertex Cover Size
Authors:
Krzysztof Onak,
Dana Ron,
Michal Rosen,
Ronitt Rubinfeld
Abstract:
We give a nearly optimal sublinear-time algorithm for approximating the size of a minimum vertex cover in a graph G. The algorithm may query the degree deg(v) of any vertex v of its choice, and for each 1 <= i <= deg(v), it may ask for the i-th neighbor of v. Letting VC_opt(G) denote the minimum size of vertex cover in G, the algorithm outputs, with high constant success probability, an estimate V…
▽ More
We give a nearly optimal sublinear-time algorithm for approximating the size of a minimum vertex cover in a graph G. The algorithm may query the degree deg(v) of any vertex v of its choice, and for each 1 <= i <= deg(v), it may ask for the i-th neighbor of v. Letting VC_opt(G) denote the minimum size of vertex cover in G, the algorithm outputs, with high constant success probability, an estimate VC_estimate(G) such that VC_opt(G) <= VC_estimate(G) <= 2 * VC_opt(G) + epsilon*n, where epsilon is a given additive approximation parameter. We refer to such an estimate as a (2,epsilon)-estimate. The query complexity and running time of the algorithm are ~O(avg_deg * poly(1/epsilon)), where avg_deg denotes the average vertex degree in the graph. The best previously known sublinear algorithm, of Yoshida et al. (STOC 2009), has query complexity and running time O(d^4/epsilon^2), where d is the maximum degree in the graph. Given the lower bound of Omega(avg_deg) (for constant epsilon) for obtaining such an estimate (with any constant multiplicative factor) due to Parnas and Ron (TCS 2007), our result is nearly optimal.
In the case that the graph is dense, that is, the number of edges is Theta(n^2), we consider another model, in which the algorithm may ask, for any pair of vertices u and v, whether there is an edge between u and v. We show how to adapt the algorithm that uses neighbor queries to this model and obtain an algorithm that outputs a (2,epsilon)-estimate of the size of a minimum vertex cover whose query complexity and running time are ~O(n) * poly(1/epsilon).
△ Less
Submitted 5 October, 2011;
originally announced October 2011.