Search | arXiv e-print repository

A Certifably Correct Algorithm for Generalized Robot-World and Hand-Eye Calibration

Authors: Emmett Wise, Pushyami Kaveti, Qilong Chen, Wenhao Wang, Hanumant Singh, Jonathan Kelly, David M. Rosen, Matthew Giamou

Abstract: Automatic extrinsic sensor calibration is a fundamental problem for multi-sensor platforms. Reliable and general-purpose solutions should be computationally efficient, require few assumptions about the structure of the sensing environment, and demand little effort from human operators. Since the engineering effort required to obtain accurate calibration parameters increases with the number of sens… ▽ More Automatic extrinsic sensor calibration is a fundamental problem for multi-sensor platforms. Reliable and general-purpose solutions should be computationally efficient, require few assumptions about the structure of the sensing environment, and demand little effort from human operators. Since the engineering effort required to obtain accurate calibration parameters increases with the number of sensors deployed, robotics researchers have pursued methods requiring few assumptions about the sensing environment and minimal effort from human operators. In this work, we introduce a fast and certifiably globally optimal algorithm for solving a generalized formulation of the $\textit{robot-world and hand-eye calibration}$ (RWHEC) problem. The formulation of RWHEC presented is "generalized" in that it supports the simultaneous estimation of multiple sensor and target poses, and permits the use of monocular cameras that, alone, are unable to measure the scale of their environments. In addition to demonstrating our method's superior performance over existing solutions, we derive novel identifiability criteria and establish $\textit{a priori}$ guarantees of global optimality for problem instances with bounded measurement errors. We also introduce a complementary Lie-algebraic local solver for RWHEC and compare its performance with our global method and prior art. Finally, we provide a free and open-source implementation of our algorithms and experiments. △ Less

Submitted 30 July, 2025; originally announced July 2025.

Comments: 25 pages, 10 figures, submitted to the International Journal of Robotics Research

arXiv:2505.12228 [pdf, ps, other]

From Low Field to High Value: Robust Cortical Mapping from Low-Field MRI

Authors: Karthik Gopinath, Annabel Sorby-Adams, Jonathan W. Ramirez, Dina Zemlyanker, Jennifer Guo, David Hunt, Christine L. Mac Donald, C. Dirk Keene, Timothy Coalson, Matthew F. Glasser, David Van Essen, Matthew S. Rosen, Oula Puonti, W. Taylor Kimberly, Juan Eugenio Iglesias

Abstract: Three-dimensional reconstruction of cortical surfaces from MRI for morphometric analysis is fundamental for understanding brain structure. While high-field MRI (HF-MRI) is standard in research and clinical settings, its limited availability hinders widespread use. Low-field MRI (LF-MRI), particularly portable systems, offers a cost-effective and accessible alternative. However, existing cortical s… ▽ More Three-dimensional reconstruction of cortical surfaces from MRI for morphometric analysis is fundamental for understanding brain structure. While high-field MRI (HF-MRI) is standard in research and clinical settings, its limited availability hinders widespread use. Low-field MRI (LF-MRI), particularly portable systems, offers a cost-effective and accessible alternative. However, existing cortical surface analysis tools are optimized for high-resolution HF-MRI and struggle with the lower signal-to-noise ratio and resolution of LF-MRI. In this work, we present a machine learning method for 3D reconstruction and analysis of portable LF-MRI across a range of contrasts and resolutions. Our method works "out of the box" without retraining. It uses a 3D U-Net trained on synthetic LF-MRI to predict signed distance functions of cortical surfaces, followed by geometric processing to ensure topological accuracy. We evaluate our method using paired HF/LF-MRI scans of the same subjects, showing that LF-MRI surface reconstruction accuracy depends on acquisition parameters, including contrast type (T1 vs T2), orientation (axial vs isotropic), and resolution. A 3mm isotropic T2-weighted scan acquired in under 4 minutes, yields strong agreement with HF-derived surfaces: surface area correlates at r=0.96, cortical parcellations reach Dice=0.98, and gray matter volume achieves r=0.93. Cortical thickness remains more challenging with correlations up to r=0.70, reflecting the difficulty of sub-mm precision with 3mm voxels. We further validate our method on challenging postmortem LF-MRI, demonstrating its robustness. Our method represents a step toward enabling cortical surface analysis on portable LF-MRI. Code is available at https://surfer.nmr.mgh.harvard.edu/fswiki/ReconAny △ Less

Submitted 18 May, 2025; originally announced May 2025.

Comments: 32 pages

arXiv:2505.02470 [pdf, other]

Deep learning of personalized priors from past MRI scans enables fast, quality-enhanced point-of-care MRI with low-cost systems

Authors: Tal Oved, Beatrice Lena, Chloé F. Najac, Sheng Shen, Matthew S. Rosen, Andrew Webb, Efrat Shimron

Abstract: Magnetic resonance imaging (MRI) offers superb-quality images, but its accessibility is limited by high costs, posing challenges for patients requiring longitudinal care. Low-field MRI provides affordable imaging with low-cost devices but is hindered by long scans and degraded image quality, including low signal-to-noise ratio (SNR) and tissue contrast. We propose a novel healthcare paradigm: usin… ▽ More Magnetic resonance imaging (MRI) offers superb-quality images, but its accessibility is limited by high costs, posing challenges for patients requiring longitudinal care. Low-field MRI provides affordable imaging with low-cost devices but is hindered by long scans and degraded image quality, including low signal-to-noise ratio (SNR) and tissue contrast. We propose a novel healthcare paradigm: using deep learning to extract personalized features from past standard high-field MRI scans and harnessing them to enable accelerated, enhanced-quality follow-up scans with low-cost systems. To overcome the SNR and contrast differences, we introduce ViT-Fuser, a feature-fusion vision transformer that learns features from past scans, e.g. those stored in standard DICOM CDs. We show that \textit{a single prior scan is sufficient}, and this scan can come from various MRI vendors, field strengths, and pulse sequences. Experiments with four datasets, including glioblastoma data, low-field ($50mT$), and ultra-low-field ($6.5mT$) data, demonstrate that ViT-Fuser outperforms state-of-the-art methods, providing enhanced-quality images from accelerated low-field scans, with robustness to out-of-distribution data. Our freely available framework thus enables rapid, diagnostic-quality, low-cost imaging for wide healthcare applications. △ Less

Submitted 5 May, 2025; originally announced May 2025.

arXiv:2503.09963 [pdf, other]

Reference-Free 3D Reconstruction of Brain Dissection Photographs with Machine Learning

Authors: Lin Tian, Sean I. Young, Jonathan Williams Ramirez, Dina Zemlyanker, Lucas Jacob Deden Binder, Rogeny Herisse, Theresa R. Connors, Derek H. Oakley, Bradley T. Hyman, Oula Puonti, Matthew S. Rosen, Juan Eugenio Iglesias

Abstract: Correlation of neuropathology with MRI has the potential to transfer microscopic signatures of pathology to invivo scans. Recently, a classical registration method has been proposed, to build these correlations from 3D reconstructed stacks of dissection photographs, which are routinely taken at brain banks. These photographs bypass the need for exvivo MRI, which is not widely accessible. However,… ▽ More Correlation of neuropathology with MRI has the potential to transfer microscopic signatures of pathology to invivo scans. Recently, a classical registration method has been proposed, to build these correlations from 3D reconstructed stacks of dissection photographs, which are routinely taken at brain banks. These photographs bypass the need for exvivo MRI, which is not widely accessible. However, this method requires a full stack of brain slabs and a reference mask (e.g., acquired with a surface scanner), which severely limits the applicability of the technique. Here we propose RefFree, a dissection photograph reconstruction method without external reference. RefFree is a learning approach that estimates the 3D coordinates in the atlas space for every pixel in every photograph; simple least-squares fitting can then be used to compute the 3D reconstruction. As a by-product, RefFree also produces an atlas-based segmentation of the reconstructed stack. RefFree is trained on synthetic photographs generated from digitally sliced 3D MRI data, with randomized appearance for enhanced generalization ability. Experiments on simulated and real data show that RefFree achieves performance comparable to the baseline method without an explicit reference while also enabling reconstruction of partial stacks. Our code is available at https://github.com/lintian-a/reffree. △ Less

Submitted 12 March, 2025; originally announced March 2025.

arXiv:2503.03192 [pdf, other]

Distributed Certifiably Correct Range-Aided SLAM

Authors: Alexander Thoms, Alan Papalia, Jared Velasquez, David M. Rosen, Sriram Narasimhan

Abstract: Reliable simultaneous localization and mapping (SLAM) algorithms are necessary for safety-critical autonomous navigation. In the communication-constrained multi-agent setting, navigation systems increasingly use point-to-point range sensors as they afford measurements with low bandwidth requirements and known data association. The state estimation problem for these systems takes the form of range-… ▽ More Reliable simultaneous localization and mapping (SLAM) algorithms are necessary for safety-critical autonomous navigation. In the communication-constrained multi-agent setting, navigation systems increasingly use point-to-point range sensors as they afford measurements with low bandwidth requirements and known data association. The state estimation problem for these systems takes the form of range-aided (RA) SLAM. However, distributed algorithms for solving the RA-SLAM problem lack formal guarantees on the quality of the returned estimate. To this end, we present the first distributed algorithm for RA-SLAM that can efficiently recover certifiably globally optimal solutions. Our algorithm, distributed certifiably correct RA-SLAM (DCORA), achieves this via the Riemannian Staircase method, where computational procedures developed for distributed certifiably correct pose graph optimization are generalized to the RA-SLAM problem. We demonstrate DCORA's efficacy on real-world multi-agent datasets by achieving absolute trajectory errors comparable to those of a state-of-the-art centralized certifiably correct RA-SLAM algorithm. Additionally, we perform a parametric study on the structure of the RA-SLAM problem using synthetic data, revealing how common parameters affect DCORA's performance. △ Less

Submitted 13 May, 2025; v1 submitted 5 March, 2025; originally announced March 2025.

Comments: 8 pages, 3 figures, accepted to the 2025 International Conference on Robotics and Automation (ICRA). This version includes minor clerical edits to the published version in the conference proceedings

arXiv:2410.16195 [pdf, other]

A Trust-Region Method for Graphical Stein Variational Inference

Authors: Liam Pavlovic, David M. Rosen

Abstract: Stein variational inference (SVI) is a sample-based approximate Bayesian inference technique that generates a sample set by jointly optimizing the samples' locations to minimize an information-theoretic measure of discrepancy with the target probability distribution. SVI thus provides a fast and significantly more sample-efficient approach to Bayesian inference than traditional (random-sampling-ba… ▽ More Stein variational inference (SVI) is a sample-based approximate Bayesian inference technique that generates a sample set by jointly optimizing the samples' locations to minimize an information-theoretic measure of discrepancy with the target probability distribution. SVI thus provides a fast and significantly more sample-efficient approach to Bayesian inference than traditional (random-sampling-based) alternatives. However, the optimization techniques employed in existing SVI methods struggle to address problems in which the target distribution is high-dimensional, poorly-conditioned, or non-convex, which severely limits the range of their practical applicability. In this paper, we propose a novel trust-region optimization approach for SVI that successfully addresses each of these challenges. Our method builds upon prior work in SVI by leveraging conditional independences in the target distribution (to achieve high-dimensional scaling) and second-order information (to address poor conditioning), while additionally providing an effective adaptive step control procedure, which is essential for ensuring convergence on challenging non-convex optimization problems. Experimental results show our method achieves superior numerical performance, both in convergence rate and sample accuracy, and scales better in high-dimensional distributions, than previous SVI techniques. △ Less

Submitted 21 October, 2024; originally announced October 2024.

arXiv:2410.04097 [pdf, other]

TV-based Deep 3D Self Super-Resolution for fMRI

Authors: Fernando Pérez-Bueno, Hongwei Bran Li, Matthew S. Rosen, Shahin Nasr, Cesar Caballero-Gaudes, Juan Eugenio Iglesias

Abstract: While functional Magnetic Resonance Imaging (fMRI) offers valuable insights into cognitive processes, its inherent spatial limitations pose challenges for detailed analysis of the fine-grained functional architecture of the brain. More specifically, MRI scanner and sequence specifications impose a trade-off between temporal resolution, spatial resolution, signal-to-noise ratio, and scan time. Deep… ▽ More While functional Magnetic Resonance Imaging (fMRI) offers valuable insights into cognitive processes, its inherent spatial limitations pose challenges for detailed analysis of the fine-grained functional architecture of the brain. More specifically, MRI scanner and sequence specifications impose a trade-off between temporal resolution, spatial resolution, signal-to-noise ratio, and scan time. Deep Learning (DL) Super-Resolution (SR) methods have emerged as a promising solution to enhance fMRI resolution, generating high-resolution (HR) images from low-resolution (LR) images typically acquired with lower scanning times. However, most existing SR approaches depend on supervised DL techniques, which require training ground truth (GT) HR data, which is often difficult to acquire and simultaneously sets a bound for how far SR can go. In this paper, we introduce a novel self-supervised DL SR model that combines a DL network with an analytical approach and Total Variation (TV) regularization. Our method eliminates the need for external GT images, achieving competitive performance compared to supervised DL techniques and preserving the functional maps. △ Less

Submitted 24 February, 2025; v1 submitted 5 October, 2024; originally announced October 2024.

Comments: Preprint Submitted to ISBI 2025 (Accepted)

arXiv:2410.00117 [pdf, ps, other]

An Overview of the Burer-Monteiro Method for Certifiable Robot Perception

Authors: Alan Papalia, Yulun Tian, David M. Rosen, Jonathan P. How, John J. Leonard

Abstract: This paper presents an overview of the Burer-Monteiro method (BM), a technique that has been applied to solve robot perception problems to certifiable optimality in real-time. BM is often used to solve semidefinite programming relaxations, which can be used to perform global optimization for non-convex perception problems. Specifically, BM leverages the low-rank structure of typical semidefinite p… ▽ More This paper presents an overview of the Burer-Monteiro method (BM), a technique that has been applied to solve robot perception problems to certifiable optimality in real-time. BM is often used to solve semidefinite programming relaxations, which can be used to perform global optimization for non-convex perception problems. Specifically, BM leverages the low-rank structure of typical semidefinite programs to dramatically reduce the computational cost of performing optimization. This paper discusses BM in certifiable perception, with three main objectives: (i) to consolidate information from the literature into a unified presentation, (ii) to elucidate the role of the linear independence constraint qualification (LICQ), a concept not yet well-covered in certifiable perception literature, and (iii) to share practical considerations that are discussed among practitioners but not thoroughly covered in the literature. Our general aim is to offer a practical primer for applying BM towards certifiable perception. △ Less

Submitted 9 June, 2025; v1 submitted 30 September, 2024; originally announced October 2024.

Comments: Accepted to 2024 Robotics: Science and Systems (RSS) Safe Autonomy Workshop

MSC Class: 49; 68 ACM Class: I.4.0; I.5.0; J.2

arXiv:2405.16460 [pdf, other]

Probabilistic Contrastive Learning with Explicit Concentration on the Hypersphere

Authors: Hongwei Bran Li, Cheng Ouyang, Tamaz Amiranashvili, Matthew S. Rosen, Bjoern Menze, Juan Eugenio Iglesias

Abstract: Self-supervised contrastive learning has predominantly adopted deterministic methods, which are not suited for environments characterized by uncertainty and noise. This paper introduces a new perspective on incorporating uncertainty into contrastive learning by embedding representations within a spherical space, inspired by the von Mises-Fisher distribution (vMF). We introduce an unnormalized form… ▽ More Self-supervised contrastive learning has predominantly adopted deterministic methods, which are not suited for environments characterized by uncertainty and noise. This paper introduces a new perspective on incorporating uncertainty into contrastive learning by embedding representations within a spherical space, inspired by the von Mises-Fisher distribution (vMF). We introduce an unnormalized form of vMF and leverage the concentration parameter, kappa, as a direct, interpretable measure to quantify uncertainty explicitly. This approach not only provides a probabilistic interpretation of the embedding space but also offers a method to calibrate model confidence against varying levels of data corruption and characteristics. Our empirical results demonstrate that the estimated concentration parameter correlates strongly with the degree of unforeseen data corruption encountered at test time, enables failure analysis, and enhances existing out-of-distribution detection methods. △ Less

Submitted 26 May, 2024; originally announced May 2024.

Comments: technical report

arXiv:2404.02127 [pdf, other]

LawInstruct: A Resource for Studying Language Model Adaptation to the Legal Domain

Authors: Joel Niklaus, Lucia Zheng, Arya D. McCarthy, Christopher Hahn, Brian M. Rosen, Peter Henderson, Daniel E. Ho, Garrett Honke, Percy Liang, Christopher Manning

Abstract: Instruction tuning is an important step in making language models useful for direct user interaction. However, the legal domain is underrepresented in typical instruction datasets (e.g., only 10 out of 1600+ tasks in Super-NaturalInstructions). To study whether instruction tuning on legal datasets is necessary for strong legal reasoning, we aggregate 58 annotated legal datasets and write instructi… ▽ More Instruction tuning is an important step in making language models useful for direct user interaction. However, the legal domain is underrepresented in typical instruction datasets (e.g., only 10 out of 1600+ tasks in Super-NaturalInstructions). To study whether instruction tuning on legal datasets is necessary for strong legal reasoning, we aggregate 58 annotated legal datasets and write instructions for each, creating LawInstruct. LawInstruct covers 17 global jurisdictions, 24 languages and a total of 12M examples across diverse tasks such as legal QA, summarization of court cases, and legal argument mining. We evaluate our models on LegalBench, measuring legal reasoning across five categories in 162 challenging and realistic legal tasks, and MMLU, to measure potential drops in general reasoning capabilities. We find that legal-specific instruction tuning on Flan-T5 - yielding FLawN-T5 - improves performance on LegalBench across all model sizes, with an aggregate increase of 15 points or 50% over Flan-T5 for the base size. No model size shows performance drops in MMLU. We publish LawInstruct as a resource for further study of instruction tuning in the legal domain. △ Less

Submitted 23 January, 2025; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: Accepted at Findings of NAACL 2025

MSC Class: 68T50 ACM Class: I.2

arXiv:2401.07370 [pdf, other]

Generation of Synthetic Images for Pedestrian Detection Using a Sequence of GANs

Authors: Viktor Seib, Malte Roosen, Ida Germann, Stefan Wirtz, Dietrich Paulus

Abstract: Creating annotated datasets demands a substantial amount of manual effort. In this proof-of-concept work, we address this issue by proposing a novel image generation pipeline. The pipeline consists of three distinct generative adversarial networks (previously published), combined in a novel way to augment a dataset for pedestrian detection. Despite the fact that the generated images are not always… ▽ More Creating annotated datasets demands a substantial amount of manual effort. In this proof-of-concept work, we address this issue by proposing a novel image generation pipeline. The pipeline consists of three distinct generative adversarial networks (previously published), combined in a novel way to augment a dataset for pedestrian detection. Despite the fact that the generated images are not always visually pleasant to the human eye, our detection benchmark reveals that the results substantially surpass the baseline. The presented proof-of-concept work was done in 2020 and is now published as a technical report after a three years retention period. △ Less

Submitted 14 January, 2024; originally announced January 2024.

arXiv:2312.05119 [pdf, other]

Quantifying white matter hyperintensity and brain volumes in heterogeneous clinical and low-field portable MRI

Authors: Pablo Laso, Stefano Cerri, Annabel Sorby-Adams, Jennifer Guo, Farrah Mateen, Philipp Goebl, Jiaming Wu, Peirong Liu, Hongwei Li, Sean I. Young, Benjamin Billot, Oula Puonti, Gordon Sze, Sam Payabavash, Adam DeHavenon, Kevin N. Sheth, Matthew S. Rosen, John Kirsch, Nicola Strisciuglio, Jelmer M. Wolterink, Arman Eshaghi, Frederik Barkhof, W. Taylor Kimberly, Juan Eugenio Iglesias

Abstract: Brain atrophy and white matter hyperintensity (WMH) are critical neuroimaging features for ascertaining brain injury in cerebrovascular disease and multiple sclerosis. Automated segmentation and quantification is desirable but existing methods require high-resolution MRI with good signal-to-noise ratio (SNR). This precludes application to clinical and low-field portable MRI (pMRI) scans, thus hamp… ▽ More Brain atrophy and white matter hyperintensity (WMH) are critical neuroimaging features for ascertaining brain injury in cerebrovascular disease and multiple sclerosis. Automated segmentation and quantification is desirable but existing methods require high-resolution MRI with good signal-to-noise ratio (SNR). This precludes application to clinical and low-field portable MRI (pMRI) scans, thus hampering large-scale tracking of atrophy and WMH progression, especially in underserved areas where pMRI has huge potential. Here we present a method that segments white matter hyperintensity and 36 brain regions from scans of any resolution and contrast (including pMRI) without retraining. We show results on eight public datasets and on a private dataset with paired high- and low-field scans (3T and 64mT), where we attain strong correlation between the WMH ($ρ$=.85) and hippocampal volumes (r=.89) estimated at both fields. Our method is publicly available as part of FreeSurfer, at: http://surfer.nmr.mgh.harvard.edu/fswiki/WMH-SynthSeg. △ Less

Submitted 15 February, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

arXiv:2311.14918 [pdf, other]

Resolution- and Stimulus-agnostic Super-Resolution of Ultra-High-Field Functional MRI: Application to Visual Studies

Authors: Hongwei Bran Li, Matthew S. Rosen, Shahin Nasr, Juan Eugenio Iglesias

Abstract: High-resolution fMRI provides a window into the brain's mesoscale organization. Yet, higher spatial resolution increases scan times, to compensate for the low signal and contrast-to-noise ratio. This work introduces a deep learning-based 3D super-resolution (SR) method for fMRI. By incorporating a resolution-agnostic image augmentation framework, our method adapts to varying voxel sizes without re… ▽ More High-resolution fMRI provides a window into the brain's mesoscale organization. Yet, higher spatial resolution increases scan times, to compensate for the low signal and contrast-to-noise ratio. This work introduces a deep learning-based 3D super-resolution (SR) method for fMRI. By incorporating a resolution-agnostic image augmentation framework, our method adapts to varying voxel sizes without retraining. We apply this innovative technique to localize fine-scale motion-selective sites in the early visual areas. Detection of these sites typically requires a resolution higher than 1 mm isotropic, whereas here, we visualize them based on lower resolution (2-3mm isotropic) fMRI data. Remarkably, the super-resolved fMRI is able to recover high-frequency detail of the interdigitated organization of these sites (relative to the color-selective sites), even with training data sourced from different subjects and experimental paradigms -- including non-visual resting-state fMRI, underscoring its robustness and versatility. Quantitative and qualitative results indicate that our method has the potential to enhance the spatial resolution of fMRI, leading to a drastic reduction in acquisition time. △ Less

Submitted 19 March, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

Comments: ISBI2024 final version

arXiv:2309.10698 [pdf, other]

OASIS: Optimal Arrangements for Sensing in SLAM

Authors: Pushyami Kaveti, Matthew Giamou, Hanumant Singh, David M. Rosen

Abstract: The number and arrangement of sensors on mobile robot dramatically influence its perception capabilities. Ensuring that sensors are mounted in a manner that enables accurate detection, localization, and mapping is essential for the success of downstream control tasks. However, when designing a new robotic platform, researchers and practitioners alike usually mimic standard configurations or maximi… ▽ More The number and arrangement of sensors on mobile robot dramatically influence its perception capabilities. Ensuring that sensors are mounted in a manner that enables accurate detection, localization, and mapping is essential for the success of downstream control tasks. However, when designing a new robotic platform, researchers and practitioners alike usually mimic standard configurations or maximize simple heuristics like field-of-view (FOV) coverage to decide where to place exteroceptive sensors. In this work, we conduct an information-theoretic investigation of this overlooked element of robotic perception in the context of simultaneous localization and mapping (SLAM). We show how to formalize the sensor arrangement problem as a form of subset selection under the E-optimality performance criterion. While this formulation is NP-hard in general, we show that a combination of greedy sensor selection and fast convex relaxation-based post-hoc verification enables the efficient recovery of certifiably optimal sensor designs in practice. Results from synthetic experiments reveal that sensors placed with OASIS outperform benchmarks in terms of mean squared error of visual SLAM estimates. △ Less

Submitted 21 March, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

arXiv:2305.07618 [pdf]

Uncertainty Estimation and Out-of-Distribution Detection for Deep Learning-Based Image Reconstruction using the Local Lipschitz

Authors: Danyal F. Bhutto, Bo Zhu, Jeremiah Z. Liu, Neha Koonjoo, Hongwei B. Li, Bruce R. Rosen, Matthew S. Rosen

Abstract: Accurate image reconstruction is at the heart of diagnostics in medical imaging. Supervised deep learning-based approaches have been investigated for solving inverse problems including image reconstruction. However, these trained models encounter unseen data distributions that are widely shifted from training data during deployment. Therefore, it is essential to assess whether a given input falls… ▽ More Accurate image reconstruction is at the heart of diagnostics in medical imaging. Supervised deep learning-based approaches have been investigated for solving inverse problems including image reconstruction. However, these trained models encounter unseen data distributions that are widely shifted from training data during deployment. Therefore, it is essential to assess whether a given input falls within the training data distribution for diagnostic purposes. Uncertainty estimation approaches exist but focus on providing an uncertainty map to radiologists, rather than assessing the training distribution fit. In this work, we propose a method based on the local Lipschitz-based metric to distinguish out-of-distribution images from in-distribution with an area under the curve of 99.94%. Empirically, we demonstrate a very strong relationship between the local Lipschitz value and mean absolute error (MAE), supported by a high Spearman's rank correlation coefficient of 0.8475, which determines the uncertainty estimation threshold for optimal model performance. Through the identification of false positives, the local Lipschitz and MAE relationship was used to guide data augmentation and reduce model uncertainty. Our study was validated using the AUTOMAP architecture for sensor-to-image Magnetic Resonance Imaging (MRI) reconstruction. We compare our proposed approach with baseline methods: Monte-Carlo dropout and deep ensembles, and further analysis included MRI denoising and Computed Tomography (CT) sparse-to-full view reconstruction using UNET architectures. We show that our approach is applicable to various architectures and learned functions, especially in the realm of medical image reconstruction, where preserving the diagnostic accuracy of reconstructed images remains paramount. △ Less

Submitted 1 December, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

arXiv:2302.11614 [pdf, other]

doi 10.1109/TRO.2024.3454430

Certifiably Correct Range-Aided SLAM

Authors: Alan Papalia, Andrew Fishberg, Brendan W. O'Neill, Jonathan P. How, David M. Rosen, John J. Leonard

Abstract: We present the first algorithm to efficiently compute certifiably optimal solutions to range-aided simultaneous localization and mapping (RA-SLAM) problems. Robotic navigation systems increasingly incorporate point-to-point ranging sensors, leading to state estimation problems in the form of RA-SLAM. However, the RA-SLAM problem is significantly more difficult to solve than traditional pose-graph… ▽ More We present the first algorithm to efficiently compute certifiably optimal solutions to range-aided simultaneous localization and mapping (RA-SLAM) problems. Robotic navigation systems increasingly incorporate point-to-point ranging sensors, leading to state estimation problems in the form of RA-SLAM. However, the RA-SLAM problem is significantly more difficult to solve than traditional pose-graph SLAM: ranging sensor models introduce non-convexity and single range measurements do not uniquely determine the transform between the involved sensors. As a result, RA-SLAM inference is sensitive to initial estimates yet lacks reliable initialization techniques. Our approach, certifiably correct RA-SLAM (CORA), leverages a novel quadratically constrained quadratic programming (QCQP) formulation of RA-SLAM to relax the RA-SLAM problem to a semidefinite program (SDP). CORA solves the SDP efficiently using the Riemannian Staircase methodology; the SDP solution provides both (i) a lower bound on the RA-SLAM problem's optimal value, and (ii) an approximate solution of the RA-SLAM problem, which can be subsequently refined using local optimization. CORA applies to problems with arbitrary pose-pose, pose-landmark, and ranging measurements and, due to using convex relaxation, is insensitive to initialization. We evaluate CORA on several real-world problems. In contrast to state-of-the-art approaches, CORA is able to obtain high-quality solutions on all problems despite being initialized with random values. Additionally, we study the tightness of the SDP relaxation with respect to important problem parameters: the number of (i) robots, (ii) landmarks, and (iii) range measurements. These experiments demonstrate that the SDP relaxation is often tight and reveal relationships between graph connectivity and the tightness of the SDP relaxation. △ Less

Submitted 28 September, 2024; v1 submitted 22 February, 2023; originally announced February 2023.

Comments: Accepted to Transactions on Robotics (T-RO)

arXiv:2211.15047 [pdf]

Synthetic Low-Field MRI Super-Resolution Via Nested U-Net Architecture

Authors: Aryan Kalluvila, Neha Koonjoo, Danyal Bhutto, Marcio Rockenbach, Matthew S. Rosen

Abstract: Low-field (LF) MRI scanners have the power to revolutionize medical imaging by providing a portable and cheaper alternative to high-field MRI scanners. However, such scanners are usually significantly noisier and lower quality than their high-field counterparts. The aim of this paper is to improve the SNR and overall image quality of low-field MRI scans to improve diagnostic capability. To address… ▽ More Low-field (LF) MRI scanners have the power to revolutionize medical imaging by providing a portable and cheaper alternative to high-field MRI scanners. However, such scanners are usually significantly noisier and lower quality than their high-field counterparts. The aim of this paper is to improve the SNR and overall image quality of low-field MRI scans to improve diagnostic capability. To address this issue, we propose a Nested U-Net neural network architecture super-resolution algorithm that outperforms previously suggested deep learning methods with an average PSNR of 78.83 and SSIM of 0.9551. We tested our network on artificial noisy downsampled synthetic data from a major T1 weighted MRI image dataset called the T1-mix dataset. One board-certified radiologist scored 25 images on the Likert scale (1-5) assessing overall image quality, anatomical structure, and diagnostic confidence across our architecture and other published works (SR DenseNet, Generator Block, SRCNN, etc.). We also introduce a new type of loss function called natural log mean squared error (NLMSE). In conclusion, we present a more accurate deep learning method for single image super-resolution applied to synthetic low-field MRI via a Nested U-Net architecture. △ Less

Submitted 27 November, 2022; originally announced November 2022.

arXiv:2210.03177 [pdf, other]

SCORE: A Second-Order Conic Initialization for Range-Aided SLAM

Authors: Alan Papalia, Joseph Morales, Kevin J. Doherty, David M. Rosen, John J. Leonard

Abstract: We present a novel initialization technique for the range-aided simultaneous localization and mapping (RA-SLAM) problem. In RA-SLAM we consider measurements of point-to-point distances in addition to measurements of rigid transformations to landmark or pose variables. Standard formulations of RA-SLAM approach the problem as non-convex optimization, which requires a good initialization to obtain qu… ▽ More We present a novel initialization technique for the range-aided simultaneous localization and mapping (RA-SLAM) problem. In RA-SLAM we consider measurements of point-to-point distances in addition to measurements of rigid transformations to landmark or pose variables. Standard formulations of RA-SLAM approach the problem as non-convex optimization, which requires a good initialization to obtain quality results. The initialization technique proposed here relaxes the RA-SLAM problem to a convex problem which is then solved to determine an initialization for the original, non-convex problem. The relaxation is a second-order cone program (SOCP), which is derived from a quadratically constrained quadratic program (QCQP) formulation of the RA-SLAM problem. As a SOCP, the method is highly scalable. We name this relaxation Second-order COnic RElaxation for RA-SLAM (SCORE). To our knowledge, this work represents the first convex relaxation for RA-SLAM. We present real-world and simulated experiments which show SCORE initialization permits the efficient recovery of quality solutions for a variety of challenging single- and multi-robot RA-SLAM problems with thousands of poses and range measurements. △ Less

Submitted 6 October, 2022; originally announced October 2022.

Comments: 9 pages, 8 figures, extended version of paper submitted to ICRA 2023

arXiv:2207.05257 [pdf, other]

doi 10.1109/LRA.2022.3220154

Accelerating Certifiable Estimation with Preconditioned Eigensolvers

Authors: David M. Rosen

Abstract: Convex (specifically semidefinite) relaxation provides a powerful approach to constructing robust machine perception systems, enabling the recovery of certifiably globally optimal solutions of challenging estimation problems in many practical settings. However, solving the large-scale semidefinite relaxations underpinning this approach remains a formidable computational challenge. A dominant cost… ▽ More Convex (specifically semidefinite) relaxation provides a powerful approach to constructing robust machine perception systems, enabling the recovery of certifiably globally optimal solutions of challenging estimation problems in many practical settings. However, solving the large-scale semidefinite relaxations underpinning this approach remains a formidable computational challenge. A dominant cost in many state-of-the-art (Burer-Monteiro factorization-based) certifiable estimation methods is solution verification (testing the global optimality of a given candidate solution), which entails computing a minimum eigenpair of a certain symmetric certificate matrix. In this letter, we show how to significantly accelerate this verification step, and thereby the overall speed of certifiable estimation methods. First, we show that the certificate matrices arising in the Burer-Monteiro approach generically possess spectra that make the verification problem expensive to solve using standard iterative eigenvalue methods. We then show how to address this challenge using preconditioned eigensolvers; specifically, we design a specialized solution verification algorithm based upon the locally optimal block preconditioned conjugate gradient (LOBPCG) method together with a simple yet highly effective algebraic preconditioner. Experimental evaluation on a variety of simulated and real-world examples shows that our proposed verification scheme is very effective in practice, accelerating solution verification by up to 280x, and the overall Burer-Monteiro method by up to 16x, versus the standard Lanczos method when applied to relaxations derived from large-scale SLAM benchmarks. △ Less

Submitted 13 November, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

Comments: 8 pages, 6 figures

arXiv:2205.06768 [pdf, other]

doi 10.1080/15435075.2023.2262006

Artificial Intelligence-Assisted Optimization and Multiphase Analysis of Polygon PEM Fuel Cells

Authors: Ali Jabbary, Nader Pourmahmoud, Mir Ali Asghar Abdollahi, Marc A. Rosen

Abstract: This article presents new hexagonal and pentagonal PEM fuel cell models. The models have been optimized after achieving improved cell performance. The input parameters of the multi-objective optimization algorithm were pressure and temperature at the inlet, and consumption and output powers were the objective parameters. The output data of the numerical simulation has been trained using deep neura… ▽ More This article presents new hexagonal and pentagonal PEM fuel cell models. The models have been optimized after achieving improved cell performance. The input parameters of the multi-objective optimization algorithm were pressure and temperature at the inlet, and consumption and output powers were the objective parameters. The output data of the numerical simulation has been trained using deep neural networks and then modeled with polynomial regression. The target functions have been extracted using the RSM (Response Surface Method), and the targets were optimized using the multi-objective genetic algorithm (NSGA-II). Compared to the base model, the optimized Pentagonal and Hexagonal models increase the output current density by 21.8% and 39.9%, respectively. △ Less

Submitted 6 July, 2022; v1 submitted 10 April, 2022; originally announced May 2022.

Comments: Int. J. Green Energy, vol. 0, no. 0, pp. 1-17, 2023

arXiv:2203.13897 [pdf, other]

Spectral Measurement Sparsification for Pose-Graph SLAM

Authors: Kevin J. Doherty, David M. Rosen, John J. Leonard

Abstract: Simultaneous localization and mapping (SLAM) is a critical capability in autonomous navigation, but in order to scale SLAM to the setting of "lifelong" SLAM, particularly under memory or computation constraints, a robot must be able to determine what information should be retained and what can safely be forgotten. In graph-based SLAM, the number of edges (measurements) in a pose graph determines b… ▽ More Simultaneous localization and mapping (SLAM) is a critical capability in autonomous navigation, but in order to scale SLAM to the setting of "lifelong" SLAM, particularly under memory or computation constraints, a robot must be able to determine what information should be retained and what can safely be forgotten. In graph-based SLAM, the number of edges (measurements) in a pose graph determines both the memory requirements of storing a robot's observations and the computational expense of algorithms deployed for performing state estimation using those observations; both of which can grow unbounded during long-term navigation. To address this, we propose a spectral approach for pose graph sparsification which maximizes the algebraic connectivity of the sparsified measurement graphs, a key quantity which has been shown to control the estimation error of pose graph SLAM solutions. Our algorithm, MAC (for "maximizing algebraic connectivity"), which is based on convex relaxation, is simple and computationally inexpensive, and admits formal post hoc performance guarantees on the quality of the solutions it provides. In experiments on benchmark pose-graph SLAM datasets, we show that our approach quickly produces high-quality sparsification results which retain the connectivity of the graph and, in turn, the quality of corresponding SLAM solutions, as compared to a baseline approach which does not consider graph connectivity. △ Less

Submitted 25 March, 2022; originally announced March 2022.

Comments: Submitted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022

arXiv:2203.00851 [pdf, other]

Distributed Riemannian Optimization with Lazy Communication for Collaborative Geometric Estimation

Authors: Yulun Tian, Amrit Singh Bedi, Alec Koppel, Miguel Calvo-Fullana, David M. Rosen, Jonathan P. How

Abstract: We present the first distributed optimization algorithm with lazy communication for collaborative geometric estimation, the backbone of modern collaborative simultaneous localization and mapping (SLAM) and structure-from-motion (SfM) applications. Our method allows agents to cooperatively reconstruct a shared geometric model on a central server by fusing individual observations, but without the ne… ▽ More We present the first distributed optimization algorithm with lazy communication for collaborative geometric estimation, the backbone of modern collaborative simultaneous localization and mapping (SLAM) and structure-from-motion (SfM) applications. Our method allows agents to cooperatively reconstruct a shared geometric model on a central server by fusing individual observations, but without the need to transmit potentially sensitive information about the agents themselves (such as their locations). Furthermore, to alleviate the burden of communication during iterative optimization, we design a set of communication triggering conditions that enable agents to selectively upload a targeted subset of local information that is useful to global optimization. Our approach thus achieves significant communication reduction with minimal impact on optimization performance. As our main theoretical contribution, we prove that our method converges to first-order critical points with a global sublinear convergence rate. Numerical evaluations on bundle adjustment problems from collaborative SLAM and SfM datasets show that our method performs competitively against existing distributed techniques, while achieving up to 78% total communication reduction. △ Less

Submitted 29 July, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

Comments: technical report (17 pages, 3 figures); to appear at IROS 2022

arXiv:2202.05267 [pdf, other]

doi 10.1002/mp.16224

On Real-time Image Reconstruction with Neural Networks for MRI-guided Radiotherapy

Authors: David E. J. Waddington, Nicholas Hindley, Neha Koonjoo, Christopher Chiu, Tess Reynolds, Paul Z. Y. Liu, Bo Zhu, Danyal Bhutto, Chiara Paganelli, Paul J. Keall, Matthew S. Rosen

Abstract: MRI-guidance techniques that dynamically adapt radiation beams to follow tumor motion in real-time will lead to more accurate cancer treatments and reduced collateral healthy tissue damage. The gold-standard for reconstruction of undersampled MR data is compressed sensing (CS) which is computationally slow and limits the rate that images can be available for real-time adaptation. Here, we demonstr… ▽ More MRI-guidance techniques that dynamically adapt radiation beams to follow tumor motion in real-time will lead to more accurate cancer treatments and reduced collateral healthy tissue damage. The gold-standard for reconstruction of undersampled MR data is compressed sensing (CS) which is computationally slow and limits the rate that images can be available for real-time adaptation. Here, we demonstrate the use of automated transform by manifold approximation (AUTOMAP), a generalized framework that maps raw MR signal to the target image domain, to rapidly reconstruct images from undersampled radial k-space data. The AUTOMAP neural network was trained to reconstruct images from a golden-angle radial acquisition, a benchmark for motion-sensitive imaging, on lung cancer patient data and generic images from ImageNet. Model training was subsequently augmented with motion-encoded k-space data derived from videos in the YouTube-8M dataset to encourage motion robust reconstruction. We find that AUTOMAP-reconstructed radial k-space has equivalent accuracy to CS but with much shorter processing times after initial fine-tuning on retrospectively acquired lung cancer patient data. Validation of motion-trained models with a virtual dynamic lung tumor phantom showed that the generalized motion properties learned from YouTube lead to improved target tracking accuracy. Our work shows that AUTOMAP can achieve real-time, accurate reconstruction of radial data. These findings imply that neural-network-based reconstruction is potentially superior to existing approaches for real-time image guidance applications. △ Less

Submitted 18 May, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

Comments: 12 pages, 6 figures, 1 table. v2 has a typo in eqn 1 corrected and references added to the discussion

arXiv:2202.03564 [pdf]

Accurate super-resolution low-field brain MRI

Authors: Juan Eugenio Iglesias, Riana Schleicher, Sonia Laguna, Benjamin Billot, Pamela Schaefer, Brenna McKaig, Joshua N. Goldstein, Kevin N. Sheth, Matthew S. Rosen, W. Taylor Kimberly

Abstract: The recent introduction of portable, low-field MRI (LF-MRI) into the clinical setting has the potential to transform neuroimaging. However, LF-MRI is limited by lower resolution and signal-to-noise ratio, leading to incomplete characterization of brain regions. To address this challenge, recent advances in machine learning facilitate the synthesis of higher resolution images derived from one or mu… ▽ More The recent introduction of portable, low-field MRI (LF-MRI) into the clinical setting has the potential to transform neuroimaging. However, LF-MRI is limited by lower resolution and signal-to-noise ratio, leading to incomplete characterization of brain regions. To address this challenge, recent advances in machine learning facilitate the synthesis of higher resolution images derived from one or multiple lower resolution scans. Here, we report the extension of a machine learning super-resolution (SR) algorithm to synthesize 1 mm isotropic MPRAGE-like scans from LF-MRI T1-weighted and T2-weighted sequences. Our initial results on a paired dataset of LF and high-field (HF, 1.5T-3T) clinical scans show that: (i) application of available automated segmentation tools directly to LF-MRI images falters; but (ii) segmentation tools succeed when applied to SR images with high correlation to gold standard measurements from HF-MRI (e.g., r = 0.85 for hippocampal volume, r = 0.84 for the thalamus, r = 0.92 for the whole cerebrum). This work demonstrates proof-of-principle post-processing image enhancement from lower resolution LF-MRI sequences. These results lay the foundation for future work to enhance the detection of normal and abnormal image findings at LF and ultimately improve the diagnostic performance of LF-MRI. Our tools are publicly available on FreeSurfer (surfer.nmr.mgh.harvard.edu/). △ Less

Submitted 7 February, 2022; originally announced February 2022.

arXiv:2201.03773 [pdf, other]

Performance Guarantees for Spectral Initialization in Rotation Averaging and Pose-Graph SLAM

Authors: Kevin J. Doherty, David M. Rosen, John J. Leonard

Abstract: In this work we present the first initialization methods equipped with explicit performance guarantees adapted to the pose-graph simultaneous localization and mapping (SLAM) and rotation averaging (RA) problems. SLAM and rotation averaging are typically formalized as large-scale nonconvex point estimation problems, with many bad local minima that can entrap the smooth optimization methods typicall… ▽ More In this work we present the first initialization methods equipped with explicit performance guarantees adapted to the pose-graph simultaneous localization and mapping (SLAM) and rotation averaging (RA) problems. SLAM and rotation averaging are typically formalized as large-scale nonconvex point estimation problems, with many bad local minima that can entrap the smooth optimization methods typically applied to solve them; the performance of standard SLAM and RA algorithms thus crucially depends upon the quality of the estimates used to initialize this local search. While many initialization methods for SLAM and RA have appeared in the literature, these are typically obtained as purely heuristic approximations, making it difficult to determine whether (or under what circumstances) these techniques can be reliably deployed. In contrast, in this work we study the problem of initialization through the lens of spectral relaxation. Specifically, we derive a simple spectral relaxation of SLAM and RA, the form of which enables us to exploit classical linear-algebraic techniques (eigenvector perturbation bounds) to control the distance from our spectral estimate to both the (unknown) ground-truth and the global minimizer of the estimation problem as a function of measurement noise. Our results reveal the critical role that spectral graph-theoretic properties of the measurement network play in controlling estimation accuracy; moreover, as a by-product of our analysis we obtain new bounds on the estimation error for the maximum likelihood estimators in SLAM and RA, which are likely to be of independent interest. Finally, we show experimentally that our spectral estimator is very effective in practice, producing initializations of comparable or superior quality at lower computational cost compared to existing state-of-the-art techniques. △ Less

Submitted 10 January, 2022; originally announced January 2022.

arXiv:2112.13495 [pdf, other]

Multiple Randomization Designs

Authors: Patrick Bajari, Brian Burdick, Guido W. Imbens, Lorenzo Masoero, James McQueen, Thomas Richardson, Ido M. Rosen

Abstract: In this study we introduce a new class of experimental designs. In a classical randomized controlled trial (RCT), or A/B test, a randomly selected subset of a population of units (e.g., individuals, plots of land, or experiences) is assigned to a treatment (treatment A), and the remainder of the population is assigned to the control treatment (treatment B). The difference in average outcome by tre… ▽ More In this study we introduce a new class of experimental designs. In a classical randomized controlled trial (RCT), or A/B test, a randomly selected subset of a population of units (e.g., individuals, plots of land, or experiences) is assigned to a treatment (treatment A), and the remainder of the population is assigned to the control treatment (treatment B). The difference in average outcome by treatment group is an estimate of the average effect of the treatment. However, motivating our study, the setting for modern experiments is often different, with the outcomes and treatment assignments indexed by multiple populations. For example, outcomes may be indexed by buyers and sellers, by content creators and subscribers, by drivers and riders, or by travelers and airlines and travel agents, with treatments potentially varying across these indices. Spillovers or interference can arise from interactions between units across populations. For example, sellers' behavior may depend on buyers' treatment assignment, or vice versa. This can invalidate the simple comparison of means as an estimator for the average effect of the treatment in classical RCTs. We propose new experiment designs for settings in which multiple populations interact. We show how these designs allow us to study questions about interference that cannot be answered by classical randomized experiments. Finally, we develop new statistical methods for analyzing these Multiple Randomization Designs. △ Less

Submitted 26 December, 2021; originally announced December 2021.

Comments: 57 pages, 7 figures

MSC Class: 62B15 (Primary) 91B82; 91B26; 91C20; 91B80; 91C20 (Secondary) ACM Class: J.4; G.3; I.2.6

arXiv:2110.00876 [pdf, other]

Incremental Non-Gaussian Inference for SLAM Using Normalizing Flows

Authors: Qiangqiang Huang, Can Pu, Kasra Khosoussi, David M. Rosen, Dehann Fourie, Jonathan P. How, John J. Leonard

Abstract: This paper presents normalizing flows for incremental smoothing and mapping (NF-iSAM), a novel algorithm for inferring the full posterior distribution in SLAM problems with nonlinear measurement models and non-Gaussian factors. NF-iSAM exploits the expressive power of neural networks, and trains normalizing flows to model and sample the full posterior. By leveraging the Bayes tree, NF-iSAM enables… ▽ More This paper presents normalizing flows for incremental smoothing and mapping (NF-iSAM), a novel algorithm for inferring the full posterior distribution in SLAM problems with nonlinear measurement models and non-Gaussian factors. NF-iSAM exploits the expressive power of neural networks, and trains normalizing flows to model and sample the full posterior. By leveraging the Bayes tree, NF-iSAM enables efficient incremental updates similar to iSAM2, albeit in the more challenging non-Gaussian setting. We demonstrate the advantages of NF-iSAM over state-of-the-art point and distribution estimation algorithms using range-only SLAM problems with data association ambiguity. NF-iSAM presents superior accuracy in describing the posterior beliefs of continuous variables (e.g., position) and discrete variables (e.g., data association). △ Less

Submitted 2 July, 2022; v1 submitted 2 October, 2021; originally announced October 2021.

Comments: Extension of work published at arXiv:2105.05045

arXiv:2109.03374 [pdf, other]

doi 10.1109/LRA.2022.3141763

Convex Iteration for Distance-Geometric Inverse Kinematics

Authors: Matthew Giamou, Filip Marić, David M. Rosen, Valentin Peretroukhin, Nicholas Roy, Ivan Petrović, Jonathan Kelly

Abstract: Inverse kinematics (IK) is the problem of finding robot joint configurations that satisfy constraints on the position or pose of one or more end-effectors. For robots with redundant degrees of freedom, there is often an infinite, nonconvex set of solutions. The IK problem is further complicated when collision avoidance constraints are imposed by obstacles in the workspace. In general, closed-form… ▽ More Inverse kinematics (IK) is the problem of finding robot joint configurations that satisfy constraints on the position or pose of one or more end-effectors. For robots with redundant degrees of freedom, there is often an infinite, nonconvex set of solutions. The IK problem is further complicated when collision avoidance constraints are imposed by obstacles in the workspace. In general, closed-form expressions yielding feasible configurations do not exist, motivating the use of numerical solution methods. However, these approaches rely on local optimization of nonconvex problems, often requiring an accurate initialization or numerous re-initializations to converge to a valid solution. In this work, we first formulate inverse kinematics with complex workspace constraints as a convex feasibility problem whose low-rank feasible points provide exact IK solutions. We then present \texttt{CIDGIK} (Convex Iteration for Distance-Geometric Inverse Kinematics), an algorithm that solves this feasibility problem with a sequence of semidefinite programs whose objectives are designed to encourage low-rank minimizers. Our problem formulation elegantly unifies the configuration space and workspace constraints of a robot: intrinsic robot geometry and obstacle avoidance are both expressed as simple linear matrix equations and inequalities. Our experimental results for a variety of popular manipulator models demonstrate faster and more accurate convergence than a conventional nonlinear optimization-based approach, especially in environments with many obstacles. △ Less

Submitted 6 July, 2022; v1 submitted 7 September, 2021; originally announced September 2021.

Comments: In IEEE Robotics and Automation Letters (RA-L) and presented at the IEEE International Conference on Robotics and Automation (ICRA'22), Philadelphia, USA, May 23-27, 2022

Journal ref: IEEE Robotics and Automation Letters (RA-L), Vol. 7, No. 2, pp. 1952-1959, Apr. 2022

arXiv:2104.05871 [pdf, other]

Balboa: Bobbing and Weaving around Network Censorship

Authors: Marc B. Rosen, James Parker, Alex J. Malozemoff

Abstract: We introduce Balboa, a link obfuscation framework for censorship circumvention. Balboa provides a general framework for tunneling data through existing applications. Balboa sits between an application and the operating system, intercepting outgoing network traffic and rewriting it to embed data. To avoid introducing any distinguishable divergence from the expected application behavior, Balboa only… ▽ More We introduce Balboa, a link obfuscation framework for censorship circumvention. Balboa provides a general framework for tunneling data through existing applications. Balboa sits between an application and the operating system, intercepting outgoing network traffic and rewriting it to embed data. To avoid introducing any distinguishable divergence from the expected application behavior, Balboa only rewrites traffic that matches an externally specified \emph{traffic model} pre-shared between the communicating parties. The traffic model captures some subset of the network traffic (e.g., some subset of music an audio streaming server streams). The sender uses this model to replace outgoing data with a pointer to the associated location in the model and embed data in the freed up space. The receiver then extracts the data, replacing the pointer with the original data from the model before passing the data on to the application. When using TLS, this approach means that application behavior with Balboa is \emph{equivalent}, modulo small (protocol-dependent) timing differences, to if the application was running without Balboa. Balboa differs from prior approaches in that it (1) provides a framework for tunneling data through arbitrary (TLS-protected) protocols/applications, and (2) runs the unaltered application binaries on standard inputs, as opposed to most prior tunneling approaches which run the application on non-standard -- and thus potentially distinguishable -- inputs. We present two instantiations of Balboa -- one for audio streaming and one for web browsing -- and demonstrate the difficulty of identifying Balboa by a machine learning classifier. △ Less

Submitted 12 April, 2021; originally announced April 2021.

arXiv:2103.05041 [pdf, other]

Advances in Inference and Representation for Simultaneous Localization and Mapping

Authors: David M. Rosen, Kevin J. Doherty, Antonio Teran Espinoza, John J. Leonard

Abstract: Simultaneous localization and mapping (SLAM) is the process of constructing a global model of an environment from local observations of it; this is a foundational capability for mobile robots, supporting such core functions as planning, navigation, and control. This article reviews recent progress in SLAM, focusing on advances in the expressive capacity of the environmental models used in SLAM sys… ▽ More Simultaneous localization and mapping (SLAM) is the process of constructing a global model of an environment from local observations of it; this is a foundational capability for mobile robots, supporting such core functions as planning, navigation, and control. This article reviews recent progress in SLAM, focusing on advances in the expressive capacity of the environmental models used in SLAM systems (representation) and the performance of the algorithms used to estimate these models from data (inference). A prominent theme of recent SLAM research is the pursuit of environmental representations (including learned representations) that go beyond the classical attributes of geometry and appearance to model properties such as hierarchical organization, affordance, dynamics, and semantics; these advances equip autonomous agents with a more comprehensive understanding of the world, enabling more versatile and intelligent operation. A second major theme is a revitalized interest in the mathematical properties of the SLAM estimation problem itself (including its computational and information-theoretic performance limits); this work has led to the development of novel classes of certifiable and robust inference methods that dramatically improve the reliability of SLAM systems in real-world operation. We survey these advances with an emphasis on their ramifications for achieving robust, long-duration autonomy, and conclude with a discussion of open challenges and a perspective on future research directions. △ Less

Submitted 8 March, 2021; originally announced March 2021.

Comments: 30 pages, 4 figures. To appear in Annual Review of Control, Robotics, and Autonomous Systems 2021

arXiv:2008.02737 [pdf, ps, other]

Shonan Rotation Averaging: Global Optimality by Surfing $SO(p)^n$

Authors: Frank Dellaert, David M. Rosen, Jing Wu, Robert Mahony, Luca Carlone

Abstract: Shonan Rotation Averaging is a fast, simple, and elegant rotation averaging algorithm that is guaranteed to recover globally optimal solutions under mild assumptions on the measurement noise. Our method employs semidefinite relaxation in order to recover provably globally optimal solutions of the rotation averaging problem. In contrast to prior work, we show how to solve large-scale instances of t… ▽ More Shonan Rotation Averaging is a fast, simple, and elegant rotation averaging algorithm that is guaranteed to recover globally optimal solutions under mild assumptions on the measurement noise. Our method employs semidefinite relaxation in order to recover provably globally optimal solutions of the rotation averaging problem. In contrast to prior work, we show how to solve large-scale instances of these relaxations using manifold minimization on (only slightly) higher-dimensional rotation manifolds, re-using existing high-performance (but local) structure-from-motion pipelines. Our method thus preserves the speed and scalability of current SFM methods, while recovering globally optimal solutions. △ Less

Submitted 6 August, 2020; originally announced August 2020.

Comments: 30 pages (paper + supplementary material). To appear at the European Conference on Computer Vision (ECCV) 2020

arXiv:2006.01031 [pdf, other]

doi 10.15607/RSS.2020.XVI.007

A Smooth Representation of Belief over SO(3) for Deep Rotation Learning with Uncertainty

Authors: Valentin Peretroukhin, Matthew Giamou, David M. Rosen, W. Nicholas Greene, Nicholas Roy, Jonathan Kelly

Abstract: Accurate rotation estimation is at the heart of robot perception tasks such as visual odometry and object pose estimation. Deep neural networks have provided a new way to perform these tasks, and the choice of rotation representation is an important part of network design. In this work, we present a novel symmetric matrix representation of the 3D rotation group, SO(3), with two important propertie… ▽ More Accurate rotation estimation is at the heart of robot perception tasks such as visual odometry and object pose estimation. Deep neural networks have provided a new way to perform these tasks, and the choice of rotation representation is an important part of network design. In this work, we present a novel symmetric matrix representation of the 3D rotation group, SO(3), with two important properties that make it particularly suitable for learned models: (1) it satisfies a smoothness property that improves convergence and generalization when regressing large rotation targets, and (2) it encodes a symmetric Bingham belief over the space of unit quaternions, permitting the training of uncertainty-aware models. We empirically validate the benefits of our formulation by training deep neural rotation regressors on two data modalities. First, we use synthetic point-cloud data to show that our representation leads to superior predictive accuracy over existing representations for arbitrary rotation targets. Second, we use image data collected onboard ground and aerial vehicles to demonstrate that our representation is amenable to an effective out-of-distribution (OOD) rejection technique that significantly improves the robustness of rotation estimates to unseen environmental effects and corrupted input images, without requiring the use of an explicit likelihood loss, stochastic sampling, or an auxiliary classifier. This capability is key for safety-critical applications where detecting novel inputs can prevent catastrophic failure of learned models. △ Less

Submitted 17 January, 2021; v1 submitted 1 June, 2020; originally announced June 2020.

Comments: In Proceedings of Robotics: Science and Systems (RSS'20), Corvallis , Oregon, USA, Jul. 12-16, 2020

arXiv:1911.03721 [pdf, other]

Distributed Certifiably Correct Pose-Graph Optimization

Authors: Yulun Tian, Kasra Khosoussi, David M. Rosen, Jonathan P. How

Abstract: This paper presents the first certifiably correct algorithm for distributed pose-graph optimization (PGO), the backbone of modern collaborative simultaneous localization and mapping (CSLAM) and camera network localization (CNL) systems. Our method is based upon a sparse semidefinite relaxation that we prove provides globally-optimal PGO solutions under moderate measurement noise (matching the guar… ▽ More This paper presents the first certifiably correct algorithm for distributed pose-graph optimization (PGO), the backbone of modern collaborative simultaneous localization and mapping (CSLAM) and camera network localization (CNL) systems. Our method is based upon a sparse semidefinite relaxation that we prove provides globally-optimal PGO solutions under moderate measurement noise (matching the guarantees enjoyed by state-of-the-art centralized methods), but is amenable to distributed optimization using the low-rank Riemannian Staircase framework. To implement the Riemannian Staircase in the distributed setting, we develop Riemannian block coordinate descent (RBCD), a novel method for (locally) minimizing a function over a product of Riemannian manifolds. We also propose the first distributed solution verification and saddle escape methods to certify the global optimality of critical points recovered via RBCD, and to descend from suboptimal critical points (if necessary). All components of our approach are inherently decentralized: they require only local communication, provide privacy protection, and are easily parallelizable. Extensive evaluations on synthetic and real-world datasets demonstrate that the proposed method correctly recovers globally optimal solutions under moderate noise, and outperforms alternative distributed techniques in terms of solution precision and convergence speed. △ Less

Submitted 18 May, 2021; v1 submitted 9 November, 2019; originally announced November 2019.

Comments: Updated convergence proofs. Paper accepted at T-RO

arXiv:1710.05267 [pdf]

doi 10.1002/mrm.27198

MR fingerprinting Deep RecOnstruction NEtwork (DRONE)

Authors: Ouri Cohen, Bo Zhu, Matthew S. Rosen

Abstract: PURPOSE: Demonstrate a novel fast method for reconstruction of multi-dimensional MR Fingerprinting (MRF) data using Deep Learning methods. METHODS: A neural network (NN) is defined using the TensorFlow framework and trained on simulated MRF data computed using the Bloch equations. The accuracy of the NN reconstruction of noisy data is compared to conventional MRF template matching as a function… ▽ More PURPOSE: Demonstrate a novel fast method for reconstruction of multi-dimensional MR Fingerprinting (MRF) data using Deep Learning methods. METHODS: A neural network (NN) is defined using the TensorFlow framework and trained on simulated MRF data computed using the Bloch equations. The accuracy of the NN reconstruction of noisy data is compared to conventional MRF template matching as a function of training data size, and quantified in a both simulated numerical brain phantom data and acquired data from the ISMRM/NIST phantom. The utility of the method is demonstrated in a healthy subject in vivo at 1.5 T. RESULTS: Network training required 10 minutes and once trained, data reconstruction required approximately 10 ms. Reconstruction of simulated brain data using the NN resulted in a root-mean-square error (RMSE) of 3.5 ms for T1 and 7.8 ms for T2. The RMSE for the NN trained on sparse dictionaries was approximately 6 fold lower for T1 and 2 fold lower for T2 than conventional MRF dot-product dictionary matching on the same dictionaries. Phantom measurements yielded good agreement (R2=0.99) between the T1 and T2 estimated by the NN and reference values from the ISMRM/NIST phantom. CONCLUSION: Reconstruction of MRF data with a NN is accurate, 300 fold faster and more robust to noise and undersampling than conventional MRF dictionary matching. △ Less

Submitted 24 April, 2018; v1 submitted 14 October, 2017; originally announced October 2017.

Comments: 21 pages, 7 figures

arXiv:1704.08841 [pdf]

doi 10.1038/nature25988

Image reconstruction by domain transform manifold learning

Authors: Bo Zhu, Jeremiah Z. Liu, Bruce R. Rosen, Matthew S. Rosen

Abstract: Image reconstruction plays a critical role in the implementation of all contemporary imaging modalities across the physical and life sciences including optical, MRI, CT, PET, and radio astronomy. During an image acquisition, the sensor encodes an intermediate representation of an object in the sensor domain, which is subsequently reconstructed into an image by an inversion of the encoding function… ▽ More Image reconstruction plays a critical role in the implementation of all contemporary imaging modalities across the physical and life sciences including optical, MRI, CT, PET, and radio astronomy. During an image acquisition, the sensor encodes an intermediate representation of an object in the sensor domain, which is subsequently reconstructed into an image by an inversion of the encoding function. Image reconstruction is challenging because analytic knowledge of the inverse transform may not exist a priori, especially in the presence of sensor non-idealities and noise. Thus, the standard reconstruction approach involves approximating the inverse function with multiple ad hoc stages in a signal processing chain whose composition depends on the details of each acquisition strategy, and often requires expert parameter tuning to optimize reconstruction performance. We present here a unified framework for image reconstruction, AUtomated TransfOrm by Manifold APproximation (AUTOMAP), which recasts image reconstruction as a data-driven, supervised learning task that allows a mapping between sensor and image domain to emerge from an appropriate corpus of training data. We implement AUTOMAP with a deep neural network and exhibit its flexibility in learning reconstruction transforms for a variety of MRI acquisition strategies, using the same network architecture and hyperparameters. We further demonstrate its efficiency in sparsely representing transforms along low-dimensional manifolds, resulting in superior immunity to noise and reconstruction artifacts compared with conventional handcrafted reconstruction methods. In addition to improving the reconstruction performance of existing acquisition methodologies, we anticipate accelerating the discovery of new acquisition strategies across modalities as the burden of reconstruction becomes lifted by AUTOMAP and learned-reconstruction approaches. △ Less

Submitted 28 April, 2017; originally announced April 2017.

Comments: 18 pages, 4 figures

arXiv:1612.07386 [pdf, other]

SE-Sync: A Certifiably Correct Algorithm for Synchronization over the Special Euclidean Group

Authors: David M. Rosen, Luca Carlone, Afonso S. Bandeira, John J. Leonard

Abstract: Many important geometric estimation problems take the form of synchronization over the special Euclidean group: estimate the values of a set of poses given a set of relative measurements between them. This problem is typically formulated as a nonconvex maximum-likelihood estimation that is computationally hard to solve in general. Nevertheless, in this paper we present an algorithm that is able to… ▽ More Many important geometric estimation problems take the form of synchronization over the special Euclidean group: estimate the values of a set of poses given a set of relative measurements between them. This problem is typically formulated as a nonconvex maximum-likelihood estimation that is computationally hard to solve in general. Nevertheless, in this paper we present an algorithm that is able to efficiently recover certifiably globally optimal solutions of the special Euclidean synchronization problem in a non-adversarial noise regime. The crux of our approach is the development of a semidefinite relaxation of the maximum-likelihood estimation whose minimizer provides an exact MLE so long as the magnitude of the noise corrupting the available measurements falls below a certain critical threshold; furthermore, whenever exactness obtains, it is possible to verify this fact a posteriori, thereby certifying the optimality of the recovered estimate. We develop a specialized optimization scheme for solving large-scale instances of this relaxation by exploiting its low-rank, geometric, and graph-theoretic structure to reduce it to an equivalent optimization problem on a low-dimensional Riemannian manifold, and design a truncated-Newton trust-region method to solve this reduction efficiently. Finally, we combine this fast optimization approach with a simple rounding procedure to produce our algorithm, SE-Sync. Experimental evaluation on a variety of simulated and real-world pose-graph SLAM datasets shows that SE-Sync is able to recover certifiably globally optimal solutions when the available measurements are corrupted by noise up to an order of magnitude greater than that typically encountered in robotics and computer vision applications, and does so more than an order of magnitude faster than the Gauss-Newton-based approach that forms the basis of current state-of-the-art techniques. △ Less

Submitted 4 February, 2017; v1 submitted 21 December, 2016; originally announced December 2016.

Comments: 49 Pages, 20 figures

arXiv:1611.00128 [pdf, other]

A Certifiably Correct Algorithm for Synchronization over the Special Euclidean Group

Authors: David M. Rosen, Luca Carlone, Afonso S. Bandeira, John J. Leonard

Abstract: Many geometric estimation problems take the form of synchronization over the special Euclidean group: estimate the values of a set of poses given noisy measurements of a subset of their pairwise relative transforms. This problem is typically formulated as a maximum-likelihood estimation that requires solving a nonconvex nonlinear program, which is computationally intractable in general. Neverthele… ▽ More Many geometric estimation problems take the form of synchronization over the special Euclidean group: estimate the values of a set of poses given noisy measurements of a subset of their pairwise relative transforms. This problem is typically formulated as a maximum-likelihood estimation that requires solving a nonconvex nonlinear program, which is computationally intractable in general. Nevertheless, in this paper we present an algorithm that is able to efficiently recover certifiably globally optimal solutions of this estimation problem in a non-adversarial noise regime. The crux of our approach is the development of a semidefinite relaxation of the maximum-likelihood estimation whose minimizer provides the exact MLE so long as the magnitude of the noise corrupting the available measurements falls below a certain critical threshold; furthermore, whenever exactness obtains, it is possible to verify this fact a posteriori, thereby certifying the optimality of the recovered estimate. We develop a specialized optimization scheme for solving large-scale instances of this semidefinite relaxation by exploiting its low-rank, geometric, and graph-theoretic structure to reduce it to an equivalent optimization problem on a low-dimensional Riemannian manifold, and then design a Riemannian truncated-Newton trust-region method to solve this reduction efficiently. We combine this fast optimization approach with a simple rounding procedure to produce our algorithm, SE-Sync. Experimental evaluation on a variety of simulated and real-world pose-graph SLAM datasets shows that SE-Sync is capable of recovering globally optimal solutions when the available measurements are corrupted by noise up to an order of magnitude greater than that typically encountered in robotics applications, and does so at a computational cost that scales comparably with that of direct Newton-type local search techniques. △ Less

Submitted 9 February, 2017; v1 submitted 1 November, 2016; originally announced November 2016.

Comments: 16 pages, 8 figures, to appear in the International Workshop on the Algorithmic Foundations of Robotics (WAFR), Dec 2016

arXiv:1110.1079 [pdf, ps, other]

A Near-Optimal Sublinear-Time Algorithm for Approximating the Minimum Vertex Cover Size

Authors: Krzysztof Onak, Dana Ron, Michal Rosen, Ronitt Rubinfeld

Abstract: We give a nearly optimal sublinear-time algorithm for approximating the size of a minimum vertex cover in a graph G. The algorithm may query the degree deg(v) of any vertex v of its choice, and for each 1 <= i <= deg(v), it may ask for the i-th neighbor of v. Letting VC_opt(G) denote the minimum size of vertex cover in G, the algorithm outputs, with high constant success probability, an estimate V… ▽ More We give a nearly optimal sublinear-time algorithm for approximating the size of a minimum vertex cover in a graph G. The algorithm may query the degree deg(v) of any vertex v of its choice, and for each 1 <= i <= deg(v), it may ask for the i-th neighbor of v. Letting VC_opt(G) denote the minimum size of vertex cover in G, the algorithm outputs, with high constant success probability, an estimate VC_estimate(G) such that VC_opt(G) <= VC_estimate(G) <= 2 * VC_opt(G) + epsilon*n, where epsilon is a given additive approximation parameter. We refer to such an estimate as a (2,epsilon)-estimate. The query complexity and running time of the algorithm are ~O(avg_deg * poly(1/epsilon)), where avg_deg denotes the average vertex degree in the graph. The best previously known sublinear algorithm, of Yoshida et al. (STOC 2009), has query complexity and running time O(d^4/epsilon^2), where d is the maximum degree in the graph. Given the lower bound of Omega(avg_deg) (for constant epsilon) for obtaining such an estimate (with any constant multiplicative factor) due to Parnas and Ron (TCS 2007), our result is nearly optimal. In the case that the graph is dense, that is, the number of edges is Theta(n^2), we consider another model, in which the algorithm may ask, for any pair of vertices u and v, whether there is an edge between u and v. We show how to adapt the algorithm that uses neighbor queries to this model and obtain an algorithm that outputs a (2,epsilon)-estimate of the size of a minimum vertex cover whose query complexity and running time are ~O(n) * poly(1/epsilon). △ Less

Submitted 5 October, 2011; originally announced October 2011.

Showing 1–38 of 38 results for author: Rosen, M