-
Benchmarking Multi-Organ Segmentation Tools for Multi-Parametric T1-weighted Abdominal MRI
Authors:
Nicole Tran,
Anisa Prasad,
Yan Zhuang,
Tejas Sudharshan Mathai,
Boah Kim,
Sydney Lewis,
Pritam Mukherjee,
Jianfei Liu,
Ronald M. Summers
Abstract:
The segmentation of multiple organs in multi-parametric MRI studies is critical for many applications in radiology, such as correlating imaging biomarkers with disease status (e.g., cirrhosis, diabetes). Recently, three publicly available tools, such as MRSegmentator (MRSeg), TotalSegmentator MRI (TS), and TotalVibeSegmentator (VIBE), have been proposed for multi-organ segmentation in MRI. However…
▽ More
The segmentation of multiple organs in multi-parametric MRI studies is critical for many applications in radiology, such as correlating imaging biomarkers with disease status (e.g., cirrhosis, diabetes). Recently, three publicly available tools, such as MRSegmentator (MRSeg), TotalSegmentator MRI (TS), and TotalVibeSegmentator (VIBE), have been proposed for multi-organ segmentation in MRI. However, the performance of these tools on specific MRI sequence types has not yet been quantified. In this work, a subset of 40 volumes from the public Duke Liver Dataset was curated. The curated dataset contained 10 volumes each from the pre-contrast fat saturated T1, arterial T1w, venous T1w, and delayed T1w phases, respectively. Ten abdominal structures were manually annotated in these volumes. Next, the performance of the three public tools was benchmarked on this curated dataset. The results indicated that MRSeg obtained a Dice score of 80.7 $\pm$ 18.6 and Hausdorff Distance (HD) error of 8.9 $\pm$ 10.4 mm. It fared the best ($p < .05$) across the different sequence types in contrast to TS and VIBE.
△ Less
Submitted 10 April, 2025;
originally announced April 2025.
-
A subgradient splitting algorithm for optimization on nonpositively curved metric spaces
Authors:
Ariel Goodwin,
Adrian S. Lewis,
Genaro López-Acedo,
Adriana Nicolae
Abstract:
Many of the primal ingredients of convex optimization extend naturally from Euclidean to Hadamard spaces $\unicode{x2014}$ nonpositively curved metric spaces like Euclidean, Hilbert, and hyperbolic spaces, metric trees, and more general CAT(0) cubical complexes. Linear structure, however, and the duality theory it supports are absent. Nonetheless, we introduce a new type of subgradient for convex…
▽ More
Many of the primal ingredients of convex optimization extend naturally from Euclidean to Hadamard spaces $\unicode{x2014}$ nonpositively curved metric spaces like Euclidean, Hilbert, and hyperbolic spaces, metric trees, and more general CAT(0) cubical complexes. Linear structure, however, and the duality theory it supports are absent. Nonetheless, we introduce a new type of subgradient for convex functions on Hadamard spaces, based on Busemann functions. This notion supports a splitting subgradient method with guaranteed complexity bounds. In particular, the algorithm solves $p$-mean problems in general Hadamard spaces: we illustrate by computing medians in BHV tree space.
△ Less
Submitted 18 December, 2024; v1 submitted 9 December, 2024;
originally announced December 2024.
-
NARF24: Estimating Articulated Object Structure for Implicit Rendering
Authors:
Stanley Lewis,
Tom Gao,
Odest Chadwicke Jenkins
Abstract:
Articulated objects and their representations pose a difficult problem for robots. These objects require not only representations of geometry and texture, but also of the various connections and joint parameters that make up each articulation. We propose a method that learns a common Neural Radiance Field (NeRF) representation across a small number of collected scenes. This representation is combi…
▽ More
Articulated objects and their representations pose a difficult problem for robots. These objects require not only representations of geometry and texture, but also of the various connections and joint parameters that make up each articulation. We propose a method that learns a common Neural Radiance Field (NeRF) representation across a small number of collected scenes. This representation is combined with a parts-based image segmentation to produce an implicit space part localization, from which the connectivity and joint parameters of the articulated object can be estimated, thus enabling configuration-conditioned rendering.
△ Less
Submitted 15 September, 2024;
originally announced September 2024.
-
Journalists, Emotions, and the Introduction of Generative AI Chatbots: A Large-Scale Analysis of Tweets Before and After the Launch of ChatGPT
Authors:
Seth C. Lewis,
David M. Markowitz,
Jon Benedik Bunquin
Abstract:
As part of a broader look at the impact of generative AI, this study investigated the emotional responses of journalists to the release of ChatGPT at the time of its launch. By analyzing nearly 1 million Tweets from journalists at major U.S. news outlets, we tracked changes in emotional tone and sentiment before and after the introduction of ChatGPT in November 2022. Using various computational an…
▽ More
As part of a broader look at the impact of generative AI, this study investigated the emotional responses of journalists to the release of ChatGPT at the time of its launch. By analyzing nearly 1 million Tweets from journalists at major U.S. news outlets, we tracked changes in emotional tone and sentiment before and after the introduction of ChatGPT in November 2022. Using various computational and natural language processing techniques to measure emotional shifts in response to ChatGPT's release, we found an increase in positive emotion and a more favorable tone post-launch, suggesting initial optimism toward AI's potential. This research underscores the pivotal role of journalists as interpreters of technological innovation and disruption, highlighting how their emotional reactions may shape public narratives around emerging technologies. The study contributes to understanding the intersection of journalism, emotion, and AI, offering insights into the broader societal impact of generative AI tools.
△ Less
Submitted 13 September, 2024;
originally announced September 2024.
-
Single-View 3D Reconstruction via SO(2)-Equivariant Gaussian Sculpting Networks
Authors:
Ruihan Xu,
Anthony Opipari,
Joshua Mah,
Stanley Lewis,
Haoran Zhang,
Hanzhe Guo,
Odest Chadwicke Jenkins
Abstract:
This paper introduces SO(2)-Equivariant Gaussian Sculpting Networks (GSNs) as an approach for SO(2)-Equivariant 3D object reconstruction from single-view image observations.
GSNs take a single observation as input to generate a Gaussian splat representation describing the observed object's geometry and texture. By using a shared feature extractor before decoding Gaussian colors, covariances, pos…
▽ More
This paper introduces SO(2)-Equivariant Gaussian Sculpting Networks (GSNs) as an approach for SO(2)-Equivariant 3D object reconstruction from single-view image observations.
GSNs take a single observation as input to generate a Gaussian splat representation describing the observed object's geometry and texture. By using a shared feature extractor before decoding Gaussian colors, covariances, positions, and opacities, GSNs achieve extremely high throughput (>150FPS). Experiments demonstrate that GSNs can be trained efficiently using a multi-view rendering loss and are competitive, in quality, with expensive diffusion-based reconstruction algorithms. The GSN model is validated on multiple benchmark experiments. Moreover, we demonstrate the potential for GSNs to be used within a robotic manipulation pipeline for object-centric grasping.
△ Less
Submitted 11 September, 2024;
originally announced September 2024.
-
OVAL-Prompt: Open-Vocabulary Affordance Localization for Robot Manipulation through LLM Affordance-Grounding
Authors:
Edmond Tong,
Anthony Opipari,
Stanley Lewis,
Zhen Zeng,
Odest Chadwicke Jenkins
Abstract:
In order for robots to interact with objects effectively, they must understand the form and function of each object they encounter. Essentially, robots need to understand which actions each object affords, and where those affordances can be acted on. Robots are ultimately expected to operate in unstructured human environments, where the set of objects and affordances is not known to the robot befo…
▽ More
In order for robots to interact with objects effectively, they must understand the form and function of each object they encounter. Essentially, robots need to understand which actions each object affords, and where those affordances can be acted on. Robots are ultimately expected to operate in unstructured human environments, where the set of objects and affordances is not known to the robot before deployment (i.e. the open-vocabulary setting). In this work, we introduce OVAL-Prompt, a prompt-based approach for open-vocabulary affordance localization in RGB-D images. By leveraging a Vision Language Model (VLM) for open-vocabulary object part segmentation and a Large Language Model (LLM) to ground each part-segment-affordance, OVAL-Prompt demonstrates generalizability to novel object instances, categories, and affordances without domain-specific finetuning. Quantitative experiments demonstrate that without any finetuning, OVAL-Prompt achieves localization accuracy that is competitive with supervised baseline models. Moreover, qualitative experiments show that OVAL-Prompt enables affordance-based robot manipulation of open-vocabulary object instances and categories. Project Page: https://ekjt.github.io/OVAL-Prompt/
△ Less
Submitted 25 May, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
A Novel Corpus of Annotated Medical Imaging Reports and Information Extraction Results Using BERT-based Language Models
Authors:
Namu Park,
Kevin Lybarger,
Giridhar Kaushik Ramachandran,
Spencer Lewis,
Aashka Damani,
Ozlem Uzuner,
Martin Gunn,
Meliha Yetisgen
Abstract:
Medical imaging is critical to the diagnosis, surveillance, and treatment of many health conditions, including oncological, neurological, cardiovascular, and musculoskeletal disorders, among others. Radiologists interpret these complex, unstructured images and articulate their assessments through narrative reports that remain largely unstructured. This unstructured narrative must be converted into…
▽ More
Medical imaging is critical to the diagnosis, surveillance, and treatment of many health conditions, including oncological, neurological, cardiovascular, and musculoskeletal disorders, among others. Radiologists interpret these complex, unstructured images and articulate their assessments through narrative reports that remain largely unstructured. This unstructured narrative must be converted into a structured semantic representation to facilitate secondary applications such as retrospective analyses or clinical decision support. Here, we introduce the Corpus of Annotated Medical Imaging Reports (CAMIR), which includes 609 annotated radiology reports from three imaging modality types: Computed Tomography, Magnetic Resonance Imaging, and Positron Emission Tomography-Computed Tomography. Reports were annotated using an event-based schema that captures clinical indications, lesions, and medical problems. Each event consists of a trigger and multiple arguments, and a majority of the argument types, including anatomy, normalize the spans to pre-defined concepts to facilitate secondary use. CAMIR uniquely combines a granular event structure and concept normalization. To extract CAMIR events, we explored two BERT (Bi-directional Encoder Representation from Transformers)-based architectures, including an existing architecture (mSpERT) that jointly extracts all event information and a multi-step approach (PL-Marker++) that we augmented for the CAMIR schema.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Horoballs and the subgradient method
Authors:
Adrian S. Lewis,
Genaro Lopez-Acedo,
Adriana Nicolae
Abstract:
To explore convex optimization on Hadamard spaces, we consider an iteration in the style of a subgradient algorithm. Traditionally, such methods assume that the underlying spaces are manifolds and that the objectives are geodesically convex: the methods are described using tangent spaces and exponential maps. By contrast, our iteration applies in a general Hadamard space, is framed in the underlyi…
▽ More
To explore convex optimization on Hadamard spaces, we consider an iteration in the style of a subgradient algorithm. Traditionally, such methods assume that the underlying spaces are manifolds and that the objectives are geodesically convex: the methods are described using tangent spaces and exponential maps. By contrast, our iteration applies in a general Hadamard space, is framed in the underlying space itself, and relies instead on horospherical convexity of the objective level sets. For this restricted class of objectives, we prove a complexity result of the usual form. Notably, the complexity does not depend on a lower bound on the space curvature. We illustrate our subgradient algorithm on the minimal enclosing ball problem in Hadamard spaces.
△ Less
Submitted 2 April, 2024; v1 submitted 23 March, 2024;
originally announced March 2024.
-
Improved motif-scaffolding with SE(3) flow matching
Authors:
Jason Yim,
Andrew Campbell,
Emile Mathieu,
Andrew Y. K. Foong,
Michael Gastegger,
José Jiménez-Luna,
Sarah Lewis,
Victor Garcia Satorras,
Bastiaan S. Veeling,
Frank Noé,
Regina Barzilay,
Tommi S. Jaakkola
Abstract:
Protein design often begins with the knowledge of a desired function from a motif which motif-scaffolding aims to construct a functional protein around. Recently, generative models have achieved breakthrough success in designing scaffolds for a range of motifs. However, generated scaffolds tend to lack structural diversity, which can hinder success in wet-lab validation. In this work, we extend Fr…
▽ More
Protein design often begins with the knowledge of a desired function from a motif which motif-scaffolding aims to construct a functional protein around. Recently, generative models have achieved breakthrough success in designing scaffolds for a range of motifs. However, generated scaffolds tend to lack structural diversity, which can hinder success in wet-lab validation. In this work, we extend FrameFlow, an SE(3) flow matching model for protein backbone generation, to perform motif-scaffolding with two complementary approaches. The first is motif amortization, in which FrameFlow is trained with the motif as input using a data augmentation strategy. The second is motif guidance, which performs scaffolding using an estimate of the conditional score from FrameFlow without additional training. On a benchmark of 24 biologically meaningful motifs, we show our method achieves 2.5 times more designable and unique motif-scaffolds compared to state-of-the-art. Code: https://github.com/microsoft/protein-frame-flow
△ Less
Submitted 18 July, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
MatterGen: a generative model for inorganic materials design
Authors:
Claudio Zeni,
Robert Pinsler,
Daniel Zügner,
Andrew Fowler,
Matthew Horton,
Xiang Fu,
Sasha Shysheya,
Jonathan Crabbé,
Lixin Sun,
Jake Smith,
Bichlien Nguyen,
Hannes Schulz,
Sarah Lewis,
Chin-Wei Huang,
Ziheng Lu,
Yichi Zhou,
Han Yang,
Hongxia Hao,
Jielan Li,
Ryota Tomioka,
Tian Xie
Abstract:
The design of functional materials with desired properties is essential in driving technological advances in areas like energy storage, catalysis, and carbon capture. Generative models provide a new paradigm for materials design by directly generating entirely novel materials given desired property constraints. Despite recent progress, current generative models have low success rate in proposing s…
▽ More
The design of functional materials with desired properties is essential in driving technological advances in areas like energy storage, catalysis, and carbon capture. Generative models provide a new paradigm for materials design by directly generating entirely novel materials given desired property constraints. Despite recent progress, current generative models have low success rate in proposing stable crystals, or can only satisfy a very limited set of property constraints. Here, we present MatterGen, a model that generates stable, diverse inorganic materials across the periodic table and can further be fine-tuned to steer the generation towards a broad range of property constraints. To enable this, we introduce a new diffusion-based generative process that produces crystalline structures by gradually refining atom types, coordinates, and the periodic lattice. We further introduce adapter modules to enable fine-tuning towards any given property constraints with a labeled dataset. Compared to prior generative models, structures produced by MatterGen are more than twice as likely to be novel and stable, and more than 15 times closer to the local energy minimum. After fine-tuning, MatterGen successfully generates stable, novel materials with desired chemistry, symmetry, as well as mechanical, electronic and magnetic properties. Finally, we demonstrate multi-property materials design capabilities by proposing structures that have both high magnetic density and a chemical composition with low supply-chain risk. We believe that the quality of generated materials and the breadth of MatterGen's capabilities represent a major advancement towards creating a universal generative model for materials design.
△ Less
Submitted 29 January, 2024; v1 submitted 6 December, 2023;
originally announced December 2023.
-
MBot: A Modular Ecosystem for Scalable Robotics Education
Authors:
Peter Gaskell,
Jana Pavlasek,
Tom Gao,
Abhishek Narula,
Stanley Lewis,
Odest Chadwicke Jenkins
Abstract:
The Michigan Robotics MBot is a low-cost mobile robot platform that has been used to train over 1,400 students in autonomous navigation since 2014 at the University of Michigan and our collaborating colleges. The MBot platform was designed to meet the needs of teaching robotics at scale to match the growth of robotics as a field and an academic discipline. Transformative advancements in robot navi…
▽ More
The Michigan Robotics MBot is a low-cost mobile robot platform that has been used to train over 1,400 students in autonomous navigation since 2014 at the University of Michigan and our collaborating colleges. The MBot platform was designed to meet the needs of teaching robotics at scale to match the growth of robotics as a field and an academic discipline. Transformative advancements in robot navigation over the past decades have led to a significant demand for skilled roboticists across industry and academia. This demand has sparked a need for robotics courses in higher education, spanning all levels of undergraduate and graduate experiences. Incorporating real robot platforms into such courses and curricula is effective for conveying the unique challenges of programming embodied agents in real-world environments and sparking student interest. However, teaching with real robots remains challenging due to the cost of hardware and the development effort involved in adapting existing hardware for a new course. In this paper, we describe the design and evolution of the MBot platform, and the underlying principals of scalability and flexibility which are keys to its success.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Retro-fallback: retrosynthetic planning in an uncertain world
Authors:
Austin Tripp,
Krzysztof Maziarz,
Sarah Lewis,
Marwin Segler,
José Miguel Hernández-Lobato
Abstract:
Retrosynthesis is the task of planning a series of chemical reactions to create a desired molecule from simpler, buyable molecules. While previous works have proposed algorithms to find optimal solutions for a range of metrics (e.g. shortest, lowest-cost), these works generally overlook the fact that we have imperfect knowledge of the space of possible reactions, meaning plans created by algorithm…
▽ More
Retrosynthesis is the task of planning a series of chemical reactions to create a desired molecule from simpler, buyable molecules. While previous works have proposed algorithms to find optimal solutions for a range of metrics (e.g. shortest, lowest-cost), these works generally overlook the fact that we have imperfect knowledge of the space of possible reactions, meaning plans created by algorithms may not work in a laboratory. In this paper we propose a novel formulation of retrosynthesis in terms of stochastic processes to account for this uncertainty. We then propose a novel greedy algorithm called retro-fallback which maximizes the probability that at least one synthesis plan can be executed in the lab. Using in-silico benchmarks we demonstrate that retro-fallback generally produces better sets of synthesis plans than the popular MCTS and retro* algorithms.
△ Less
Submitted 13 April, 2024; v1 submitted 13 October, 2023;
originally announced October 2023.
-
NARF22: Neural Articulated Radiance Fields for Configuration-Aware Rendering
Authors:
Stanley Lewis,
Jana Pavlasek,
Odest Chadwicke Jenkins
Abstract:
Articulated objects pose a unique challenge for robotic perception and manipulation. Their increased number of degrees-of-freedom makes tasks such as localization computationally difficult, while also making the process of real-world dataset collection unscalable. With the aim of addressing these scalability issues, we propose Neural Articulated Radiance Fields (NARF22), a pipeline which uses a fu…
▽ More
Articulated objects pose a unique challenge for robotic perception and manipulation. Their increased number of degrees-of-freedom makes tasks such as localization computationally difficult, while also making the process of real-world dataset collection unscalable. With the aim of addressing these scalability issues, we propose Neural Articulated Radiance Fields (NARF22), a pipeline which uses a fully-differentiable, configuration-parameterized Neural Radiance Field (NeRF) as a means of providing high quality renderings of articulated objects. NARF22 requires no explicit knowledge of the object structure at inference time. We propose a two-stage parts-based training mechanism which allows the object rendering models to generalize well across the configuration space even if the underlying training data has as few as one configuration represented. We demonstrate the efficacy of NARF22 by training configurable renderers on a real-world articulated tool dataset collected via a Fetch mobile manipulation robot. We show the applicability of the model to gradient-based inference methods through a configuration estimation and 6 degree-of-freedom pose refinement task. The project webpage is available at: https://progress.eecs.umich.edu/projects/narf/.
△ Less
Submitted 3 October, 2022;
originally announced October 2022.
-
Multi-level Adversarial Spatio-temporal Learning for Footstep Pressure based FoG Detection
Authors:
Kun Hu,
Shaohui Mei,
Wei Wang,
Kaylena A. Ehgoetz Martens,
Liang Wang,
Simon J. G. Lewis,
David D. Feng,
Zhiyong Wang
Abstract:
Freezing of gait (FoG) is one of the most common symptoms of Parkinson's disease, which is a neurodegenerative disorder of the central nervous system impacting millions of people around the world. To address the pressing need to improve the quality of treatment for FoG, devising a computer-aided detection and quantification tool for FoG has been increasingly important. As a non-invasive technique…
▽ More
Freezing of gait (FoG) is one of the most common symptoms of Parkinson's disease, which is a neurodegenerative disorder of the central nervous system impacting millions of people around the world. To address the pressing need to improve the quality of treatment for FoG, devising a computer-aided detection and quantification tool for FoG has been increasingly important. As a non-invasive technique for collecting motion patterns, the footstep pressure sequences obtained from pressure sensitive gait mats provide a great opportunity for evaluating FoG in the clinic and potentially in the home environment. In this study, FoG detection is formulated as a sequential modelling task and a novel deep learning architecture, namely Adversarial Spatio-temporal Network (ASTN), is proposed to learn FoG patterns across multiple levels. A novel adversarial training scheme is introduced with a multi-level subject discriminator to obtain subject-independent FoG representations, which helps to reduce the over-fitting risk due to the high inter-subject variance. As a result, robust FoG detection can be achieved for unseen subjects. The proposed scheme also sheds light on improving subject-level clinical studies from other scenarios as it can be integrated with many existing deep architectures. To the best of our knowledge, this is one of the first studies of footstep pressure-based FoG detection and the approach of utilizing ASTN is the first deep neural network architecture in pursuit of subject-independent representations. Experimental results on 393 trials collected from 21 subjects demonstrate encouraging performance of the proposed ASTN for FoG detection with an AUC 0.85.
△ Less
Submitted 22 September, 2022;
originally announced September 2022.
-
ProgressLabeller: Visual Data Stream Annotation for Training Object-Centric 3D Perception
Authors:
Xiaotong Chen,
Huijie Zhang,
Zeren Yu,
Stanley Lewis,
Odest Chadwicke Jenkins
Abstract:
Visual perception tasks often require vast amounts of labelled data, including 3D poses and image space segmentation masks. The process of creating such training data sets can prove difficult or time-intensive to scale up to efficacy for general use. Consider the task of pose estimation for rigid objects. Deep neural network based approaches have shown good performance when trained on large, publi…
▽ More
Visual perception tasks often require vast amounts of labelled data, including 3D poses and image space segmentation masks. The process of creating such training data sets can prove difficult or time-intensive to scale up to efficacy for general use. Consider the task of pose estimation for rigid objects. Deep neural network based approaches have shown good performance when trained on large, public datasets. However, adapting these networks for other novel objects, or fine-tuning existing models for different environments, requires significant time investment to generate newly labelled instances. Towards this end, we propose ProgressLabeller as a method for more efficiently generating large amounts of 6D pose training data from color images sequences for custom scenes in a scalable manner. ProgressLabeller is intended to also support transparent or translucent objects, for which the previous methods based on depth dense reconstruction will fail. We demonstrate the effectiveness of ProgressLabeller by rapidly create a dataset of over 1M samples with which we fine-tune a state-of-the-art pose estimation network in order to markedly improve the downstream robotic grasp success rates. ProgressLabeller is open-source at https://github.com/huijieZH/ProgressLabeller.
△ Less
Submitted 1 August, 2022; v1 submitted 1 March, 2022;
originally announced March 2022.
-
Survey Descent: A Multipoint Generalization of Gradient Descent for Nonsmooth Optimization
Authors:
X. Y. Han,
Adrian S. Lewis
Abstract:
For strongly convex objectives that are smooth, the classical theory of gradient descent ensures linear convergence relative to the number of gradient evaluations. An analogous nonsmooth theory is challenging. Even when the objective is smooth at every iterate, the corresponding local models are unstable and the number of cutting planes invoked by traditional remedies is difficult to bound, leadin…
▽ More
For strongly convex objectives that are smooth, the classical theory of gradient descent ensures linear convergence relative to the number of gradient evaluations. An analogous nonsmooth theory is challenging. Even when the objective is smooth at every iterate, the corresponding local models are unstable and the number of cutting planes invoked by traditional remedies is difficult to bound, leading to convergences guarantees that are sublinear relative to the cumulative number of gradient evaluations. We instead propose a multipoint generalization of the gradient descent iteration for local optimization. While designed with general objectives in mind, we are motivated by a ``max-of-smooth'' model that captures the subdifferential dimension at optimality. We prove linear convergence when the objective is itself max-of-smooth, and experiments suggest a more general phenomenon.
△ Less
Submitted 27 September, 2022; v1 submitted 30 November, 2021;
originally announced November 2021.
-
Parts-Based Articulated Object Localization in Clutter Using Belief Propagation
Authors:
Jana Pavlasek,
Stanley Lewis,
Karthik Desingh,
Odest Chadwicke Jenkins
Abstract:
Robots working in human environments must be able to perceive and act on challenging objects with articulations, such as a pile of tools. Articulated objects increase the dimensionality of the pose estimation problem, and partial observations under clutter create additional challenges. To address this problem, we present a generative-discriminative parts-based recognition and localization method f…
▽ More
Robots working in human environments must be able to perceive and act on challenging objects with articulations, such as a pile of tools. Articulated objects increase the dimensionality of the pose estimation problem, and partial observations under clutter create additional challenges. To address this problem, we present a generative-discriminative parts-based recognition and localization method for articulated objects in clutter. We formulate the problem of articulated object pose estimation as a Markov Random Field (MRF). Hidden nodes in this MRF express the pose of the object parts, and edges express the articulation constraints between parts. Localization is performed within the MRF using an efficient belief propagation method. The method is informed by both part segmentation heatmaps over the observation, generated by a neural network, and the articulation constraints between object parts. Our generative-discriminative approach allows the proposed method to function in cluttered environments by inferring the pose of occluded parts using hypotheses from the visible parts. We demonstrate the efficacy of our methods in a tabletop environment for recognizing and localizing hand tools in uncluttered and cluttered configurations.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Multi-robot Dubins Coverage with Autonomous Surface Vehicles
Authors:
Nare Karapetyan,
Jason Moulton,
Jeremy S. Lewis,
Alberto Quattrini Li,
Jason M. O'Kane,
Ioannis Rekleitis
Abstract:
In large scale coverage operations, such as marine exploration or aerial monitoring, single robot approaches are not ideal, as they may take too long to cover a large area. In such scenarios, multi-robot approaches are preferable. Furthermore, several real world vehicles are non-holonomic, but can be modeled using Dubins vehicle kinematics. This paper focuses on environmental monitoring of aquatic…
▽ More
In large scale coverage operations, such as marine exploration or aerial monitoring, single robot approaches are not ideal, as they may take too long to cover a large area. In such scenarios, multi-robot approaches are preferable. Furthermore, several real world vehicles are non-holonomic, but can be modeled using Dubins vehicle kinematics. This paper focuses on environmental monitoring of aquatic environments using Autonomous Surface Vehicles (ASVs). In particular, we propose a novel approach for solving the problem of complete coverage of a known environment by a multi-robot team consisting of Dubins vehicles. It is worth noting that both multi-robot coverage and Dubins vehicle coverage are NP-complete problems. As such, we present two heuristics methods based on a variant of the traveling salesman problem -- k-TSP -- formulation and clustering algorithms that efficiently solve the problem. The proposed methods are tested both in simulations to assess their scalability and with a team of ASVs operating on a lake to ensure their applicability in real world.
△ Less
Submitted 7 August, 2018;
originally announced August 2018.