-
Fabrica: Dual-Arm Assembly of General Multi-Part Objects via Integrated Planning and Learning
Authors:
Yunsheng Tian,
Joshua Jacob,
Yijiang Huang,
Jialiang Zhao,
Edward Gu,
Pingchuan Ma,
Annan Zhang,
Farhad Javid,
Branden Romero,
Sachin Chitta,
Shinjiro Sueda,
Hui Li,
Wojciech Matusik
Abstract:
Multi-part assembly poses significant challenges for robots to execute long-horizon, contact-rich manipulation with generalization across complex geometries. We present Fabrica, a dual-arm robotic system capable of end-to-end planning and control for autonomous assembly of general multi-part objects. For planning over long horizons, we develop hierarchies of precedence, sequence, grasp, and motion…
▽ More
Multi-part assembly poses significant challenges for robots to execute long-horizon, contact-rich manipulation with generalization across complex geometries. We present Fabrica, a dual-arm robotic system capable of end-to-end planning and control for autonomous assembly of general multi-part objects. For planning over long horizons, we develop hierarchies of precedence, sequence, grasp, and motion planning with automated fixture generation, enabling general multi-step assembly on any dual-arm robots. The planner is made efficient through a parallelizable design and is optimized for downstream control stability. For contact-rich assembly steps, we propose a lightweight reinforcement learning framework that trains generalist policies across object geometries, assembly directions, and grasp poses, guided by equivariance and residual actions obtained from the plan. These policies transfer zero-shot to the real world and achieve 80% successful steps. For systematic evaluation, we propose a benchmark suite of multi-part assemblies resembling industrial and daily objects across diverse categories and geometries. By integrating efficient global planning and robust local control, we showcase the first system to achieve complete and generalizable real-world multi-part assembly without domain knowledge or human demonstrations. Project website: http://fabrica.csail.mit.edu/
△ Less
Submitted 5 June, 2025;
originally announced June 2025.
-
Directed Graph Grammars for Sequence-based Learning
Authors:
Michael Sun,
Orion Foo,
Gang Liu,
Wojciech Matusik,
Jie Chen
Abstract:
Directed acyclic graphs (DAGs) are a class of graphs commonly used in practice, with examples that include electronic circuits, Bayesian networks, and neural architectures. While many effective encoders exist for DAGs, it remains challenging to decode them in a principled manner, because the nodes of a DAG can have many different topological orders. In this work, we propose a grammar-based approac…
▽ More
Directed acyclic graphs (DAGs) are a class of graphs commonly used in practice, with examples that include electronic circuits, Bayesian networks, and neural architectures. While many effective encoders exist for DAGs, it remains challenging to decode them in a principled manner, because the nodes of a DAG can have many different topological orders. In this work, we propose a grammar-based approach to constructing a principled, compact and equivalent sequential representation of a DAG. Specifically, we view a graph as derivations over an unambiguous grammar, where the DAG corresponds to a unique sequence of production rules. Equivalently, the procedure to construct such a description can be viewed as a lossless compression of the data. Such a representation has many uses, including building a generative model for graph generation, learning a latent space for property prediction, and leveraging the sequence representational continuity for Bayesian Optimization over structured data. Code is available at https://github.com/shiningsunnyday/induction.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
Foundation Molecular Grammar: Multi-Modal Foundation Models Induce Interpretable Molecular Graph Languages
Authors:
Michael Sun,
Weize Yuan,
Gang Liu,
Wojciech Matusik,
Jie Chen
Abstract:
Recent data-efficient molecular generation approaches exploit graph grammars to introduce interpretability into the generative models. However, grammar learning therein relies on expert annotation or unreliable heuristics for algorithmic inference. We propose Foundation Molecular Grammar (FMG), which leverages multi-modal foundation models (MMFMs) to induce an interpretable molecular language. By…
▽ More
Recent data-efficient molecular generation approaches exploit graph grammars to introduce interpretability into the generative models. However, grammar learning therein relies on expert annotation or unreliable heuristics for algorithmic inference. We propose Foundation Molecular Grammar (FMG), which leverages multi-modal foundation models (MMFMs) to induce an interpretable molecular language. By exploiting the chemical knowledge of an MMFM, FMG renders molecules as images, describes them as text, and aligns information across modalities using prompt learning. FMG can be used as a drop-in replacement for the prior grammar learning approaches in molecular generation and property prediction. We show that FMG not only excels in synthesizability, diversity, and data efficiency but also offers built-in chemical interpretability for automated molecular discovery workflows. Code is available at https://github.com/shiningsunnyday/induction.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
FlashBias: Fast Computation of Attention with Bias
Authors:
Haixu Wu,
Minghao Guo,
Yuezhou Ma,
Yuanxu Sun,
Jianmin Wang,
Wojciech Matusik,
Mingsheng Long
Abstract:
Attention mechanism has emerged as a foundation module of modern deep learning models and has also empowered many milestones in various domains. Moreover, FlashAttention with IO-aware speedup resolves the efficiency issue of standard attention, further promoting its practicality. Beyond canonical attention, attention with bias also widely exists, such as relative position bias in vision and langua…
▽ More
Attention mechanism has emerged as a foundation module of modern deep learning models and has also empowered many milestones in various domains. Moreover, FlashAttention with IO-aware speedup resolves the efficiency issue of standard attention, further promoting its practicality. Beyond canonical attention, attention with bias also widely exists, such as relative position bias in vision and language models and pair representation bias in AlphaFold. In these works, prior knowledge is introduced as an additive bias term of attention weights to guide the learning process, which has been proven essential for model performance. Surprisingly, despite the common usage of attention with bias, its targeted efficiency optimization is still absent, which seriously hinders its wide applications in complex tasks. Diving into the computation of FlashAttention, we prove that its optimal efficiency is determined by the rank of the attention weight matrix. Inspired by this theoretical result, this paper presents FlashBias based on the low-rank compressed sensing theory, which can provide fast-exact computation for many widely used attention biases and a fast-accurate approximation for biases in general formalization. FlashBias can fully take advantage of the extremely optimized matrix multiplication operation in modern GPUs, achieving 1.5$\times$ speedup for AlphaFold, and over 2$\times$ speedup for attention with bias in vision and language models without loss of accuracy.
△ Less
Submitted 17 May, 2025;
originally announced May 2025.
-
AI-Enhanced Automatic Design of Efficient Underwater Gliders
Authors:
Peter Yichen Chen,
Pingchuan Ma,
Niklas Hagemann,
John Romanishin,
Wei Wang,
Daniela Rus,
Wojciech Matusik
Abstract:
The development of novel autonomous underwater gliders has been hindered by limited shape diversity, primarily due to the reliance on traditional design tools that depend heavily on manual trial and error. Building an automated design framework is challenging due to the complexities of representing glider shapes and the high computational costs associated with modeling complex solid-fluid interact…
▽ More
The development of novel autonomous underwater gliders has been hindered by limited shape diversity, primarily due to the reliance on traditional design tools that depend heavily on manual trial and error. Building an automated design framework is challenging due to the complexities of representing glider shapes and the high computational costs associated with modeling complex solid-fluid interactions. In this work, we introduce an AI-enhanced automated computational framework designed to overcome these limitations by enabling the creation of underwater robots with non-trivial hull shapes. Our approach involves an algorithm that co-optimizes both shape and control signals, utilizing a reduced-order geometry representation and a differentiable neural-network-based fluid surrogate model. This end-to-end design workflow facilitates rapid iteration and evaluation of hydrodynamic performance, leading to the discovery of optimal and complex hull shapes across various control settings. We validate our method through wind tunnel experiments and swimming pool gliding tests, demonstrating that our computationally designed gliders surpass manually designed counterparts in terms of energy efficiency. By addressing challenges in efficient shape representation and neural fluid surrogate models, our work paves the way for the development of highly efficient underwater gliders, with implications for long-range ocean exploration and environmental monitoring.
△ Less
Submitted 30 April, 2025;
originally announced May 2025.
-
Fits like a Flex-Glove: Automatic Design of Personalized FPCB-Based Tactile Sensing Gloves
Authors:
Devin Murphy,
Yichen Li,
Crystal Owens,
Layla Stanton,
Young Joong Lee,
Paul Pu Liang,
Yiyue Luo,
Antonio Torralba,
Wojciech Matusik
Abstract:
Resistive tactile sensing gloves have captured the interest of researchers spanning diverse domains, such as robotics, healthcare, and human-computer interaction. However, existing fabrication methods often require labor-intensive assembly or costly equipment, limiting accessibility. Leveraging flexible printed circuit board (FPCB) technology, we present an automated pipeline for generating resist…
▽ More
Resistive tactile sensing gloves have captured the interest of researchers spanning diverse domains, such as robotics, healthcare, and human-computer interaction. However, existing fabrication methods often require labor-intensive assembly or costly equipment, limiting accessibility. Leveraging flexible printed circuit board (FPCB) technology, we present an automated pipeline for generating resistive tactile sensing glove design files solely from a simple hand photo on legal-size paper, which can be readily supplied to commercial board houses for manufacturing. Our method enables cost-effective, accessible production at under \$130 per glove with sensor assembly times under 15 minutes. Sensor performance was characterized under varying pressure loads, and a preliminary user evaluation showcases four unique automatically manufactured designs, evaluated for their reliability and comfort.
△ Less
Submitted 8 March, 2025;
originally announced March 2025.
-
VLMaterial: Procedural Material Generation with Large Vision-Language Models
Authors:
Beichen Li,
Rundi Wu,
Armando Solar-Lezama,
Changxi Zheng,
Liang Shi,
Bernd Bickel,
Wojciech Matusik
Abstract:
Procedural materials, represented as functional node graphs, are ubiquitous in computer graphics for photorealistic material appearance design. They allow users to perform intuitive and precise editing to achieve desired visual appearances. However, creating a procedural material given an input image requires professional knowledge and significant effort. In this work, we leverage the ability to c…
▽ More
Procedural materials, represented as functional node graphs, are ubiquitous in computer graphics for photorealistic material appearance design. They allow users to perform intuitive and precise editing to achieve desired visual appearances. However, creating a procedural material given an input image requires professional knowledge and significant effort. In this work, we leverage the ability to convert procedural materials into standard Python programs and fine-tune a large pre-trained vision-language model (VLM) to generate such programs from input images. To enable effective fine-tuning, we also contribute an open-source procedural material dataset and propose to perform program-level augmentation by prompting another pre-trained large language model (LLM). Through extensive evaluation, we show that our method outperforms previous methods on both synthetic and real-world examples.
△ Less
Submitted 18 February, 2025; v1 submitted 26 January, 2025;
originally announced January 2025.
-
WiReSens Toolkit: An Open-source Platform towards Accessible Wireless Tactile Sensing
Authors:
Devin Murphy,
Junyi Zhu,
Paul Pu Liang,
Wojciech Matusik,
Yiyue Luo
Abstract:
Past research has widely explored the design and fabrication of resistive matrix-based tactile sensors as a means of creating touch-sensitive devices. However, developing portable, adaptive, and long-lasting tactile sensing systems that incorporate these sensors remains challenging for individuals having limited prior experience with them. To address this, we developed the WiReSens Toolkit, an ope…
▽ More
Past research has widely explored the design and fabrication of resistive matrix-based tactile sensors as a means of creating touch-sensitive devices. However, developing portable, adaptive, and long-lasting tactile sensing systems that incorporate these sensors remains challenging for individuals having limited prior experience with them. To address this, we developed the WiReSens Toolkit, an open-source platform for accessible wireless tactile sensing. Central to our approach is adaptive hardware for interfacing with resistive sensors and a web-based GUI that mediates access to complex functionalities for developing scalable tactile sensing systems, including 1) multi-device programming and wireless visualization across three distinct communication protocols 2) autocalibration methods for adaptive sensitivity and 3) intermittent data transmission for low-power operation. We validated the toolkit's usability through a user study with 11 novice participants, who, on average, successfully configured a tactile sensor with over 95\% accuracy in under five minutes, calibrated sensors 10x faster than baseline methods, and demonstrated enhanced tactile data sense-making.
△ Less
Submitted 24 April, 2025; v1 submitted 29 November, 2024;
originally announced December 2024.
-
Two-Stage Pretraining for Molecular Property Prediction in the Wild
Authors:
Kevin Tirta Wijaya,
Minghao Guo,
Michael Sun,
Hans-Peter Seidel,
Wojciech Matusik,
Vahid Babaei
Abstract:
Accurate property prediction is crucial for accelerating the discovery of new molecules. Although deep learning models have achieved remarkable success, their performance often relies on large amounts of labeled data that are expensive and time-consuming to obtain. Thus, there is a growing need for models that can perform well with limited experimentally-validated data. In this work, we introduce…
▽ More
Accurate property prediction is crucial for accelerating the discovery of new molecules. Although deep learning models have achieved remarkable success, their performance often relies on large amounts of labeled data that are expensive and time-consuming to obtain. Thus, there is a growing need for models that can perform well with limited experimentally-validated data. In this work, we introduce MoleVers, a versatile pretrained model designed for various types of molecular property prediction in the wild, i.e., where experimentally-validated molecular property labels are scarce. MoleVers adopts a two-stage pretraining strategy. In the first stage, the model learns molecular representations from large unlabeled datasets via masked atom prediction and dynamic denoising, a novel task enabled by a new branching encoder architecture. In the second stage, MoleVers is further pretrained using auxiliary labels obtained with inexpensive computational methods, enabling supervised learning without the need for costly experimental data. This two-stage framework allows MoleVers to learn representations that generalize effectively across various downstream datasets. We evaluate MoleVers on a new benchmark comprising 22 molecular datasets with diverse types of properties, the majority of which contain 50 or fewer training labels reflecting real-world conditions. MoleVers achieves state-of-the-art results on 20 out of the 22 datasets, and ranks second among the remaining two, highlighting its ability to bridge the gap between data-hungry models and real-world conditions where practically-useful labels are scarce.
△ Less
Submitted 5 November, 2024;
originally announced November 2024.
-
Sensor2Text: Enabling Natural Language Interactions for Daily Activity Tracking Using Wearable Sensors
Authors:
Wenqiang Chen,
Jiaxuan Cheng,
Leyao Wang,
Wei Zhao,
Wojciech Matusik
Abstract:
Visual Question-Answering, a technology that generates textual responses from an image and natural language question, has progressed significantly. Notably, it can aid in tracking and inquiring about daily activities, crucial in healthcare monitoring, especially for elderly patients or those with memory disabilities. However, video poses privacy concerns and has a limited field of view. This paper…
▽ More
Visual Question-Answering, a technology that generates textual responses from an image and natural language question, has progressed significantly. Notably, it can aid in tracking and inquiring about daily activities, crucial in healthcare monitoring, especially for elderly patients or those with memory disabilities. However, video poses privacy concerns and has a limited field of view. This paper presents Sensor2Text, a model proficient in tracking daily activities and engaging in conversations using wearable sensors. The approach outlined here tackles several challenges, including low information density in wearable sensor data, insufficiency of single wearable sensors in human activities recognition, and model's limited capacity for Question-Answering and interactive conversations. To resolve these obstacles, transfer learning and student-teacher networks are utilized to leverage knowledge from visual-language models. Additionally, an encoder-decoder neural network model is devised to jointly process language and sensor data for conversational purposes. Furthermore, Large Language Models are also utilized to enable interactive capabilities. The model showcases the ability to identify human activities and engage in Q\&A dialogues using various wearable sensor modalities. It performs comparably to or better than existing visual-language models in both captioning and conversational tasks. To our knowledge, this represents the first model capable of conversing about wearable sensor data, offering an innovative approach to daily activity tracking that addresses privacy and field-of-view limitations associated with current vision-based solutions.
△ Less
Submitted 25 October, 2024;
originally announced October 2024.
-
Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning
Authors:
Gang Liu,
Michael Sun,
Wojciech Matusik,
Meng Jiang,
Jie Chen
Abstract:
While large language models (LLMs) have integrated images, adapting them to graphs remains challenging, limiting their applications in materials and drug design. This difficulty stems from the need for coherent autoregressive generation across texts and graphs. To address this, we introduce Llamole, the first multimodal LLM capable of interleaved text and graph generation, enabling molecular inver…
▽ More
While large language models (LLMs) have integrated images, adapting them to graphs remains challenging, limiting their applications in materials and drug design. This difficulty stems from the need for coherent autoregressive generation across texts and graphs. To address this, we introduce Llamole, the first multimodal LLM capable of interleaved text and graph generation, enabling molecular inverse design with retrosynthetic planning. Llamole integrates a base LLM with the Graph Diffusion Transformer and Graph Neural Networks for multi-conditional molecular generation and reaction inference within texts, while the LLM, with enhanced molecular understanding, flexibly controls activation among the different graph modules. Additionally, Llamole integrates A* search with LLM-based cost functions for efficient retrosynthetic planning. We create benchmarking datasets and conduct extensive experiments to evaluate Llamole against in-context learning and supervised fine-tuning. Llamole significantly outperforms 14 adapted LLMs across 12 metrics for controllable molecular design and retrosynthetic planning.
△ Less
Submitted 5 October, 2024;
originally announced October 2024.
-
Learning Object Properties Using Robot Proprioception via Differentiable Robot-Object Interaction
Authors:
Peter Yichen Chen,
Chao Liu,
Pingchuan Ma,
John Eastman,
Daniela Rus,
Dylan Randle,
Yuri Ivanov,
Wojciech Matusik
Abstract:
Differentiable simulation has become a powerful tool for system identification. While prior work has focused on identifying robot properties using robot-specific data or object properties using object-specific data, our approach calibrates object properties by using information from the robot, without relying on data from the object itself. Specifically, we utilize robot joint encoder information,…
▽ More
Differentiable simulation has become a powerful tool for system identification. While prior work has focused on identifying robot properties using robot-specific data or object properties using object-specific data, our approach calibrates object properties by using information from the robot, without relying on data from the object itself. Specifically, we utilize robot joint encoder information, which is commonly available in standard robotic systems. Our key observation is that by analyzing the robot's reactions to manipulated objects, we can infer properties of those objects, such as inertia and softness. Leveraging this insight, we develop differentiable simulations of robot-object interactions to inversely identify the properties of the manipulated objects. Our approach relies solely on proprioception -- the robot's internal sensing capabilities -- and does not require external measurement tools or vision-based tracking systems. This general method is applicable to any articulated robot and requires only joint position information. We demonstrate the effectiveness of our method on a low-cost robotic platform, achieving accurate mass and elastic modulus estimations of manipulated objects with just a few seconds of computation on a laptop.
△ Less
Submitted 7 March, 2025; v1 submitted 4 October, 2024;
originally announced October 2024.
-
Procedural Synthesis of Synthesizable Molecules
Authors:
Michael Sun,
Alston Lo,
Minghao Guo,
Jie Chen,
Connor Coley,
Wojciech Matusik
Abstract:
Designing synthetically accessible molecules and recommending analogs to unsynthesizable molecules are important problems for accelerating molecular discovery. We reconceptualize both problems using ideas from program synthesis. Drawing inspiration from syntax-guided synthesis approaches, we decouple the syntactic skeleton from the semantics of a synthetic tree to create a bilevel framework for re…
▽ More
Designing synthetically accessible molecules and recommending analogs to unsynthesizable molecules are important problems for accelerating molecular discovery. We reconceptualize both problems using ideas from program synthesis. Drawing inspiration from syntax-guided synthesis approaches, we decouple the syntactic skeleton from the semantics of a synthetic tree to create a bilevel framework for reasoning about the combinatorial space of synthesis pathways. Given a molecule we aim to generate analogs for, we iteratively refine its skeletal characteristics via Markov Chain Monte Carlo simulations over the space of syntactic skeletons. Given a black-box oracle to optimize, we formulate a joint design space over syntactic templates and molecular descriptors and introduce evolutionary algorithms that optimize both syntactic and semantic dimensions synergistically. Our key insight is that once the syntactic skeleton is set, we can amortize over the search complexity of deriving the program's semantics by training policies to fully utilize the fixed horizon Markov Decision Process imposed by the syntactic template. We demonstrate performance advantages of our bilevel framework for synthesizable analog generation and synthesizable molecule design. Notably, our approach offers the user explicit control over the resources required to perform synthesis and biases the design space towards simpler solutions, making it particularly promising for autonomous synthesis platforms. Code is at https://github.com/shiningsunnyday/SynthesisNet.
△ Less
Submitted 28 February, 2025; v1 submitted 24 August, 2024;
originally announced September 2024.
-
KAN 2.0: Kolmogorov-Arnold Networks Meet Science
Authors:
Ziming Liu,
Pingchuan Ma,
Yixuan Wang,
Wojciech Matusik,
Max Tegmark
Abstract:
A major challenge of AI + Science lies in their inherent incompatibility: today's AI is primarily based on connectionism, while science depends on symbolism. To bridge the two worlds, we propose a framework to seamlessly synergize Kolmogorov-Arnold Networks (KANs) and science. The framework highlights KANs' usage for three aspects of scientific discovery: identifying relevant features, revealing m…
▽ More
A major challenge of AI + Science lies in their inherent incompatibility: today's AI is primarily based on connectionism, while science depends on symbolism. To bridge the two worlds, we propose a framework to seamlessly synergize Kolmogorov-Arnold Networks (KANs) and science. The framework highlights KANs' usage for three aspects of scientific discovery: identifying relevant features, revealing modular structures, and discovering symbolic formulas. The synergy is bidirectional: science to KAN (incorporating scientific knowledge into KANs), and KAN to science (extracting scientific insights from KANs). We highlight major new functionalities in the pykan package: (1) MultKAN: KANs with multiplication nodes. (2) kanpiler: a KAN compiler that compiles symbolic formulas into KANs. (3) tree converter: convert KANs (or any neural networks) to tree graphs. Based on these tools, we demonstrate KANs' capability to discover various types of physical laws, including conserved quantities, Lagrangians, symmetries, and constitutive laws.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Lite2Relight: 3D-aware Single Image Portrait Relighting
Authors:
Pramod Rao,
Gereon Fox,
Abhimitra Meka,
Mallikarjun B R,
Fangneng Zhan,
Tim Weyrich,
Bernd Bickel,
Hanspeter Pfister,
Wojciech Matusik,
Mohamed Elgharib,
Christian Theobalt
Abstract:
Achieving photorealistic 3D view synthesis and relighting of human portraits is pivotal for advancing AR/VR applications. Existing methodologies in portrait relighting demonstrate substantial limitations in terms of generalization and 3D consistency, coupled with inaccuracies in physically realistic lighting and identity preservation. Furthermore, personalization from a single view is difficult to…
▽ More
Achieving photorealistic 3D view synthesis and relighting of human portraits is pivotal for advancing AR/VR applications. Existing methodologies in portrait relighting demonstrate substantial limitations in terms of generalization and 3D consistency, coupled with inaccuracies in physically realistic lighting and identity preservation. Furthermore, personalization from a single view is difficult to achieve and often requires multiview images during the testing phase or involves slow optimization processes.
This paper introduces Lite2Relight, a novel technique that can predict 3D consistent head poses of portraits while performing physically plausible light editing at interactive speed. Our method uniquely extends the generative capabilities and efficient volumetric representation of EG3D, leveraging a lightstage dataset to implicitly disentangle face reflectance and perform relighting under target HDRI environment maps. By utilizing a pre-trained geometry-aware encoder and a feature alignment module, we map input images into a relightable 3D space, enhancing them with a strong face geometry and reflectance prior.
Through extensive quantitative and qualitative evaluations, we show that our method outperforms the state-of-the-art methods in terms of efficacy, photorealism, and practical application. This includes producing 3D-consistent results of the full head, including hair, eyes, and expressions. Lite2Relight paves the way for large-scale adoption of photorealistic portrait editing in various domains, offering a robust, interactive solution to a previously constrained problem. Project page: https://vcai.mpi-inf.mpg.de/projects/Lite2Relight/
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
MeMo: Meaningful, Modular Controllers via Noise Injection
Authors:
Megan Tjandrasuwita,
Jie Xu,
Armando Solar-Lezama,
Wojciech Matusik
Abstract:
Robots are often built from standardized assemblies, (e.g. arms, legs, or fingers), but each robot must be trained from scratch to control all the actuators of all the parts together. In this paper we demonstrate a new approach that takes a single robot and its controller as input and produces a set of modular controllers for each of these assemblies such that when a new robot is built from the sa…
▽ More
Robots are often built from standardized assemblies, (e.g. arms, legs, or fingers), but each robot must be trained from scratch to control all the actuators of all the parts together. In this paper we demonstrate a new approach that takes a single robot and its controller as input and produces a set of modular controllers for each of these assemblies such that when a new robot is built from the same parts, its control can be quickly learned by reusing the modular controllers. We achieve this with a framework called MeMo which learns (Me)aningful, (Mo)dular controllers. Specifically, we propose a novel modularity objective to learn an appropriate division of labor among the modules. We demonstrate that this objective can be optimized simultaneously with standard behavior cloning loss via noise injection. We benchmark our framework in locomotion and grasping environments on simple to complex robot morphology transfer. We also show that the modules help in task transfer. On both structure and task transfer, MeMo achieves improved training efficiency to graph neural network and Transformer baselines.
△ Less
Submitted 11 February, 2025; v1 submitted 24 May, 2024;
originally announced July 2024.
-
Physically Compatible 3D Object Modeling from a Single Image
Authors:
Minghao Guo,
Bohan Wang,
Pingchuan Ma,
Tianyuan Zhang,
Crystal Elaine Owens,
Chuang Gan,
Joshua B. Tenenbaum,
Kaiming He,
Wojciech Matusik
Abstract:
We present a computational framework that transforms single images into 3D physical objects. The visual geometry of a physical object in an image is determined by three orthogonal attributes: mechanical properties, external forces, and rest-shape geometry. Existing single-view 3D reconstruction methods often overlook this underlying composition, presuming rigidity or neglecting external forces. Co…
▽ More
We present a computational framework that transforms single images into 3D physical objects. The visual geometry of a physical object in an image is determined by three orthogonal attributes: mechanical properties, external forces, and rest-shape geometry. Existing single-view 3D reconstruction methods often overlook this underlying composition, presuming rigidity or neglecting external forces. Consequently, the reconstructed objects fail to withstand real-world physical forces, resulting in instability or undesirable deformation -- diverging from their intended designs as depicted in the image. Our optimization framework addresses this by embedding physical compatibility into the reconstruction process. We explicitly decompose the three physical attributes and link them through static equilibrium, which serves as a hard constraint, ensuring that the optimized physical shapes exhibit desired physical behaviors. Evaluations on a dataset collected from Objaverse demonstrate that our framework consistently enhances the physical realism of 3D models over existing methods. The utility of our framework extends to practical applications in dynamic simulations and 3D printing, where adherence to physical compatibility is paramount.
△ Less
Submitted 31 December, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes
Authors:
Minghao Guo,
Bohan Wang,
Kaiming He,
Wojciech Matusik
Abstract:
We introduce TetSphere Splatting, a Lagrangian geometry representation designed for high-quality 3D shape modeling. TetSphere splatting leverages an underused yet powerful geometric primitive -- volumetric tetrahedral meshes. It represents 3D shapes by deforming a collection of tetrahedral spheres, with geometric regularizations and constraints that effectively resolve common mesh issues such as i…
▽ More
We introduce TetSphere Splatting, a Lagrangian geometry representation designed for high-quality 3D shape modeling. TetSphere splatting leverages an underused yet powerful geometric primitive -- volumetric tetrahedral meshes. It represents 3D shapes by deforming a collection of tetrahedral spheres, with geometric regularizations and constraints that effectively resolve common mesh issues such as irregular triangles, non-manifoldness, and floating artifacts. Experimental results on multi-view and single-view reconstruction highlight TetSphere splatting's superior mesh quality while maintaining competitive reconstruction accuracy compared to state-of-the-art methods. Additionally, TetSphere splatting demonstrates versatility by seamlessly integrating into generative modeling tasks, such as image-to-3D and text-to-3D generation.
△ Less
Submitted 7 April, 2025; v1 submitted 30 May, 2024;
originally announced May 2024.
-
NeuralFluid: Neural Fluidic System Design and Control with Differentiable Simulation
Authors:
Yifei Li,
Yuchen Sun,
Pingchuan Ma,
Eftychios Sifakis,
Tao Du,
Bo Zhu,
Wojciech Matusik
Abstract:
We present a novel framework to explore neural control and design of complex fluidic systems with dynamic solid boundaries. Our system features a fast differentiable Navier-Stokes solver with solid-fluid interface handling, a low-dimensional differentiable parametric geometry representation, a control-shape co-design algorithm, and gym-like simulation environments to facilitate various fluidic con…
▽ More
We present a novel framework to explore neural control and design of complex fluidic systems with dynamic solid boundaries. Our system features a fast differentiable Navier-Stokes solver with solid-fluid interface handling, a low-dimensional differentiable parametric geometry representation, a control-shape co-design algorithm, and gym-like simulation environments to facilitate various fluidic control design applications. Additionally, we present a benchmark of design, control, and learning tasks on high-fidelity, high-resolution dynamic fluid environments that pose challenges for existing differentiable fluid simulators. These tasks include designing the control of artificial hearts, identifying robotic end-effector shapes, and controlling a fluid gate. By seamlessly incorporating our differentiable fluid simulator into a learning framework, we demonstrate successful design, control, and learning results that surpass gradient-free solutions in these benchmark tasks.
△ Less
Submitted 31 October, 2024; v1 submitted 22 May, 2024;
originally announced May 2024.
-
LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery
Authors:
Pingchuan Ma,
Tsun-Hsuan Wang,
Minghao Guo,
Zhiqing Sun,
Joshua B. Tenenbaum,
Daniela Rus,
Chuang Gan,
Wojciech Matusik
Abstract:
Large Language Models have recently gained significant attention in scientific discovery for their extensive knowledge and advanced reasoning capabilities. However, they encounter challenges in effectively simulating observational feedback and grounding it with language to propel advancements in physical scientific discovery. Conversely, human scientists undertake scientific discovery by formulati…
▽ More
Large Language Models have recently gained significant attention in scientific discovery for their extensive knowledge and advanced reasoning capabilities. However, they encounter challenges in effectively simulating observational feedback and grounding it with language to propel advancements in physical scientific discovery. Conversely, human scientists undertake scientific discovery by formulating hypotheses, conducting experiments, and revising theories through observational analysis. Inspired by this, we propose to enhance the knowledge-driven, abstract reasoning abilities of LLMs with the computational strength of simulations. We introduce Scientific Generative Agent (SGA), a bilevel optimization framework: LLMs act as knowledgeable and versatile thinkers, proposing scientific hypotheses and reason about discrete components, such as physics equations or molecule structures; meanwhile, simulations function as experimental platforms, providing observational feedback and optimizing via differentiability for continuous parts, such as physical parameters. We conduct extensive experiments to demonstrate our framework's efficacy in constitutive law discovery and molecular design, unveiling novel solutions that differ from conventional human expectations yet remain coherent upon analysis.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Configurable Holography: Towards Display and Scene Adaptation
Authors:
Yicheng Zhan,
Liang Shi,
Wojciech Matusik,
Qi Sun,
Kaan AkÅŸit
Abstract:
Emerging learned holography approaches have enabled faster and high-quality hologram synthesis, setting a new milestone toward practical holographic displays. However, these learned models require training a dedicated model for each set of display-scene parameters. To address this shortcoming, our work introduces a highly configurable learned model structure, synthesizing 3D holograms interactivel…
▽ More
Emerging learned holography approaches have enabled faster and high-quality hologram synthesis, setting a new milestone toward practical holographic displays. However, these learned models require training a dedicated model for each set of display-scene parameters. To address this shortcoming, our work introduces a highly configurable learned model structure, synthesizing 3D holograms interactively while supporting diverse display-scene parameters. Our family of models relying on this structure can be conditioned continuously for varying novel scene parameters, including input images, propagation distances, volume depths, peak brightnesses, and novel display parameters of pixel pitches and wavelengths. Uniquely, our findings unearth a correlation between depth estimation and hologram synthesis tasks in the learning domain, leading to a learned model that unlocks accurate 3D hologram generation from 2D images across varied display-scene parameters. We validate our models by synthesizing high-quality 3D holograms in simulations and also verify our findings with two different holographic display prototypes. Moreover, our family of models can synthesize holograms with a 2x speed-up compared to the state-of-the-art learned holography approaches in the literature.
△ Less
Submitted 30 March, 2025; v1 submitted 24 March, 2024;
originally announced May 2024.
-
Replicating Human Anatomy with Vision Controlled Jetting -- A Pneumatic Musculoskeletal Hand and Forearm
Authors:
Thomas Buchner,
Stefan Weirich,
Alexander M. Kübler,
Wojciech Matusik,
Robert K. Katzschmann
Abstract:
The functional replication and actuation of complex structures inspired by nature is a longstanding goal for humanity. Creating such complex structures combining soft and rigid features and actuating them with artificial muscles would further our understanding of natural kinematic structures. We printed a biomimetic hand in a single print process comprised of a rigid skeleton, soft joint capsules,…
▽ More
The functional replication and actuation of complex structures inspired by nature is a longstanding goal for humanity. Creating such complex structures combining soft and rigid features and actuating them with artificial muscles would further our understanding of natural kinematic structures. We printed a biomimetic hand in a single print process comprised of a rigid skeleton, soft joint capsules, tendons, and printed touch sensors. We showed it's actuation using electric motors. In this work, we expand on this work by adding a forearm that is also closely modeled after the human anatomy and replacing the hand's motors with 22 independently controlled pneumatic artificial muscles (PAMs). Our thin, high-strain (up to 30.1%) PAMs match the performance of state-of-the-art artificial muscles at a lower cost. The system showcases human-like dexterity with independent finger movements, demonstrating successful grasping of various objects, ranging from a small, lightweight coin to a large can of 272g in weight. The performance evaluation, based on fingertip and grasping forces along with finger joint range of motion, highlights the system's potential.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Representing Molecules as Random Walks Over Interpretable Grammars
Authors:
Michael Sun,
Minghao Guo,
Weize Yuan,
Veronika Thost,
Crystal Elaine Owens,
Aristotle Franklin Grosz,
Sharvaa Selvan,
Katelyn Zhou,
Hassan Mohiuddin,
Benjamin J Pedretti,
Zachary P Smith,
Jie Chen,
Wojciech Matusik
Abstract:
Recent research in molecular discovery has primarily been devoted to small, drug-like molecules, leaving many similarly important applications in material design without adequate technology. These applications often rely on more complex molecular structures with fewer examples that are carefully designed using known substructures. We propose a data-efficient and interpretable model for representin…
▽ More
Recent research in molecular discovery has primarily been devoted to small, drug-like molecules, leaving many similarly important applications in material design without adequate technology. These applications often rely on more complex molecular structures with fewer examples that are carefully designed using known substructures. We propose a data-efficient and interpretable model for representing and reasoning over such molecules in terms of graph grammars that explicitly describe the hierarchical design space featuring motifs to be the design basis. We present a novel representation in the form of random walks over the design space, which facilitates both molecule generation and property prediction. We demonstrate clear advantages over existing methods in terms of performance, efficiency, and synthesizability of predicted molecules, and we provide detailed insights into the method's chemical interpretability.
△ Less
Submitted 2 June, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Boundary Exploration for Bayesian Optimization With Unknown Physical Constraints
Authors:
Yunsheng Tian,
Ane Zuniga,
Xinwei Zhang,
Johannes P. Dürholt,
Payel Das,
Jie Chen,
Wojciech Matusik,
Mina Konaković Luković
Abstract:
Bayesian optimization has been successfully applied to optimize black-box functions where the number of evaluations is severely limited. However, in many real-world applications, it is hard or impossible to know in advance which designs are feasible due to some physical or system limitations. These issues lead to an even more challenging problem of optimizing an unknown function with unknown const…
▽ More
Bayesian optimization has been successfully applied to optimize black-box functions where the number of evaluations is severely limited. However, in many real-world applications, it is hard or impossible to know in advance which designs are feasible due to some physical or system limitations. These issues lead to an even more challenging problem of optimizing an unknown function with unknown constraints. In this paper, we observe that in such scenarios optimal solution typically lies on the boundary between feasible and infeasible regions of the design space, making it considerably more difficult than that with interior optima. Inspired by this observation, we propose BE-CBO, a new Bayesian optimization method that efficiently explores the boundary between feasible and infeasible designs. To identify the boundary, we learn the constraints with an ensemble of neural networks that outperform the standard Gaussian Processes for capturing complex boundaries. Our method demonstrates superior performance against state-of-the-art methods through comprehensive experiments on synthetic and real-world benchmarks. Code available at: https://github.com/yunshengtian/BE-CBO
△ Less
Submitted 21 May, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
A Universal Scaling Law for Intrinsic Fracture Energy of Networks
Authors:
Chase Hartquist,
Shu Wang,
Qiaodong Cui,
Wojciech Matusik,
Bolei Deng,
Xuanhe Zhao
Abstract:
Networks of interconnected materials permeate throughout nature, biology, and technology due to exceptional mechanical performance. Despite the importance of failure resistance in network design and utility, no existing physical model effectively links strand mechanics and connectivity to predict bulk fracture. Here, we reveal a universal scaling law that bridges these levels to predict the intrin…
▽ More
Networks of interconnected materials permeate throughout nature, biology, and technology due to exceptional mechanical performance. Despite the importance of failure resistance in network design and utility, no existing physical model effectively links strand mechanics and connectivity to predict bulk fracture. Here, we reveal a universal scaling law that bridges these levels to predict the intrinsic fracture energy of diverse networks. Simulations and experiments demonstrate its remarkable applicability to a breadth of strand constitutive behaviors, topologies, dimensionalities, and length scales. We show that local strand rupture and nonlocal energy release contribute synergistically to the measured intrinsic fracture energy in networks. These effects coordinate such that the intrinsic fracture energy scales independent of the energy to rupture a strand; it instead depends on the strand rupture force, breaking length, and connectivity. Our scaling law establishes a physical basis for understanding network fracture and a framework for fabricating tough materials from networks across multiple length scales.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
DiffAvatar: Simulation-Ready Garment Optimization with Differentiable Simulation
Authors:
Yifei Li,
Hsiao-yu Chen,
Egor Larionov,
Nikolaos Sarafianos,
Wojciech Matusik,
Tuur Stuyck
Abstract:
The realism of digital avatars is crucial in enabling telepresence applications with self-expression and customization. While physical simulations can produce realistic motions for clothed humans, they require high-quality garment assets with associated physical parameters for cloth simulations. However, manually creating these assets and calibrating their parameters is labor-intensive and require…
▽ More
The realism of digital avatars is crucial in enabling telepresence applications with self-expression and customization. While physical simulations can produce realistic motions for clothed humans, they require high-quality garment assets with associated physical parameters for cloth simulations. However, manually creating these assets and calibrating their parameters is labor-intensive and requires specialized expertise. Current methods focus on reconstructing geometry, but don't generate complete assets for physics-based applications. To address this gap, we propose \papername,~a novel approach that performs body and garment co-optimization using differentiable simulation. By integrating physical simulation into the optimization loop and accounting for the complex nonlinear behavior of cloth and its intricate interaction with the body, our framework recovers body and garment geometry and extracts important material parameters in a physically plausible way. Our experiments demonstrate that our approach generates realistic clothing and body shape suitable for downstream applications. We provide additional insights and results on our webpage: https://people.csail.mit.edu/liyifei/publication/diffavatar/
△ Less
Submitted 29 March, 2024; v1 submitted 20 November, 2023;
originally announced November 2023.
-
Neural Stress Fields for Reduced-order Elastoplasticity and Fracture
Authors:
Zeshun Zong,
Xuan Li,
Minchen Li,
Maurizio M. Chiaramonte,
Wojciech Matusik,
Eitan Grinspun,
Kevin Carlberg,
Chenfanfu Jiang,
Peter Yichen Chen
Abstract:
We propose a hybrid neural network and physics framework for reduced-order modeling of elastoplasticity and fracture. State-of-the-art scientific computing models like the Material Point Method (MPM) faithfully simulate large-deformation elastoplasticity and fracture mechanics. However, their long runtime and large memory consumption render them unsuitable for applications constrained by computati…
▽ More
We propose a hybrid neural network and physics framework for reduced-order modeling of elastoplasticity and fracture. State-of-the-art scientific computing models like the Material Point Method (MPM) faithfully simulate large-deformation elastoplasticity and fracture mechanics. However, their long runtime and large memory consumption render them unsuitable for applications constrained by computation time and memory usage, e.g., virtual reality. To overcome these barriers, we propose a reduced-order framework. Our key innovation is training a low-dimensional manifold for the Kirchhoff stress field via an implicit neural representation. This low-dimensional neural stress field (NSF) enables efficient evaluations of stress values and, correspondingly, internal forces at arbitrary spatial locations. In addition, we also train neural deformation and affine fields to build low-dimensional manifolds for the deformation and affine momentum fields. These neural stress, deformation, and affine fields share the same low-dimensional latent space, which uniquely embeds the high-dimensional simulation state. After training, we run new simulations by evolving in this single latent space, which drastically reduces the computation time and memory consumption. Our general continuum-mechanics-based reduced-order framework is applicable to any phenomena governed by the elastodynamics equation. To showcase the versatility of our framework, we simulate a wide range of material behaviors, including elastica, sand, metal, non-Newtonian fluids, fracture, contact, and collision. We demonstrate dimension reduction by up to 100,000X and time savings by up to 10X.
△ Less
Submitted 26 October, 2023;
originally announced October 2023.
-
Medial Skeletal Diagram: A Generalized Medial Axis Approach for 3D Shape Representation
Authors:
Minghao Guo,
Bohan Wang,
Wojciech Matusik
Abstract:
We propose the Medial Skeletal Diagram, a novel skeletal representation that tackles the prevailing issues around skeleton sparsity and reconstruction accuracy in existing skeletal representations. Our approach augments the continuous elements in the medial axis representation to effectively shift the complexity away from the discrete elements. To that end, we introduce generalized enveloping prim…
▽ More
We propose the Medial Skeletal Diagram, a novel skeletal representation that tackles the prevailing issues around skeleton sparsity and reconstruction accuracy in existing skeletal representations. Our approach augments the continuous elements in the medial axis representation to effectively shift the complexity away from the discrete elements. To that end, we introduce generalized enveloping primitives, an enhancement over the standard primitives in the medial axis, which ensure efficient coverage of intricate local features of the input shape and substantially reduce the number of discrete elements required. Moreover, we present a computational framework for constructing a medial skeletal diagram from an arbitrary closed manifold mesh. Our optimization pipeline ensures that the resulting medial skeletal diagram comprehensively covers the input shape with the fewest primitives. Additionally, each optimized primitive undergoes a post-refinement process to guarantee an accurate match with the source mesh in both geometry and tessellation. We validate our approach on a comprehensive benchmark of 100 shapes, demonstrating the sparsity of the discrete elements and superior reconstruction accuracy across a variety of cases. Finally, we exemplify the versatility of our representation in downstream applications such as shape generation, mesh decomposition, shape optimization, mesh alignment, mesh compression, and user-interactive design.
△ Less
Submitted 2 October, 2024; v1 submitted 13 October, 2023;
originally announced October 2023.
-
ASAP: Automated Sequence Planning for Complex Robotic Assembly with Physical Feasibility
Authors:
Yunsheng Tian,
Karl D. D. Willis,
Bassel Al Omari,
Jieliang Luo,
Pingchuan Ma,
Yichen Li,
Farhad Javid,
Edward Gu,
Joshua Jacob,
Shinjiro Sueda,
Hui Li,
Sachin Chitta,
Wojciech Matusik
Abstract:
The automated assembly of complex products requires a system that can automatically plan a physically feasible sequence of actions for assembling many parts together. In this paper, we present ASAP, a physics-based planning approach for automatically generating such a sequence for general-shaped assemblies. ASAP accounts for gravity to design a sequence where each sub-assembly is physically stable…
▽ More
The automated assembly of complex products requires a system that can automatically plan a physically feasible sequence of actions for assembling many parts together. In this paper, we present ASAP, a physics-based planning approach for automatically generating such a sequence for general-shaped assemblies. ASAP accounts for gravity to design a sequence where each sub-assembly is physically stable with a limited number of parts being held and a support surface. We apply efficient tree search algorithms to reduce the combinatorial complexity of determining such an assembly sequence. The search can be guided by either geometric heuristics or graph neural networks trained on data with simulation labels. Finally, we show the superior performance of ASAP at generating physically realistic assembly sequence plans on a large dataset of hundreds of complex product assemblies. We further demonstrate the applicability of ASAP on both simulation and real-world robotic setups. Project website: asap.csail.mit.edu
△ Less
Submitted 29 February, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction
Authors:
Minghao Guo,
Veronika Thost,
Samuel W Song,
Adithya Balachandran,
Payel Das,
Jie Chen,
Wojciech Matusik
Abstract:
The prediction of molecular properties is a crucial task in the field of material and drug discovery. The potential benefits of using deep learning techniques are reflected in the wealth of recent literature. Still, these techniques are faced with a common challenge in practice: Labeled data are limited by the cost of manual extraction from literature and laborious experimentation. In this work, w…
▽ More
The prediction of molecular properties is a crucial task in the field of material and drug discovery. The potential benefits of using deep learning techniques are reflected in the wealth of recent literature. Still, these techniques are faced with a common challenge in practice: Labeled data are limited by the cost of manual extraction from literature and laborious experimentation. In this work, we propose a data-efficient property predictor by utilizing a learnable hierarchical molecular grammar that can generate molecules from grammar production rules. Such a grammar induces an explicit geometry of the space of molecular graphs, which provides an informative prior on molecular structural similarity. The property prediction is performed using graph neural diffusion over the grammar-induced geometry. On both small and large datasets, our evaluation shows that this approach outperforms a wide spectrum of baselines, including supervised and pre-trained graph neural networks. We include a detailed ablation study and further analysis of our solution, showing its effectiveness in cases with extremely limited data. Code is available at https://github.com/gmh14/Geo-DEG.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
How Can Large Language Models Help Humans in Design and Manufacturing?
Authors:
Liane Makatura,
Michael Foshey,
Bohan Wang,
Felix HähnLein,
Pingchuan Ma,
Bolei Deng,
Megan Tjandrasuwita,
Andrew Spielberg,
Crystal Elaine Owens,
Peter Yichen Chen,
Allan Zhao,
Amy Zhu,
Wil J Norton,
Edward Gu,
Joshua Jacob,
Yifei Li,
Adriana Schulz,
Wojciech Matusik
Abstract:
The advancement of Large Language Models (LLMs), including GPT-4, provides exciting new opportunities for generative design. We investigate the application of this tool across the entire design and manufacturing workflow. Specifically, we scrutinize the utility of LLMs in tasks such as: converting a text-based prompt into a design specification, transforming a design into manufacturing instruction…
▽ More
The advancement of Large Language Models (LLMs), including GPT-4, provides exciting new opportunities for generative design. We investigate the application of this tool across the entire design and manufacturing workflow. Specifically, we scrutinize the utility of LLMs in tasks such as: converting a text-based prompt into a design specification, transforming a design into manufacturing instructions, producing a design space and design variations, computing the performance of a design, and searching for designs predicated on performance. Through a series of examples, we highlight both the benefits and the limitations of the current LLMs. By exposing these limitations, we aspire to catalyze the continued improvement and progression of these models.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Ergonomic-Centric Holography: Optimizing Realism,Immersion, and Comfort for Holographic Display
Authors:
Liang Shi,
DongHun Ryu,
Wojciech Matusik
Abstract:
We introduce ergonomic-centric holography, an algorithmic framework that simultaneously optimizes for realistic incoherent defocus, unrestricted pupil movements in the eye box, and high-order diffractions for filtering-free holography. The proposed method outperforms prior algorithms on holographic display prototypes operating in unfiltered and pupil-mimicking modes, offering the potential to enha…
▽ More
We introduce ergonomic-centric holography, an algorithmic framework that simultaneously optimizes for realistic incoherent defocus, unrestricted pupil movements in the eye box, and high-order diffractions for filtering-free holography. The proposed method outperforms prior algorithms on holographic display prototypes operating in unfiltered and pupil-mimicking modes, offering the potential to enhance next-generation virtual and augmented reality experiences.
△ Less
Submitted 16 June, 2023; v1 submitted 13 June, 2023;
originally announced June 2023.
-
Learning Preconditioner for Conjugate Gradient PDE Solvers
Authors:
Yichen Li,
Peter Yichen Chen,
Tao Du,
Wojciech Matusik
Abstract:
Efficient numerical solvers for partial differential equations empower science and engineering. One of the commonly employed numerical solvers is the preconditioned conjugate gradient (PCG) algorithm which can solve large systems to a given precision level. One challenge in PCG solvers is the selection of preconditioners, as different problem-dependent systems can benefit from different preconditi…
▽ More
Efficient numerical solvers for partial differential equations empower science and engineering. One of the commonly employed numerical solvers is the preconditioned conjugate gradient (PCG) algorithm which can solve large systems to a given precision level. One challenge in PCG solvers is the selection of preconditioners, as different problem-dependent systems can benefit from different preconditioners. We present a new method to introduce \emph{inductive bias} in preconditioning conjugate gradient algorithm. Given a system matrix and a set of solution vectors arise from an underlying distribution, we train a graph neural network to obtain an approximate decomposition to the system matrix to be used as a preconditioner in the context of PCG solvers. We conduct extensive experiments to demonstrate the efficacy and generalizability of our proposed approach in solving various 2D and 3D linear second-order PDEs.
△ Less
Submitted 6 September, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Learning Neural Constitutive Laws From Motion Observations for Generalizable PDE Dynamics
Authors:
Pingchuan Ma,
Peter Yichen Chen,
Bolei Deng,
Joshua B. Tenenbaum,
Tao Du,
Chuang Gan,
Wojciech Matusik
Abstract:
We propose a hybrid neural network (NN) and PDE approach for learning generalizable PDE dynamics from motion observations. Many NN approaches learn an end-to-end model that implicitly models both the governing PDE and constitutive models (or material models). Without explicit PDE knowledge, these approaches cannot guarantee physical correctness and have limited generalizability. We argue that the…
▽ More
We propose a hybrid neural network (NN) and PDE approach for learning generalizable PDE dynamics from motion observations. Many NN approaches learn an end-to-end model that implicitly models both the governing PDE and constitutive models (or material models). Without explicit PDE knowledge, these approaches cannot guarantee physical correctness and have limited generalizability. We argue that the governing PDEs are often well-known and should be explicitly enforced rather than learned. Instead, constitutive models are particularly suitable for learning due to their data-fitting nature. To this end, we introduce a new framework termed "Neural Constitutive Laws" (NCLaw), which utilizes a network architecture that strictly guarantees standard constitutive priors, including rotation equivariance and undeformed state equilibrium. We embed this network inside a differentiable simulation and train the model by minimizing a loss function based on the difference between the simulation and the motion observation. We validate NCLaw on various large-deformation dynamical systems, ranging from solids to fluids. After training on a single motion trajectory, our method generalizes to new geometries, initial/boundary conditions, temporal ranges, and even multi-physics systems. On these extremely out-of-distribution generalization tasks, NCLaw is orders-of-magnitude more accurate than previous NN approaches. Real-world experiments demonstrate our method's ability to learn constitutive laws from videos.
△ Less
Submitted 15 June, 2023; v1 submitted 27 April, 2023;
originally announced April 2023.
-
Category-Level Multi-Part Multi-Joint 3D Shape Assembly
Authors:
Yichen Li,
Kaichun Mo,
Yueqi Duan,
He Wang,
Jiequan Zhang,
Lin Shao,
Wojciech Matusik,
Leonidas Guibas
Abstract:
Shape assembly composes complex shapes geometries by arranging simple part geometries and has wide applications in autonomous robotic assembly and CAD modeling. Existing works focus on geometry reasoning and neglect the actual physical assembly process of matching and fitting joints, which are the contact surfaces connecting different parts. In this paper, we consider contacting joints for the tas…
▽ More
Shape assembly composes complex shapes geometries by arranging simple part geometries and has wide applications in autonomous robotic assembly and CAD modeling. Existing works focus on geometry reasoning and neglect the actual physical assembly process of matching and fitting joints, which are the contact surfaces connecting different parts. In this paper, we consider contacting joints for the task of multi-part assembly. A successful joint-optimized assembly needs to satisfy the bilateral objectives of shape structure and joint alignment. We propose a hierarchical graph learning approach composed of two levels of graph representation learning. The part graph takes part geometries as input to build the desired shape structure. The joint-level graph uses part joints information and focuses on matching and aligning joints. The two kinds of information are combined to achieve the bilateral objectives. Extensive experiments demonstrate that our method outperforms previous methods, achieving better shape structure and higher joint alignment accuracy.
△ Less
Submitted 10 March, 2023;
originally announced March 2023.
-
Computational Discovery of Microstructured Composites with Optimal Stiffness-Toughness Trade-Offs
Authors:
Beichen Li,
Bolei Deng,
Wan Shou,
Tae-Hyun Oh,
Yuanming Hu,
Yiyue Luo,
Liang Shi,
Wojciech Matusik
Abstract:
The conflict between stiffness and toughness is a fundamental problem in engineering materials design. However, the systematic discovery of microstructured composites with optimal stiffness-toughness trade-offs has never been demonstrated, hindered by the discrepancies between simulation and reality and the lack of data-efficient exploration of the entire Pareto front. We introduce a generalizable…
▽ More
The conflict between stiffness and toughness is a fundamental problem in engineering materials design. However, the systematic discovery of microstructured composites with optimal stiffness-toughness trade-offs has never been demonstrated, hindered by the discrepancies between simulation and reality and the lack of data-efficient exploration of the entire Pareto front. We introduce a generalizable pipeline that integrates physical experiments, numerical simulations, and artificial neural networks to address both challenges. Without any prescribed expert knowledge of material design, our approach implements a nested-loop proposal-validation workflow to bridge the simulation-to-reality gap and discover microstructured composites that are stiff and tough with high sample efficiency. Further analysis of Pareto-optimal designs allows us to automatically identify existing toughness enhancement mechanisms, which were previously discovered through trial-and-error or biomimicry. On a broader scale, our method provides a blueprint for computational design in various research areas beyond solid mechanics, such as polymer chemistry, fluid dynamics, meteorology, and robotics.
△ Less
Submitted 3 January, 2024; v1 submitted 31 January, 2023;
originally announced February 2023.
-
Multi-color Holograms Improve Brightness in Holographic Displays
Authors:
Koray Kavaklı,
Liang Shi,
Hakan Ürey,
Wojciech Matusik,
Kaan AkÅŸit
Abstract:
Holographic displays generate Three-Dimensional (3D) images by displaying single-color holograms time-sequentially, each lit by a single-color light source. However, representing each color one by one limits brightness in holographic displays. This paper introduces a new driving scheme for realizing brighter images in holographic displays. Unlike the conventional driving scheme, our method utilize…
▽ More
Holographic displays generate Three-Dimensional (3D) images by displaying single-color holograms time-sequentially, each lit by a single-color light source. However, representing each color one by one limits brightness in holographic displays. This paper introduces a new driving scheme for realizing brighter images in holographic displays. Unlike the conventional driving scheme, our method utilizes three light sources to illuminate each displayed hologram simultaneously at various intensity levels. In this way, our method reconstructs a multiplanar three-dimensional target scene using consecutive multi-color holograms and persistence of vision. We co-optimize multi-color holograms and required intensity levels from each light source using a gradient descent-based optimizer with a combination of application-specific loss terms. We experimentally demonstrate that our method can increase the intensity levels in holographic displays up to three times, reaching a broader range and unlocking new potentials for perceptual realism in holographic displays.
△ Less
Submitted 5 October, 2023; v1 submitted 24 January, 2023;
originally announced January 2023.
-
Assemble Them All: Physics-Based Planning for Generalizable Assembly by Disassembly
Authors:
Yunsheng Tian,
Jie Xu,
Yichen Li,
Jieliang Luo,
Shinjiro Sueda,
Hui Li,
Karl D. D. Willis,
Wojciech Matusik
Abstract:
Assembly planning is the core of automating product assembly, maintenance, and recycling for modern industrial manufacturing. Despite its importance and long history of research, planning for mechanical assemblies when given the final assembled state remains a challenging problem. This is due to the complexity of dealing with arbitrary 3D shapes and the highly constrained motion required for real-…
▽ More
Assembly planning is the core of automating product assembly, maintenance, and recycling for modern industrial manufacturing. Despite its importance and long history of research, planning for mechanical assemblies when given the final assembled state remains a challenging problem. This is due to the complexity of dealing with arbitrary 3D shapes and the highly constrained motion required for real-world assemblies. In this work, we propose a novel method to efficiently plan physically plausible assembly motion and sequences for real-world assemblies. Our method leverages the assembly-by-disassembly principle and physics-based simulation to efficiently explore a reduced search space. To evaluate the generality of our method, we define a large-scale dataset consisting of thousands of physically valid industrial assemblies with a variety of assembly motions required. Our experiments on this new benchmark demonstrate we achieve a state-of-the-art success rate and the highest computational efficiency compared to other baseline algorithms. Our method also generalizes to rotational assemblies (e.g., screws and puzzles) and solves 80-part assemblies within several minutes.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Fluidic Topology Optimization with an Anisotropic Mixture Model
Authors:
Yifei Li,
Tao Du,
Sangeetha Grama Srinivasan,
Kui Wu,
Bo Zhu,
Eftychios Sifakis,
Wojciech Matusik
Abstract:
Fluidic devices are crucial components in many industrial applications involving fluid mechanics. Computational design of a high-performance fluidic system faces multifaceted challenges regarding its geometric representation and physical accuracy. We present a novel topology optimization method to design fluidic devices in a Stokes flow context. Our approach is featured by its capability in accomm…
▽ More
Fluidic devices are crucial components in many industrial applications involving fluid mechanics. Computational design of a high-performance fluidic system faces multifaceted challenges regarding its geometric representation and physical accuracy. We present a novel topology optimization method to design fluidic devices in a Stokes flow context. Our approach is featured by its capability in accommodating a broad spectrum of boundary conditions at the solid-fluid interface. Our key contribution is an anisotropic and differentiable constitutive model that unifies the representation of different phases and boundary conditions in a Stokes model, enabling a topology optimization method that can synthesize novel structures with accurate boundary conditions from a background grid discretization. We demonstrate the efficacy of our approach by conducting several fluidic system design tasks with over four million design parameters.
△ Less
Submitted 24 September, 2022; v1 submitted 21 September, 2022;
originally announced September 2022.
-
RISP: Rendering-Invariant State Predictor with Differentiable Simulation and Rendering for Cross-Domain Parameter Estimation
Authors:
Pingchuan Ma,
Tao Du,
Joshua B. Tenenbaum,
Wojciech Matusik,
Chuang Gan
Abstract:
This work considers identifying parameters characterizing a physical system's dynamic motion directly from a video whose rendering configurations are inaccessible. Existing solutions require massive training data or lack generalizability to unknown rendering configurations. We propose a novel approach that marries domain randomization and differentiable rendering gradients to address this problem.…
▽ More
This work considers identifying parameters characterizing a physical system's dynamic motion directly from a video whose rendering configurations are inaccessible. Existing solutions require massive training data or lack generalizability to unknown rendering configurations. We propose a novel approach that marries domain randomization and differentiable rendering gradients to address this problem. Our core idea is to train a rendering-invariant state-prediction (RISP) network that transforms image differences into state differences independent of rendering configurations, e.g., lighting, shadows, or material reflectance. To train this predictor, we formulate a new loss on rendering variances using gradients from differentiable rendering. Moreover, we present an efficient, second-order method to compute the gradients of this loss, allowing it to be integrated seamlessly into modern deep learning frameworks. We evaluate our method in rigid-body and deformable-body simulation environments using four tasks: state estimation, system identification, imitation learning, and visuomotor control. We further demonstrate the efficacy of our approach on a real-world example: inferring the state and action sequences of a quadrotor from a video of its motion sequences. Compared with existing methods, our approach achieves significantly lower reconstruction errors and has better generalizability among unknown rendering configurations.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
Fast Aquatic Swimmer Optimization with Differentiable Projective Dynamics and Neural Network Hydrodynamic Models
Authors:
Elvis Nava,
John Z. Zhang,
Mike Y. Michelis,
Tao Du,
Pingchuan Ma,
Benjamin F. Grewe,
Wojciech Matusik,
Robert K. Katzschmann
Abstract:
Aquatic locomotion is a classic fluid-structure interaction (FSI) problem of interest to biologists and engineers. Solving the fully coupled FSI equations for incompressible Navier-Stokes and finite elasticity is computationally expensive. Optimizing robotic swimmer design within such a system generally involves cumbersome, gradient-free procedures on top of the already costly simulation. To addre…
▽ More
Aquatic locomotion is a classic fluid-structure interaction (FSI) problem of interest to biologists and engineers. Solving the fully coupled FSI equations for incompressible Navier-Stokes and finite elasticity is computationally expensive. Optimizing robotic swimmer design within such a system generally involves cumbersome, gradient-free procedures on top of the already costly simulation. To address this challenge we present a novel, fully differentiable hybrid approach to FSI that combines a 2D direct numerical simulation for the deformable solid structure of the swimmer and a physics-constrained neural network surrogate to capture hydrodynamic effects of the fluid. For the deformable solid simulation of the swimmer's body, we use state-of-the-art techniques from the field of computer graphics to speed up the finite-element method (FEM). For the fluid simulation, we use a U-Net architecture trained with a physics-based loss function to predict the flow field at each time step. The pressure and velocity field outputs from the neural network are sampled around the boundary of our swimmer using an immersed boundary method (IBM) to compute its swimming motion accurately and efficiently. We demonstrate the computational efficiency and differentiability of our hybrid simulator on a 2D carangiform swimmer. Due to differentiability, the simulator can be used for computational design of controls for soft bodies immersed in fluids via direct gradient-based optimization.
△ Less
Submitted 22 June, 2022; v1 submitted 30 March, 2022;
originally announced April 2022.
-
An Integrated Design Pipeline for Tactile Sensing Robotic Manipulators
Authors:
Lara Zlokapa,
Yiyue Luo,
Jie Xu,
Michael Foshey,
Kui Wu,
Pulkit Agrawal,
Wojciech Matusik
Abstract:
Traditional robotic manipulator design methods require extensive, time-consuming, and manual trial and error to produce a viable design. During this process, engineers often spend their time redesigning or reshaping components as they discover better topologies for the robotic manipulator. Tactile sensors, while useful, often complicate the design due to their bulky form factor. We propose an inte…
▽ More
Traditional robotic manipulator design methods require extensive, time-consuming, and manual trial and error to produce a viable design. During this process, engineers often spend their time redesigning or reshaping components as they discover better topologies for the robotic manipulator. Tactile sensors, while useful, often complicate the design due to their bulky form factor. We propose an integrated design pipeline to streamline the design and manufacturing of robotic manipulators with knitted, glove-like tactile sensors. The proposed pipeline allows a designer to assemble a collection of modular, open-source components by applying predefined graph grammar rules. The end result is an intuitive design paradigm that allows the creation of new virtual designs of manipulators in a matter of minutes. Our framework allows the designer to fine-tune the manipulator's shape through cage-based geometry deformation. Finally, the designer can select surfaces for adding tactile sensing. Once the manipulator design is finished, the program will automatically generate 3D printing and knitting files for manufacturing. We demonstrate the utility of this pipeline by creating four custom manipulators tested on real-world tasks: screwing in a wing screw, sorting water bottles, picking up an egg, and cutting paper with scissors.
△ Less
Submitted 14 April, 2022;
originally announced April 2022.
-
Accelerated Policy Learning with Parallel Differentiable Simulation
Authors:
Jie Xu,
Viktor Makoviychuk,
Yashraj Narang,
Fabio Ramos,
Wojciech Matusik,
Animesh Garg,
Miles Macklin
Abstract:
Deep reinforcement learning can generate complex control policies, but requires large amounts of training data to work effectively. Recent work has attempted to address this issue by leveraging differentiable simulators. However, inherent problems such as local minima and exploding/vanishing numerical gradients prevent these methods from being generally applied to control tasks with complex contac…
▽ More
Deep reinforcement learning can generate complex control policies, but requires large amounts of training data to work effectively. Recent work has attempted to address this issue by leveraging differentiable simulators. However, inherent problems such as local minima and exploding/vanishing numerical gradients prevent these methods from being generally applied to control tasks with complex contact-rich dynamics, such as humanoid locomotion in classical RL benchmarks. In this work we present a high-performance differentiable simulator and a new policy learning algorithm (SHAC) that can effectively leverage simulation gradients, even in the presence of non-smoothness. Our learning algorithm alleviates problems with local minima through a smooth critic function, avoids vanishing/exploding gradients through a truncated learning window, and allows many physical environments to be run in parallel. We evaluate our method on classical RL control tasks, and show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms. In addition, we demonstrate the scalability of our method by applying it to the challenging high-dimensional problem of muscle-actuated locomotion with a large action space, achieving a greater than 17x reduction in training time over the best-performing established RL algorithm.
△ Less
Submitted 14 April, 2022;
originally announced April 2022.
-
Data-Efficient Graph Grammar Learning for Molecular Generation
Authors:
Minghao Guo,
Veronika Thost,
Beichen Li,
Payel Das,
Jie Chen,
Wojciech Matusik
Abstract:
The problem of molecular generation has received significant attention recently. Existing methods are typically based on deep neural networks and require training on large datasets with tens of thousands of samples. In practice, however, the size of class-specific chemical datasets is usually limited (e.g., dozens of samples) due to labor-intensive experimentation and data collection. This present…
▽ More
The problem of molecular generation has received significant attention recently. Existing methods are typically based on deep neural networks and require training on large datasets with tens of thousands of samples. In practice, however, the size of class-specific chemical datasets is usually limited (e.g., dozens of samples) due to labor-intensive experimentation and data collection. This presents a considerable challenge for the deep learning generative models to comprehensively describe the molecular design space. Another major challenge is to generate only physically synthesizable molecules. This is a non-trivial task for neural network-based generative models since the relevant chemical knowledge can only be extracted and generalized from the limited training data. In this work, we propose a data-efficient generative model that can be learned from datasets with orders of magnitude smaller sizes than common benchmarks. At the heart of this method is a learnable graph grammar that generates molecules from a sequence of production rules. Without any human assistance, these production rules are automatically constructed from training data. Furthermore, additional chemical knowledge can be incorporated in the model by further grammar optimization. Our learned graph grammar yields state-of-the-art results on generating high-quality molecules for three monomer datasets that contain only ${\sim}20$ samples each. Our approach also achieves remarkable performance in a challenging polymer generation task with only $117$ training samples and is competitive against existing methods using $81$k data points. Code is available at https://github.com/gmh14/data_efficient_grammar.
△ Less
Submitted 15 March, 2022;
originally announced March 2022.
-
Closed-Loop Control of Direct Ink Writing via Reinforcement Learning
Authors:
Michal Piovarci,
Michael Foshey,
Jie Xu,
Timothy Erps,
Vahid Babaei,
Piotr Didyk,
Szymon Rusinkiewicz,
Wojciech Matusik,
Bernd Bickel
Abstract:
Enabling additive manufacturing to employ a wide range of novel, functional materials can be a major boost to this technology. However, making such materials printable requires painstaking trial-and-error by an expert operator, as they typically tend to exhibit peculiar rheological or hysteresis properties. Even in the case of successfully finding the process parameters, there is no guarantee of p…
▽ More
Enabling additive manufacturing to employ a wide range of novel, functional materials can be a major boost to this technology. However, making such materials printable requires painstaking trial-and-error by an expert operator, as they typically tend to exhibit peculiar rheological or hysteresis properties. Even in the case of successfully finding the process parameters, there is no guarantee of print-to-print consistency due to material differences between batches. These challenges make closed-loop feedback an attractive option where the process parameters are adjusted on-the-fly. There are several challenges for designing an efficient controller: the deposition parameters are complex and highly coupled, artifacts occur after long time horizons, simulating the deposition is computationally costly, and learning on hardware is intractable. In this work, we demonstrate the feasibility of learning a closed-loop control policy for additive manufacturing using reinforcement learning. We show that approximate, but efficient, numerical simulation is sufficient as long as it allows learning the behavioral patterns of deposition that translate to real-world experiences. In combination with reinforcement learning, our model can be used to discover control policies that outperform baseline controllers. Furthermore, the recovered policies have a minimal sim-to-real gap. We showcase this by applying our control policy in-vivo on a single-layer, direct ink writing printer.
△ Less
Submitted 12 September, 2022; v1 submitted 27 January, 2022;
originally announced January 2022.
-
Evolution Gym: A Large-Scale Benchmark for Evolving Soft Robots
Authors:
Jagdeep Singh Bhatia,
Holly Jackson,
Yunsheng Tian,
Jie Xu,
Wojciech Matusik
Abstract:
Both the design and control of a robot play equally important roles in its task performance. However, while optimal control is well studied in the machine learning and robotics community, less attention is placed on finding the optimal robot design. This is mainly because co-optimizing design and control in robotics is characterized as a challenging problem, and more importantly, a comprehensive e…
▽ More
Both the design and control of a robot play equally important roles in its task performance. However, while optimal control is well studied in the machine learning and robotics community, less attention is placed on finding the optimal robot design. This is mainly because co-optimizing design and control in robotics is characterized as a challenging problem, and more importantly, a comprehensive evaluation benchmark for co-optimization does not exist. In this paper, we propose Evolution Gym, the first large-scale benchmark for co-optimizing the design and control of soft robots. In our benchmark, each robot is composed of different types of voxels (e.g., soft, rigid, actuators), resulting in a modular and expressive robot design space. Our benchmark environments span a wide range of tasks, including locomotion on various types of terrains and manipulation. Furthermore, we develop several robot co-evolution algorithms by combining state-of-the-art design optimization methods and deep reinforcement learning techniques. Evaluating the algorithms on our benchmark platform, we observe robots exhibiting increasingly complex behaviors as evolution progresses, with the best evolved designs solving many of our proposed tasks. Additionally, even though robot designs are evolved autonomously from scratch without prior knowledge, they often grow to resemble existing natural creatures while outperforming hand-designed robots. Nevertheless, all tested algorithms fail to find robots that succeed in our hardest environments. This suggests that more advanced algorithms are required to explore the high-dimensional design space and evolve increasingly intelligent robots -- an area of research in which we hope Evolution Gym will accelerate progress. Our website with code, environments, documentation, and tutorials is available at http://evogym.csail.mit.edu.
△ Less
Submitted 24 January, 2022;
originally announced January 2022.
-
JoinABLe: Learning Bottom-up Assembly of Parametric CAD Joints
Authors:
Karl D. D. Willis,
Pradeep Kumar Jayaraman,
Hang Chu,
Yunsheng Tian,
Yifei Li,
Daniele Grandi,
Aditya Sanghi,
Linh Tran,
Joseph G. Lambourne,
Armando Solar-Lezama,
Wojciech Matusik
Abstract:
Physical products are often complex assemblies combining a multitude of 3D parts modeled in computer-aided design (CAD) software. CAD designers build up these assemblies by aligning individual parts to one another using constraints called joints. In this paper we introduce JoinABLe, a learning-based method that assembles parts together to form joints. JoinABLe uses the weak supervision available i…
▽ More
Physical products are often complex assemblies combining a multitude of 3D parts modeled in computer-aided design (CAD) software. CAD designers build up these assemblies by aligning individual parts to one another using constraints called joints. In this paper we introduce JoinABLe, a learning-based method that assembles parts together to form joints. JoinABLe uses the weak supervision available in standard parametric CAD files without the help of object class labels or human guidance. Our results show that by making network predictions over a graph representation of solid models we can outperform multiple baseline methods with an accuracy (79.53%) that approaches human performance (80%). Finally, to support future research we release the Fusion 360 Gallery assembly dataset, containing assemblies with rich information on joints, contact surfaces, holes, and the underlying assembly graph structure.
△ Less
Submitted 22 April, 2022; v1 submitted 24 November, 2021;
originally announced November 2021.
-
Designing Composites with Target Effective Young's Modulus using Reinforcement Learning
Authors:
Aldair E. Gongora,
Siddharth Mysore,
Beichen Li,
Wan Shou,
Wojciech Matusik,
Elise F. Morgan,
Keith A. Brown,
Emily Whiting
Abstract:
Advancements in additive manufacturing have enabled design and fabrication of materials and structures not previously realizable. In particular, the design space of composite materials and structures has vastly expanded, and the resulting size and complexity has challenged traditional design methodologies, such as brute force exploration and one factor at a time (OFAT) exploration, to find optimum…
▽ More
Advancements in additive manufacturing have enabled design and fabrication of materials and structures not previously realizable. In particular, the design space of composite materials and structures has vastly expanded, and the resulting size and complexity has challenged traditional design methodologies, such as brute force exploration and one factor at a time (OFAT) exploration, to find optimum or tailored designs. To address this challenge, supervised machine learning approaches have emerged to model the design space using curated training data; however, the selection of the training data is often determined by the user. In this work, we develop and utilize a Reinforcement learning (RL)-based framework for the design of composite structures which avoids the need for user-selected training data. For a 5 $\times$ 5 composite design space comprised of soft and compliant blocks of constituent material, we find that using this approach, the model can be trained using 2.78% of the total design space consists of $2^{25}$ design possibilities. Additionally, the developed RL-based framework is capable of finding designs at a success rate exceeding 90%. The success of this approach motivates future learning frameworks to utilize RL for the design of composites and other material systems.
△ Less
Submitted 7 October, 2021;
originally announced October 2021.
-
Sim2Real for Soft Robotic Fish via Differentiable Simulation
Authors:
John Z. Zhang,
Yu Zhang,
Pingchuan Ma,
Elvis Nava,
Tao Du,
Philip Arm,
Wojciech Matusik,
Robert K. Katzschmann
Abstract:
Accurate simulation of soft mechanisms under dynamic actuation is critical for the design of soft robots. We address this gap with our differentiable simulation tool by learning the material parameters of our soft robotic fish. On the example of a soft robotic fish, we demonstrate an experimentally-verified, fast optimization pipeline for learning the material parameters from quasi-static data via…
▽ More
Accurate simulation of soft mechanisms under dynamic actuation is critical for the design of soft robots. We address this gap with our differentiable simulation tool by learning the material parameters of our soft robotic fish. On the example of a soft robotic fish, we demonstrate an experimentally-verified, fast optimization pipeline for learning the material parameters from quasi-static data via differentiable simulation and apply it to the prediction of dynamic performance. Our method identifies physically plausible Young's moduli for various soft silicone elastomers and stiff acetal copolymers used in creation of our three different robotic fish tail designs. We show that our method is compatible with varying internal geometry of the actuators, such as the number of hollow cavities. Our framework allows high fidelity prediction of dynamic behavior for composite bi-morph bending structures in real hardware to millimeter-accuracy and within 3 percent error normalized to actuator length. We provide a differentiable and robust estimate of the thrust force using a neural network thrust predictor; this estimate allows for accurate modeling of our experimental setup measuring bollard pull. This work presents a prototypical hardware and simulation problem solved using our differentiable framework; the framework can be applied to higher dimensional parameter inference, learning control policies, and computational design due to its differentiable character.
△ Less
Submitted 8 November, 2022; v1 submitted 30 September, 2021;
originally announced September 2021.
-
Dynamic Modeling of Hand-Object Interactions via Tactile Sensing
Authors:
Qiang Zhang,
Yunzhu Li,
Yiyue Luo,
Wan Shou,
Michael Foshey,
Junchi Yan,
Joshua B. Tenenbaum,
Wojciech Matusik,
Antonio Torralba
Abstract:
Tactile sensing is critical for humans to perform everyday tasks. While significant progress has been made in analyzing object grasping from vision, it remains unclear how we can utilize tactile sensing to reason about and model the dynamics of hand-object interactions. In this work, we employ a high-resolution tactile glove to perform four different interactive activities on a diversified set of…
▽ More
Tactile sensing is critical for humans to perform everyday tasks. While significant progress has been made in analyzing object grasping from vision, it remains unclear how we can utilize tactile sensing to reason about and model the dynamics of hand-object interactions. In this work, we employ a high-resolution tactile glove to perform four different interactive activities on a diversified set of objects. We build our model on a cross-modal learning framework and generate the labels using a visual processing pipeline to supervise the tactile model, which can then be used on its own during the test time. The tactile model aims to predict the 3d locations of both the hand and the object purely from the touch data by combining a predictive model and a contrastive learning module. This framework can reason about the interaction patterns from the tactile data, hallucinate the changes in the environment, estimate the uncertainty of the prediction, and generalize to unseen objects. We also provide detailed ablation studies regarding different system designs as well as visualizations of the predicted trajectories. This work takes a step on dynamics modeling in hand-object interactions from dense tactile sensing, which opens the door for future applications in activity learning, human-computer interactions, and imitation learning for robotics.
△ Less
Submitted 9 September, 2021;
originally announced September 2021.