-
Meta 3D Gen
Authors:
Raphael Bensadoun,
Tom Monnier,
Yanir Kleiman,
Filippos Kokkinos,
Yawar Siddiqui,
Mahendra Kariya,
Omri Harosh,
Roman Shapovalov,
Benjamin Graham,
Emilien Garreau,
Animesh Karnewar,
Ang Cao,
Idan Azuri,
Iurii Makarov,
Eric-Tuan Le,
Antoine Toisoul,
David Novotny,
Oran Gafni,
Natalia Neverova,
Andrea Vedaldi
Abstract:
We introduce Meta 3D Gen (3DGen), a new state-of-the-art, fast pipeline for text-to-3D asset generation. 3DGen offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in under a minute. It supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications. Additionally, 3DGen supports generative retexturing of previously gener…
▽ More
We introduce Meta 3D Gen (3DGen), a new state-of-the-art, fast pipeline for text-to-3D asset generation. 3DGen offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in under a minute. It supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications. Additionally, 3DGen supports generative retexturing of previously generated (or artist-created) 3D shapes using additional textual inputs provided by the user. 3DGen integrates key technical components, Meta 3D AssetGen and Meta 3D TextureGen, that we developed for text-to-3D and text-to-texture generation, respectively. By combining their strengths, 3DGen represents 3D objects simultaneously in three ways: in view space, in volumetric space, and in UV (or texture) space. The integration of these two techniques achieves a win rate of 68% with respect to the single-stage model. We compare 3DGen to numerous industry baselines, and show that it outperforms them in terms of prompt fidelity and visual quality for complex textual prompts, while being significantly faster.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Meta 3D TextureGen: Fast and Consistent Texture Generation for 3D Objects
Authors:
Raphael Bensadoun,
Yanir Kleiman,
Idan Azuri,
Omri Harosh,
Andrea Vedaldi,
Natalia Neverova,
Oran Gafni
Abstract:
The recent availability and adaptability of text-to-image models has sparked a new era in many related domains that benefit from the learned text priors as well as high-quality and fast generation capabilities, one of which is texture generation for 3D objects. Although recent texture generation methods achieve impressive results by using text-to-image networks, the combination of global consisten…
▽ More
The recent availability and adaptability of text-to-image models has sparked a new era in many related domains that benefit from the learned text priors as well as high-quality and fast generation capabilities, one of which is texture generation for 3D objects. Although recent texture generation methods achieve impressive results by using text-to-image networks, the combination of global consistency, quality, and speed, which is crucial for advancing texture generation to real-world applications, remains elusive. To that end, we introduce Meta 3D TextureGen: a new feedforward method comprised of two sequential networks aimed at generating high-quality and globally consistent textures for arbitrary geometries of any complexity degree in less than 20 seconds. Our method achieves state-of-the-art results in quality and speed by conditioning a text-to-image model on 3D semantics in 2D space and fusing them into a complete and high-resolution UV texture map, as demonstrated by extensive qualitative and quantitative evaluations. In addition, we introduce a texture enhancement network that is capable of up-scaling any texture by an arbitrary ratio, producing 4k pixel resolution textures.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Localizing Object-level Shape Variations with Text-to-Image Diffusion Models
Authors:
Or Patashnik,
Daniel Garibi,
Idan Azuri,
Hadar Averbuch-Elor,
Daniel Cohen-Or
Abstract:
Text-to-image models give rise to workflows which often begin with an exploration step, where users sift through a large collection of generated images. The global nature of the text-to-image generation process prevents users from narrowing their exploration to a particular object in the image. In this paper, we present a technique to generate a collection of images that depicts variations in the…
▽ More
Text-to-image models give rise to workflows which often begin with an exploration step, where users sift through a large collection of generated images. The global nature of the text-to-image generation process prevents users from narrowing their exploration to a particular object in the image. In this paper, we present a technique to generate a collection of images that depicts variations in the shape of a specific object, enabling an object-level shape exploration process. Creating plausible variations is challenging as it requires control over the shape of the generated object while respecting its semantics. A particular challenge when generating object variations is accurately localizing the manipulation applied over the object's shape. We introduce a prompt-mixing technique that switches between prompts along the denoising process to attain a variety of shape choices. To localize the image-space operation, we present two techniques that use the self-attention layers in conjunction with the cross-attention layers. Moreover, we show that these localization techniques are general and effective beyond the scope of generating object variations. Extensive results and comparisons demonstrate the effectiveness of our method in generating object variations, and the competence of our localization techniques.
△ Less
Submitted 12 August, 2023; v1 submitted 20 March, 2023;
originally announced March 2023.
-
Generative Latent Implicit Conditional Optimization when Learning from Small Sample
Authors:
Idan Azuri,
Daphna Weinshall
Abstract:
We revisit the long-standing problem of learning from a small sample, to which end we propose a novel method called GLICO (Generative Latent Implicit Conditional Optimization). GLICO learns a mapping from the training examples to a latent space and a generator that generates images from vectors in the latent space. Unlike most recent works, which rely on access to large amounts of unlabeled data,…
▽ More
We revisit the long-standing problem of learning from a small sample, to which end we propose a novel method called GLICO (Generative Latent Implicit Conditional Optimization). GLICO learns a mapping from the training examples to a latent space and a generator that generates images from vectors in the latent space. Unlike most recent works, which rely on access to large amounts of unlabeled data, GLICO does not require access to any additional data other than the small set of labeled points. In fact, GLICO learns to synthesize completely new samples for every class using as little as 5 or 10 examples per class, with as few as 10 such classes without imposing any prior. GLICO is then used to augment the small training set while training a classifier on the small sample. To this end, our proposed method samples the learned latent space using spherical interpolation, and generates new examples using the trained generator. Empirical results show that the new sampled set is diverse enough, leading to improvement in image classification in comparison with the state of the art, when trained on small samples obtained from CIFAR-10, CIFAR-100, and CUB-200.
△ Less
Submitted 15 December, 2020; v1 submitted 31 March, 2020;
originally announced March 2020.
-
Mechanical and Tribological Properties of Layered Materials Under High Pressure: Assessing the Importance of Many-Body Dispersion Effects
Authors:
Wengen Ouyang,
Ido Azuri,
Davide Mandelli,
Alexandre Tkatchenko,
Leeor Kronik,
Michael Urbakh,
Oded Hod
Abstract:
The importance of many-body dispersion effects in layered materials subjected to high external loads is evaluated. State-of-the-art many-body dispersion density functional theory calculations performed for graphite, hexagonal boron nitride, and their hetero-structures were used to fit the parameters of a classical registry-dependent interlayer potential. Using the latter, we performed extensive eq…
▽ More
The importance of many-body dispersion effects in layered materials subjected to high external loads is evaluated. State-of-the-art many-body dispersion density functional theory calculations performed for graphite, hexagonal boron nitride, and their hetero-structures were used to fit the parameters of a classical registry-dependent interlayer potential. Using the latter, we performed extensive equilibrium molecular dynamics simulations and studied the mechanical response of homogeneous and heterogeneous bulk models under hydrostatic pressures up to 30 GPa. Comparison with experimental data demonstrates that the reliability of the many-body dispersion model extends deep into the sub-equilibrium regime. Friction simulations demonstrate the importance of many-body dispersion effects for the accurate description of the tribological properties of layered materials interfaces under high pressure.
△ Less
Submitted 28 December, 2019; v1 submitted 12 December, 2019;
originally announced December 2019.
-
The Effect of Ionic Composition on Acoustic Phonon Speeds in Hybrid Perovskites from Brillouin Spectroscopy and Density Functional Theory
Authors:
Irina V. Kabakova,
Ido Azuri,
Zhuoying Chen,
Pabitra K. Nayak,
Henry J. Snaith,
Leeor Kronik,
Carl Paterson,
Artem A. Bakulin,
David A. Egger
Abstract:
Hybrid organic-inorganic perovskites (HOIPs) have recently emerged as highly promising solution-processable materials for photovoltaic (PV) and other optoelectronic devices. HOIPs represent a broad family of materials with properties highly tuneable by the ions that make up the perovskite structure as well as their multiple combinations. Interestingly, recent high-efficiency PV devices using HOIPs…
▽ More
Hybrid organic-inorganic perovskites (HOIPs) have recently emerged as highly promising solution-processable materials for photovoltaic (PV) and other optoelectronic devices. HOIPs represent a broad family of materials with properties highly tuneable by the ions that make up the perovskite structure as well as their multiple combinations. Interestingly, recent high-efficiency PV devices using HOIPs with substantially improved long-term stability have used combinations of different ionic compositions. The structural dynamics of these systems are unique for semiconducting materials and are currently argued to be central to HOIPs stability and charge-transport properties. Here, we studied the impact of ionic composition on phonon speeds of HOIPs from Brillouin spectroscopy experiments and density functional theory calculations for FAPbBr$_3$, MAPbBr$_3$, MAPbCl$_3$, and the mixed halide MAPbBr$_{1.25}$Cl$_{1.75}$. Our results show that the acoustic phonon speeds can be strongly modified by ionic composition, which we explain by analysing the lead-halide sublattice in detail. The vibrational properties of HOIPs are therefore tuneable by using targeted ionic compositions in the perovskite structure. This tuning can be rationalized with non-trivial effects, for example, considering the influence of the shape and dipole moment of organic cations. This has an important implication to further improvements in the stability and charge-transport properties of these systems.
△ Less
Submitted 10 January, 2018;
originally announced January 2018.
-
Inter-layer Potential for Hexagonal Boron Nitride
Authors:
Itai Leven,
Ido Azuri,
Leeor Kronik,
Oded Hod
Abstract:
A new interlayer force-field for layered hexagonal boron nitride (h-BN) based structures is presented. The force-field contains three terms representing the interlayer attraction due to dispersive interactions, repulsion due to anisotropic overlaps of electron clouds, and monopolar electrostatic interactions. With appropriate parameterization, the potential is able to simultaneously capture well t…
▽ More
A new interlayer force-field for layered hexagonal boron nitride (h-BN) based structures is presented. The force-field contains three terms representing the interlayer attraction due to dispersive interactions, repulsion due to anisotropic overlaps of electron clouds, and monopolar electrostatic interactions. With appropriate parameterization, the potential is able to simultaneously capture well the binding and lateral sliding energies of planar h-BN based dimer systems as well as the interlayer telescoping and rotation of double walled boron-nitride nanotubes of different crystallographic orientations. The new potential thus enables the accurate and efficient modeling and simulation of large-scale h-BN based layered structures.
△ Less
Submitted 10 October, 2013;
originally announced October 2013.