-
FruitNeRF++: A Generalized Multi-Fruit Counting Method Utilizing Contrastive Learning and Neural Radiance Fields
Authors:
Lukas Meyer,
Andrei-Timotei Ardelean,
Tim Weyrich,
Marc Stamminger
Abstract:
We introduce FruitNeRF++, a novel fruit-counting approach that combines contrastive learning with neural radiance fields to count fruits from unstructured input photographs of orchards. Our work is based on FruitNeRF, which employs a neural semantic field combined with a fruit-specific clustering approach. The requirement for adaptation for each fruit type limits the applicability of the method, a…
▽ More
We introduce FruitNeRF++, a novel fruit-counting approach that combines contrastive learning with neural radiance fields to count fruits from unstructured input photographs of orchards. Our work is based on FruitNeRF, which employs a neural semantic field combined with a fruit-specific clustering approach. The requirement for adaptation for each fruit type limits the applicability of the method, and makes it difficult to use in practice. To lift this limitation, we design a shape-agnostic multi-fruit counting framework, that complements the RGB and semantic data with instance masks predicted by a vision foundation model. The masks are used to encode the identity of each fruit as instance embeddings into a neural instance field. By volumetrically sampling the neural fields, we extract a point cloud embedded with the instance features, which can be clustered in a fruit-agnostic manner to obtain the fruit count. We evaluate our approach using a synthetic dataset containing apples, plums, lemons, pears, peaches, and mangoes, as well as a real-world benchmark apple dataset. Our results demonstrate that FruitNeRF++ is easier to control and compares favorably to other state-of-the-art methods.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
MAROON: A Framework for the Joint Characterization of Near-Field High-Resolution Radar and Optical Depth Imaging Techniques
Authors:
Vanessa Wirth,
Johanna Bräunig,
Martin Vossiek,
Tim Weyrich,
Marc Stamminger
Abstract:
Utilizing the complementary strengths of wavelength-specific range or depth sensors is crucial for robust computer-assisted tasks such as autonomous driving. Despite this, there is still little research done at the intersection of optical depth sensors and radars operating close range, where the target is decimeters away from the sensors. Together with a growing interest in high-resolution imaging…
▽ More
Utilizing the complementary strengths of wavelength-specific range or depth sensors is crucial for robust computer-assisted tasks such as autonomous driving. Despite this, there is still little research done at the intersection of optical depth sensors and radars operating close range, where the target is decimeters away from the sensors. Together with a growing interest in high-resolution imaging radars operating in the near field, the question arises how these sensors behave in comparison to their traditional optical counterparts.
In this work, we take on the unique challenge of jointly characterizing depth imagers from both, the optical and radio-frequency domain using a multimodal spatial calibration. We collect data from four depth imagers, with three optical sensors of varying operation principle and an imaging radar. We provide a comprehensive evaluation of their depth measurements with respect to distinct object materials, geometries, and object-to-sensor distances. Specifically, we reveal scattering effects of partially transmissive materials and investigate the response of radio-frequency signals. All object measurements will be made public in form of a multimodal dataset, called MAROON.
△ Less
Submitted 26 November, 2024; v1 submitted 1 November, 2024;
originally announced November 2024.
-
Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency
Authors:
Florian Hahlbohm,
Fabian Friederichs,
Tim Weyrich,
Linus Franke,
Moritz Kappel,
Susana Castillo,
Marc Stamminger,
Martin Eisemann,
Marcus Magnor
Abstract:
3D Gaussian Splats (3DGS) have proven a versatile rendering primitive, both for inverse rendering as well as real-time exploration of scenes. In these applications, coherence across camera frames and multiple views is crucial, be it for robust convergence of a scene reconstruction or for artifact-free fly-throughs. Recent work started mitigating artifacts that break multi-view coherence, including…
▽ More
3D Gaussian Splats (3DGS) have proven a versatile rendering primitive, both for inverse rendering as well as real-time exploration of scenes. In these applications, coherence across camera frames and multiple views is crucial, be it for robust convergence of a scene reconstruction or for artifact-free fly-throughs. Recent work started mitigating artifacts that break multi-view coherence, including popping artifacts due to inconsistent transparency sorting and perspective-correct outlines of (2D) splats. At the same time, real-time requirements forced such implementations to accept compromises in how transparency of large assemblies of 3D Gaussians is resolved, in turn breaking coherence in other ways. In our work, we aim at achieving maximum coherence, by rendering fully perspective-correct 3D Gaussians while using a high-quality approximation of accurate blending, hybrid transparency, on a per-pixel level, in order to retain real-time frame rates. Our fast and perspectively accurate approach for evaluation of 3D Gaussians does not require matrix inversions, thereby ensuring numerical stability and eliminating the need for special handling of degenerate splats, and the hybrid transparency formulation for blending maintains similar quality as fully resolved per-pixel transparencies at a fraction of the rendering costs. We further show that each of these two components can be independently integrated into Gaussian splatting systems. In combination, they achieve up to 2$\times$ higher frame rates, 2$\times$ faster optimization, and equal or better image quality with fewer rendering artifacts compared to traditional 3DGS on common benchmarks.
△ Less
Submitted 10 March, 2025; v1 submitted 10 October, 2024;
originally announced October 2024.
-
Lite2Relight: 3D-aware Single Image Portrait Relighting
Authors:
Pramod Rao,
Gereon Fox,
Abhimitra Meka,
Mallikarjun B R,
Fangneng Zhan,
Tim Weyrich,
Bernd Bickel,
Hanspeter Pfister,
Wojciech Matusik,
Mohamed Elgharib,
Christian Theobalt
Abstract:
Achieving photorealistic 3D view synthesis and relighting of human portraits is pivotal for advancing AR/VR applications. Existing methodologies in portrait relighting demonstrate substantial limitations in terms of generalization and 3D consistency, coupled with inaccuracies in physically realistic lighting and identity preservation. Furthermore, personalization from a single view is difficult to…
▽ More
Achieving photorealistic 3D view synthesis and relighting of human portraits is pivotal for advancing AR/VR applications. Existing methodologies in portrait relighting demonstrate substantial limitations in terms of generalization and 3D consistency, coupled with inaccuracies in physically realistic lighting and identity preservation. Furthermore, personalization from a single view is difficult to achieve and often requires multiview images during the testing phase or involves slow optimization processes.
This paper introduces Lite2Relight, a novel technique that can predict 3D consistent head poses of portraits while performing physically plausible light editing at interactive speed. Our method uniquely extends the generative capabilities and efficient volumetric representation of EG3D, leveraging a lightstage dataset to implicitly disentangle face reflectance and perform relighting under target HDRI environment maps. By utilizing a pre-trained geometry-aware encoder and a feature alignment module, we map input images into a relightable 3D space, enhancing them with a strong face geometry and reflectance prior.
Through extensive quantitative and qualitative evaluations, we show that our method outperforms the state-of-the-art methods in terms of efficacy, photorealism, and practical application. This includes producing 3D-consistent results of the full head, including hair, eyes, and expressions. Lite2Relight paves the way for large-scale adoption of photorealistic portrait editing in various domains, offering a robust, interactive solution to a previously constrained problem. Project page: https://vcai.mpi-inf.mpg.de/projects/Lite2Relight/
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Blind Localization and Clustering of Anomalies in Textures
Authors:
Andrei-Timotei Ardelean,
Tim Weyrich
Abstract:
Anomaly detection and localization in images is a growing field in computer vision. In this area, a seemingly understudied problem is anomaly clustering, i.e., identifying and grouping different types of anomalies in a fully unsupervised manner. In this work, we propose a novel method for clustering anomalies in largely stationary images (textures) in a blind setting. That is, the input consists o…
▽ More
Anomaly detection and localization in images is a growing field in computer vision. In this area, a seemingly understudied problem is anomaly clustering, i.e., identifying and grouping different types of anomalies in a fully unsupervised manner. In this work, we propose a novel method for clustering anomalies in largely stationary images (textures) in a blind setting. That is, the input consists of normal and anomalous images without distinction and without labels. What contributes to the difficulty of the task is that anomalous regions are often small and may present only subtle changes in appearance, which can be easily overshadowed by the genuine variance in the texture. Moreover, each anomaly type may have a complex appearance distribution. We introduce a novel scheme for solving this task using a combination of blind anomaly localization and contrastive learning. By identifying the anomalous regions with high fidelity, we can restrict our focus to those regions of interest; then, contrastive learning is employed to increase the separability of different anomaly types and reduce the intra-class variation. Our experiments show that the proposed solution yields significantly better results compared to prior work, setting a new state of the art. Project page: https://reality.tf.fau.de/pub/ardelean2024blind.html.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Automatic Spatial Calibration of Near-Field MIMO Radar With Respect to Optical Depth Sensors
Authors:
Vanessa Wirth,
Johanna Bräunig,
Danti Khouri,
Florian Gutsche,
Martin Vossiek,
Tim Weyrich,
Marc Stamminger
Abstract:
Despite an emerging interest in MIMO radar, the utilization of its complementary strengths in combination with optical depth sensors has so far been limited to far-field applications, due to the challenges that arise from mutual sensor calibration in the near field. In fact, most related approaches in the autonomous industry propose target-based calibration methods using corner reflectors that hav…
▽ More
Despite an emerging interest in MIMO radar, the utilization of its complementary strengths in combination with optical depth sensors has so far been limited to far-field applications, due to the challenges that arise from mutual sensor calibration in the near field. In fact, most related approaches in the autonomous industry propose target-based calibration methods using corner reflectors that have proven to be unsuitable for the near field. In contrast, we propose a novel, joint calibration approach for optical RGB-D sensors and MIMO radars that is designed to operate in the radar's near-field range, within decimeters from the sensors. Our pipeline consists of a bespoke calibration target, allowing for automatic target detection and localization, followed by the spatial calibration of the two sensor coordinate systems through target registration. We validate our approach using two different depth sensing technologies from the optical domain. The experiments show the efficiency and accuracy of our calibration for various target displacements, as well as its robustness of our localization in terms of signal ambiguities.
△ Less
Submitted 13 August, 2024; v1 submitted 16 March, 2024;
originally announced March 2024.
-
High-Fidelity Zero-Shot Texture Anomaly Localization Using Feature Correspondence Analysis
Authors:
Andrei-Timotei Ardelean,
Tim Weyrich
Abstract:
We propose a novel method for Zero-Shot Anomaly Localization on textures. The task refers to identifying abnormal regions in an otherwise homogeneous image. To obtain a high-fidelity localization, we leverage a bijective mapping derived from the 1-dimensional Wasserstein Distance. As opposed to using holistic distances between distributions, the proposed approach allows pinpointing the non-conform…
▽ More
We propose a novel method for Zero-Shot Anomaly Localization on textures. The task refers to identifying abnormal regions in an otherwise homogeneous image. To obtain a high-fidelity localization, we leverage a bijective mapping derived from the 1-dimensional Wasserstein Distance. As opposed to using holistic distances between distributions, the proposed approach allows pinpointing the non-conformity of a pixel in a local context with increased precision. By aggregating the contribution of the pixel to the errors of all nearby patches we obtain a reliable anomaly score estimate. We validate our solution on several datasets and obtain more than a 40% reduction in error over the previous state of the art on the MVTec AD dataset in a zero-shot setting. Also see https://reality.tf.fau.de/pub/ardelean2024highfidelity.html.
△ Less
Submitted 4 December, 2023; v1 submitted 13 April, 2023;
originally announced April 2023.
-
Metameric Varifocal Holography
Authors:
David R. Walton,
Koray Kavaklı,
Rafael Kuffner dos Anjos,
David Swapp,
Tim Weyrich,
Hakan Urey,
Anthony Steed,
Tobias Ritschel,
Kaan Akşit
Abstract:
Computer-Generated Holography (CGH) offers the potential for genuine, high-quality three-dimensional visuals. However, fulfilling this potential remains a practical challenge due to computational complexity and visual quality issues. We propose a new CGH method that exploits gaze-contingency and perceptual graphics to accelerate the development of practical holographic display systems. Firstly, ou…
▽ More
Computer-Generated Holography (CGH) offers the potential for genuine, high-quality three-dimensional visuals. However, fulfilling this potential remains a practical challenge due to computational complexity and visual quality issues. We propose a new CGH method that exploits gaze-contingency and perceptual graphics to accelerate the development of practical holographic display systems. Firstly, our method infers the user's focal depth and generates images only at their focus plane without using any moving parts. Second, the images displayed are metamers; in the user's peripheral vision, they need only be statistically correct and blend with the fovea seamlessly. Unlike previous methods, our method prioritises and improves foveal visual quality without causing perceptually visible distortions at the periphery. To enable our method, we introduce a novel metameric loss function that robustly compares the statistics of two given images for a known gaze location. In parallel, we implement a model representing the relation between holograms and their image reconstructions. We couple our differentiable loss function and model to metameric varifocal holograms using a stochastic gradient descent solver. We evaluate our method with an actual proof-of-concept holographic display, and we show that our CGH method leads to practical and perceptually three-dimensional image reconstructions.
△ Less
Submitted 16 May, 2022; v1 submitted 5 October, 2021;
originally announced October 2021.
-
PhotoApp: Photorealistic Appearance Editing of Head Portraits
Authors:
Mallikarjun B R,
Ayush Tewari,
Abdallah Dib,
Tim Weyrich,
Bernd Bickel,
Hans-Peter Seidel,
Hanspeter Pfister,
Wojciech Matusik,
Louis Chevallier,
Mohamed Elgharib,
Christian Theobalt
Abstract:
Photorealistic editing of portraits is a challenging task as humans are very sensitive to inconsistencies in faces. We present an approach for high-quality intuitive editing of the camera viewpoint and scene illumination in a portrait image. This requires our method to capture and control the full reflectance field of the person in the image. Most editing approaches rely on supervised learning usi…
▽ More
Photorealistic editing of portraits is a challenging task as humans are very sensitive to inconsistencies in faces. We present an approach for high-quality intuitive editing of the camera viewpoint and scene illumination in a portrait image. This requires our method to capture and control the full reflectance field of the person in the image. Most editing approaches rely on supervised learning using training data captured with setups such as light and camera stages. Such datasets are expensive to acquire, not readily available and do not capture all the rich variations of in-the-wild portrait images. In addition, most supervised approaches only focus on relighting, and do not allow camera viewpoint editing. Thus, they only capture and control a subset of the reflectance field. Recently, portrait editing has been demonstrated by operating in the generative model space of StyleGAN. While such approaches do not require direct supervision, there is a significant loss of quality when compared to the supervised approaches. In this paper, we present a method which learns from limited supervised training data. The training images only include people in a fixed neutral expression with eyes closed, without much hair or background variations. Each person is captured under 150 one-light-at-a-time conditions and under 8 camera poses. Instead of training directly in the image space, we design a supervised problem which learns transformations in the latent space of StyleGAN. This combines the best of supervised learning and generative adversarial modeling. We show that the StyleGAN prior allows for generalisation to different expressions, hairstyles and backgrounds. This produces high-quality photorealistic results for in-the-wild images and significantly outperforms existing methods. Our approach can edit the illumination and pose simultaneously, and runs at interactive rates.
△ Less
Submitted 13 May, 2021; v1 submitted 13 March, 2021;
originally announced March 2021.
-
Neural BRDF Representation and Importance Sampling
Authors:
Alejandro Sztrajman,
Gilles Rainer,
Tobias Ritschel,
Tim Weyrich
Abstract:
Controlled capture of real-world material appearance yields tabulated sets of highly realistic reflectance data. In practice, however, its high memory footprint requires compressing into a representation that can be used efficiently in rendering while remaining faithful to the original. Previous works in appearance encoding often prioritised one of these requirements at the expense of the other, b…
▽ More
Controlled capture of real-world material appearance yields tabulated sets of highly realistic reflectance data. In practice, however, its high memory footprint requires compressing into a representation that can be used efficiently in rendering while remaining faithful to the original. Previous works in appearance encoding often prioritised one of these requirements at the expense of the other, by either applying high-fidelity array compression strategies not suited for efficient queries during rendering, or by fitting a compact analytic model that lacks expressiveness. We present a compact neural network-based representation of BRDF data that combines high-accuracy reconstruction with efficient practical rendering via built-in interpolation of reflectance. We encode BRDFs as lightweight networks, and propose a training scheme with adaptive angular sampling, critical for the accurate reconstruction of specular highlights. Additionally, we propose a novel approach to make our representation amenable to importance sampling: rather than inverting the trained networks, we learn to encode them in a more compact embedding that can be mapped to parameters of an analytic BRDF for which importance sampling is known. We evaluate encoding results on isotropic and anisotropic BRDFs from multiple real-world datasets, and importance sampling performance for isotropic BRDFs mapped to two different analytic models.
△ Less
Submitted 14 May, 2021; v1 submitted 11 February, 2021;
originally announced February 2021.
-
Monocular Reconstruction of Neural Face Reflectance Fields
Authors:
Mallikarjun B R.,
Ayush Tewari,
Tae-Hyun Oh,
Tim Weyrich,
Bernd Bickel,
Hans-Peter Seidel,
Hanspeter Pfister,
Wojciech Matusik,
Mohamed Elgharib,
Christian Theobalt
Abstract:
The reflectance field of a face describes the reflectance properties responsible for complex lighting effects including diffuse, specular, inter-reflection and self shadowing. Most existing methods for estimating the face reflectance from a monocular image assume faces to be diffuse with very few approaches adding a specular component. This still leaves out important perceptual aspects of reflecta…
▽ More
The reflectance field of a face describes the reflectance properties responsible for complex lighting effects including diffuse, specular, inter-reflection and self shadowing. Most existing methods for estimating the face reflectance from a monocular image assume faces to be diffuse with very few approaches adding a specular component. This still leaves out important perceptual aspects of reflectance as higher-order global illumination effects and self-shadowing are not modeled. We present a new neural representation for face reflectance where we can estimate all components of the reflectance responsible for the final appearance from a single monocular image. Instead of modeling each component of the reflectance separately using parametric models, our neural representation allows us to generate a basis set of faces in a geometric deformation-invariant space, parameterized by the input light direction, viewpoint and face geometry. We learn to reconstruct this reflectance field of a face just from a monocular image, which can be used to render the face from any viewpoint in any light condition. Our method is trained on a light-stage training dataset, which captures 300 people illuminated with 150 light conditions from 8 viewpoints. We show that our method outperforms existing monocular reflectance reconstruction methods, in terms of photorealism due to better capturing of physical premitives, such as sub-surface scattering, specularities, self-shadows and other higher-order effects.
△ Less
Submitted 24 August, 2020;
originally announced August 2020.
-
Efficient Spatially Adaptive Convolution and Correlation
Authors:
Thomas W. Mitchel,
Benedict Brown,
David Koller,
Tim Weyrich,
Szymon Rusinkiewicz,
Michael Kazhdan
Abstract:
Fast methods for convolution and correlation underlie a variety of applications in computer vision and graphics, including efficient filtering, analysis, and simulation. However, standard convolution and correlation are inherently limited to fixed filters: spatial adaptation is impossible without sacrificing efficient computation. In early work, Freeman and Adelson have shown how steerable filters…
▽ More
Fast methods for convolution and correlation underlie a variety of applications in computer vision and graphics, including efficient filtering, analysis, and simulation. However, standard convolution and correlation are inherently limited to fixed filters: spatial adaptation is impossible without sacrificing efficient computation. In early work, Freeman and Adelson have shown how steerable filters can address this limitation, providing a way for rotating the filter as it is passed over the signal. In this work, we provide a general, representation-theoretic, framework that allows for spatially varying linear transformations to be applied to the filter. This framework allows for efficient implementation of extended convolution and correlation for transformation groups such as rotation (in 2D and 3D) and scale, and provides a new interpretation for previous methods including steerable filters and the generalized Hough transform. We present applications to pattern matching, image feature description, vector field visualization, and adaptive image filtering.
△ Less
Submitted 28 July, 2020; v1 submitted 23 June, 2020;
originally announced June 2020.
-
Progressive Refinement Imaging
Authors:
Markus Kluge,
Tim Weyrich,
Andreas Kolb
Abstract:
This paper presents a novel technique for progressive online integration of uncalibrated image sequences with substantial geometric and/or photometric discrepancies into a single, geometrically and photometrically consistent image. Our approach can handle large sets of images, acquired from a nearly planar or infinitely distant scene at different resolutions in object domain and under variable loc…
▽ More
This paper presents a novel technique for progressive online integration of uncalibrated image sequences with substantial geometric and/or photometric discrepancies into a single, geometrically and photometrically consistent image. Our approach can handle large sets of images, acquired from a nearly planar or infinitely distant scene at different resolutions in object domain and under variable local or global illumination conditions. It allows for efficient user guidance as its progressive nature provides a valid and consistent reconstruction at any moment during the online refinement process. Our approach avoids global optimization techniques, as commonly used in the field of image refinement, and progressively incorporates new imagery into a dynamically extendable and memory-efficient Laplacian pyramid. Our image registration process includes a coarse homography and a local refinement stage using optical flow. Photometric consistency is achieved by retaining the photometric intensities given in a reference image, while it is being refined. Globally blurred imagery and local geometric inconsistencies due to, e.g. motion are detected and removed prior to image fusion. We demonstrate the quality and robustness of our approach using several image and video sequences, including handheld acquisition with mobile phones and zooming sequences with consumer cameras.
△ Less
Submitted 9 September, 2019; v1 submitted 9 July, 2019;
originally announced July 2019.
-
Image-based remapping of spatially-varying material appearance
Authors:
Alejandro Sztrajman,
Jaroslav Krivanek,
Alexander Wilkie,
Tim Weyrich
Abstract:
BRDF models are ubiquitous tools for the representation of material appearance. However, there is now an astonishingly large number of different models in practical use. Both a lack of BRDF model standardisation across implementations found in different renderers, as well as the often semantically different capabilities of various models, have grown to be a major hindrance to the interchange of pr…
▽ More
BRDF models are ubiquitous tools for the representation of material appearance. However, there is now an astonishingly large number of different models in practical use. Both a lack of BRDF model standardisation across implementations found in different renderers, as well as the often semantically different capabilities of various models, have grown to be a major hindrance to the interchange of production assets between different rendering systems. Current attempts to solve this problem rely on manually finding visual similarities between models, or mathematical ones between their functional shapes, which requires access to the shader implementation, usually unavailable in commercial renderers. We present a method for automatic translation of material appearance between different BRDF models, which uses an image-based metric for appearance comparison, and that delegates the interaction with the model to the renderer. We analyse the performance of the method, both with respect to robustness and visual differences of the fits for multiple combinations of BRDF models. While it is effective for individual BRDFs, the computational cost does not scale well for spatially-varying BRDFs. Therefore, we further present a parametric regression scheme that approximates the shape of the transformation function and generates a reduced representation which evaluates instantly and without further interaction with the renderer. We present respective visual comparisons of the remapped SVBRDF models for commonly used renderers and shading models, and show that our approach is able to extrapolate transformed BRDF parameters better than other complex regression schemes.
△ Less
Submitted 20 August, 2018;
originally announced August 2018.
-
Learning on the Edge: Explicit Boundary Handling in CNNs
Authors:
Carlo Innamorati,
Tobias Ritschel,
Tim Weyrich,
Niloy J. Mitra
Abstract:
Convolutional neural networks (CNNs) handle the case where filters extend beyond the image boundary using several heuristics, such as zero, repeat or mean padding. These schemes are applied in an ad-hoc fashion and, being weakly related to the image content and oblivious of the target task, result in low output quality at the boundary. In this paper, we propose a simple and effective improvement t…
▽ More
Convolutional neural networks (CNNs) handle the case where filters extend beyond the image boundary using several heuristics, such as zero, repeat or mean padding. These schemes are applied in an ad-hoc fashion and, being weakly related to the image content and oblivious of the target task, result in low output quality at the boundary. In this paper, we propose a simple and effective improvement that learns the boundary handling itself. At training-time, the network is provided with a separate set of explicit boundary filters. At testing-time, we use these filters which have learned to extrapolate features at the boundary in an optimal way for the specific task. Our extensive evaluation, over a wide range of architectural changes (variations of layers, feature channels, or both), shows how the explicit filters result in improved boundary handling. Consequently, we demonstrate an improvement of 5% to 20% across the board of typical CNN applications (colorization, de-Bayering, optical flow, and disparity estimation).
△ Less
Submitted 8 May, 2018;
originally announced May 2018.
-
Plausible Shading Decomposition For Layered Photo Retouching
Authors:
Carlo Innamorati,
Tobias Ritschel,
Tim Weyrich,
Niloy J. Mitra
Abstract:
Photographers routinely compose multiple manipulated photos of the same scene (layers) into a single image, which is better than any individual photo could be alone. Similarly, 3D artists set up rendering systems to produce layered images to contain only individual aspects of the light transport, which are composed into the final result in post-production. Regrettably, both approaches either take…
▽ More
Photographers routinely compose multiple manipulated photos of the same scene (layers) into a single image, which is better than any individual photo could be alone. Similarly, 3D artists set up rendering systems to produce layered images to contain only individual aspects of the light transport, which are composed into the final result in post-production. Regrettably, both approaches either take considerable time to capture, or remain limited to synthetic scenes. In this paper, we suggest a system to allow decomposing a single image into a plausible shading decomposition (PSD) that approximates effects such as shadow, diffuse illumination, albedo, and specular shading. This decomposition can then be manipulated in any off-the-shelf image manipulation software and recomposited back. We perform such a decomposition by learning a convolutional neural network trained using synthetic data. We demonstrate the effectiveness of our decomposition on synthetic (i.e., rendered) and real data (i.e., photographs), and use them for common photo manipulation, which are nearly impossible to perform otherwise from single images.
△ Less
Submitted 2 February, 2017; v1 submitted 23 January, 2017;
originally announced January 2017.