-
StructuReiser: A Structure-preserving Video Stylization Method
Authors:
Radim Spetlik,
David Futschik,
Daniel Sykora
Abstract:
We introduce StructuReiser, a novel video-to-video translation method that transforms input videos into stylized sequences using a set of user-provided keyframes. Unlike existing approaches, StructuReiser maintains strict adherence to the structural elements of the target video, preserving the original identity while seamlessly applying the desired stylistic transformations. This enables a level o…
▽ More
We introduce StructuReiser, a novel video-to-video translation method that transforms input videos into stylized sequences using a set of user-provided keyframes. Unlike existing approaches, StructuReiser maintains strict adherence to the structural elements of the target video, preserving the original identity while seamlessly applying the desired stylistic transformations. This enables a level of control and consistency that was previously unattainable with traditional text-driven or keyframe-based methods. Furthermore, StructuReiser supports real-time inference and custom keyframe editing, making it ideal for interactive applications and expanding the possibilities for creative expression and video manipulation.
△ Less
Submitted 7 October, 2024; v1 submitted 9 September, 2024;
originally announced September 2024.
-
Controllable Light Diffusion for Portraits
Authors:
David Futschik,
Kelvin Ritland,
James Vecore,
Sean Fanello,
Sergio Orts-Escolano,
Brian Curless,
Daniel Sýkora,
Rohit Pandey
Abstract:
We introduce light diffusion, a novel method to improve lighting in portraits, softening harsh shadows and specular highlights while preserving overall scene illumination. Inspired by professional photographers' diffusers and scrims, our method softens lighting given only a single portrait photo. Previous portrait relighting approaches focus on changing the entire lighting environment, removing sh…
▽ More
We introduce light diffusion, a novel method to improve lighting in portraits, softening harsh shadows and specular highlights while preserving overall scene illumination. Inspired by professional photographers' diffusers and scrims, our method softens lighting given only a single portrait photo. Previous portrait relighting approaches focus on changing the entire lighting environment, removing shadows (ignoring strong specular highlights), or removing shading entirely. In contrast, we propose a learning based method that allows us to control the amount of light diffusion and apply it on in-the-wild portraits. Additionally, we design a method to synthetically generate plausible external shadows with sub-surface scattering effects while conforming to the shape of the subject's face. Finally, we show how our approach can increase the robustness of higher level vision applications, such as albedo estimation, geometry estimation and semantic segmentation.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
Neural Neighbor Style Transfer
Authors:
Nicholas Kolkin,
Michal Kucera,
Sylvain Paris,
Daniel Sykora,
Eli Shechtman,
Greg Shakhnarovich
Abstract:
We propose Neural Neighbor Style Transfer (NNST), a pipeline that offers state-of-the-art quality, generalization, and competitive efficiency for artistic style transfer. Our approach is based on explicitly replacing neural features extracted from the content input (to be stylized) with those from a style exemplar, then synthesizing the final output based on these rearranged features. While the sp…
▽ More
We propose Neural Neighbor Style Transfer (NNST), a pipeline that offers state-of-the-art quality, generalization, and competitive efficiency for artistic style transfer. Our approach is based on explicitly replacing neural features extracted from the content input (to be stylized) with those from a style exemplar, then synthesizing the final output based on these rearranged features. While the spirit of our approach is similar to prior work, we show that our design decisions dramatically improve the final visual quality.
△ Less
Submitted 24 March, 2022;
originally announced March 2022.
-
STALP: Style Transfer with Auxiliary Limited Pairing
Authors:
David Futschik,
Michal Kučera,
Michal Lukáč,
Zhaowen Wang,
Eli Shechtman,
Daniel Sýkora
Abstract:
We present an approach to example-based stylization of images that uses a single pair of a source image and its stylized counterpart. We demonstrate how to train an image translation network that can perform real-time semantically meaningful style transfer to a set of target images with similar content as the source image. A key added value of our approach is that it considers also consistency of…
▽ More
We present an approach to example-based stylization of images that uses a single pair of a source image and its stylized counterpart. We demonstrate how to train an image translation network that can perform real-time semantically meaningful style transfer to a set of target images with similar content as the source image. A key added value of our approach is that it considers also consistency of target images during training. Although those have no stylized counterparts, we constrain the translation to keep the statistics of neural responses compatible with those extracted from the stylized source. In contrast to concurrent techniques that use a similar input, our approach better preserves important visual characteristics of the source style and can deliver temporally stable results without the need to explicitly handle temporal consistency. We demonstrate its practical utility on various applications including video stylization, style transfer to panoramas, faces, and 3D models.
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Real Image Inversion via Segments
Authors:
David Futschik,
Michal Lukáč,
Eli Shechtman,
Daniel Sýkora
Abstract:
In this short report, we present a simple, yet effective approach to editing real images via generative adversarial networks (GAN). Unlike previous techniques, that treat all editing tasks as an operation that affects pixel values in the entire image in our approach we cut up the image into a set of smaller segments. For those segments corresponding latent codes of a generative network can be esti…
▽ More
In this short report, we present a simple, yet effective approach to editing real images via generative adversarial networks (GAN). Unlike previous techniques, that treat all editing tasks as an operation that affects pixel values in the entire image in our approach we cut up the image into a set of smaller segments. For those segments corresponding latent codes of a generative network can be estimated with greater accuracy due to the lower number of constraints. When codes are altered by the user the content in the image is manipulated locally while the rest of it remains unaffected. Thanks to this property the final edited image better retains the original structures and thus helps to preserve natural look.
△ Less
Submitted 12 October, 2021;
originally announced October 2021.
-
Interactive Video Stylization Using Few-Shot Patch-Based Training
Authors:
Ondřej Texler,
David Futschik,
Michal Kučera,
Ondřej Jamriška,
Šárka Sochorová,
Menglei Chai,
Sergey Tulyakov,
Daniel Sýkora
Abstract:
In this paper, we present a learning-based method to the keyframe-based video stylization that allows an artist to propagate the style from a few selected keyframes to the rest of the sequence. Its key advantage is that the resulting stylization is semantically meaningful, i.e., specific parts of moving objects are stylized according to the artist's intention. In contrast to previous style transfe…
▽ More
In this paper, we present a learning-based method to the keyframe-based video stylization that allows an artist to propagate the style from a few selected keyframes to the rest of the sequence. Its key advantage is that the resulting stylization is semantically meaningful, i.e., specific parts of moving objects are stylized according to the artist's intention. In contrast to previous style transfer techniques, our approach does not require any lengthy pre-training process nor a large training dataset. We demonstrate how to train an appearance translation network from scratch using only a few stylized exemplars while implicitly preserving temporal consistency. This leads to a video stylization framework that supports real-time inference, parallel processing, and random access to an arbitrary output frame. It can also merge the content from multiple keyframes without the need to perform an explicit blending operation. We demonstrate its practical utility in various interactive scenarios, where the user paints over a selected keyframe and sees her style transferred to an existing recorded sequence or a live video stream.
△ Less
Submitted 29 April, 2020;
originally announced April 2020.
-
StyleBlit: Fast Example-Based Stylization with Local Guidance
Authors:
Daniel Sýkora,
Ondřej Jamriška,
Jingwan Lu,
Eli Shechtman
Abstract:
We present StyleBlit---an efficient example-based style transfer algorithm that can deliver high-quality stylized renderings in real-time on a single-core CPU. Our technique is especially suitable for style transfer applications that use local guidance - descriptive guiding channels containing large spatial variations. Local guidance encourages transfer of content from the source exemplar to the t…
▽ More
We present StyleBlit---an efficient example-based style transfer algorithm that can deliver high-quality stylized renderings in real-time on a single-core CPU. Our technique is especially suitable for style transfer applications that use local guidance - descriptive guiding channels containing large spatial variations. Local guidance encourages transfer of content from the source exemplar to the target image in a semantically meaningful way. Typical local guidance includes, e.g., normal values, texture coordinates or a displacement field. Contrary to previous style transfer techniques, our approach does not involve any computationally expensive optimization. We demonstrate that when local guidance is used, optimization-based techniques converge to solutions that can be well approximated by simple pixel-level operations. Inspired by this observation, we designed an algorithm that produces results visually similar to, if not better than, the state-of-the-art, and is several orders of magnitude faster. Our approach is suitable for scenarios with low computational budget such as games and mobile applications.
△ Less
Submitted 9 July, 2018;
originally announced July 2018.
-
Building Anatomically Realistic Jaw Kinematics Model from Data
Authors:
Wenwu Yang,
Nathan Marshak,
Daniel Sýkora,
Srikumar Ramalingam,
Ladislav Kavan
Abstract:
This paper considers a different aspect of anatomical face modeling: kinematic modeling of the jaw, i.e., the Temporo-Mandibular Joint (TMJ). Previous work often relies on simple models of jaw kinematics, even though the actual physiological behavior of the TMJ is quite complex, allowing not only for mouth opening, but also for some amount of sideways (lateral) and front-to-back (protrusion) motio…
▽ More
This paper considers a different aspect of anatomical face modeling: kinematic modeling of the jaw, i.e., the Temporo-Mandibular Joint (TMJ). Previous work often relies on simple models of jaw kinematics, even though the actual physiological behavior of the TMJ is quite complex, allowing not only for mouth opening, but also for some amount of sideways (lateral) and front-to-back (protrusion) motions. Fortuitously, the TMJ is the only joint whose kinematics can be accurately measured with optical methods, because the bones of the lower and upper jaw are rigidly connected to the lower and upper teeth. We construct a person-specific jaw kinematic model by asking an actor to exercise the entire range of motion of the jaw while keeping the lips open so that the teeth are at least partially visible. This performance is recorded with three calibrated cameras. We obtain highly accurate 3D models of the teeth with a standard dental scanner and use these models to reconstruct the rigid body trajectories of the teeth from the videos (markerless tracking). The relative rigid transformations samples between the lower and upper teeth are mapped to the Lie algebra of rigid body motions in order to linearize the rotational motion. Our main contribution is to fit these samples with a three-dimensional nonlinear model parameterizing the entire range of motion of the TMJ. We show that standard Principal Component Analysis (PCA) fails to capture the nonlinear trajectories of the moving mandible. However, we found these nonlinearities can be captured with a special modification of autoencoder neural networks known as Nonlinear PCA. By mapping back to the Lie group of rigid transformations, we obtain parameterization of the jaw kinematics which provides an intuitive interface allowing the animators to explore realistic jaw motions in a user-friendly way.
△ Less
Submitted 15 May, 2018;
originally announced May 2018.