Skip to main content

Showing 1–39 of 39 results for author: Lensch, H P A

.
  1. arXiv:2505.06793  [pdf, ps, other

    eess.IV cs.CV

    HistDiST: Histopathological Diffusion-based Stain Transfer

    Authors: Erik Großkopf, Valay Bundele, Mehran Hossienzadeh, Hendrik P. A. Lensch

    Abstract: Hematoxylin and Eosin (H&E) staining is the cornerstone of histopathology but lacks molecular specificity. While Immunohistochemistry (IHC) provides molecular insights, it is costly and complex, motivating H&E-to-IHC translation as a cost-effective alternative. Existing translation methods are mainly GAN-based, often struggling with training instability and limited structural fidelity, while diffu… ▽ More

    Submitted 10 May, 2025; originally announced May 2025.

    Comments: 8 pages, 4 figures

  2. arXiv:2411.07719  [pdf, other

    cs.RO cs.CV cs.LG

    EMPERROR: A Flexible Generative Perception Error Model for Probing Self-Driving Planners

    Authors: Niklas Hanselmann, Simon Doll, Marius Cordts, Hendrik P. A. Lensch, Andreas Geiger

    Abstract: To handle the complexities of real-world traffic, learning planners for self-driving from data is a promising direction. While recent approaches have shown great progress, they typically assume a setting in which the ground-truth world state is available as input. However, when deployed, planning needs to be robust to the long-tail of errors incurred by a noisy perception system, which is often ne… ▽ More

    Submitted 13 May, 2025; v1 submitted 12 November, 2024; originally announced November 2024.

    Comments: Project page: https://lasnik.github.io/emperror/

    Journal ref: IEEE Robotics and Automation Letters, vol. 10, no. 6, pp. 5807-5814, June 2025

  3. arXiv:2411.05419  [pdf, other

    cs.CV

    POC-SLT: Partial Object Completion with SDF Latent Transformers

    Authors: Faezeh Zakeri, Raphael Braun, Lukas Ruppert, Henrik P. A. Lensch

    Abstract: 3D geometric shape completion hinges on representation learning and a deep understanding of geometric data. Without profound insights into the three-dimensional nature of the data, this task remains unattainable. Our work addresses this challenge of 3D shape completion given partial observations by proposing a transformer operating on the latent space representing Signed Distance Fields (SDFs). In… ▽ More

    Submitted 8 November, 2024; originally announced November 2024.

    ACM Class: I.4.8; I.4.5

  4. arXiv:2410.04259  [pdf, other

    cs.CL

    Is deeper always better? Replacing linear mappings with deep learning networks in the Discriminative Lexicon Model

    Authors: Maria Heitmeier, Valeria Schmidt, Hendrik P. A. Lensch, R. Harald Baayen

    Abstract: Recently, deep learning models have increasingly been used in cognitive modelling of language. This study asks whether deep learning can help us to better understand the learning problem that needs to be solved by speakers, above and beyond linear methods. We utilise the Discriminative Lexicon Model (DLM, Baayen et al., 2019), which models comprehension and production with mappings between numeric… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

    Comments: 17 pages, 6 figures

  5. arXiv:2409.20140  [pdf, other

    cs.CV cs.GR

    RISE-SDF: a Relightable Information-Shared Signed Distance Field for Glossy Object Inverse Rendering

    Authors: Deheng Zhang, Jingyu Wang, Shaofei Wang, Marko Mihajlovic, Sergey Prokudin, Hendrik P. A. Lensch, Siyu Tang

    Abstract: In this paper, we propose a novel end-to-end relightable neural inverse rendering system that achieves high-quality reconstruction of geometry and material properties, thus enabling high-quality relighting. The cornerstone of our method is a two-stage approach for learning a better factorization of scene parameters. In the first stage, we develop a reflection-aware radiance field using a neural si… ▽ More

    Submitted 10 October, 2024; v1 submitted 30 September, 2024; originally announced September 2024.

    Comments: https://dehezhang2.github.io/RISE-SDF/

  6. arXiv:2408.12282  [pdf, other

    cs.CV cs.GR

    Subsurface Scattering for 3D Gaussian Splatting

    Authors: Jan-Niklas Dihlmann, Arjun Majumdar, Andreas Engelhardt, Raphael Braun, Hendrik P. A. Lensch

    Abstract: 3D reconstruction and relighting of objects made from scattering materials present a significant challenge due to the complex light transport beneath the surface. 3D Gaussian Splatting introduced high-quality novel view synthesis at real-time speeds. While 3D Gaussians efficiently approximate an object's surface, they fail to capture the volumetric properties of subsurface scattering. We propose a… ▽ More

    Submitted 31 October, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

    Comments: Project page: https://sss.jdihlmann.com/

  7. arXiv:2406.06264  [pdf, other

    cs.CV

    DualAD: Disentangling the Dynamic and Static World for End-to-End Driving

    Authors: Simon Doll, Niklas Hanselmann, Lukas Schneider, Richard Schulz, Marius Cordts, Markus Enzweiler, Hendrik P. A. Lensch

    Abstract: State-of-the-art approaches for autonomous driving integrate multiple sub-tasks of the overall driving task into a single pipeline that can be trained in an end-to-end fashion by passing latent representations between the different modules. In contrast to previous approaches that rely on a unified grid to represent the belief state of the scene, we propose dedicated representations to disentangle… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted at CVPR 2024; Copyright 2024 IEEE; Project Website: https://simondoll.github.io/publications/dualad

  8. arXiv:2404.15436  [pdf, other

    cs.CV

    Iterative Cluster Harvesting for Wafer Map Defect Patterns

    Authors: Alina Pleli, Simon Baeuerle, Michel Janus, Jonas Barth, Ralf Mikut, Hendrik P. A. Lensch

    Abstract: Unsupervised clustering of wafer map defect patterns is challenging because the appearance of certain defect patterns varies significantly. This includes changing shape, location, density, and rotation of the defect area on the wafer. We present a harvesting approach, which can cluster even challenging defect patterns of wafer maps well. Our approach makes use of a well-known, three-step procedure… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  9. arXiv:2402.19186  [pdf, other

    cs.CV cs.LG

    Disentangling representations of retinal images with generative models

    Authors: Sarah Müller, Lisa M. Koch, Hendrik P. A. Lensch, Philipp Berens

    Abstract: Retinal fundus images play a crucial role in the early detection of eye diseases. However, the impact of technical factors on these images can pose challenges for reliable AI applications in ophthalmology. For example, large fundus cohorts are often confounded by factors like camera type, bearing the risk of learning shortcuts rather than the causal relationships behind the image generation proces… ▽ More

    Submitted 20 September, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  10. arXiv:2401.10171  [pdf, other

    cs.CV cs.GR

    SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

    Authors: Andreas Engelhardt, Amit Raj, Mark Boss, Yunzhi Zhang, Abhishek Kar, Yuanzhen Li, Deqing Sun, Ricardo Martin Brualla, Jonathan T. Barron, Hendrik P. A. Lensch, Varun Jampani

    Abstract: We present SHINOBI, an end-to-end framework for the reconstruction of shape, material, and illumination from object images captured with varying lighting, pose, and background. Inverse rendering of an object based on unconstrained image collections is a long-standing challenge in computer vision and graphics and requires a joint optimization over shape, radiance, and pose. We show that an implicit… ▽ More

    Submitted 29 March, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: Accepted by IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024). Updated supplementary material and acknowledgements

  11. arXiv:2311.05043  [pdf, other

    cs.CV cs.AI cs.CL

    Zero-shot Translation of Attention Patterns in VQA Models to Natural Language

    Authors: Leonard Salewski, A. Sophia Koepke, Hendrik P. A. Lensch, Zeynep Akata

    Abstract: Converting a model's internals to text can yield human-understandable insights about the model. Inspired by the recent success of training-free approaches for image captioning, we propose ZS-A2T, a zero-shot framework that translates the transformer attention of a given model into natural language without requiring any training. We consider this in the context of Visual Question Answering (VQA). Z… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Published in GCPR 2023

  12. arXiv:2310.10543  [pdf, other

    cs.CL cs.CV

    ViPE: Visualise Pretty-much Everything

    Authors: Hassan Shahmohammadi, Adhiraj Ghosh, Hendrik P. A. Lensch

    Abstract: Figurative and non-literal expressions are profoundly integrated in human communication. Visualising such expressions allow us to convey our creative thoughts, and evoke nuanced emotions. Recent text-to-image models like Stable Diffusion, on the other hand, struggle to depict non-literal expressions. Recent works primarily deal with this issue by compiling humanly annotated datasets on a small sca… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: To be presented in EMNLP2023 Main Conference

  13. arXiv:2307.07482  [pdf, other

    cs.CV

    Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based Tumor Classification

    Authors: Simon Holdenried-Krafft, Peter Somers, Ivonne A. Montes-Majarro, Diana Silimon, Cristina Tarín, Falko Fend, Hendrik P. A. Lensch

    Abstract: Whole slide image (WSI) assessment is a challenging and crucial step in cancer diagnosis and treatment planning. WSIs require high magnifications to facilitate sub-cellular analysis. Precise annotations for patch- or even pixel-level classifications in the context of gigapixel WSIs are tedious to acquire and require domain experts. Coarse-grained labels, on the other hand, are easily accessible, w… ▽ More

    Submitted 17 November, 2023; v1 submitted 14 July, 2023; originally announced July 2023.

  14. arXiv:2306.17602  [pdf, other

    cs.CV cs.AI cs.RO

    S.T.A.R.-Track: Latent Motion Models for End-to-End 3D Object Tracking with Adaptive Spatio-Temporal Appearance Representations

    Authors: Simon Doll, Niklas Hanselmann, Lukas Schneider, Richard Schulz, Markus Enzweiler, Hendrik P. A. Lensch

    Abstract: Following the tracking-by-attention paradigm, this paper introduces an object-centric, transformer-based framework for tracking in 3D. Traditional model-based tracking approaches incorporate the geometric effect of object- and ego motion between frames with a geometric motion model. Inspired by this, we propose S.T.A.R.-Track, which uses a novel latent motion model (LMM) to additionally adjust obj… ▽ More

    Submitted 13 October, 2024; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: \c{opyright} 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

    Journal ref: IEEE Robotics and Automation Letters, Vol. 9, No. 2 (2024), PP 1326-1333

  15. arXiv:2209.03714  [pdf, other

    cs.CL

    Visual Grounding of Inter-lingual Word-Embeddings

    Authors: Wafaa Mohammed, Hassan Shahmohammadi, Hendrik P. A. Lensch, R. Harald Baayen

    Abstract: Visual grounding of Language aims at enriching textual representations of language with multiple sources of visual knowledge such as images and videos. Although visual grounding is an area of intense research, inter-lingual aspects of visual grounding have not received much attention. The present study investigates the inter-lingual visual grounding of word embeddings. We propose an implicit align… ▽ More

    Submitted 21 November, 2022; v1 submitted 8 September, 2022; originally announced September 2022.

    Comments: - added more results - paper accepted to appear at UM-IoS workshop, EMNLP 2022

  16. arXiv:2208.09266  [pdf, other

    cs.CV

    Diverse Video Captioning by Adaptive Spatio-temporal Attention

    Authors: Zohreh Ghaderi, Leonard Salewski, Hendrik P. A. Lensch

    Abstract: To generate proper captions for videos, the inference needs to identify relevant concepts and pay attention to the spatial relationships between them as well as to the temporal development in the clip. Our end-to-end encoder-decoder video captioning framework incorporates two transformer-based architectures, an adapted transformer for a single joint spatio-temporal video analysis as well as a self… ▽ More

    Submitted 19 August, 2022; originally announced August 2022.

  17. arXiv:2206.15381  [pdf, other

    cs.CL

    How direct is the link between words and images?

    Authors: Hassan Shahmohammadi, Maria Heitmeier, Elnaz Shafaei-Bajestan, Hendrik P. A. Lensch, Harald Baayen

    Abstract: Current word embedding models despite their success, still suffer from their lack of grounding in the real world. In this line of research, Gunther et al. 2022 proposed a behavioral experiment to investigate the relationship between words and images. In their setup, participants were presented with a target noun and a pair of images, one chosen by their model and another chosen randomly. Participa… ▽ More

    Submitted 31 October, 2023; v1 submitted 30 June, 2022; originally announced June 2022.

    Comments: Accepted in the Mental Lexicon Journal: https://benjamins.com/catalog/ml

  18. arXiv:2206.08823  [pdf, other

    cs.CL

    Language with Vision: a Study on Grounded Word and Sentence Embeddings

    Authors: Hassan Shahmohammadi, Maria Heitmeier, Elnaz Shafaei-Bajestan, Hendrik P. A. Lensch, Harald Baayen

    Abstract: Grounding language in vision is an active field of research seeking to construct cognitively plausible word and sentence representations by incorporating perceptual knowledge from vision into text-based representations. Despite many attempts at language grounding, achieving an optimal equilibrium between textual representations of the language and our embodied experiences remains an open field. So… ▽ More

    Submitted 31 October, 2023; v1 submitted 17 June, 2022; originally announced June 2022.

  19. arXiv:2205.15768  [pdf, other

    cs.CV cs.GR cs.LG

    SAMURAI: Shape And Material from Unconstrained Real-world Arbitrary Image collections

    Authors: Mark Boss, Andreas Engelhardt, Abhishek Kar, Yuanzhen Li, Deqing Sun, Jonathan T. Barron, Hendrik P. A. Lensch, Varun Jampani

    Abstract: Inverse rendering of an object under entirely unknown capture conditions is a fundamental challenge in computer vision and graphics. Neural approaches such as NeRF have achieved photorealistic results on novel view synthesis, but they require known camera poses. Solving this problem with unknown camera poses is highly challenging as it requires joint optimization over shape, radiance, and pose. Th… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

  20. CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations

    Authors: Leonard Salewski, A. Sophia Koepke, Hendrik P. A. Lensch, Zeynep Akata

    Abstract: Providing explanations in the context of Visual Question Answering (VQA) presents a fundamental problem in machine learning. To obtain detailed insights into the process of generating natural language explanations for VQA, we introduce the large-scale CLEVR-X dataset that extends the CLEVR dataset with natural language explanations. For each image-question pair in the CLEVR dataset, CLEVR-X contai… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

  21. arXiv:2110.14373  [pdf, other

    cs.CV cs.GR cs.LG

    Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition

    Authors: Mark Boss, Varun Jampani, Raphael Braun, Ce Liu, Jonathan T. Barron, Hendrik P. A. Lensch

    Abstract: Decomposing a scene into its shape, reflectance and illumination is a fundamental problem in computer vision and graphics. Neural approaches such as NeRF have achieved remarkable success in view synthesis, but do not explicitly perform decomposition and instead operate exclusively on radiance (the product of reflectance and illumination). Extensions to NeRF, such as NeRD, can perform decomposition… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

    Comments: Project page: https://markboss.me/publication/2021-neural-pil/ Video: https://youtu.be/AsdAR5u3vQ8 - Accepted at NeurIPS 2021

  22. arXiv:2104.07500  [pdf, other

    cs.CL

    Learning Zero-Shot Multifaceted Visually Grounded Word Embeddings via Multi-Task Training

    Authors: Hassan Shahmohammadi, Hendrik P. A. Lensch, R. Harald Baayen

    Abstract: Language grounding aims at linking the symbolic representation of language (e.g., words) into the rich perceptual knowledge of the outside world. The general approach is to embed both textual and visual information into a common space -the grounded space-confined by an explicit relationship between both modalities. We argue that this approach sacrifices the abstract knowledge obtained from linguis… ▽ More

    Submitted 13 September, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: To be published in the 25th Conference on Computational Natural Language Learning (CoNLL 2021)

  23. arXiv:2012.03918  [pdf, other

    cs.CV cs.GR cs.LG

    NeRD: Neural Reflectance Decomposition from Image Collections

    Authors: Mark Boss, Raphael Braun, Varun Jampani, Jonathan T. Barron, Ce Liu, Hendrik P. A. Lensch

    Abstract: Decomposing a scene into its shape, reflectance, and illumination is a challenging but important problem in computer vision and graphics. This problem is inherently more challenging when the illumination is not a single light source under laboratory conditions but is instead an unconstrained environmental illumination. Though recent work has shown that implicit representations can be used to model… ▽ More

    Submitted 26 August, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

    Comments: Accepted at ICCV 2021

  24. arXiv:2009.09823  [pdf, other

    cs.LG stat.ML

    Latent State Inference in a Spatiotemporal Generative Model

    Authors: Matthias Karlbauer, Tobias Menge, Sebastian Otte, Hendrik P. A. Lensch, Thomas Scholten, Volker Wulfmeyer, Martin V. Butz

    Abstract: Knowledge about the hidden factors that determine particular system dynamics is crucial for both explaining them and pursuing goal-directed interventions. Inferring these factors from time series data without supervision remains an open challenge. Here, we focus on spatiotemporal processes, including wave propagation and weather dynamics, for which we assume that universal causes (e.g. physics) ap… ▽ More

    Submitted 15 August, 2021; v1 submitted 21 September, 2020; originally announced September 2020.

    Comments: As submitted to and accepted by the 30th International Conference on Artificial Neural Networks (ICANN)

  25. arXiv:2009.09187  [pdf, other

    cs.LG stat.ML

    Inferring, Predicting, and Denoising Causal Wave Dynamics

    Authors: Matthias Karlbauer, Sebastian Otte, Hendrik P. A. Lensch, Thomas Scholten, Volker Wulfmeyer, Martin V. Butz

    Abstract: The novel DISTributed Artificial neural Network Architecture (DISTANA) is a generative, recurrent graph convolution neural network. It implements a grid or mesh of locally parameterizable laterally connected network modules. DISTANA is specifically designed to identify the causality behind spatially distributed, non-linear dynamical processes. We show that DISTANA is very well-suited to denoise da… ▽ More

    Submitted 19 September, 2020; originally announced September 2020.

    Comments: As accepted by the 29th International Conference on Artificial Neural Networks (ICANN20)

  26. Two-shot Spatially-varying BRDF and Shape Estimation

    Authors: Mark Boss, Varun Jampani, Kihwan Kim, Hendrik P. A. Lensch, Jan Kautz

    Abstract: Capturing the shape and spatially-varying appearance (SVBRDF) of an object from images is a challenging task that has applications in both computer vision and graphics. Traditional optimization-based approaches often need a large number of images taken from multiple views in a controlled environment. Newer deep learning-based approaches require only a few input images, but the reconstruction quali… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

  27. arXiv:1912.11141  [pdf, other

    cs.LG cs.NE

    A Distributed Neural Network Architecture for Robust Non-Linear Spatio-Temporal Prediction

    Authors: Matthias Karlbauer, Sebastian Otte, Hendrik P. A. Lensch, Thomas Scholten, Volker Wulfmeyer, Martin V. Butz

    Abstract: We introduce a distributed spatio-temporal artificial neural network architecture (DISTANA). It encodes mesh nodes using recurrent, neural prediction kernels (PKs), while neural transition kernels (TKs) transfer information between neighboring PKs, together modeling and predicting spatio-temporal time series dynamics. As a consequence, DISTANA assumes that generally applicable causes, which may be… ▽ More

    Submitted 23 December, 2019; originally announced December 2019.

    Comments: 8 pages, 4 figures, video on https://www.youtube.com/watch?v=4VHhHYeWTzo

  28. arXiv:1912.01059  [pdf, other

    cs.CV cs.DB cs.DS cs.IR

    GGNN: Graph-based GPU Nearest Neighbor Search

    Authors: Fabian Groh, Lukas Ruppert, Patrick Wieschollek, Hendrik P. A. Lensch

    Abstract: Approximate nearest neighbor (ANN) search in high dimensions is an integral part of several computer vision systems and gains importance in deep learning with explicit memory representations. Since PQT, FAISS, and SONG started to leverage the massive parallelism offered by GPUs, GPU-based implementations are a crucial resource for today's state-of-the-art ANN methods. While most of these methods a… ▽ More

    Submitted 7 April, 2022; v1 submitted 2 December, 2019; originally announced December 2019.

  29. arXiv:1910.05148  [pdf, other

    cs.GR cs.CV cs.LG eess.IV

    Single Image BRDF Parameter Estimation with a Conditional Adversarial Network

    Authors: Mark Boss, Hendrik P. A. Lensch

    Abstract: Creating plausible surfaces is an essential component in achieving a high degree of realism in rendering. To relieve artists, who create these surfaces in a time-consuming, manual process, automated retrieval of the spatially-varying Bidirectional Reflectance Distribution Function (SVBRDF) from a single mobile phone image is desirable. By leveraging a deep neural network, this casual capturing met… ▽ More

    Submitted 11 October, 2019; originally announced October 2019.

  30. arXiv:1803.07289  [pdf, other

    cs.CV

    Flex-Convolution (Million-Scale Point-Cloud Learning Beyond Grid-Worlds)

    Authors: Fabian Groh, Patrick Wieschollek, Hendrik P. A. Lensch

    Abstract: Traditional convolution layers are specifically designed to exploit the natural data representation of images -- a fixed and regular grid. However, unstructured data like 3D point clouds containing irregular neighborhoods constantly breaks the grid-based data assumption. Therefore applying best-practices and design choices from 2D-image learning methods towards processing point clouds are not read… ▽ More

    Submitted 15 April, 2020; v1 submitted 20 March, 2018; originally announced March 2018.

    Comments: accepted at ACCV 2018

  31. arXiv:1708.04208  [pdf, other

    cs.CV

    Learning Blind Motion Deblurring

    Authors: Patrick Wieschollek, Michael Hirsch, Bernhard Schölkopf, Hendrik P. A. Lensch

    Abstract: As handheld video cameras are now commonplace and available in every smartphone, images and videos can be recorded almost everywhere at anytime. However, taking a quick shot frequently yields a blurry result due to unwanted camera shake during recording or moving objects in the scene. Removing these artifacts from the blurry recordings is a highly ill-posed problem as neither the sharp image nor t… ▽ More

    Submitted 14 August, 2017; originally announced August 2017.

    Comments: International Conference on Computer Vision (ICCV) (2017)

  32. Efficient Large-scale Approximate Nearest Neighbor Search on the GPU

    Authors: Patrick Wieschollek, Oliver Wang, Alexander Sorkine-Hornung, Hendrik P. A. Lensch

    Abstract: We present a new approach for efficient approximate nearest neighbor (ANN) search in high dimensional spaces, extending the idea of Product Quantization. We propose a two-level product and vector quantization tree that reduces the number of vector comparisons required during tree traversal. Our approach also includes a novel highly parallelizable re-ranking method for candidate vectors by efficien… ▽ More

    Submitted 20 February, 2017; originally announced February 2017.

    Journal ref: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2027 - 2035 (2016)

  33. arXiv:1702.02549  [pdf, other

    cs.CV

    Backpropagation Training for Fisher Vectors within Neural Networks

    Authors: Patrick Wieschollek, Fabian Groh, Hendrik P. A. Lensch

    Abstract: Fisher-Vectors (FV) encode higher-order statistics of a set of multiple local descriptors like SIFT features. They already show good performance in combination with shallow learning architectures on visual recognitions tasks. Current methods using FV as a feature descriptor in deep architectures assume that all original input features are static. We propose a framework to jointly learn the represe… ▽ More

    Submitted 8 February, 2017; originally announced February 2017.

  34. arXiv:1611.05203  [pdf, other

    cs.CV

    Will People Like Your Image? Learning the Aesthetic Space

    Authors: Katharina Schwarz, Patrick Wieschollek, Hendrik P. A. Lensch

    Abstract: Rating how aesthetically pleasing an image appears is a highly complex matter and depends on a large number of different visual factors. Previous work has tackled the aesthetic rating problem by ranking on a 1-dimensional rating scale, e.g., incorporating handcrafted attributes. In this paper, we propose a rather general approach to automatically map aesthetic pleasingness with all its complexity… ▽ More

    Submitted 4 December, 2017; v1 submitted 16 November, 2016; originally announced November 2016.

  35. arXiv:1610.05985  [pdf, other

    cs.CV

    Learning Robust Video Synchronization without Annotations

    Authors: Patrick Wieschollek, Ido Freeman, Hendrik P. A. Lensch

    Abstract: Aligning video sequences is a fundamental yet still unsolved component for a broad range of applications in computer graphics and vision. Most classical image processing methods cannot be directly applied to related video problems due to the high amount of underlying data and their limit to small changes in appearance. We present a scalable and robust method for computing a non-linear temporal vid… ▽ More

    Submitted 15 September, 2017; v1 submitted 19 October, 2016; originally announced October 2016.

    Comments: International Conference On Machine Learning And Applications (ICMLA 2017)

  36. arXiv:1609.06188  [pdf, other

    cs.CV

    Transfer Learning for Material Classification using Convolutional Networks

    Authors: Patrick Wieschollek, Hendrik P. A. Lensch

    Abstract: Material classification in natural settings is a challenge due to complex interplay of geometry, reflectance properties, and illumination. Previous work on material classification relies strongly on hand-engineered features of visual samples. In this work we use a Convolutional Neural Network (convnet) that learns descriptive features for the specific task of material recognition. Specifically, tr… ▽ More

    Submitted 20 September, 2016; originally announced September 2016.

  37. arXiv:1607.04433  [pdf, other

    cs.CV

    End-to-End Learning for Image Burst Deblurring

    Authors: Patrick Wieschollek, Bernhard Schölkopf, Hendrik P. A. Lensch, Michael Hirsch

    Abstract: We present a neural network model approach for multi-frame blind deconvolution. The discriminative approach adopts and combines two recent techniques for image deblurring into a single neural network architecture. Our proposed hybrid-architecture combines the explicit prediction of a deconvolution filter and non-trivial averaging of Fourier coefficients in the frequency domain. In order to make fu… ▽ More

    Submitted 6 September, 2016; v1 submitted 15 July, 2016; originally announced July 2016.

  38. arXiv:1605.09533  [pdf, other

    cs.CV cs.LG cs.RO

    Robust Deep-Learning-Based Road-Prediction for Augmented Reality Navigation Systems

    Authors: Matthias Limmer, Julian Forster, Dennis Baudach, Florian Schüle, Roland Schweiger, Hendrik P. A. Lensch

    Abstract: This paper proposes an approach that predicts the road course from camera sensors leveraging deep learning techniques. Road pixels are identified by training a multi-scale convolutional neural network on a large number of full-scene-labeled night-time road images including adverse weather conditions. A framework is presented that applies the proposed approach to longer distance road course estimat… ▽ More

    Submitted 31 May, 2016; originally announced May 2016.

    Comments: 8 pages, 12 figures, submitted to ITSC 2016

  39. arXiv:1604.02245  [pdf, other

    cs.CV cs.GR

    Infrared Colorization Using Deep Convolutional Neural Networks

    Authors: Matthias Limmer, Hendrik P. A. Lensch

    Abstract: This paper proposes a method for transferring the RGB color spectrum to near-infrared (NIR) images using deep multi-scale convolutional neural networks. A direct and integrated transfer between NIR and RGB pixels is trained. The trained model does not require any user guidance or a reference image database in the recall phase to produce images with a natural appearance. To preserve the rich detail… ▽ More

    Submitted 26 July, 2016; v1 submitted 8 April, 2016; originally announced April 2016.

    Comments: 8 pages, 11 figures, 1 table, submitted to ICMLA2016

    MSC Class: 82C32 (Primary); 68T45 (Secondary) ACM Class: H.5.1; I.4.8; I.5.1