Search | arXiv e-print repository

Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency

Authors: Florian Hahlbohm, Fabian Friederichs, Tim Weyrich, Linus Franke, Moritz Kappel, Susana Castillo, Marc Stamminger, Martin Eisemann, Marcus Magnor

Abstract: 3D Gaussian Splats (3DGS) have proven a versatile rendering primitive, both for inverse rendering as well as real-time exploration of scenes. In these applications, coherence across camera frames and multiple views is crucial, be it for robust convergence of a scene reconstruction or for artifact-free fly-throughs. Recent work started mitigating artifacts that break multi-view coherence, including… ▽ More 3D Gaussian Splats (3DGS) have proven a versatile rendering primitive, both for inverse rendering as well as real-time exploration of scenes. In these applications, coherence across camera frames and multiple views is crucial, be it for robust convergence of a scene reconstruction or for artifact-free fly-throughs. Recent work started mitigating artifacts that break multi-view coherence, including popping artifacts due to inconsistent transparency sorting and perspective-correct outlines of (2D) splats. At the same time, real-time requirements forced such implementations to accept compromises in how transparency of large assemblies of 3D Gaussians is resolved, in turn breaking coherence in other ways. In our work, we aim at achieving maximum coherence, by rendering fully perspective-correct 3D Gaussians while using a high-quality approximation of accurate blending, hybrid transparency, on a per-pixel level, in order to retain real-time frame rates. Our fast and perspectively accurate approach for evaluation of 3D Gaussians does not require matrix inversions, thereby ensuring numerical stability and eliminating the need for special handling of degenerate splats, and the hybrid transparency formulation for blending maintains similar quality as fully resolved per-pixel transparencies at a fraction of the rendering costs. We further show that each of these two components can be independently integrated into Gaussian splatting systems. In combination, they achieve up to 2$\times$ higher frame rates, 2$\times$ faster optimization, and equal or better image quality with fewer rendering artifacts compared to traditional 3DGS on common benchmarks. △ Less

Submitted 10 March, 2025; v1 submitted 10 October, 2024; originally announced October 2024.

Comments: Project page: https://fhahlbohm.github.io/htgs/

arXiv:2410.02319 [pdf, other]

QDGset: A Large Scale Grasping Dataset Generated with Quality-Diversity

Authors: Johann Huber, François Hélénon, Mathilde Kappel, Ignacio de Loyola Páez-Ubieta, Santiago T. Puente, Pablo Gil, Faïz Ben Amar, Stéphane Doncieux

Abstract: Recent advances in AI have led to significant results in robotic learning, but skills like grasping remain partially solved. Many recent works exploit synthetic grasping datasets to learn to grasp unknown objects. However, those datasets were generated using simple grasp sampling methods using priors. Recently, Quality-Diversity (QD) algorithms have been proven to make grasp sampling significantly… ▽ More Recent advances in AI have led to significant results in robotic learning, but skills like grasping remain partially solved. Many recent works exploit synthetic grasping datasets to learn to grasp unknown objects. However, those datasets were generated using simple grasp sampling methods using priors. Recently, Quality-Diversity (QD) algorithms have been proven to make grasp sampling significantly more efficient. In this work, we extend QDG-6DoF, a QD framework for generating object-centric grasps, to scale up the production of synthetic grasping datasets. We propose a data augmentation method that combines the transformation of object meshes with transfer learning from previous grasping repertoires. The conducted experiments show that this approach reduces the number of required evaluations per discovered robust grasp by up to 20%. We used this approach to generate QDGset, a dataset of 6DoF grasp poses that contains about 3.5 and 4.5 times more grasps and objects, respectively, than the previous state-of-the-art. Our method allows anyone to easily generate data, eventually contributing to a large-scale collaborative dataset of synthetic grasps. △ Less

Submitted 3 October, 2024; originally announced October 2024.

Comments: 8 pages, 9 figures. Draft version

arXiv:2408.07097 [pdf, other]

Attention Please: What Transformer Models Really Learn for Process Prediction

Authors: Martin Käppel, Lars Ackermann, Stefan Jablonski, Simon Härtl

Abstract: Predictive process monitoring aims to support the execution of a process during runtime with various predictions about the further evolution of a process instance. In the last years a plethora of deep learning architectures have been established as state-of-the-art for different prediction targets, among others the transformer architecture. The transformer architecture is equipped with a powerful… ▽ More Predictive process monitoring aims to support the execution of a process during runtime with various predictions about the further evolution of a process instance. In the last years a plethora of deep learning architectures have been established as state-of-the-art for different prediction targets, among others the transformer architecture. The transformer architecture is equipped with a powerful attention mechanism, assigning attention scores to each input part that allows to prioritize most relevant information leading to more accurate and contextual output. However, deep learning models largely represent a black box, i.e., their reasoning or decision-making process cannot be understood in detail. This paper examines whether the attention scores of a transformer based next-activity prediction model can serve as an explanation for its decision-making. We find that attention scores in next-activity prediction models can serve as explainers and exploit this fact in two proposed graph-based explanation approaches. The gained insights could inspire future work on the improvement of predictive business process models as well as enabling a neural network based mining of process models from event logs. △ Less

Submitted 12 August, 2024; originally announced August 2024.

MSC Class: 68T07; 68T01; 68U35 ACM Class: H.4.2; I.2.1; I.2.6

arXiv:2407.02331 [pdf]

doi 10.1021/acsphotonics.5c01198

Attoliter Mie Void Sensing

Authors: Serkan Arslan, Micha Kappel, Adrià Canós Valero, Thu Huong T. Tran, Julian Karst, Philipp Christ, Ulrich Hohenester, Thomas Weiss, Harald Giessen, Mario Hentschel

Abstract: Traditional nanophotonic sensing schemes utilize evanescent fields in dielectric or metallic nanoparticles, which confine far-field radiation in dispersive and lossy media. Apart from the lack of a well-defined sensing volume that can be accompanied by moderate sensitivities, these structures suffer from the generally limited access to the modal field, which is key for sensing performance. Recentl… ▽ More Traditional nanophotonic sensing schemes utilize evanescent fields in dielectric or metallic nanoparticles, which confine far-field radiation in dispersive and lossy media. Apart from the lack of a well-defined sensing volume that can be accompanied by moderate sensitivities, these structures suffer from the generally limited access to the modal field, which is key for sensing performance. Recently, a novel strategy for dielectric nanophotonics has been demonstrated, namely, the resonant confinement of light in air. So-called Mie voids created in high-index dielectric host materials support localized resonant modes with exceptional properties. In particular, due to the confinement in air, these structures benefit from the full access to the modal field inside the void. We utilize these Mie voids for refractive index sensing in single voids with volumes down to 100 attoliters and sensitivities on the order of 400 nm per refractive index unit. Taking the signal-to-noise ratio of our measurements into account, we demonstrate detection of refractive index changes as small as 6.9 x 10-4 in a defined volume of just 850 attoliters. The combination of our Mie void sensor platform with appropriate surface functionalization will even enable specificity to biological or other analytes of interest, as the sensing volumes are on the order of cellular signaling chemicals of single vesicles in cellular synapses. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.10078 [pdf, other]

D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video

Authors: Moritz Kappel, Florian Hahlbohm, Timon Scholz, Susana Castillo, Christian Theobalt, Martin Eisemann, Vladislav Golyanik, Marcus Magnor

Abstract: Dynamic reconstruction and spatiotemporal novel-view synthesis of non-rigidly deforming scenes recently gained increased attention. While existing work achieves impressive quality and performance on multi-view or teleporting camera setups, most methods fail to efficiently and faithfully recover motion and appearance from casual monocular captures. This paper contributes to the field by introducing… ▽ More Dynamic reconstruction and spatiotemporal novel-view synthesis of non-rigidly deforming scenes recently gained increased attention. While existing work achieves impressive quality and performance on multi-view or teleporting camera setups, most methods fail to efficiently and faithfully recover motion and appearance from casual monocular captures. This paper contributes to the field by introducing a new method for dynamic novel view synthesis from monocular video, such as casual smartphone captures. Our approach represents the scene as a $\textit{dynamic neural point cloud}$, an implicit time-conditioned point distribution that encodes local geometry and appearance in separate hash-encoded neural feature grids for static and dynamic regions. By sampling a discrete point cloud from our model, we can efficiently render high-quality novel views using a fast differentiable rasterizer and neural rendering network. Similar to recent work, we leverage advances in neural scene analysis by incorporating data-driven priors like monocular depth estimation and object segmentation to resolve motion and depth ambiguities originating from the monocular captures. In addition to guiding the optimization process, we show that these priors can be exploited to explicitly initialize our scene representation to drastically improve optimization speed and final image quality. As evidenced by our experimental evaluation, our dynamic point cloud model not only enables fast optimization and real-time frame rates for interactive applications, but also achieves competitive image quality on monocular benchmark sequences. Our code and data are available online: https://moritzkappel.github.io/projects/dnpc/. △ Less

Submitted 28 February, 2025; v1 submitted 14 June, 2024; originally announced June 2024.

Comments: 18 pages, 8 figures, 12 tables. Project page: https://moritzkappel.github.io/projects/dnpc/

arXiv:2406.01786 [pdf, other]

Recent Advances in Data-Driven Business Process Management

Authors: Lars Ackermann, Martin Käppel, Laura Marcus, Linda Moder, Sebastian Dunzer, Markus Hornsteiner, Annina Liessmann, Yorck Zisgen, Philip Empl, Lukas-Valentin Herm, Nicolas Neis, Julian Neuberger, Leo Poss, Myriam Schaschek, Sven Weinzierl, Niklas Wördehoff, Stefan Jablonski, Agnes Koschmider, Wolfgang Kratsch, Martin Matzner, Stefanie Rinderle-Ma, Maximilian Röglinger, Stefan Schönig, Axel Winkelmann

Abstract: The rapid development of cutting-edge technologies, the increasing volume of data and also the availability and processability of new types of data sources has led to a paradigm shift in data-based management and decision-making. Since business processes are at the core of organizational work, these developments heavily impact BPM as a crucial success factor for organizations. In view of this emer… ▽ More The rapid development of cutting-edge technologies, the increasing volume of data and also the availability and processability of new types of data sources has led to a paradigm shift in data-based management and decision-making. Since business processes are at the core of organizational work, these developments heavily impact BPM as a crucial success factor for organizations. In view of this emerging potential, data-driven business process management has become a relevant and vibrant research area. Given the complexity and interdisciplinarity of the research field, this position paper therefore presents research insights regarding data-driven BPM. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: position paper, 34 pages, 10 figures

MSC Class: 68U35 68T07 68T07; 68U35; 68T01 ACM Class: H.4.1; I.2.1; I.2.6; I.2.7; H.2.8; K.6.1

arXiv:2403.16862 [pdf, other]

INPC: Implicit Neural Point Clouds for Radiance Field Rendering

Authors: Florian Hahlbohm, Linus Franke, Moritz Kappel, Susana Castillo, Martin Eisemann, Marc Stamminger, Marcus Magnor

Abstract: We introduce a new approach for reconstruction and novel view synthesis of unbounded real-world scenes. In contrast to previous methods using either volumetric fields, grid-based models, or discrete point cloud proxies, we propose a hybrid scene representation, which implicitly encodes the geometry in a continuous octree-based probability field and view-dependent appearance in a multi-resolution h… ▽ More We introduce a new approach for reconstruction and novel view synthesis of unbounded real-world scenes. In contrast to previous methods using either volumetric fields, grid-based models, or discrete point cloud proxies, we propose a hybrid scene representation, which implicitly encodes the geometry in a continuous octree-based probability field and view-dependent appearance in a multi-resolution hash grid. This allows for extraction of arbitrary explicit point clouds, which can be rendered using rasterization. In doing so, we combine the benefits of both worlds and retain favorable behavior during optimization: Our novel implicit point cloud representation and differentiable bilinear rasterizer enable fast rendering while preserving the fine geometric detail captured by volumetric neural fields. Furthermore, this representation does not depend on priors like structure-from-motion point clouds. Our method achieves state-of-the-art image quality on common benchmarks. Furthermore, we achieve fast inference at interactive frame rates, and can convert our trained model into a large, explicit point cloud to further enhance performance. △ Less

Submitted 11 March, 2025; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: Project page: https://fhahlbohm.github.io/inpc/

arXiv:2403.06173 [pdf, other]

Speeding up 6-DoF Grasp Sampling with Quality-Diversity

Authors: Johann Huber, François Hélénon, Mathilde Kappel, Elie Chelly, Mahdi Khoramshahi, Faïz Ben Amar, Stéphane Doncieux

Abstract: Recent advances in AI have led to significant results in robotic learning, including natural language-conditioned planning and efficient optimization of controllers using generative models. However, the interaction data remains the bottleneck for generalization. Getting data for grasping is a critical challenge, as this skill is required to complete many manipulation tasks. Quality-Diversity (QD)… ▽ More Recent advances in AI have led to significant results in robotic learning, including natural language-conditioned planning and efficient optimization of controllers using generative models. However, the interaction data remains the bottleneck for generalization. Getting data for grasping is a critical challenge, as this skill is required to complete many manipulation tasks. Quality-Diversity (QD) algorithms optimize a set of solutions to get diverse, high-performing solutions to a given problem. This paper investigates how QD can be combined with priors to speed up the generation of diverse grasps poses in simulation compared to standard 6-DoF grasp sampling schemes. Experiments conducted on 4 grippers with 2-to-5 fingers on standard objects show that QD outperforms commonly used methods by a large margin. Further experiments show that QD optimization automatically finds some efficient priors that are usually hard coded. The deployment of generated grasps on a 2-finger gripper and an Allegro hand shows that the diversity produced maintains sim-to-real transferability. We believe these results to be a significant step toward the generation of large datasets that can lead to robust and generalizing robotic grasping policies. △ Less

Submitted 10 March, 2024; originally announced March 2024.

Comments: 7 pages, 8 figures. Preprint version

arXiv:2212.01368 [pdf, other]

Fast Non-Rigid Radiance Fields from Monocularized Data

Authors: Moritz Kappel, Vladislav Golyanik, Susana Castillo, Christian Theobalt, Marcus Magnor

Abstract: The reconstruction and novel view synthesis of dynamic scenes recently gained increased attention. As reconstruction from large-scale multi-view data involves immense memory and computational requirements, recent benchmark datasets provide collections of single monocular views per timestamp sampled from multiple (virtual) cameras. We refer to this form of inputs as "monocularized" data. Existing w… ▽ More The reconstruction and novel view synthesis of dynamic scenes recently gained increased attention. As reconstruction from large-scale multi-view data involves immense memory and computational requirements, recent benchmark datasets provide collections of single monocular views per timestamp sampled from multiple (virtual) cameras. We refer to this form of inputs as "monocularized" data. Existing work shows impressive results for synthetic setups and forward-facing real-world data, but is often limited in the training speed and angular range for generating novel views. This paper addresses these limitations and proposes a new method for full 360° inward-facing novel view synthesis of non-rigidly deforming scenes. At the core of our method are: 1) An efficient deformation module that decouples the processing of spatial and temporal information for accelerated training and inference; and 2) A static module representing the canonical scene as a fast hash-encoded neural radiance field. In addition to existing synthetic monocularized data, we systematically analyze the performance on real-world inward-facing scenes using a newly recorded challenging dataset sampled from a synchronized large-scale multi-view rig. In both cases, our method is significantly faster than previous methods, converging in less than 7 minutes and achieving real-time framerates at 1K resolution, while obtaining a higher visual accuracy for generated novel views. Our source code and data is available at our project page https://graphics.tu-bs.de/publications/kappel2022fast. △ Less

Submitted 13 November, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

Comments: 18 pages, 14 figures; project page: https://graphics.tu-bs.de/publications/kappel2022fast

arXiv:2112.06705 [pdf, other]

N-SfC: Robust and Fast Shape Estimation from Caustic Images

Authors: Marc Kassubeck, Moritz Kappel, Susana Castillo, Marcus Magnor

Abstract: This paper deals with the highly challenging problem of reconstructing the shape of a refracting object from a single image of its resulting caustic. Due to the ubiquity of transparent refracting objects in everyday life, reconstruction of their shape entails a multitude of practical applications. The recent Shape from Caustics (SfC) method casts the problem as the inverse of a light propagation s… ▽ More This paper deals with the highly challenging problem of reconstructing the shape of a refracting object from a single image of its resulting caustic. Due to the ubiquity of transparent refracting objects in everyday life, reconstruction of their shape entails a multitude of practical applications. The recent Shape from Caustics (SfC) method casts the problem as the inverse of a light propagation simulation for synthesis of the caustic image, that can be solved by a differentiable renderer. However, the inherent complexity of light transport through refracting surfaces currently limits the practicability with respect to reconstruction speed and robustness. To address these issues, we introduce Neural-Shape from Caustics (N-SfC), a learning-based extension that incorporates two components into the reconstruction pipeline: a denoising module, which alleviates the computational cost of the light transport simulation, and an optimization process based on learned gradient descent, which enables better convergence using fewer iterations. Extensive experiments demonstrate the effectiveness of our neural extensions in the scenario of quality control in 3D glass printing, where we significantly outperform the current state-of-the-art in terms of computational speed and final surface error. △ Less

Submitted 13 December, 2021; originally announced December 2021.

Comments: Project Page: https://graphics.tu-bs.de/publications/kassubeck2021n-sfc

ACM Class: I.5.4; I.5.5; I.3.8; I.4.5; I.2.6

arXiv:2104.00362 [pdf, other]

Evaluating Predictive Business Process Monitoring Approaches on Small Event Logs

Authors: Martin Käppel, Stefan Jablonski, Stefan Schönig

Abstract: Predictive business process monitoring is concerned with the prediction how a running process instance will unfold up to its completion at runtime. Most of the proposed approaches rely on a wide number of different machine learning (ML) techniques. In the last years numerous comparative studies, reviews, and benchmarks of such approaches where published and revealed that they can be successfully a… ▽ More Predictive business process monitoring is concerned with the prediction how a running process instance will unfold up to its completion at runtime. Most of the proposed approaches rely on a wide number of different machine learning (ML) techniques. In the last years numerous comparative studies, reviews, and benchmarks of such approaches where published and revealed that they can be successfully applied for different prediction targets. ML techniques require a qualitatively and quantitatively sufficient data set. However, there are many situations in business process management (BPM) where only a quantitatively insufficient data set is available. The problem of insufficient data in the context of BPM is still neglected. Hence, none of the comparative studies or benchmarks investigates the performance of predictive business process monitoring techniques in environments with small data sets. In this paper an evaluation framework for comparing existing approaches with regard to their suitability for small data sets is developed and exemplarily applied to state-of-the-art approaches in predictive business process monitoring. △ Less

Submitted 20 April, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

arXiv:2012.10974 [pdf, other]

High-Fidelity Neural Human Motion Transfer from Monocular Video

Authors: Moritz Kappel, Vladislav Golyanik, Mohamed Elgharib, Jann-Ole Henningson, Hans-Peter Seidel, Susana Castillo, Christian Theobalt, Marcus Magnor

Abstract: Video-based human motion transfer creates video animations of humans following a source motion. Current methods show remarkable results for tightly-clad subjects. However, the lack of temporally consistent handling of plausible clothing dynamics, including fine and high-frequency details, significantly limits the attainable visual quality. We address these limitations for the first time in the lit… ▽ More Video-based human motion transfer creates video animations of humans following a source motion. Current methods show remarkable results for tightly-clad subjects. However, the lack of temporally consistent handling of plausible clothing dynamics, including fine and high-frequency details, significantly limits the attainable visual quality. We address these limitations for the first time in the literature and present a new framework which performs high-fidelity and temporally-consistent human motion transfer with natural pose-dependent non-rigid deformations, for several types of loose garments. In contrast to the previous techniques, we perform image generation in three subsequent stages, synthesizing human shape, structure, and appearance. Given a monocular RGB video of an actor, we train a stack of recurrent deep neural networks that generate these intermediate representations from 2D poses and their temporal derivatives. Splitting the difficult motion transfer problem into subtasks that are aware of the temporal motion context helps us to synthesize results with plausible dynamics and pose-dependent detail. It also allows artistic control of results by manipulation of individual framework stages. In the experimental results, we significantly outperform the state-of-the-art in terms of video realism. Our code and data will be made publicly available. △ Less

Submitted 20 December, 2020; originally announced December 2020.

Comments: 14 pages, 8 figures; project page: https://graphics.tu-bs.de/publications/kappel2020high-fidelity

arXiv:1011.5372 [pdf, ps, other]

The effect of Mounted Ribs on the Radiation of a Soundboard

Authors: Marcel Kappel, Markus Abel, Reimund Gerhard

Abstract: The grand piano is one of the most important instruments in western music. Its functioning and details are investigated and understood to a reasonable level, however, differences between manufacturers exist which are hard to explain. To add a new piece of understanding, we decided to investigate the effect of ribs mounted on a soundboard. Apart from pianos, this is important to a wider class of in… ▽ More The grand piano is one of the most important instruments in western music. Its functioning and details are investigated and understood to a reasonable level, however, differences between manufacturers exist which are hard to explain. To add a new piece of understanding, we decided to investigate the effect of ribs mounted on a soundboard. Apart from pianos, this is important to a wider class of instruments which radiate from a structured surface. From scattering theory, it is well-known that a regular array of scatterers yields a band structure. By a systematic study of the latter, the effect of the ribs on the radiated spectrum is demonstrated for a specially manufactured multichord mimicking topologically a piano soundboard. To distinguish between radiated sound and sound propagated inside the board we use piezopolymers, an innovative, non-invasive technique. As a result we find a dramatic change in the spectrum allowed to propagate in the soundboard which is consequently radiated. An explanation by a simple model of coupled oscillators is given with a very nice qualitative coincidence. △ Less

Submitted 24 November, 2010; originally announced November 2010.

Comments: 10 pages, 9 figures, submitted to JASA special issue on musical acoustics

Showing 1–13 of 13 results for author: Käppel, M