Skip to main content

Showing 1–50 of 62 results for author: Birdal, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06780  [pdf, ps, other

    cs.CV cs.LG

    Continuous-Time SO(3) Forecasting with Savitzky--Golay Neural Controlled Differential Equations

    Authors: Lennart Bastian, Mohammad Rashed, Nassir Navab, Tolga Birdal

    Abstract: Tracking and forecasting the rotation of objects is fundamental in computer vision and robotics, yet SO(3) extrapolation remains challenging as (1) sensor observations can be noisy and sparse, (2) motion patterns can be governed by complex dynamics, and (3) application settings can demand long-term forecasting. This work proposes modeling continuous-time rotational object dynamics on $SO(3)$ using… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

    Comments: Extended abstract, presented at the CVPR Workshop on 4D Vision

  2. arXiv:2505.21251  [pdf, ps, other

    cs.LG

    Copresheaf Topological Neural Networks: A Generalized Deep Learning Framework

    Authors: Mustafa Hajij, Lennart Bastian, Sarah Osentoski, Hardik Kabaria, John L. Davenport, Sheik Dawood, Balaji Cherukuri, Joseph G. Kocheemoolayil, Nastaran Shahmansouri, Adrian Lew, Theodore Papamarkou, Tolga Birdal

    Abstract: We introduce copresheaf topological neural networks (CTNNs), a powerful and unifying framework that encapsulates a wide spectrum of deep learning architectures, designed to operate on structured data: including images, point clouds, graphs, meshes, and topological manifolds. While deep learning has profoundly impacted domains ranging from digital assistants to autonomous systems, the principled de… ▽ More

    Submitted 28 May, 2025; v1 submitted 27 May, 2025; originally announced May 2025.

  3. arXiv:2505.17677  [pdf, ps, other

    cs.CV

    Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery

    Authors: Ming Hu, Zhengdi Yu, Feilong Tang, Kaiwen Chen, Yulong Li, Imran Razzak, Junjun He, Tolga Birdal, Kaijing Zhou, Zongyuan Ge

    Abstract: Accurate 3D reconstruction of hands and instruments is critical for vision-based analysis of ophthalmic microsurgery, yet progress has been hampered by the lack of realistic, large-scale datasets and reliable annotation tools. In this work, we introduce OphNet-3D, the first extensive RGB-D dynamic 3D reconstruction dataset for ophthalmic surgery, comprising 41 sequences from 40 surgeons and totali… ▽ More

    Submitted 30 May, 2025; v1 submitted 23 May, 2025; originally announced May 2025.

  4. Efficient Training of Neural SDEs Using Stochastic Optimal Control

    Authors: Rembert Daems, Manfred Opper, Guillaume Crevecoeur, Tolga Birdal

    Abstract: We present a hierarchical, control theory inspired method for variational inference (VI) for neural stochastic differential equations (SDEs). While VI for neural SDEs is a promising avenue for uncertainty-aware reasoning in time-series, it is computationally challenging due to the iterative nature of maximizing the ELBO. In this work, we propose to decompose the control term into linear and residu… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: Published in the ESANN 2025 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium) and online event, 23-25 April 2025

    Journal ref: ESANN 2025 : 33rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Proceedings. p.693-698

  5. arXiv:2502.07782  [pdf, ps, other

    cs.CV

    A Flag Decomposition for Hierarchical Datasets

    Authors: Nathan Mankovich, Ignacio Santamaria, Gustau Camps-Valls, Tolga Birdal

    Abstract: Flag manifolds encode nested sequences of subspaces and serve as powerful structures for various computer vision and machine learning applications. Despite their utility in tasks such as dimensionality reduction, motion averaging, and subspace clustering, current applications are often restricted to extracting flags using common matrix decomposition methods like the singular value decomposition. H… ▽ More

    Submitted 4 June, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

  6. arXiv:2502.04308  [pdf, other

    cs.LG cs.AI cs.SI physics.soc-ph

    HOG-Diff: Higher-Order Guided Diffusion for Graph Generation

    Authors: Yiming Huang, Tolga Birdal

    Abstract: Graph generation is a critical yet challenging task as empirical analyses require a deep understanding of complex, non-Euclidean structures. Although diffusion models have recently made significant achievements in graph generation, these models typically adapt from the frameworks designed for image generation, making them ill-suited for capturing the topological properties of graphs. In this work,… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

  7. arXiv:2501.04697  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Grokking at the Edge of Numerical Stability

    Authors: Lucas Prieto, Melih Barsbey, Pedro A. M. Mediano, Tolga Birdal

    Abstract: Grokking, the sudden generalization that occurs after prolonged overfitting, is a surprising phenomenon challenging our understanding of deep learning. Although significant progress has been made in understanding grokking, the reasons behind the delayed generalization and its dependence on regularization remain unclear. In this work, we argue that without regularization, grokking tasks push models… ▽ More

    Submitted 19 May, 2025; v1 submitted 8 January, 2025; originally announced January 2025.

  8. arXiv:2412.12861  [pdf, ps, other

    cs.CV

    Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera

    Authors: Zhengdi Yu, Stefanos Zafeiriou, Tolga Birdal

    Abstract: We propose Dyn-HaMR, to the best of our knowledge, the first approach to reconstruct 4D global hand motion from monocular videos recorded by dynamic cameras in the wild. Reconstructing accurate 3D hand meshes from monocular videos is a crucial task for understanding human behaviour, with significant applications in augmented and virtual reality (AR/VR). However, existing methods for monocular hand… ▽ More

    Submitted 31 May, 2025; v1 submitted 17 December, 2024; originally announced December 2024.

    Comments: Project page is available at https://dyn-hamr.github.io/

  9. arXiv:2410.22311  [pdf, other

    cs.LG math.OC

    Convex Formulations for Training Two-Layer ReLU Neural Networks

    Authors: Karthik Prakhya, Tolga Birdal, Alp Yurtsever

    Abstract: Solving non-convex, NP-hard optimization problems is crucial for training machine learning models, including neural networks. However, non-convexity often leads to black-box machine learning models with unclear inner workings. While convex formulations have been used for verifying neural network robustness, their application to training neural networks remains less explored. In response to this ch… ▽ More

    Submitted 17 March, 2025; v1 submitted 29 October, 2024; originally announced October 2024.

    Comments: Published as a conference paper at ICLR 2025

  10. arXiv:2409.06683  [pdf, other

    cs.CV

    Alignist: CAD-Informed Orientation Distribution Estimation by Fusing Shape and Correspondences

    Authors: Shishir Reddy Vutukur, Rasmus Laurvig Haugaard, Junwen Huang, Benjamin Busam, Tolga Birdal

    Abstract: Object pose distribution estimation is crucial in robotics for better path planning and handling of symmetric objects. Recent distribution estimation approaches employ contrastive learning-based approaches by maximizing the likelihood of a single pose estimate in the absence of a CAD model. We propose a pose distribution estimation method leveraging symmetry respecting correspondence distributions… ▽ More

    Submitted 11 September, 2024; v1 submitted 10 September, 2024; originally announced September 2024.

    Comments: Accepted to ECCV 2024

  11. arXiv:2408.16762  [pdf, other

    cs.CV cs.GR cs.LG

    UV-free Texture Generation with Denoising and Geodesic Heat Diffusions

    Authors: Simone Foti, Stefanos Zafeiriou, Tolga Birdal

    Abstract: Seams, distortions, wasted UV space, vertex-duplication, and varying resolution over the surface are the most prominent issues of the standard UV-based texturing of meshes. These issues are particularly acute when automatic UV-unwrapping techniques are used. For this reason, instead of generating textures in automatically generated UV-planes like most state-of-the-art methods, we propose to repres… ▽ More

    Submitted 10 October, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

  12. arXiv:2407.08723  [pdf, other

    cs.LG math.AT

    Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms

    Authors: Rayna Andreeva, Benjamin Dupuis, Rik Sarkar, Tolga Birdal, Umut Şimşekli

    Abstract: We present a novel set of rigorous and computationally efficient topology-based complexity notions that exhibit a strong correlation with the generalization gap in modern deep neural networks (DNNs). DNNs show remarkable generalization properties, yet the source of these capabilities remains elusive, defying the established statistical learning theory. Recent studies have revealed that properties… ▽ More

    Submitted 14 December, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  13. NeRF-Feat: 6D Object Pose Estimation using Feature Rendering

    Authors: Shishir Reddy Vutukur, Heike Brock, Benjamin Busam, Tolga Birdal, Andreas Hutter, Slobodan Ilic

    Abstract: Object Pose Estimation is a crucial component in robotic grasping and augmented reality. Learning based approaches typically require training data from a highly accurate CAD model or labeled training data acquired using a complex setup. We address this by learning to estimate pose from weakly labeled data without a known CAD model. We propose to use a NeRF to learn object shape implicitly which is… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 3DV 2024

    Journal ref: 3DV 2024

  14. arXiv:2405.14094  [pdf, other

    cs.LG cs.AI cs.CV math.AT stat.ML

    Attending to Topological Spaces: The Cellular Transformer

    Authors: Rubén Ballester, Pablo Hernández-García, Mathilde Papillon, Claudio Battiloro, Nina Miolane, Tolga Birdal, Carles Casacuberta, Sergio Escalera, Mustafa Hajij

    Abstract: Topological Deep Learning seeks to enhance the predictive performance of neural network models by harnessing topological structures in input data. Topological neural networks operate on spaces such as cell complexes and hypergraphs, that can be seen as generalizations of graphs. In this work, we introduce the Cellular Transformer (CT), a novel architecture that generalizes graph-based transformers… ▽ More

    Submitted 26 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  15. arXiv:2403.03122  [pdf, other

    cs.CV

    NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors

    Authors: Yannan He, Garvita Tiwari, Tolga Birdal, Jan Eric Lenssen, Gerard Pons-Moll

    Abstract: Faithfully modeling the space of articulations is a crucial task that allows recovery and generation of realistic poses, and remains a notorious challenge. To this end, we introduce Neural Riemannian Distance Fields (NRDFs), data-driven priors modeling the space of plausible articulations, represented as the zero-level-set of a neural field in a high-dimensional product-quaternion space. To train… ▽ More

    Submitted 11 April, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024. Project page: https://virtualhumans.mpi-inf.mpg.de/nrdf

  16. arXiv:2403.00372  [pdf, other

    cs.CV

    HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D Text2Shape Generation

    Authors: Zhiying Leng, Tolga Birdal, Xiaohui Liang, Federico Tombari

    Abstract: 3D shape generation from text is a fundamental task in 3D representation learning. The text-shape pairs exhibit a hierarchical structure, where a general text like ``chair" covers all 3D shapes of the chair, while more detailed prompts refer to more specific shapes. Furthermore, both text and 3D shapes are inherently hierarchical structures. However, existing Text2Shape methods, such as SDFusion,… ▽ More

    Submitted 30 April, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Journal ref: IEEE/CVF conference on computer vision and pattern recognition 2024

  17. arXiv:2402.08871  [pdf, other

    cs.LG stat.ML

    Position: Topological Deep Learning is the New Frontier for Relational Learning

    Authors: Theodore Papamarkou, Tolga Birdal, Michael Bronstein, Gunnar Carlsson, Justin Curry, Yue Gao, Mustafa Hajij, Roland Kwitt, Pietro Liò, Paolo Di Lorenzo, Vasileios Maroulas, Nina Miolane, Farzana Nasrin, Karthikeyan Natesan Ramamurthy, Bastian Rieck, Simone Scardapane, Michael T. Schaub, Petar Veličković, Bei Wang, Yusu Wang, Guo-Wei Wei, Ghada Zamzmi

    Abstract: Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning setting… ▽ More

    Submitted 6 August, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  18. arXiv:2402.02441  [pdf, other

    cs.LG cs.AI cs.MS stat.CO

    TopoX: A Suite of Python Packages for Machine Learning on Topological Domains

    Authors: Mustafa Hajij, Mathilde Papillon, Florian Frantzen, Jens Agerberg, Ibrahem AlJabea, Rubén Ballester, Claudio Battiloro, Guillermo Bernárdez, Tolga Birdal, Aiden Brent, Peter Chin, Sergio Escalera, Simone Fiorellino, Odin Hoff Gardaa, Gurusankar Gopalakrishnan, Devendra Govil, Josef Hoppe, Maneel Reddy Karri, Jude Khouja, Manuel Lecha, Neal Livesay, Jan Meißner, Soham Mukherjee, Alexander Nikitin, Theodore Papamarkou , et al. (18 additional authors not shown)

    Abstract: We introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order… ▽ More

    Submitted 8 December, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  19. arXiv:2401.04071  [pdf, other

    cs.CV cs.LG math.DG math.OC stat.ML

    Fun with Flags: Robust Principal Directions via Flag Manifolds

    Authors: Nathan Mankovich, Gustau Camps-Valls, Tolga Birdal

    Abstract: Principal component analysis (PCA), along with its extensions to manifolds and outlier contaminated data, have been indispensable in computer vision and machine learning. In this work, we present a unifying formalism for PCA and its variants, and introduce a framework based on the flags of linear subspaces, ie a hierarchy of nested linear subspaces of increasing dimension, which not only allows fo… ▽ More

    Submitted 4 August, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

  20. arXiv:2312.09504  [pdf, other

    cs.LG cs.SI math.AT math.CO stat.ML

    Combinatorial Complexes: Bridging the Gap Between Cell Complexes and Hypergraphs

    Authors: Mustafa Hajij, Ghada Zamzmi, Theodore Papamarkou, Aldo Guzmán-Sáenz, Tolga Birdal, Michael T. Schaub

    Abstract: Graph-based signal processing techniques have become essential for handling data in non-Euclidean spaces. However, there is a growing awareness that these graph models might need to be expanded into `higher-order' domains to effectively represent the complex relations found in high-dimensional data. Such higher-order domains are typically modeled either as hypergraphs, or as simplicial, cubical or… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Journal ref: 57th Asilomar Conference on Signals, Systems, and Computers, 2023

  21. arXiv:2310.20436  [pdf, other

    cs.CV

    SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark

    Authors: Zhengdi Yu, Shaoli Huang, Yongkang Cheng, Tolga Birdal

    Abstract: We present SignAvatars, the first large-scale, multi-prompt 3D sign language (SL) motion dataset designed to bridge the communication gap for Deaf and hard-of-hearing individuals. While there has been an exponentially growing number of research regarding digital communication, the majority of existing communication technologies primarily cater to spoken or written languages, instead of SL, the ess… ▽ More

    Submitted 2 July, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: ECCV2024 14 pages; Project page available at https://signavatars.github.io/

  22. arXiv:2310.17638  [pdf, other

    cs.LG stat.ML

    Generative Fractional Diffusion Models

    Authors: Gabriel Nobis, Maximilian Springenberg, Marco Aversa, Michael Detzel, Rembert Daems, Roderick Murray-Smith, Shinichi Nakajima, Sebastian Lapuschkin, Stefano Ermon, Tolga Birdal, Manfred Opper, Christoph Knochenhauer, Luis Oala, Wojciech Samek

    Abstract: We introduce the first continuous-time score-based generative model that leverages fractional diffusion processes for its underlying dynamics. Although diffusion models have excelled at capturing data distributions, they still suffer from various limitations such as slow convergence, mode-collapse on imbalanced data, and lack of diversity. These issues are partially linked to the use of light-tail… ▽ More

    Submitted 31 October, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

    ACM Class: I.2.4; F.4.1; G.3

  23. arXiv:2310.15128  [pdf, other

    cs.CV cs.LG quant-ph

    Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients

    Authors: Maximilian Krahn, Michele Sasdelli, Fengyi Yang, Vladislav Golyanik, Juho Kannala, Tat-Jun Chin, Tolga Birdal

    Abstract: We present, QP-SBGD, a novel layer-wise stochastic optimiser tailored towards training neural networks with binary weights, known as binary neural networks (BNNs), on quantum hardware. BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy. However, training them in practice remains to be an open challenge. Most known BNN-optimisers… ▽ More

    Submitted 3 September, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Journal ref: BMVC 2024

  24. arXiv:2310.12975  [pdf, other

    cs.LG cs.AI cs.CV stat.AP stat.ML

    Variational Inference for SDEs Driven by Fractional Noise

    Authors: Rembert Daems, Manfred Opper, Guillaume Crevecoeur, Tolga Birdal

    Abstract: We present a novel variational framework for performing inference in (neural) stochastic differential equations (SDEs) driven by Markov-approximate fractional Brownian motion (fBM). SDEs offer a versatile tool for modeling real-world continuous-time dynamic systems with inherent noise and randomness. Combining SDEs with the powerful inference capabilities of variational methods, enables the learni… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 24 pages, under review

  25. arXiv:2310.12153  [pdf, other

    cs.LG cs.AI cs.CV

    Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

    Authors: Jan-Nico Zaech, Martin Danelljan, Tolga Birdal, Luc Van Gool

    Abstract: Adiabatic quantum computing (AQC) is a promising approach for discrete and often NP-hard optimization problems. Current AQCs allow to implement problems of research interest, which has sparked the development of quantum representations for many computer vision tasks. Despite requiring multiple measurements from the noisy AQC, current approaches only utilize the best measurement, discarding informa… ▽ More

    Submitted 1 May, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted at CVPR 2024

  26. ICML 2023 Topological Deep Learning Challenge : Design and Results

    Authors: Mathilde Papillon, Mustafa Hajij, Helen Jenne, Johan Mathe, Audun Myers, Theodore Papamarkou, Tolga Birdal, Tamal Dey, Tim Doster, Tegan Emerson, Gurusankar Gopalakrishnan, Devendra Govil, Aldo Guzmán-Sáenz, Henry Kvinge, Neal Livesay, Soham Mukherjee, Shreyas N. Samaga, Karthikeyan Natesan Ramamurthy, Maneel Reddy Karri, Paul Rosen, Sophia Sanborn, Robin Walters, Jens Agerberg, Sadrodin Barikbin, Claudio Battiloro , et al. (31 additional authors not shown)

    Abstract: This paper presents the computational challenge on topological deep learning that was hosted within the ICML 2023 Workshop on Topology and Geometry in Machine Learning. The competition asked participants to provide open-source implementations of topological neural networks from the literature by contributing to the python packages TopoNetX (data processing) and TopoModelX (deep learning). The chal… ▽ More

    Submitted 18 January, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

  27. VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs

    Authors: Moayed Haji Ali, Andrew Bond, Tolga Birdal, Duygu Ceylan, Levent Karacan, Erkut Erdem, Aykut Erdem

    Abstract: We propose $\textbf{VidStyleODE}$, a spatiotemporally continuous disentangled $\textbf{Vid}$eo representation based upon $\textbf{Style}$GAN and Neural-$\textbf{ODE}$s. Effective traversal of the latent space learned by Generative Adversarial Networks (GANs) has been the basis for recent breakthroughs in image editing. However, the applicability of such advancements to the video domain has been hi… ▽ More

    Submitted 10 March, 2025; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Project website: https://cyberiada.github.io/VidStyleODE

    Journal ref: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)

  28. arXiv:2303.13501  [pdf, other

    cs.CV cs.LG math.DG math.OC stat.ML

    Chordal Averaging on Flag Manifolds and Its Applications

    Authors: Nathan Mankovich, Tolga Birdal

    Abstract: This paper presents a new, provably-convergent algorithm for computing the flag-mean and flag-median of a set of points on a flag manifold under the chordal metric. The flag manifold is a mathematical space consisting of flags, which are sequences of nested subspaces of a vector space that increase in dimension. The flag manifold is a superset of a wide range of known matrix spaces, including Stie… ▽ More

    Submitted 17 July, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: Appears at ICCV 2023

  29. arXiv:2211.02980  [pdf, other

    cs.CV

    Disentangling Content and Motion for Text-Based Neural Video Manipulation

    Authors: Levent Karacan, Tolga Kerimoğlu, İsmail İnan, Tolga Birdal, Erkut Erdem, Aykut Erdem

    Abstract: Giving machines the ability to imagine possible new objects or scenes from linguistic descriptions and produce their realistic renderings is arguably one of the most challenging problems in computer vision. Recent advances in deep generative models have led to new approaches that give promising results towards this goal. In this paper, we introduce a new method called DiCoMoGAN for manipulating vi… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

  30. arXiv:2207.06333  [pdf, other

    cs.CV

    6D Camera Relocalization in Visually Ambiguous Extreme Environments

    Authors: Yang Zheng, Tolga Birdal, Fei Xia, Yanchao Yang, Yueqi Duan, Leonidas J. Guibas

    Abstract: We propose a novel method to reliably estimate the pose of a camera given a sequence of images acquired in extreme environments such as deep seas or extraterrestrial terrains. Data acquired under these challenging conditions are corrupted by textureless surfaces, image degradation, and presence of repetitive and highly ambiguous structures. When naively deployed, the state-of-the-art methods can f… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

  31. arXiv:2206.00606  [pdf, other

    cs.LG cs.CV cs.SI math.AT stat.ML

    Topological Deep Learning: Going Beyond Graph Data

    Authors: Mustafa Hajij, Ghada Zamzmi, Theodore Papamarkou, Nina Miolane, Aldo Guzmán-Sáenz, Karthikeyan Natesan Ramamurthy, Tolga Birdal, Tamal K. Dey, Soham Mukherjee, Shreyas N. Samaga, Neal Livesay, Robin Walters, Paul Rosen, Michael T. Schaub

    Abstract: Topological deep learning is a rapidly growing field that pertains to the development of deep learning models for data supported on topological domains such as simplicial complexes, cell complexes, and hypergraphs, which generalize many domains encountered in scientific computations. In this paper, we present a unifying deep learning framework built upon a richer data structure that includes widel… ▽ More

    Submitted 19 May, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

  32. arXiv:2203.12633  [pdf, other

    cs.CV cs.LG math.OC

    Q-FW: A Hybrid Classical-Quantum Frank-Wolfe for Quadratic Binary Optimization

    Authors: Alp Yurtsever, Tolga Birdal, Vladislav Golyanik

    Abstract: We present a hybrid classical-quantum framework based on the Frank-Wolfe algorithm, Q-FW, for solving quadratic, linearly-constrained, binary optimization problems on quantum annealers (QA). The computational premise of quantum computers has cultivated the re-design of various existing vision problems into quantum-friendly forms. Experimental QA realizations can solve a particular non-convex probl… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: 26 pages with supplementary material

  33. arXiv:2112.09329  [pdf, other

    cs.CV

    Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders

    Authors: Mikaela Angelina Uy, Yen-yu Chang, Minhyuk Sung, Purvi Goel, Joseph Lambourne, Tolga Birdal, Leonidas Guibas

    Abstract: We propose Point2Cyl, a supervised network transforming a raw 3D point cloud to a set of extrusion cylinders. Reverse engineering from a raw geometry to a CAD model is an essential task to enable manipulation of the 3D data in shape editing software and thus expand their usages in many downstream applications. Particularly, the form of CAD models having a sequence of extrusion cylinders -- a 2D sk… ▽ More

    Submitted 29 May, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

    Comments: CVPR 2022

  34. arXiv:2111.14762  [pdf, other

    cs.CV cs.GR

    Riemannian Functional Map Synchronization for Probabilistic Partial Correspondence in Shape Networks

    Authors: Faria Huq, Adrish Dey, Sahra Yusuf, Dena Bazazian, Tolga Birdal, Nina Miolane

    Abstract: We consider the problem of graph-matching on a network of 3D shapes with uncertainty quantification. We assume that the pairwise shape correspondences are efficiently represented as \emph{functional maps}, that match real-valued functions defined over pairs of shapes. By modeling functional maps between nearly isometric shapes as elements of the Lie group $SO(n)$, we employ \emph{synchronization}… ▽ More

    Submitted 3 January, 2023; v1 submitted 29 November, 2021; originally announced November 2021.

    Comments: 16 pages

  35. arXiv:2111.13171  [pdf, other

    cs.LG cs.AI cs.CV math.GN stat.ML

    Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

    Authors: Tolga Birdal, Aaron Lou, Leonidas Guibas, Umut Şimşekli

    Abstract: Disobeying the classical wisdom of statistical learning theory, modern deep neural networks generalize well even though they typically contain millions of parameters. Recently, it has been shown that the trajectories of iterative optimization algorithms can possess fractal structures, and their generalization error can be formally linked to the complexity of such fractals. This complexity is measu… ▽ More

    Submitted 25 November, 2021; originally announced November 2021.

    Comments: Appears at NeurIPS 2021

  36. Multiway Non-rigid Point Cloud Registration via Learned Functional Map Synchronization

    Authors: Jiahui Huang, Tolga Birdal, Zan Gojcic, Leonidas J. Guibas, Shi-Min Hu

    Abstract: We present SyNoRiM, a novel way to jointly register multiple non-rigid shapes by synchronizing the maps relating learned functions defined on the point clouds. Even though the ability to process non-rigid shapes is critical in various applications ranging from computer animation to 3D digitization, the literature still lacks a robust and flexible framework to match and align a collection of real,… ▽ More

    Submitted 1 April, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence 2022

  37. arXiv:2110.11657  [pdf, other

    cs.CV

    Projective Manifold Gradient Layer for Deep Rotation Regression

    Authors: Jiayi Chen, Yingda Yin, Tolga Birdal, Baoquan Chen, Leonidas Guibas, He Wang

    Abstract: Regressing rotations on SO(3) manifold using deep neural networks is an important yet unsolved problem. The gap between the Euclidean network output space and the non-Euclidean SO(3) manifold imposes a severe challenge for neural network learning in both forward and backward passes. While several works have proposed different regression-friendly rotation representations, very few works have been d… ▽ More

    Submitted 29 March, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

    Comments: CVPR2022

  38. arXiv:2105.04668  [pdf, other

    cs.CV cs.LG

    HuMoR: 3D Human Motion Model for Robust Pose Estimation

    Authors: Davis Rempe, Tolga Birdal, Aaron Hertzmann, Jimei Yang, Srinath Sridhar, Leonidas J. Guibas

    Abstract: We introduce HuMoR: a 3D Human Motion Model for Robust Estimation of temporal pose and shape. Though substantial progress has been made in estimating 3D human motion and shape from dynamic observations, recovering plausible pose sequences in the presence of noise and occlusions remains a challenge. For this purpose, we propose an expressive generative model in the form of a conditional variational… ▽ More

    Submitted 18 August, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

    Comments: ICCV 2021 camera ready

  39. arXiv:2102.08945  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Weakly Supervised Learning of Rigid 3D Scene Flow

    Authors: Zan Gojcic, Or Litany, Andreas Wieser, Leonidas J. Guibas, Tolga Birdal

    Abstract: We propose a data-driven scene flow estimation algorithm exploiting the observation that many 3D scenes can be explained by a collection of agents moving as rigid bodies. At the core of our method lies a deep architecture able to reason at the \textbf{object-level} by considering 3D scene flow in conjunction with other 3D tasks. This object level abstraction, enables us to relax the requirement fo… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

  40. arXiv:2101.07755  [pdf, other

    quant-ph cs.CV cs.ET cs.LG cs.RO

    Quantum Permutation Synchronization

    Authors: Tolga Birdal, Vladislav Golyanik, Christian Theobalt, Leonidas Guibas

    Abstract: We present QuantumSync, the first quantum algorithm for solving a synchronization problem in the context of computer vision. In particular, we focus on permutation synchronization which involves solving a non-convex optimization problem in discrete variables. We start by formulating synchronization into a quadratic unconstrained binary optimization problem (QUBO). While such formulation respects t… ▽ More

    Submitted 26 November, 2021; v1 submitted 19 January, 2021; originally announced January 2021.

    Comments: 19 pages, 15 figures, 4 tables; web pages: https://vcai.mpi-inf.mpg.de/projects/QUANTUMSYNC/, https://quantumcomputervision.github.io/

    Journal ref: Computer Vision and Pattern Recognition (CVPR) 2021

  41. arXiv:2101.06605  [pdf, other

    cs.CV cs.LG

    MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

    Authors: Jiahui Huang, He Wang, Tolga Birdal, Minhyuk Sung, Federica Arrigoni, Shi-Min Hu, Leonidas Guibas

    Abstract: We present MultiBodySync, a novel, end-to-end trainable multi-body motion segmentation and rigid registration framework for multiple input 3D point clouds. The two non-trivial challenges posed by this multi-scan multibody setting that we investigate are: (i) guaranteeing correspondence and segmentation consistency across multiple input point clouds capturing different spatial arrangements of bodie… ▽ More

    Submitted 28 March, 2021; v1 submitted 17 January, 2021; originally announced January 2021.

    Comments: Contact: huang-jh18<at>mails<dot>tsinghua<dot>edu<dot>cn

  42. arXiv:2012.11002  [pdf, other

    cs.CV

    Deep Bingham Networks: Dealing with Uncertainty and Ambiguity in Pose Estimation

    Authors: Haowen Deng, Mai Bui, Nassir Navab, Leonidas Guibas, Slobodan Ilic, Tolga Birdal

    Abstract: In this work, we introduce Deep Bingham Networks (DBN), a generic framework that can naturally handle pose-related uncertainties and ambiguities arising in almost all real life applications concerning 3D data. While existing works strive to find a single solution to the pose estimation problem, we make peace with the ambiguities causing high uncertainty around which solutions to identify as the be… ▽ More

    Submitted 20 December, 2020; originally announced December 2020.

    Comments: arXiv admin note: text overlap with arXiv:2004.04807

  43. arXiv:2008.02792  [pdf, other

    cs.CV cs.LG

    CaSPR: Learning Canonical Spatiotemporal Point Cloud Representations

    Authors: Davis Rempe, Tolga Birdal, Yongheng Zhao, Zan Gojcic, Srinath Sridhar, Leonidas J. Guibas

    Abstract: We propose CaSPR, a method to learn object-centric Canonical Spatiotemporal Point Cloud Representations of dynamically moving or evolving objects. Our goal is to enable information aggregation over time and the interrogation of object state at any spatiotemporal neighborhood in the past, observed or not. Different from previous work, CaSPR learns representations that support spacetime continuity,… ▽ More

    Submitted 11 November, 2020; v1 submitted 6 August, 2020; originally announced August 2020.

    Comments: NeurIPS 2020

  44. arXiv:2004.04807  [pdf, other

    cs.CV cs.LG cs.RO eess.IV

    6D Camera Relocalization in Ambiguous Scenes via Continuous Multimodal Inference

    Authors: Mai Bui, Tolga Birdal, Haowen Deng, Shadi Albarqouni, Leonidas Guibas, Slobodan Ilic, Nassir Navab

    Abstract: We present a multimodal camera relocalization framework that captures ambiguities and uncertainties with continuous mixture models defined on the manifold of camera poses. In highly ambiguous environments, which can easily arise due to symmetries and repetitive structures in the scene, computing one plausible solution (what most state-of-the-art methods currently regress) may not be sufficient. In… ▽ More

    Submitted 16 July, 2020; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: Accepted for publication at ECCV 2020. Project page under https://multimodal3dvision.github.io

  45. arXiv:2004.01228  [pdf, other

    cs.CV cs.GR cs.LG eess.IV

    Deformation-Aware 3D Model Embedding and Retrieval

    Authors: Mikaela Angelina Uy, Jingwei Huang, Minhyuk Sung, Tolga Birdal, Leonidas Guibas

    Abstract: We introduce a new problem of retrieving 3D models that are deformable to a given query shape and present a novel deep deformation-aware embedding to solve this retrieval task. 3D model retrieval is a fundamental operation for recovering a clean and complete 3D model from a noisy and partial 3D scan. However, given a finite collection of 3D shapes, even the closest model to a query may not be sati… ▽ More

    Submitted 31 July, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

    Comments: Accepted for publication at ECCV 2020. Project page under https://deformscan2cad.github.io

  46. arXiv:2004.00663  [pdf, other

    cs.CV cs.GR cs.LG cs.RO stat.ML

    Synchronizing Probability Measures on Rotations via Optimal Transport

    Authors: Tolga Birdal, Michael Arbel, Umut Şimşekli, Leonidas Guibas

    Abstract: We introduce a new paradigm, $\textit{measure synchronization}$, for synchronizing graphs with measure-valued edges. We formulate this problem as maximization of the cycle-consistency in the space of probability measures over relative rotations. In particular, we aim at estimating marginal distributions of absolute orientations by synchronizing the $\textit{conditional}$ ones, which are defined on… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

    Comments: Accepted for publication at CVPR 2020, includes supplementary material. Project website: https://github.com/SynchInVision/probsync

  47. arXiv:2002.02506  [pdf, other

    cs.CV cs.CG

    Continuous Geodesic Convolutions for Learning on 3D Shapes

    Authors: Zhangsihao Yang, Or Litany, Tolga Birdal, Srinath Sridhar, Leonidas Guibas

    Abstract: The majority of descriptor-based methods for geometric processing of non-rigid shape rely on hand-crafted descriptors. Recently, learning-based techniques have been shown effective, achieving state-of-the-art results in a variety of tasks. Yet, even though these methods can in principle work directly on raw data, most methods still rely on hand-crafted descriptors at the input layer. In this work,… ▽ More

    Submitted 6 February, 2020; originally announced February 2020.

  48. From Planes to Corners: Multi-Purpose Primitive Detection in Unorganized 3D Point Clouds

    Authors: Christiane Sommer, Yumin Sun, Leonidas Guibas, Daniel Cremers, Tolga Birdal

    Abstract: We propose a new method for segmentation-free joint estimation of orthogonal planes, their intersection lines, relationship graph and corners lying at the intersection of three orthogonal planes. Such unified scene exploration under orthogonality allows for multitudes of applications such as semantic plane detection or local and global scan alignment, which in turn can aid robot localization or gr… ▽ More

    Submitted 24 April, 2020; v1 submitted 21 January, 2020; originally announced January 2020.

    Comments: Accepted to IEEE Robotics and Automation Letters 2020 | Video: https://youtu.be/nHWJrA6RcB0 | Code: https://github.com/c-sommer/orthogonal-planes

    Journal ref: IEEE Robotics and Automation Letters 5(2) 2020, 1764-1771

  49. arXiv:2001.05119  [pdf, other

    cs.CV cs.LG

    Learning multiview 3D point cloud registration

    Authors: Zan Gojcic, Caifa Zhou, Jan D. Wegner, Leonidas J. Guibas, Tolga Birdal

    Abstract: We present a novel, end-to-end learnable, multiview 3D point cloud registration algorithm. Registration of multiple scans typically follows a two-stage pipeline: the initial pairwise alignment and the globally consistent refinement. The former is often ambiguous due to the low overlap of neighboring point clouds, symmetries and repetitive scene parts. Therefore, the latter global refinement aims a… ▽ More

    Submitted 31 March, 2020; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: CVPR2020 - Camera Ready

  50. arXiv:1912.12098  [pdf, other

    cs.LG cs.CV cs.GR cs.RO stat.ML

    Quaternion Equivariant Capsule Networks for 3D Point Clouds

    Authors: Yongheng Zhao, Tolga Birdal, Jan Eric Lenssen, Emanuele Menegatti, Leonidas Guibas, Federico Tombari

    Abstract: We present a 3D capsule module for processing point clouds that is equivariant to 3D rotations and translations, as well as invariant to permutations of the input points. The operator receives a sparse set of local reference frames, computed from an input point cloud and establishes end-to-end transformation equivariance through a novel dynamic routing procedure on quaternions. Further, we theoret… ▽ More

    Submitted 23 August, 2020; v1 submitted 27 December, 2019; originally announced December 2019.

    Comments: Oral Presentation at ECCV 2020. Find our video under: https://youtu.be/LHh56snwhTA. We release our sources at: http://tolgabirdal.github.io/qecnetworks