Skip to main content

Showing 1–15 of 15 results for author: Örnek, E P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.00915  [pdf, other

    cs.CV cs.AI cs.LG

    EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion

    Authors: Guangyao Zhai, Evin Pınar Örnek, Dave Zhenyu Chen, Ruotong Liao, Yan Di, Nassir Navab, Federico Tombari, Benjamin Busam

    Abstract: We present EchoScene, an interactive and controllable generative model that generates 3D indoor scenes on scene graphs. EchoScene leverages a dual-branch diffusion model that dynamically adapts to scene graphs. Existing methods struggle to handle scene graphs due to varying numbers of nodes, multiple edge combinations, and manipulator-induced node-edge operations. EchoScene overcomes this by assoc… ▽ More

    Submitted 27 February, 2025; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: Nectar Track at 3DV 2025

  2. arXiv:2311.18809  [pdf, other

    cs.CV cs.RO

    FoundPose: Unseen Object Pose Estimation with Foundation Features

    Authors: Evin Pınar Örnek, Yann Labbé, Bugra Tekin, Lingni Ma, Cem Keskin, Christian Forster, Tomas Hodan

    Abstract: We propose FoundPose, a model-based method for 6D pose estimation of unseen objects from a single RGB image. The method can quickly onboard new objects using their 3D models without requiring any object- or task-specific training. In contrast, existing methods typically pre-train on large-scale, task-specific datasets in order to generalize to new objects and to bridge the image-to-model domain ga… ▽ More

    Submitted 19 July, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

  3. arXiv:2305.16283  [pdf, other

    cs.CV

    CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph Diffusion

    Authors: Guangyao Zhai, Evin Pınar Örnek, Shun-Cheng Wu, Yan Di, Federico Tombari, Nassir Navab, Benjamin Busam

    Abstract: Controllable scene synthesis aims to create interactive environments for various industrial use cases. Scene graphs provide a highly suitable interface to facilitate these applications by abstracting the scene context in a compact manner. Existing methods, reliant on retrieval from extensive databases or pre-trained shape embeddings, often overlook scene-object and object-object relationships, lea… ▽ More

    Submitted 30 December, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023 camera-ready

  4. arXiv:2212.11922  [pdf, other

    cs.CV

    SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor Environments

    Authors: Evin Pınar Örnek, Aravindhan K Krishnan, Shreekant Gayaka, Cheng-Hao Kuo, Arnie Sen, Nassir Navab, Federico Tombari

    Abstract: Object instance segmentation is a key challenge for indoor robots navigating cluttered environments with many small objects. Limitations in 3D sensing capabilities often make it difficult to detect every possible object. While deep learning approaches may be effective for this problem, manually annotating 3D data for supervised learning is time-consuming. In this work, we explore zero-shot instanc… ▽ More

    Submitted 25 May, 2023; v1 submitted 22 December, 2022; originally announced December 2022.

    Comments: Accepted in Robotics and Automation Letters April 2023

  5. arXiv:2212.01381  [pdf, other

    cs.CV

    LatentSwap3D: Semantic Edits on 3D Image GANs

    Authors: Enis Simsar, Alessio Tonioni, Evin Pınar Örnek, Federico Tombari

    Abstract: 3D GANs have the ability to generate latent codes for entire 3D volumes rather than only 2D images. These models offer desirable features like high-quality geometry and multi-view consistency, but, unlike their 2D counterparts, complex semantic image editing tasks for 3D GANs have only been partially explored. To address this problem, we propose LatentSwap3D, a semantic edit approach based on late… ▽ More

    Submitted 4 September, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: The paper has been accepted by ICCV'23 AI3DCC

  6. arXiv:2203.11937  [pdf, other

    cs.CV

    4D-OR: Semantic Scene Graphs for OR Domain Modeling

    Authors: Ege Özsoy, Evin Pınar Örnek, Ulrich Eck, Tobias Czempiel, Federico Tombari, Nassir Navab

    Abstract: Surgical procedures are conducted in highly complex operating rooms (OR), comprising different actors, devices, and interactions. To date, only medically trained human experts are capable of understanding all the links and interactions in such a demanding environment. This paper aims to bring the community one step closer to automated, holistic and semantic understanding and modeling of OR domain.… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: 11 pages, 3 figures, 3 tables

  7. arXiv:2203.08122  [pdf, other

    cs.CV

    From 2D to 3D: Re-thinking Benchmarking of Monocular Depth Prediction

    Authors: Evin Pınar Örnek, Shristi Mudgal, Johanna Wald, Yida Wang, Nassir Navab, Federico Tombari

    Abstract: There have been numerous recently proposed methods for monocular depth prediction (MDP) coupled with the equally rapid evolution of benchmarking tools. However, we argue that MDP is currently witnessing benchmark over-fitting and relying on metrics that are only partially helpful to gauge the usefulness of the predictions for 3D applications. This limits the design and development of novel methods… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

  8. arXiv:2112.01521  [pdf, other

    cs.CV cs.LG

    Object-aware Monocular Depth Prediction with Instance Convolutions

    Authors: Enis Simsar, Evin Pınar Örnek, Fabian Manhardt, Helisa Dhamo, Nassir Navab, Federico Tombari

    Abstract: With the advent of deep learning, estimating depth from a single RGB image has recently received a lot of attention, being capable of empowering many different applications ranging from path planning for robotics to computational cinematography. Nevertheless, while the depth maps are in their entirety fairly reliable, the estimates around object discontinuities are still far from satisfactory. Thi… ▽ More

    Submitted 24 February, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

  9. arXiv:2111.14673  [pdf, other

    cs.CV

    3D Compositional Zero-shot Learning with DeCompositional Consensus

    Authors: Muhammad Ferjad Naeem, Evin Pınar Örnek, Yongqin Xian, Luc Van Gool, Federico Tombari

    Abstract: Parts represent a basic unit of geometric and semantic similarity across different objects. We argue that part knowledge should be composable beyond the observed object classes. Towards this, we present 3D Compositional Zero-shot Learning as a problem of part generalization from seen to unseen object classes for semantic segmentation. We provide a structured study through benchmarking the task wit… ▽ More

    Submitted 15 April, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

  10. arXiv:2106.15309  [pdf, other

    cs.CV

    Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical Procedures

    Authors: Ege Özsoy, Evin Pınar Örnek, Ulrich Eck, Federico Tombari, Nassir Navab

    Abstract: From a computer science viewpoint, a surgical domain model needs to be a conceptual one incorporating both behavior and data. It should therefore model actors, devices, tools, their complex interactions and data flow. To capture and model these, we take advantage of the latest computer vision methodologies for generating 3D scene graphs from camera views. We then introduce the Multimodal Semantic… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  11. Co-Planar Parametrization for Stereo-SLAM and Visual-Inertial Odometry

    Authors: Xin Li, Yanyan Li, Evin Pınar Örnek, Jinlong Lin, Federico Tombari

    Abstract: This work proposes a novel SLAM framework for stereo and visual inertial odometry estimation. It builds an efficient and robust parametrization of co-planar points and lines which leverages specific geometric constraints to improve camera pose optimization in terms of both efficiency and accuracy. %reduce the size of the Hessian matrix in the optimization. The pipeline consists of extracting 2D po… ▽ More

    Submitted 26 September, 2020; originally announced September 2020.

  12. arXiv:2002.02265  [pdf, other

    cs.CV cs.CL cs.LG stat.ML

    Zero-Shot Activity Recognition with Videos

    Authors: Evin Pinar Ornek

    Abstract: In this paper, we examined the zero-shot activity recognition task with the usage of videos. We introduce an auto-encoder based model to construct a multimodal joint embedding space between the visual and textual manifolds. On the visual side, we used activity videos and a state-of-the-art 3D convolutional action recognition network to extract the features. On the textual side, we worked with GloV… ▽ More

    Submitted 22 January, 2020; originally announced February 2020.

    Comments: This is a research report done during master's studies

  13. arXiv:1911.13218  [pdf

    cs.LG eess.IV

    ModelHub.AI: Dissemination Platform for Deep Learning Models

    Authors: Ahmed Hosny, Michael Schwier, Christoph Berger, Evin P Örnek, Mehmet Turan, Phi V Tran, Leon Weninger, Fabian Isensee, Klaus H Maier-Hein, Richard McKinley, Michael T Lu, Udo Hoffmann, Bjoern Menze, Spyridon Bakas, Andriy Fedorov, Hugo JWL Aerts

    Abstract: Recent advances in artificial intelligence research have led to a profusion of studies that apply deep learning to problems in image analysis and natural language processing among others. Additionally, the availability of open-source computational frameworks has lowered the barriers to implementing state-of-the-art methods across multiple domains. Albeit leading to major performance breakthroughs… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

  14. arXiv:1803.01048  [pdf, other

    cs.RO

    Magnetic-Visual Sensor Fusion-based Dense 3D Reconstruction and Localization for Endoscopic Capsule Robots

    Authors: Mehmet Turan, Yasin Almalioglu, Evin Pinar Ornek, Helder Araujo, Mehmet Fatih Yanik, Metin Sitti

    Abstract: Reliable and real-time 3D reconstruction and localization functionality is a crucial prerequisite for the navigation of actively controlled capsule endoscopic robots as an emerging, minimally invasive diagnostic and therapeutic technology for use in the gastrointestinal (GI) tract. In this study, we propose a fully dense, non-rigidly deformable, strictly real-time, intraoperative map fusion approa… ▽ More

    Submitted 2 March, 2018; originally announced March 2018.

    Comments: submitted to IROS 2018

  15. arXiv:1803.01047  [pdf, other

    cs.RO

    Unsupervised Odometry and Depth Learning for Endoscopic Capsule Robots

    Authors: Mehmet Turan, Evin Pinar Ornek, Nail Ibrahimli, Can Giracoglu, Yasin Almalioglu, Mehmet Fatih Yanik, Metin Sitti

    Abstract: In the last decade, many medical companies and research groups have tried to convert passive capsule endoscopes as an emerging and minimally invasive diagnostic technology into actively steerable endoscopic capsule robots which will provide more intuitive disease detection, targeted drug delivery and biopsy-like operations in the gastrointestinal(GI) tract. In this study, we introduce a fully unsu… ▽ More

    Submitted 2 March, 2018; originally announced March 2018.

    Comments: submitted to IROS 2018