Skip to main content

Showing 1–12 of 12 results for author: Dalva, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.23758  [pdf, ps, other

    cs.CV

    LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers

    Authors: Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag

    Abstract: We introduce LoRAShop, the first framework for multi-concept image editing with LoRA models. LoRAShop builds on a key observation about the feature interaction patterns inside Flux-style diffusion transformers: concept-specific transformer features activate spatially coherent regions early in the denoising process. We harness this observation to derive a disentangled latent mask for each concept i… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: Project Webpage: https://lorashop.github.io/

  2. arXiv:2412.09614  [pdf, other

    cs.CV cs.CL

    Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG

    Authors: Kavana Venkatesh, Yusuf Dalva, Ismini Lourentzou, Pinar Yanardag

    Abstract: We introduce a novel approach to enhance the capabilities of text-to-image models by incorporating a graph-based RAG. Our system dynamically retrieves detailed character information and relational data from the knowledge graph, enabling the generation of visually accurate and contextually rich images. This capability significantly improves upon the limitations of existing T2I models, which often s… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

    Comments: Project Page: https://context-canvas.github.io/

  3. arXiv:2412.09611  [pdf, other

    cs.CV

    FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers

    Authors: Yusuf Dalva, Kavana Venkatesh, Pinar Yanardag

    Abstract: Rectified flow models have emerged as a dominant approach in image generation, showcasing impressive capabilities in high-quality image synthesis. However, despite their effectiveness in visual generation, rectified flow models often struggle with disentangled editing of images. This limitation prevents the ability to perform precise, attribute-specific modifications without affecting unrelated as… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

    Comments: Project Page: https://fluxspace.github.io

  4. arXiv:2412.04460  [pdf, other

    cs.CV

    LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors

    Authors: Yusuf Dalva, Yijun Li, Qing Liu, Nanxuan Zhao, Jianming Zhang, Zhe Lin, Pinar Yanardag

    Abstract: Large-scale diffusion models have achieved remarkable success in generating high-quality images from textual descriptions, gaining popularity across various applications. However, the generation of layered content, such as transparent images with foreground and background layers, remains an under-explored area. Layered content generation is crucial for creative workflows in fields like graphic des… ▽ More

    Submitted 5 December, 2024; originally announced December 2024.

    Comments: Project page: https://layerfusion.github.io

  5. arXiv:2406.00457  [pdf, other

    cs.CV

    The Curious Case of End Token: A Zero-Shot Disentangled Image Editing using CLIP

    Authors: Hidir Yesiltepe, Yusuf Dalva, Pinar Yanardag

    Abstract: Diffusion models have become prominent in creating high-quality images. However, unlike GAN models celebrated for their ability to edit images in a disentangled manner, diffusion-based text-to-image models struggle to achieve the same level of precise attribute manipulation without compromising image coherence. In this paper, CLIP which is often used in popular text-to-image diffusion models such… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  6. arXiv:2403.19645  [pdf, other

    cs.CV

    GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models

    Authors: Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag

    Abstract: The rapid advancement in image generation models has predominantly been driven by diffusion models, which have demonstrated unparalleled success in generating high-fidelity, diverse images from textual prompts. Despite their success, diffusion models encounter substantial challenges in the domain of image editing, particularly in executing disentangled edits-changes that target specific attributes… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Project page: https://gantastic.github.io

  7. arXiv:2312.05390  [pdf, other

    cs.CV

    NoiseCLR: A Contrastive Learning Approach for Unsupervised Discovery of Interpretable Directions in Diffusion Models

    Authors: Yusuf Dalva, Pinar Yanardag

    Abstract: Generative models have been very popular in the recent years for their image generation capabilities. GAN-based models are highly regarded for their disentangled latent space, which is a key feature contributing to their success in controlled image editing. On the other hand, diffusion models have emerged as powerful tools for generating high-quality images. However, the latent space of diffusion… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: Project page: https://noiseclr.github.io/

  8. arXiv:2303.03471  [pdf, other

    cs.CV

    Refining 3D Human Texture Estimation from a Single Image

    Authors: Said Fahri Altindis, Adil Meric, Yusuf Dalva, Ugur Gudukbay, Aysegul Dundar

    Abstract: Estimating 3D human texture from a single image is essential in graphics and vision. It requires learning a mapping function from input images of humans with diverse poses into the parametric (UV) space and reasonably hallucinating invisible parts. To achieve a high-quality 3D human texture estimation, we propose a framework that adaptively samples the input by a deformable convolution where offse… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

  9. Image-to-Image Translation with Disentangled Latent Vectors for Face Editing

    Authors: Yusuf Dalva, Hamza Pehlivan, Cansu Moran, Öykü Irmak Hatipoğlu, Ayşegül Dündar

    Abstract: We propose an image-to-image translation framework for facial attribute editing with disentangled interpretable latent directions. Facial attribute editing task faces the challenges of targeted attribute editing with controllable strength and disentanglement in the representations of attributes to preserve the other attributes during edits. For this goal, inspired by the latent space factorization… ▽ More

    Submitted 28 March, 2025; v1 submitted 11 January, 2023; originally announced January 2023.

    Comments: See https://yusufdalva.github.io/vecgan for the project webpage. arXiv admin note: substantial text overlap with arXiv:2207.03411

    Journal ref: IEEE Trans. Pattern Anal. Mach. Intell. 45 (2023) 14777-14788

  10. arXiv:2212.14359  [pdf, other

    cs.CV

    StyleRes: Transforming the Residuals for Real Image Editing with StyleGAN

    Authors: Hamza Pehlivan, Yusuf Dalva, Aysegul Dundar

    Abstract: We present a novel image inversion framework and a training pipeline to achieve high-fidelity image inversion with high-quality attribute editing. Inverting real images into StyleGAN's latent space is an extensively studied problem, yet the trade-off between the image reconstruction fidelity and image editing quality remains an open challenge. The low-rate latent spaces are limited in their expres… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

  11. arXiv:2207.03411  [pdf, other

    cs.CV cs.AI cs.LG

    VecGAN: Image-to-Image Translation with Interpretable Latent Directions

    Authors: Yusuf Dalva, Said Fahri Altindis, Aysegul Dundar

    Abstract: We propose VecGAN, an image-to-image translation framework for facial attribute editing with interpretable latent directions. Facial attribute editing task faces the challenges of precise attribute editing with controllable strength and preservation of the other attributes of an image. For this goal, we design the attribute editing by latent space factorization and for each attribute, we learn a l… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Comments: ECCV 2022

  12. Benchmarking the Robustness of Instance Segmentation Models

    Authors: Yusuf Dalva, Hamza Pehlivan, Said Fahri Altindis, Aysegul Dundar

    Abstract: This paper presents a comprehensive evaluation of instance segmentation models with respect to real-world image corruptions as well as out-of-domain image collections, e.g. images captured by a different set-up than the training dataset. The out-of-domain image evaluation shows the generalization capability of models, an essential aspect of real-world applications and an extensively studied topic… ▽ More

    Submitted 28 March, 2025; v1 submitted 2 September, 2021; originally announced September 2021.

    Journal ref: IEEE Trans. Neural. Netw. Learn. Syst. 2024 Dec;35(12):17021-17035