Skip to main content

Showing 1–50 of 77 results for author: Keuper, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.18015  [pdf, ps, other

    cs.CV cs.LG

    SemSegBench & DetecBench: Benchmarking Reliability and Generalization Beyond Classification

    Authors: Shashank Agnihotri, David Schader, Jonas Jakubassa, Nico Sharei, Simon Kral, Mehmet Ege Kaçar, Ruben Weber, Margret Keuper

    Abstract: Reliability and generalization in deep learning are predominantly studied in the context of image classification. Yet, real-world applications in safety-critical domains involve a broader set of semantic tasks, such as semantic segmentation and object detection, which come with a diverse set of dedicated model architectures. To facilitate research towards robust model design in segmentation and de… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: First seven listed authors have equal contribution. GitHub: https://github.com/shashankskagnihotri/benchmarking_reliability_generalization. arXiv admin note: text overlap with arXiv:2505.05091

  2. arXiv:2505.12803  [pdf, ps, other

    cs.CV cs.LG

    Informed Mixing -- Improving Open Set Recognition via Attribution-based Augmentation

    Authors: Jiawen Xu, Odej Kao, Margret Keuper

    Abstract: Open set recognition (OSR) is devised to address the problem of detecting novel classes during model inference. Even in recent vision models, this remains an open issue which is receiving increasing attention. Thereby, a crucial challenge is to learn features that are relevant for unseen categories from given data, for which these features might not be discriminative. To facilitate this process an… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  3. arXiv:2505.11314  [pdf, ps, other

    cs.CV cs.CL

    CROC: Evaluating and Training T2I Metrics with Pseudo- and Human-Labeled Contrastive Robustness Checks

    Authors: Christoph Leiter, Yuki M. Asano, Margret Keuper, Steffen Eger

    Abstract: The assessment of evaluation metrics (meta-evaluation) is crucial for determining the suitability of existing metrics in text-to-image (T2I) generation tasks. Human-based meta-evaluation is costly and time-intensive, and automated alternatives are scarce. We address this gap and propose CROC: a scalable framework for automated Contrastive Robustness Checks that systematically probes and quantifies… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

    Comments: preprint

  4. arXiv:2505.09368  [pdf, ps, other

    cs.CV cs.LG

    RobustSpring: Benchmarking Robustness to Image Corruptions for Optical Flow, Scene Flow and Stereo

    Authors: Jenny Schmalfuss, Victor Oei, Lukas Mehl, Madlen Bartsch, Shashank Agnihotri, Margret Keuper, Andrés Bruhn

    Abstract: Standard benchmarks for optical flow, scene flow, and stereo vision algorithms generally focus on model accuracy rather than robustness to image corruptions like noise or rain. Hence, the resilience of models to such real-world perturbations is largely unquantified. To address this, we present RobustSpring, a comprehensive dataset and benchmark for evaluating robustness to image corruptions for op… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  5. arXiv:2505.05091  [pdf, ps, other

    cs.CV cs.LG

    DispBench: Benchmarking Disparity Estimation to Synthetic Corruptions

    Authors: Shashank Agnihotri, Amaan Ansari, Annika Dackermann, Fabian Rösch, Margret Keuper

    Abstract: Deep learning (DL) has surpassed human performance on standard benchmarks, driving its widespread adoption in computer vision tasks. One such task is disparity estimation, estimating the disparity between matching pixels in stereo image pairs, which is crucial for safety-critical applications like medical surgeries and autonomous navigation. However, DL-based disparity estimation methods are highl… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: Accepted at CVPR 2025 Workshop on Synthetic Data for Computer Vision

  6. arXiv:2505.04835  [pdf, ps, other

    cs.CV

    Are Synthetic Corruptions A Reliable Proxy For Real-World Corruptions?

    Authors: Shashank Agnihotri, David Schader, Nico Sharei, Mehmet Ege Kaçar, Margret Keuper

    Abstract: Deep learning (DL) models are widely used in real-world applications but remain vulnerable to distribution shifts, especially due to weather and lighting changes. Collecting diverse real-world data for testing the robustness of DL models is resource-intensive, making synthetic corruptions an attractive alternative for robustness testing. However, are synthetic corruptions a reliable proxy for real… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: Accepted at CVPR 2025 Workshop on Synthetic Data for Computer Vision

  7. arXiv:2505.03569  [pdf, other

    cs.CV

    Corner Cases: How Size and Position of Objects Challenge ImageNet-Trained Models

    Authors: Mishal Fatima, Steffen Jung, Margret Keuper

    Abstract: Backgrounds in images play a major role in contributing to spurious correlations among different data points. Owing to aesthetic preferences of humans capturing the images, datasets can exhibit positional (location of the object within a given frame) and size (region-of-interest to image ratio) biases for different classes. In this paper, we show that these biases can impact how much a model relie… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

  8. arXiv:2504.18510  [pdf, other

    cs.CV

    Examining the Impact of Optical Aberrations to Image Classification and Object Detection Models

    Authors: Patrick Müller, Alexander Braun, Margret Keuper

    Abstract: Deep neural networks (DNNs) have proven to be successful in various computer vision applications such that models even infer in safety-critical situations. Therefore, vision models have to behave in a robust way to disturbances such as noise or blur. While seminal benchmarks exist to evaluate model robustness to diverse corruptions, blur is often approximated in an overly simplistic way to model d… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

    Comments: v1.0

  9. arXiv:2503.11509  [pdf, other

    cs.CL cs.CV

    TikZero: Zero-Shot Text-Guided Graphics Program Synthesis

    Authors: Jonas Belouadi, Eddy Ilg, Margret Keuper, Hideki Tanaka, Masao Utiyama, Raj Dabre, Steffen Eger, Simone Paolo Ponzetto

    Abstract: With the rise of generative AI, synthesizing figures from text captions becomes a compelling application. However, achieving high geometric precision and editability requires representing figures as graphics programs in languages like TikZ, and aligned training data (i.e., graphics programs with captions) remains scarce. Meanwhile, large amounts of unaligned graphics programs and captioned raster… ▽ More

    Submitted 19 March, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

    Comments: Project page: https://github.com/potamides/DeTikZify

  10. arXiv:2503.09361  [pdf, other

    cs.CV cs.SI

    Deep Learning for Climate Action: Computer Vision Analysis of Visual Narratives on X

    Authors: Katharina Prasse, Marcel Kleinmann, Inken Adam, Kerstin Beckersjuergen, Andreas Edte, Jona Frroku, Timotheus Gumpp, Steffen Jung, Isaac Bravo, Stefanie Walter, Margret Keuper

    Abstract: Climate change is one of the most pressing challenges of the 21st century, sparking widespread discourse across social media platforms. Activists, policymakers, and researchers seek to understand public sentiment and narratives while access to social media data has become increasingly restricted in the post-API era. In this study, we analyze a dataset of climate change-related tweets from X (forme… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

  11. arXiv:2502.15798  [pdf, ps, other

    cs.LG cs.AI cs.CV

    MaxSup: Overcoming Representation Collapse in Label Smoothing

    Authors: Yuxuan Zhou, Heng Li, Zhi-Qi Cheng, Xudong Yan, Yifei Dong, Mario Fritz, Margret Keuper

    Abstract: Label Smoothing (LS) is widely adopted to reduce overconfidence in neural network predictions and improve generalization. Despite these benefits, recent studies reveal two critical issues with LS. First, LS induces overconfidence in misclassified samples. Second, it compacts feature representations into overly tight clusters, diluting intra-class diversity, although the precise cause of this pheno… ▽ More

    Submitted 2 June, 2025; v1 submitted 18 February, 2025; originally announced February 2025.

    Comments: 24 pages, 15 tables, 5 figures. Preliminary work under review. Do not distribute

  12. arXiv:2412.11576  [pdf, other

    cs.CV

    DCBM: Data-Efficient Visual Concept Bottleneck Models

    Authors: Katharina Prasse, Patrick Knab, Sascha Marton, Christian Bartelt, Margret Keuper

    Abstract: Concept Bottleneck Models (CBMs) enhance the interpretability of neural networks by basing predictions on human-understandable concepts. However, current CBMs typically rely on concept sets extracted from large language models or extensive image corpora, limiting their effectiveness in data-sparse scenarios. We propose Data-efficient CBMs (DCBMs), which reduce the need for large sample sizes durin… ▽ More

    Submitted 4 February, 2025; v1 submitted 16 December, 2024; originally announced December 2024.

  13. arXiv:2412.01296  [pdf, other

    cs.CV

    I Spy With My Little Eye: A Minimum Cost Multicut Investigation of Dataset Frames

    Authors: Katharina Prasse, Isaac Bravo, Stefanie Walter, Margret Keuper

    Abstract: Visual framing analysis is a key method in social sciences for determining common themes and concepts in a given discourse. To reduce manual effort, image clustering can significantly speed up the annotation process. In this work, we phrase the clustering task as a Minimum Cost Multicut Problem [MP]. Solutions to the MP have been shown to provide clusterings that maximize the posterior probability… ▽ More

    Submitted 2 December, 2024; originally announced December 2024.

    Comments: WACV25 applications track

  14. arXiv:2411.19853  [pdf, other

    cs.LG cs.CV

    Towards Class-wise Robustness Analysis

    Authors: Tejaswini Medi, Julia Grabinski, Margret Keuper

    Abstract: While being very successful in solving many downstream tasks, the application of deep neural networks is limited in real-life scenarios because of their susceptibility to domain shifts such as common corruptions, and adversarial attacks. The existence of adversarial examples and data corruption significantly reduces the performance of deep classification models. Researchers have made strides in de… ▽ More

    Submitted 13 March, 2025; v1 submitted 29 November, 2024; originally announced November 2024.

  15. arXiv:2411.19037  [pdf, other

    cs.CV

    3D-WAG: Hierarchical Wavelet-Guided Autoregressive Generation for High-Fidelity 3D Shapes

    Authors: Tejaswini Medi, Arianna Rampini, Pradyumna Reddy, Pradeep Kumar Jayaraman, Margret Keuper

    Abstract: Autoregressive (AR) models have achieved remarkable success in natural language and image generation, but their application to 3D shape modeling remains largely unexplored. Unlike diffusion models, AR models enable more efficient and controllable generation with faster inference times, making them especially suitable for data-intensive domains. Traditional 3D generative models using AR approaches… ▽ More

    Submitted 28 November, 2024; originally announced November 2024.

  16. arXiv:2410.23142  [pdf, other

    cs.LG cs.CV

    FAIR-TAT: Improving Model Fairness Using Targeted Adversarial Training

    Authors: Tejaswini Medi, Steffen Jung, Margret Keuper

    Abstract: Deep neural networks are susceptible to adversarial attacks and common corruptions, which undermine their robustness. In order to enhance model resilience against such challenges, Adversarial Training (AT) has emerged as a prominent solution. Nevertheless, adversarial robustness is often attained at the expense of model fairness during AT, i.e., disparity in class-wise robustness of the model. Whi… ▽ More

    Submitted 20 January, 2025; v1 submitted 30 October, 2024; originally announced October 2024.

  17. arXiv:2410.14470  [pdf, other

    cs.CV cs.AI cs.LG

    How Do Training Methods Influence the Utilization of Vision Models?

    Authors: Paul Gavrikov, Shashank Agnihotri, Margret Keuper, Janis Keuper

    Abstract: Not all learnable parameters (e.g., weights) contribute equally to a neural network's decision function. In fact, entire layers' parameters can sometimes be reset to random values with little to no impact on the model's decisions. We revisit earlier studies that examined how architecture and task complexity influence this phenomenon and ask: is this phenomenon also affected by how we train the mod… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

    Comments: Accepted at the Interpretable AI: Past, Present and Future Workshop at NeurIPS 2024

  18. arXiv:2408.13586  [pdf, other

    cs.CL cs.AI

    Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation

    Authors: Yuxuan Zhou, Margret Keuper, Mario Fritz

    Abstract: Sampling-based decoding strategies have been widely adopted for Large Language Models (LLMs) in numerous applications, targeting a balance between diversity and quality via temperature tuning and tail truncation. Considering the strong dependency of the candidate next tokens on different prefixes, recent studies propose to adaptively truncate the tail of LLMs' predicted distribution. Although impr… ▽ More

    Submitted 7 January, 2025; v1 submitted 24 August, 2024; originally announced August 2024.

  19. arXiv:2407.03482  [pdf, other

    cs.CV cs.AI cs.LG

    Domain-Aware Fine-Tuning of Foundation Models

    Authors: Ugur Ali Kaplan, Margret Keuper, Anna Khoreva, Dan Zhang, Yumeng Li

    Abstract: Foundation models (FMs) have revolutionized computer vision, enabling effective learning across different domains. However, their performance under domain shift is yet underexplored. This paper investigates the zero-shot domain adaptation potential of FMs by comparing different backbone architectures and introducing novel domain-aware components that leverage domain related textual embeddings. We… ▽ More

    Submitted 10 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted at ICML 2024 Workshop on Foundation Models in the Wild

  20. arXiv:2406.07435  [pdf, other

    cs.CV cs.LG eess.IV

    Beware of Aliases -- Signal Preservation is Crucial for Robust Image Restoration

    Authors: Shashank Agnihotri, Julia Grabinski, Janis Keuper, Margret Keuper

    Abstract: Image restoration networks are usually comprised of an encoder and a decoder, responsible for aggregating image content from noisy, distorted data and to restore clean, undistorted images, respectively. Data aggregation as well as high-resolution image generation both usually come at the risk of involving aliases, i.e.~standard architectures put their ability to reconstruct the model input in jeop… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Tags: Adversarial attack, image restoration, image deblurring, frequency sampling

  21. arXiv:2406.01189  [pdf, other

    cs.LG cs.AI

    MultiMax: Sparse and Multi-Modal Attention Learning

    Authors: Yuxuan Zhou, Mario Fritz, Margret Keuper

    Abstract: SoftMax is a ubiquitous ingredient of modern machine learning algorithms. It maps an input vector onto a probability simplex and reweights the input by concentrating the probability mass at large entries. Yet, as a smooth approximation to the Argmax function, a significant amount of probability mass is distributed to other, residual entries, leading to poor interpretability and noise. Although spa… ▽ More

    Submitted 8 January, 2025; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML 2024

  22. arXiv:2403.13501  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis

    Authors: Yumeng Li, William Beluch, Margret Keuper, Dan Zhang, Anna Khoreva

    Abstract: Despite tremendous progress in the field of text-to-video (T2V) synthesis, open-sourced T2V diffusion models struggle to generate longer videos with dynamically varying and evolving content. They tend to synthesize quasi-static videos, ignoring the necessary visual change-over-time implied in the text prompt. At the same time, scaling these models to enable longer, more dynamic video synthesis oft… ▽ More

    Submitted 18 March, 2025; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: Accepted at ICLR 2025. Code: https://github.com/boschresearch/VSTAR and project page: https://yumengli007.github.io/VSTAR

  23. arXiv:2403.09193  [pdf, other

    cs.CV cs.AI cs.LG q-bio.NC

    Can We Talk Models Into Seeing the World Differently?

    Authors: Paul Gavrikov, Jovita Lukasik, Steffen Jung, Robert Geirhos, M. Jehanzeb Mirza, Margret Keuper, Janis Keuper

    Abstract: Unlike traditional vision-only models, vision language models (VLMs) offer an intuitive way to access visual content through language prompting by combining a large language model (LLM) with a vision encoder. However, both the LLM and the vision encoder come with their own set of biases, cue preferences, and shortcuts, which have been rigorously studied in uni-modal models. A timely question is ho… ▽ More

    Submitted 5 March, 2025; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted at ICLR 2025

  24. arXiv:2401.08815  [pdf, other

    cs.CV cs.AI cs.LG

    Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive

    Authors: Yumeng Li, Margret Keuper, Dan Zhang, Anna Khoreva

    Abstract: Despite the recent advances in large-scale diffusion models, little progress has been made on the layout-to-image (L2I) synthesis task. Current L2I models either suffer from poor editability via text or weak alignment between the generated image and the input layout. This limits their usability in practice. To mitigate this, we propose to integrate adversarial supervision into the conventional tra… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted at ICLR 2024. Project page: https://yumengli007.github.io/ALDM/ and code: https://github.com/boschresearch/ALDM

  25. arXiv:2311.17524  [pdf, other

    cs.CV

    Improving Feature Stability during Upsampling -- Spectral Artifacts and the Importance of Spatial Context

    Authors: Shashank Agnihotri, Julia Grabinski, Margret Keuper

    Abstract: Pixel-wise predictions are required in a wide variety of tasks such as image restoration, image segmentation, or disparity estimation. Common models involve several stages of data resampling, in which the resolution of feature maps is first reduced to aggregate information and then increased to generate a high-resolution output. Previous works have shown that resampling operations are subject to a… ▽ More

    Submitted 12 July, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted at ECCV 2024

  26. arXiv:2308.15499  [pdf, other

    cs.CV

    Classification robustness to common optical aberrations

    Authors: Patrick Müller, Alexander Braun, Margret Keuper

    Abstract: Computer vision using deep neural networks (DNNs) has brought about seminal changes in people's lives. Applications range from automotive, face recognition in the security industry, to industrial process monitoring. In some cases, DNNs infer even in safety-critical situations. Therefore, for practical applications, DNNs have to behave in a robust way to disturbances such as noise, pixelation, or b… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: ICCVW2023

  27. Local Spherical Harmonics Improve Skeleton-Based Hand Action Recognition

    Authors: Katharina Prasse, Steffen Jung, Yuxuan Zhou, Margret Keuper

    Abstract: Hand action recognition is essential. Communication, human-robot interactions, and gesture control are dependent on it. Skeleton-based action recognition traditionally includes hands, which belong to the classes which remain challenging to correctly recognize to date. We propose a method specifically designed for hand action recognition which uses relative angular embeddings and local Spherical Ha… ▽ More

    Submitted 14 November, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

  28. arXiv:2307.13856  [pdf, other

    cs.CV cs.LG eess.IV

    On the unreasonable vulnerability of transformers for image restoration -- and an easy fix

    Authors: Shashank Agnihotri, Kanchana Vaishnavi Gandikota, Julia Grabinski, Paramanand Chandramouli, Margret Keuper

    Abstract: Following their success in visual recognition tasks, Vision Transformers(ViTs) are being increasingly employed for image restoration. As a few recent works claim that ViTs for image classification also have better robustness properties, we investigate whether the improved adversarial robustness of ViTs extends to image restoration. We consider the recently proposed Restormer model, as well as NAFN… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: Tags: Robustness, adversarial attacks, image deblurring, image restoration, NAFNet, Baseline, Restormer, adversarial training

  29. arXiv:2307.10864  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Divide & Bind Your Attention for Improved Generative Semantic Nursing

    Authors: Yumeng Li, Margret Keuper, Dan Zhang, Anna Khoreva

    Abstract: Emerging large-scale text-to-image generative models, e.g., Stable Diffusion (SD), have exhibited overwhelming results with high fidelity. Despite the magnificent progress, current state-of-the-art models still struggle to generate images fully adhering to the input prompt. Prior work, Attend & Excite, has introduced the concept of Generative Semantic Nursing (GSN), aiming to optimize cross-attent… ▽ More

    Submitted 14 July, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted at BMVC 2023 as Oral. Code: https://github.com/boschresearch/Divide-and-Bind and project page: https://sites.google.com/view/divide-and-bind

  30. arXiv:2307.10001  [pdf, other

    cs.CV

    As large as it gets: Learning infinitely large Filters via Neural Implicit Functions in the Fourier Domain

    Authors: Julia Grabinski, Janis Keuper, Margret Keuper

    Abstract: Recent work in neural networks for image classification has seen a strong tendency towards increasing the spatial context. Whether achieved through large convolution kernels or self-attention, models scale poorly with the increased spatial context, such that the improved model accuracy often comes at significant costs. In this paper, we propose a module for studying the effective filter size of co… ▽ More

    Submitted 15 May, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: accepted at TMLR 05/24

  31. arXiv:2307.09804  [pdf, other

    cs.CV eess.IV

    Fix your downsampling ASAP! Be natively more robust via Aliasing and Spectral Artifact free Pooling

    Authors: Julia Grabinski, Janis Keuper, Margret Keuper

    Abstract: Convolutional neural networks encode images through a sequence of convolutions, normalizations and non-linearities as well as downsampling operations into potentially strong semantic embeddings. Yet, previous work showed that even slight mistakes during sampling, leading to aliasing, can be directly attributed to the networks' lack in robustness. To address such issues and facilitate simpler and f… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  32. arXiv:2307.09365  [pdf, other

    cs.LG cs.CV

    An Evaluation of Zero-Cost Proxies -- from Neural Architecture Performance to Model Robustness

    Authors: Jovita Lukasik, Michael Moeller, Margret Keuper

    Abstract: Zero-cost proxies are nowadays frequently studied and used to search for neural architectures. They show an impressive ability to predict the performance of architectures by making use of their untrained weights. These techniques allow for immense search speed-ups. So far the joint search for well-performing and robust architectures has received much less attention in the field of NAS. Therefore,… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: Accepted at DAGM GCPR 2023

  33. arXiv:2307.00648  [pdf, other

    cs.CV cs.AI cs.LG

    Intra- & Extra-Source Exemplar-Based Style Synthesis for Improved Domain Generalization

    Authors: Yumeng Li, Dan Zhang, Margret Keuper, Anna Khoreva

    Abstract: The generalization with respect to domain shifts, as they frequently appear in applications such as autonomous driving, is one of the remaining big challenges for deep learning models. Therefore, we propose an exemplar-based style synthesis pipeline to improve domain generalization in semantic segmentation. Our method is based on a novel masked noise encoder for StyleGAN2 inversion. The model lear… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: An extended version of the accepted WACV paper arXiv:2210.10175

  34. arXiv:2306.06712  [pdf, other

    cs.LG cs.CV

    Neural Architecture Design and Robustness: A Dataset

    Authors: Steffen Jung, Jovita Lukasik, Margret Keuper

    Abstract: Deep learning models have proven to be successful in a wide range of machine learning tasks. Yet, they are often highly sensitive to perturbations on the input data which can lead to incorrect decisions with high confidence, hampering their deployment for practical use-cases. Thus, finding architectures that are (more) robust against perturbations has received much attention in recent years. Just… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: ICLR 2023; project page: http://robustness.vision/

  35. arXiv:2306.06684  [pdf, other

    cs.CV

    Happy People -- Image Synthesis as Black-Box Optimization Problem in the Discrete Latent Space of Deep Generative Models

    Authors: Steffen Jung, Jan Christian Schwedhelm, Claudia Schillings, Margret Keuper

    Abstract: In recent years, optimization in the learned latent space of deep generative models has been successfully applied to black-box optimization problems such as drug design, image generation or neural architecture search. Existing models thereby leverage the ability of neural models to learn the data distribution from a limited amount of samples such that new samples from the distribution can be drawn… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: CVPR 2023 workshop: Generative Models for Computer Vision

  36. arXiv:2304.14736  [pdf, other

    cs.CV

    Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters

    Authors: Hendrik Sommerhoff, Shashank Agnihotri, Mohamed Saleh, Michael Moeller, Margret Keuper, Andreas Kolb

    Abstract: The success of deep learning is frequently described as the ability to train all parameters of a network on a specific application in an end-to-end fashion. Yet, several design choices on the camera level, including the pixel layout of the sensor, are considered as pre-defined and fixed, and high resolution, regular pixel layouts are considered to be the most generic ones in computer vision and gr… ▽ More

    Submitted 28 April, 2023; originally announced April 2023.

  37. arXiv:2303.12669  [pdf, other

    cs.CV cs.AI cs.LG

    An Extended Study of Human-like Behavior under Adversarial Training

    Authors: Paul Gavrikov, Janis Keuper, Margret Keuper

    Abstract: Neural networks have a number of shortcomings. Amongst the severest ones is the sensitivity to distribution shifts which allows models to be easily fooled into wrong predictions by small perturbations to inputs that are often imperceivable to humans and do not have to carry semantic meaning. Adversarial training poses a partial solution to address this issue by training models on worst-case pertur… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: 6 pages, accepted at the CVPR 2023 Workshop "The 3rd Workshop of Adversarial Machine Learning on Computer Vision: Art of Robustness"

  38. arXiv:2303.11235  [pdf, other

    cs.CV

    FullFormer: Generating Shapes Inside Shapes

    Authors: Tejaswini Medi, Jawad Tayyub, Muhammad Sarmad, Frank Lindseth, Margret Keuper

    Abstract: Implicit generative models have been widely employed to model 3D data and have recently proven to be successful in encoding and generating high-quality 3D shapes. This work builds upon these models and alleviates current limitations by presenting the first implicit generative model that facilitates the generation of complex 3D shapes with rich internal geometric details. To achieve this, our model… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

  39. arXiv:2302.02213  [pdf, other

    cs.CV

    CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks

    Authors: Shashank Agnihotri, Steffen Jung, Margret Keuper

    Abstract: While neural networks allow highly accurate predictions in many tasks, their lack of robustness towards even slight input perturbations often hampers their deployment. Adversarial attacks such as the seminal projected gradient descent (PGD) offer an effective means to evaluate a model's robustness and dedicated solutions have been proposed for attacks on semantic segmentation or optical flow estim… ▽ More

    Submitted 5 July, 2024; v1 submitted 4 February, 2023; originally announced February 2023.

    Comments: Accepted at 41st International Conference on Machine Learning (ICML), 2024

  40. arXiv:2212.06776  [pdf, other

    cs.CV cs.CR

    Unfolding Local Growth Rate Estimates for (Almost) Perfect Adversarial Detection

    Authors: Peter Lorenz, Margret Keuper, Janis Keuper

    Abstract: Convolutional neural networks (CNN) define the state-of-the-art solution on many perceptual tasks. However, current CNN approaches largely remain vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the system while being quasi-imperceptible to the human eye. In recent years, various approaches have been proposed to defend CNNs against such attacks,… ▽ More

    Submitted 1 March, 2024; v1 submitted 13 December, 2022; originally announced December 2022.

    Comments: accepted at VISAPP23

  41. arXiv:2211.09590  [pdf, other

    cs.CV

    Hypergraph Transformer for Skeleton-based Action Recognition

    Authors: Yuxuan Zhou, Zhi-Qi Cheng, Chao Li, Yanwen Fang, Yifeng Geng, Xuansong Xie, Margret Keuper

    Abstract: Skeleton-based action recognition aims to recognize human actions given human joint coordinates with skeletal interconnections. By defining a graph with joints as vertices and their natural connections as edges, previous works successfully adopted Graph Convolutional networks (GCNs) to model joint co-occurrences and achieved superior performance. More recently, a limitation of GCNs is identified,… ▽ More

    Submitted 21 March, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

    ACM Class: I.2.10

  42. arXiv:2210.10175  [pdf, other

    cs.CV cs.AI cs.LG

    Intra-Source Style Augmentation for Improved Domain Generalization

    Authors: Yumeng Li, Dan Zhang, Margret Keuper, Anna Khoreva

    Abstract: The generalization with respect to domain shifts, as they frequently appear in applications such as autonomous driving, is one of the remaining big challenges for deep learning models. Therefore, we propose an intra-source style augmentation (ISSA) method to improve domain generalization in semantic segmentation. Our method is based on a novel masked noise encoder for StyleGAN2 inversion. The mode… ▽ More

    Submitted 29 May, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: Accepted at WACV 2023. Code is available at https://github.com/boschresearch/ISSA

  43. arXiv:2210.05938  [pdf, other

    cs.CV

    Robust Models are less Over-Confident

    Authors: Julia Grabinski, Paul Gavrikov, Janis Keuper, Margret Keuper

    Abstract: Despite the success of convolutional neural networks (CNNs) in many academic benchmarks for computer vision tasks, their application in the real-world is still facing fundamental challenges. One of these open problems is the inherent lack of robustness, unveiled by the striking effectiveness of adversarial attacks. Current attack methods are able to manipulate the network's prediction by adding sp… ▽ More

    Submitted 6 December, 2022; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: accepted at NeurIPS 2022

  44. arXiv:2206.07662  [pdf, other

    cs.CV

    SP-ViT: Learning 2D Spatial Priors for Vision Transformers

    Authors: Yuxuan Zhou, Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Lei Zhang, Margret Keuper, Xiansheng Hua

    Abstract: Recently, transformers have shown great potential in image classification and established state-of-the-art results on the ImageNet benchmark. However, compared to CNNs, transformers converge slowly and are prone to overfitting in low-data regimes due to the lack of spatial inductive biases. Such spatial inductive biases can be especially beneficial since the 2D structure of an input image is not w… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    ACM Class: I.4

  45. arXiv:2204.01366  [pdf, other

    cs.LG cs.CV

    Learning to solve Minimum Cost Multicuts efficiently using Edge-Weighted Graph Convolutional Neural Networks

    Authors: Steffen Jung, Margret Keuper

    Abstract: The minimum cost multicut problem is the NP-hard/APX-hard combinatorial optimization problem of partitioning a real-valued edge-weighted graph such as to minimize the total cost of the partition. While graph convolutional neural networks (GNN) have proven to be promising in the context of combinatorial optimization, most of them are only tailored to or tested on positive-valued edge weights, i.e.… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

  46. arXiv:2204.00491  [pdf, other

    cs.CV eess.IV

    FrequencyLowCut Pooling -- Plug & Play against Catastrophic Overfitting

    Authors: Julia Grabinski, Steffen Jung, Janis Keuper, Margret Keuper

    Abstract: Over the last years, Convolutional Neural Networks (CNNs) have been the dominating neural architecture in a wide range of computer vision tasks. From an image and signal processing point of view, this success might be a bit surprising as the inherent spatial pyramid design of most CNNs is apparently violating basic signal processing laws, i.e. Sampling Theorem in their down-sampling operations. Ho… ▽ More

    Submitted 20 September, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

    Comments: accepted at ECCV 2022

  47. arXiv:2203.08734  [pdf, other

    cs.LG cs.CV

    Learning Where To Look -- Generative NAS is Surprisingly Efficient

    Authors: Jovita Lukasik, Steffen Jung, Margret Keuper

    Abstract: The efficient, automated search for well-performing neural architectures (NAS) has drawn increasing attention in the recent past. Thereby, the predominant research objective is to reduce the necessity of costly evaluations of neural architectures while efficiently exploring large search spaces. To this aim, surrogate models embed architectures in a latent space and predict their performance, while… ▽ More

    Submitted 1 August, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: Accepted to European Conference on Computer Vision 2022

  48. arXiv:2112.05416  [pdf, other

    cs.CV

    Optimizing Edge Detection for Image Segmentation with Multicut Penalties

    Authors: Steffen Jung, Sebastian Ziegler, Amirhossein Kardoost, Margret Keuper

    Abstract: The Minimum Cost Multicut Problem (MP) is a popular way for obtaining a graph decomposition by optimizing binary edge labels over edge costs. While the formulation of a MP from independently estimated costs per edge is highly flexible and intuitive, solving the MP is NP-hard and time-expensive. As a remedy, recent work proposed to predict edge probabilities with awareness to potential conflicts by… ▽ More

    Submitted 10 December, 2021; originally announced December 2021.

  49. arXiv:2112.01601  [pdf, other

    cs.CV cs.CR

    Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

    Authors: Peter Lorenz, Dominik Strassel, Margret Keuper, Janis Keuper

    Abstract: Recently, RobustBench (Croce et al. 2020) has become a widely recognized benchmark for the adversarial robustness of image classification networks. In its most commonly reported sub-task, RobustBench evaluates and ranks the adversarial robustness of trained neural networks on CIFAR10 under AutoAttack (Croce and Hein 2020b) with l-inf perturbations limited to eps = 8/255. With leading scores of the… ▽ More

    Submitted 20 February, 2024; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: AAAI-22 AdvML Workshop

  50. arXiv:2111.08785  [pdf, ps, other

    cs.CV cs.CR

    Detecting AutoAttack Perturbations in the Frequency Domain

    Authors: Peter Lorenz, Paula Harder, Dominik Strassel, Margret Keuper, Janis Keuper

    Abstract: Recently, adversarial attacks on image classification networks by the AutoAttack (Croce and Hein, 2020b) framework have drawn a lot of attention. While AutoAttack has shown a very high attack success rate, most defense approaches are focusing on network hardening and robustness enhancements, like adversarial training. This way, the currently best-reported method can withstand about 66% of adversar… ▽ More

    Submitted 20 February, 2024; v1 submitted 16 November, 2021; originally announced November 2021.

    Comments: accepted at ICML 2021 workshop for robustness