Skip to main content

Showing 1–7 of 7 results for author: Nauen, T C

.
  1. arXiv:2503.09399  [pdf, other

    cs.CV cs.AI cs.LG

    ForAug: Recombining Foregrounds and Backgrounds to Improve Vision Transformer Training with Bias Mitigation

    Authors: Tobias Christian Nauen, Brian Moser, Federico Raue, Stanislav Frolov, Andreas Dengel

    Abstract: Transformers, particularly Vision Transformers (ViTs), have achieved state-of-the-art performance in large-scale image classification. However, they often require large amounts of data and can exhibit biases that limit their robustness and generalizability. This paper introduces ForAug, a novel data augmentation scheme that addresses these challenges and explicitly includes inductive biases, which… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

    MSC Class: 68T45 ACM Class: I.2.10; I.2.6; I.4.6

  2. arXiv:2411.12115  [pdf, other

    cs.CV cs.AI cs.LG

    Distill the Best, Ignore the Rest: Improving Dataset Distillation with Loss-Value-Based Pruning

    Authors: Brian B. Moser, Federico Raue, Tobias C. Nauen, Stanislav Frolov, Andreas Dengel

    Abstract: Dataset distillation has gained significant interest in recent years, yet existing approaches typically distill from the entire dataset, potentially including non-beneficial samples. We introduce a novel "Prune First, Distill After" framework that systematically prunes datasets via loss-based sampling prior to distillation. By leveraging pruning before classical distillation techniques and generat… ▽ More

    Submitted 18 November, 2024; originally announced November 2024.

  3. arXiv:2411.12073  [pdf, other

    cs.CV cs.AI cs.LG

    Just Leaf It: Accelerating Diffusion Classifiers with Hierarchical Class Pruning

    Authors: Arundhati S. Shanbhag, Brian B. Moser, Tobias C. Nauen, Stanislav Frolov, Federico Raue, Andreas Dengel

    Abstract: Diffusion models, celebrated for their generative capabilities, have recently demonstrated surprising effectiveness in image classification tasks by using Bayes' theorem. Yet, current diffusion classifiers must evaluate every label candidate for each input, creating high computational costs that impede their use in large-scale applications. To address this limitation, we propose a Hierarchical Dif… ▽ More

    Submitted 7 March, 2025; v1 submitted 18 November, 2024; originally announced November 2024.

  4. arXiv:2411.12072  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Zoomed In, Diffused Out: Towards Local Degradation-Aware Multi-Diffusion for Extreme Image Super-Resolution

    Authors: Brian B. Moser, Stanislav Frolov, Tobias C. Nauen, Federico Raue, Andreas Dengel

    Abstract: Large-scale, pre-trained Text-to-Image (T2I) diffusion models have gained significant popularity in image generation tasks and have shown unexpected potential in image Super-Resolution (SR). However, most existing T2I diffusion models are trained with a resolution limit of 512x512, making scaling beyond this resolution an unresolved but necessary challenge for image SR. In this work, we introduce… ▽ More

    Submitted 18 November, 2024; originally announced November 2024.

  5. arXiv:2411.10231  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    A Low-Resolution Image is Worth 1x1 Words: Enabling Fine Image Super-Resolution with Transformers and TaylorShift

    Authors: Sanath Budakegowdanadoddi Nagaraju, Brian Bernhard Moser, Tobias Christian Nauen, Stanislav Frolov, Federico Raue, Andreas Dengel

    Abstract: Transformer-based Super-Resolution (SR) models have recently advanced image reconstruction quality, yet challenges remain due to computational complexity and an over-reliance on large patch sizes, which constrain fine-grained detail enhancement. In this work, we propose TaylorIR to address these limitations by utilizing a patch size of 1x1, enabling pixel-level processing in any transformer-based… ▽ More

    Submitted 15 November, 2024; originally announced November 2024.

  6. TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmax

    Authors: Tobias Christian Nauen, Sebastian Palacio, Andreas Dengel

    Abstract: The quadratic complexity of the attention mechanism represents one of the biggest hurdles for processing long sequences using Transformers. Current methods, relying on sparse representations or stateful recurrence, sacrifice token-to-token interactions, which ultimately leads to compromises in performance. This paper introduces TaylorShift, a novel reformulation of the Taylor softmax that enables… ▽ More

    Submitted 17 July, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    MSC Class: 68T07 ACM Class: I.5.1; I.2.10; I.2.7

  7. arXiv:2308.09372  [pdf, other

    cs.CV cs.AI cs.LG

    Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers

    Authors: Tobias Christian Nauen, Sebastian Palacio, Federico Raue, Andreas Dengel

    Abstract: Self-attention in Transformers comes with a high computational cost because of their quadratic computational complexity, but their effectiveness in addressing problems in language and vision has sparked extensive research aimed at enhancing their efficiency. However, diverse experimental conditions, spanning multiple input domains, prevent a fair comparison based solely on reported results, posing… ▽ More

    Submitted 24 February, 2025; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: v3: new models, analysis of scaling behaviors; v4: WACV 2025 camera ready version, appendix added

    MSC Class: 68T07 ACM Class: I.4.0; I.2.10; I.5.1