Skip to main content

Showing 1–50 of 74 results for author: Kampffmeyer, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.11764  [pdf, ps, other

    cs.CV eess.IV

    DiffFuSR: Super-Resolution of all Sentinel-2 Multispectral Bands using Diffusion Models

    Authors: Muhammad Sarmad, Arnt-Børre Salberg, Michael Kampffmeyer

    Abstract: This paper presents DiffFuSR, a modular pipeline for super-resolving all 12 spectral bands of Sentinel-2 Level-2A imagery to a unified ground sampling distance (GSD) of 2.5 meters. The pipeline comprises two stages: (i) a diffusion-based super-resolution (SR) model trained on high-resolution RGB imagery from the NAIP and WorldStrat datasets, harmonized to simulate Sentinel-2 characteristics; and (… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: preprint under review

  2. arXiv:2505.21813  [pdf, other

    cs.LG stat.ML

    Optimizing Data Augmentation through Bayesian Model Selection

    Authors: Madi Matymov, Ba-Hien Tran, Michael Kampffmeyer, Markus Heinonen, Maurizio Filippone

    Abstract: Data Augmentation (DA) has become an essential tool to improve robustness and generalization of modern machine learning. However, when deciding on DA strategies it is critical to choose parameters carefully, and this can be a daunting task which is traditionally left to trial-and-error or expensive optimization based on validation performance. In this paper, we counter these limitations by proposi… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: 26 pages, 3 figures

    MSC Class: 62F15; 68T07 (Primary) 62M45; 62C10; 65C60 (Secondary)

  3. arXiv:2505.01134  [pdf, other

    cs.LG

    Aggregation of Dependent Expert Distributions in Multimodal Variational Autoencoders

    Authors: Rogelio A Mancisidor, Robert Jenssen, Shujian Yu, Michael Kampffmeyer

    Abstract: Multimodal learning with variational autoencoders (VAEs) requires estimating joint distributions to evaluate the evidence lower bound (ELBO). Current methods, the product and mixture of experts, aggregate single-modality distributions assuming independence for simplicity, which is an overoptimistic assumption. This research introduces a novel methodology for aggregating single-modality distributio… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

  4. arXiv:2502.17860  [pdf, other

    cs.CV

    UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting

    Authors: Haoyuan Li, Yanpeng Zhou, Tao Tang, Jifei Song, Yihan Zeng, Michael Kampffmeyer, Hang Xu, Xiaodan Liang

    Abstract: Recent advancements in multi-modal 3D pre-training methods have shown promising efficacy in learning joint representations of text, images, and point clouds. However, adopting point clouds as 3D representation fails to fully capture the intricacies of the 3D world and exhibits a noticeable gap between the discrete points and the dense 2D pixels of images. To tackle this issue, we propose UniGS, in… ▽ More

    Submitted 27 February, 2025; v1 submitted 25 February, 2025; originally announced February 2025.

    Comments: ICLR 2025; Corrected citation of Uni3D;

  5. arXiv:2412.08513  [pdf, other

    cs.LG cs.AI

    REPEAT: Improving Uncertainty Estimation in Representation Learning Explainability

    Authors: Kristoffer K. Wickstrøm, Thea Brüsch, Michael C. Kampffmeyer, Robert Jenssen

    Abstract: Incorporating uncertainty is crucial to provide trustworthy explanations of deep learning models. Recent works have demonstrated how uncertainty modeling can be particularly important in the unsupervised field of representation learning explainable artificial intelligence (R-XAI). Current R-XAI methods provide uncertainty by measuring variability in the importance score. However, they fail to prov… ▽ More

    Submitted 11 December, 2024; originally announced December 2024.

    Comments: Accepted at AAAI 2025. Code available at: https://github.com/Wickstrom/REPEAT

  6. arXiv:2410.10790  [pdf, other

    cs.CV

    Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes

    Authors: Jianqi Chen, Panwen Hu, Xiaojun Chang, Zhenwei Shi, Michael Kampffmeyer, Xiaodan Liang

    Abstract: Recent advancements in human motion synthesis have focused on specific types of motions, such as human-scene interaction, locomotion or human-human interaction, however, there is a lack of a unified system capable of generating a diverse combination of motion types. In response, we introduce Sitcom-Crafter, a comprehensive and extendable system for human motion generation in 3D space, which can be… ▽ More

    Submitted 13 February, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

    Comments: Accepted by ICLR 2025. Project Page: https://windvchen.github.io/Sitcom-Crafter

  7. arXiv:2406.01494  [pdf, other

    cs.CV cs.LG stat.ML

    Robust Classification by Coupling Data Mollification with Label Smoothing

    Authors: Markus Heinonen, Ba-Hien Tran, Michael Kampffmeyer, Maurizio Filippone

    Abstract: Introducing training-time augmentations is a key technique to enhance generalization and prepare deep neural networks against test-time corruptions. Inspired by the success of generative diffusion models, we propose a novel approach of coupling data mollification, in the form of image noising and blurring, with label smoothing to align predicted label confidences with image degradation. The method… ▽ More

    Submitted 1 May, 2025; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: AISTATS 2025. Code: https://github.com/markusheinonen/supervised-mollification

  8. arXiv:2405.00448  [pdf, other

    cs.CV

    MMTryon: Multi-Modal Multi-Reference Control for High-Quality Fashion Generation

    Authors: Xujie Zhang, Ente Lin, Xiu Li, Yuxuan Luo, Michael Kampffmeyer, Xin Dong, Xiaodan Liang

    Abstract: This paper introduces MMTryon, a multi-modal multi-reference VIrtual Try-ON (VITON) framework, which can generate high-quality compositional try-on results by taking a text instruction and multiple garment images as inputs. Our MMTryon addresses three problems overlooked in prior literature: 1) Support of multiple try-on items. Existing methods are commonly designed for single-item try-on tasks (e… ▽ More

    Submitted 20 November, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  9. arXiv:2403.13870  [pdf, other

    cs.CV cs.LG

    ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations

    Authors: Rwiddhi Chakraborty, Adrian Sletten, Michael Kampffmeyer

    Abstract: Group robustness strategies aim to mitigate learned biases in deep learning models that arise from spurious correlations present in their training datasets. However, most existing methods rely on the access to the label distribution of the groups, which is time-consuming and expensive to obtain. As a result, unsupervised group robustness strategies are sought. Based on the insight that a trained m… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  10. arXiv:2312.07822  [pdf, other

    cs.LG cs.AI

    Prototypical Self-Explainable Models Without Re-training

    Authors: Srishti Gautam, Ahcene Boubekki, Marina M. C. Höhne, Michael C. Kampffmeyer

    Abstract: Explainable AI (XAI) has unfolded in two distinct research directions with, on the one hand, post-hoc methods that explain the predictions of a pre-trained black-box model and, on the other hand, self-explainable models (SEMs) which are trained directly to provide explanations alongside their predictions. While the latter is preferred in safety-critical scenarios, post-hoc approaches have received… ▽ More

    Submitted 4 June, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  11. Better, Not Just More: Data-Centric Machine Learning for Earth Observation

    Authors: Ribana Roscher, Marc Rußwurm, Caroline Gevaert, Michael Kampffmeyer, Jefersson A. dos Santos, Maria Vakalopoulou, Ronny Hänsch, Stine Hansen, Keiller Nogueira, Jonathan Prexl, Devis Tuia

    Abstract: Recent developments and research in modern machine learning have led to substantial improvements in the geospatial field. Although numerous deep learning architectures and models have been proposed, the majority of them have been solely developed on benchmark datasets that lack strong real-world relevance. Furthermore, the performance of many methods has already saturated on these datasets. We arg… ▽ More

    Submitted 13 March, 2025; v1 submitted 8 December, 2023; originally announced December 2023.

    Journal ref: IEEE Geoscience and Remote Sensing Magazine, vol. 12, no. 4, pp. 335-355, Dec. 2024

  12. arXiv:2312.03667  [pdf, other

    cs.CV

    WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual Try-on

    Authors: xujie zhang, Xiu Li, Michael Kampffmeyer, Xin Dong, Zhenyu Xie, Feida Zhu, Haoye Dong, Xiaodan Liang

    Abstract: Image-based Virtual Try-On (VITON) aims to transfer an in-shop garment image onto a target person. While existing methods focus on warping the garment to fit the body pose, they often overlook the synthesis quality around the garment-skin boundary and realistic effects like wrinkles and shadows on the warped garments. These limitations greatly reduce the realism of the generated results and hinder… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  13. View it like a radiologist: Shifted windows for deep learning augmentation of CT images

    Authors: Eirik A. Østmo, Kristoffer K. Wickstrøm, Keyur Radiya, Michael C. Kampffmeyer, Robert Jenssen

    Abstract: Deep learning has the potential to revolutionize medical practice by automating and performing important tasks like detecting and delineating the size and locations of cancers in medical images. However, most deep learning models rely on augmentation techniques that treat medical images as natural images. For contrast-enhanced Computed Tomography (CT) images in particular, the signals producing th… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

    Comments: 6 pages, 3 figures, accepted to MLSP 2023

    Journal ref: 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP), 1-6

  14. arXiv:2308.11206  [pdf, other

    cs.CV

    DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment

    Authors: Xujie Zhang, Binbin Yang, Michael C. Kampffmeyer, Wenqing Zhang, Shiyue Zhang, Guansong Lu, Liang Lin, Hang Xu, Xiaodan Liang

    Abstract: Cross-modal garment synthesis and manipulation will significantly benefit the way fashion designers generate garments and modify their designs via flexible linguistic interfaces.Current approaches follow the general text-to-image paradigm and mine cross-modal relations via simple cross-attention modules, neglecting the structural correspondence between visual and textual representations in the fas… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: accepted by ICCV2023

  15. arXiv:2308.10334  [pdf, other

    cs.CV

    Coordinate Transformer: Achieving Single-stage Multi-person Mesh Recovery from Videos

    Authors: Haoyuan Li, Haoye Dong, Hanchao Jia, Dong Huang, Michael C. Kampffmeyer, Liang Lin, Xiaodan Liang

    Abstract: Multi-person 3D mesh recovery from videos is a critical first step towards automatic perception of group behavior in virtual reality, physical therapy and beyond. However, existing approaches rely on multi-stage paradigms, where the person detection and tracking stages are performed in a multi-person setting, while temporal dynamics are only modeled for one person at a time. Consequently, their pe… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  16. arXiv:2307.00543  [pdf, other

    cs.LG cs.AI cs.CR cs.GT

    Defending Against Poisoning Attacks in Federated Learning with Blockchain

    Authors: Nanqing Dong, Zhipeng Wang, Jiahao Sun, Michael Kampffmeyer, William Knottenbelt, Eric Xing

    Abstract: In the era of deep learning, federated learning (FL) presents a promising approach that allows multi-institutional data owners, or clients, to collaboratively train machine learning models without compromising data privacy. However, most existing FL approaches rely on a centralized server for global model aggregation, leading to a single point of failure. This makes the system vulnerable to malici… ▽ More

    Submitted 12 March, 2024; v1 submitted 2 July, 2023; originally announced July 2023.

    Comments: Accepted by IEEE Transactions on Artificial Intelligence

  17. arXiv:2306.11103  [pdf, other

    cs.CV eess.IV

    Forest Parameter Prediction by Multiobjective Deep Learning of Regression Models Trained with Pseudo-Target Imputation

    Authors: Sara Björk, Stian N. Anfinsen, Michael Kampffmeyer, Erik Næsset, Terje Gobakken, Lennart Noordermeer

    Abstract: In prediction of forest parameters with data from remote sensing (RS), regression models have traditionally been trained on a small sample of ground reference data. This paper proposes to impute this sample of true prediction targets with data from an existing RS-based prediction map that we consider as pseudo-targets. This substantially increases the amount of target training data and leverages t… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Comments: Submitted to IEEE Transactions on Geoscience and Remote Sensing

  18. arXiv:2303.09877  [pdf, other

    stat.ML cs.CV cs.LG

    On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering

    Authors: Daniel J. Trosten, Sigurd Løkse, Robert Jenssen, Michael C. Kampffmeyer

    Abstract: Self-supervised learning is a central component in recent approaches to deep multi-view clustering (MVC). However, we find large variations in the development of self-supervision-based methods for deep MVC, potentially slowing the progress of the field. To address this, we present DeepMVC, a unified framework for deep MVC that includes many recent methods as instances. We leverage our framework to… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: CVPR 2023. Code available at https://github.com/DanielTrosten/DeepMVC

  19. arXiv:2303.09352  [pdf, other

    cs.CV

    Hubs and Hyperspheres: Reducing Hubness and Improving Transductive Few-shot Learning with Hyperspherical Embeddings

    Authors: Daniel J. Trosten, Rwiddhi Chakraborty, Sigurd Løkse, Kristoffer Knutsen Wickstrøm, Robert Jenssen, Michael C. Kampffmeyer

    Abstract: Distance-based classification is frequently used in transductive few-shot learning (FSL). However, due to the high-dimensionality of image representations, FSL classifiers are prone to suffer from the hubness problem, where a few points (hubs) occur frequently in multiple nearest neighbour lists of other points. Hubness negatively impacts distance-based classification when hubs from one class appe… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: CVPR 2023

  20. Self-Supervised Few-Shot Learning for Ischemic Stroke Lesion Segmentation

    Authors: Luca Tomasetti, Stine Hansen, Mahdieh Khanmohammadi, Kjersti Engan, Liv Jorunn Høllesli, Kathinka Dæhli Kurz, Michael Kampffmeyer

    Abstract: Precise ischemic lesion segmentation plays an essential role in improving diagnosis and treatment planning for ischemic stroke, one of the prevalent diseases with the highest mortality rate. While numerous deep neural network approaches have recently been proposed to tackle this problem, these methods require large amounts of annotated regions during training, which can be impractical in the medic… ▽ More

    Submitted 16 March, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

  21. arXiv:2211.14052  [pdf, other

    cs.CV

    Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence Learning

    Authors: Zaiyu Huang, Hanhui Li, Zhenyu Xie, Michael Kampffmeyer, Qingling Cai, Xiaodan Liang

    Abstract: In this paper, we target image-based person-to-person virtual try-on in the presence of diverse poses and large viewpoint variations. Existing methods are restricted in this setting as they estimate garment warping flows mainly based on 2D poses and appearance, which omits the geometric prior of the 3D human body shape. Moreover, current garment warping methods are confined to localized regions, w… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  22. arXiv:2210.08151  [pdf, other

    cs.LG

    ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model

    Authors: Srishti Gautam, Ahcene Boubekki, Stine Hansen, Suaiba Amina Salahuddin, Robert Jenssen, Marina MC Höhne, Michael Kampffmeyer

    Abstract: The need for interpretable models has fostered the development of self-explainable classifiers. Prior approaches are either based on multi-stage optimization schemes, impacting the predictive performance of the model, or produce explanations that are not transparent, trustworthy or do not capture the diversity of the data. To address these shortcomings, we propose ProtoVAE, a variational autoencod… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  23. ARMANI: Part-level Garment-Text Alignment for Unified Cross-Modal Fashion Design

    Authors: Xujie Zhang, Yu Sha, Michael C. Kampffmeyer, Zhenyu Xie, Zequn Jie, Chengwen Huang, Jianqing Peng, Xiaodan Liang

    Abstract: Cross-modal fashion image synthesis has emerged as one of the most promising directions in the generation domain due to the vast untapped potential of incorporating multiple modalities and the wide range of fashion image applications. To facilitate accurate generation, cross-modal synthesis methods typically rely on Contrastive Language-Image Pre-training (CLIP) to align textual and garment inform… ▽ More

    Submitted 10 August, 2022; originally announced August 2022.

    Comments: Accepted by ACMMM22

  24. arXiv:2207.13475  [pdf, other

    cs.CV

    PASTA-GAN++: A Versatile Framework for High-Resolution Unpaired Virtual Try-on

    Authors: Zhenyu Xie, Zaiyu Huang, Fuwei Zhao, Haoye Dong, Michael Kampffmeyer, Xin Dong, Feida Zhu, Xiaodan Liang

    Abstract: Image-based virtual try-on is one of the most promising applications of human-centric image generation due to its tremendous real-world potential. In this work, we take a step forwards to explore versatile virtual try-on solutions, which we argue should possess three main properties, namely, they should support unsupervised training, arbitrary garment categories, and controllable garment editing.… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2111.10544

  25. arXiv:2207.04812  [pdf, other

    cs.CV stat.ML

    A clinically motivated self-supervised approach for content-based image retrieval of CT liver images

    Authors: Kristoffer Knutsen Wickstrøm, Eirik Agnalt Østmo, Keyur Radiya, Karl Øyvind Mikalsen, Michael Christian Kampffmeyer, Robert Jenssen

    Abstract: Deep learning-based approaches for content-based image retrieval (CBIR) of CT liver images is an active field of research, but suffers from some critical limitations. First, they are heavily reliant on labeled data, which can be challenging and costly to acquire. Second, they lack transparency and explainability, which limits the trustworthiness of deep CBIR systems. We address these limitations b… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: Code: https://github.com/Wickstrom/clinical-self-supervised-CBIR-ct-liver

  26. arXiv:2206.15353  [pdf, other

    cs.CV cs.LG

    Learning Underrepresented Classes from Decentralized Partially Labeled Medical Images

    Authors: Nanqing Dong, Michael Kampffmeyer, Irina Voiculescu

    Abstract: Using decentralized data for federated training is one promising emerging research direction for alleviating data scarcity in the medical domain. However, in contrast to large-scale fully labeled data commonly seen in general object recognition tasks, the local medical datasets are more likely to only have images annotated for a subset of classes of interest due to high annotation costs. In this p… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: Accepted by MICCAI 2022

  27. arXiv:2205.08864  [pdf, ps, other

    stat.ML cs.LG math.ST

    The Kernelized Taylor Diagram

    Authors: Kristoffer Wickstrøm, J. Emmanuel Johnson, Sigurd Løkse, Gustau Camps-Valls, Karl Øyvind Mikalsen, Michael Kampffmeyer, Robert Jenssen

    Abstract: This paper presents the kernelized Taylor diagram, a graphical framework for visualizing similarities between data populations. The kernelized Taylor diagram builds on the widely used Taylor diagram, which is used to visualize similarities between populations. However, the Taylor diagram has several limitations such as not capturing non-linear relationships and sensitivity to outliers. To address… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

    Comments: Accepted at the Norwegian Artificial Intelligence Symposium 2022. Code available at: https://github.com/Wickstrom/KernelizedTaylorDiagram

  28. arXiv:2203.16205  [pdf, other

    physics.chem-ph cs.LG

    Automatic Identification of Chemical Moieties

    Authors: Jonas Lederer, Michael Gastegger, Kristof T. Schütt, Michael Kampffmeyer, Klaus-Robert Müller, Oliver T. Unke

    Abstract: In recent years, the prediction of quantum mechanical observables with machine learning methods has become increasingly popular. Message-passing neural networks (MPNNs) solve this task by constructing atomic representations, from which the properties of interest are predicted. Here, we introduce a method to automatically identify chemical moieties (molecular building blocks) from such representati… ▽ More

    Submitted 27 April, 2023; v1 submitted 30 March, 2022; originally announced March 2022.

  29. Mixing Up Contrastive Learning: Self-Supervised Representation Learning for Time Series

    Authors: Kristoffer Wickstrøm, Michael Kampffmeyer, Karl Øyvind Mikalsen, Robert Jenssen

    Abstract: The lack of labeled data is a key challenge for learning useful representation from time series data. However, an unsupervised representation framework that is capable of producing high quality representations could be of great value. It is key to enabling transfer learning, which is especially beneficial for medical applications, where there is an abundance of data but labeling is costly and time… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: Published in Journal of Pattern Recognition Letters: https://www.sciencedirect.com/science/article/pii/S0167865522000502 Code available at: https://github.com/Wickstrom/MixupContrastiveLearning

  30. Anomaly Detection-Inspired Few-Shot Medical Image Segmentation Through Self-Supervision With Supervoxels

    Authors: Stine Hansen, Srishti Gautam, Robert Jenssen, Michael Kampffmeyer

    Abstract: Recent work has shown that label-efficient few-shot learning through self-supervision can achieve promising medical image segmentation results. However, few-shot segmentation models typically rely on prototype representations of the semantic classes, resulting in a loss of local information that can degrade performance. This is particularly problematic for the typically large and highly heterogene… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

    Comments: Accepted in Medical Image Analysis

  31. arXiv:2201.03559  [pdf, other

    eess.IV cs.CV cs.LG

    Demonstrating The Risk of Imbalanced Datasets in Chest X-ray Image-based Diagnostics by Prototypical Relevance Propagation

    Authors: Srishti Gautam, Marina M. -C. Höhne, Stine Hansen, Robert Jenssen, Michael Kampffmeyer

    Abstract: The recent trend of integrating multi-source Chest X-Ray datasets to improve automated diagnostics raises concerns that models learn to exploit source-specific correlations to improve performance by recognizing the source domain of an image rather than the medical pathology. We hypothesize that this effect is enforced by and leverages label-imbalance across the source domains, i.e, prevalence of a… ▽ More

    Submitted 10 January, 2022; originally announced January 2022.

    Comments: To appear in ISBI 2022

  32. arXiv:2112.10161  [pdf, other

    stat.ML cs.LG

    RELAX: Representation Learning Explainability

    Authors: Kristoffer K. Wickstrøm, Daniel J. Trosten, Sigurd Løkse, Ahcène Boubekki, Karl Øyvind Mikalsen, Michael C. Kampffmeyer, Robert Jenssen

    Abstract: Despite the significant improvements that representation learning via self-supervision has led to when learning from unlabeled data, no methods exist that explain what influences the learned representation. We address this need through our proposed approach, RELAX, which is the first approach for attribution-based explanations of representations. Our approach can also model the uncertainty in its… ▽ More

    Submitted 21 February, 2022; v1 submitted 19 December, 2021; originally announced December 2021.

  33. arXiv:2111.10544  [pdf, other

    cs.CV

    Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN

    Authors: Zhenyu Xie, Zaiyu Huang, Fuwei Zhao, Haoye Dong, Michael Kampffmeyer, Xiaodan Liang

    Abstract: Image-based virtual try-on is one of the most promising applications of human-centric image generation due to its tremendous real-world potential. Yet, as most try-on approaches fit in-shop garments onto a target person, they require the laborious and restrictive construction of a paired training dataset, severely limiting their scalability. While a few recent works attempt to transfer garments di… ▽ More

    Submitted 20 November, 2021; originally announced November 2021.

    Comments: 12 pages, 8 figures, 35th Conference on Neural Information Processing Systems

  34. Multi-modal land cover mapping of remote sensing images using pyramid attention and gated fusion networks

    Authors: Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg

    Abstract: Multi-modality data is becoming readily available in remote sensing (RS) and can provide complementary information about the Earth's surface. Effective fusion of multi-modal information is thus important for various applications in RS, but also very challenging due to large domain differences, noise, and redundancies. There is a lack of effective and scalable fusion techniques for bridging multipl… ▽ More

    Submitted 6 November, 2021; originally announced November 2021.

    Comments: 24 pages, 11 figures, submitted to IJRS

  35. arXiv:2110.04616  [pdf, other

    cs.LG stat.ML

    Discriminative Multimodal Learning via Conditional Priors in Generative Models

    Authors: Rogelio A. Mancisidor, Michael Kampffmeyer, Kjersti Aas, Robert Jenssen

    Abstract: Deep generative models with latent variables have been used lately to learn joint representations and generative processes from multi-modal data. These two learning mechanisms can, however, conflict with each other and representations can fail to embed information on the data modalities. This research studies the realistic scenario in which all modalities and class labels are available for model t… ▽ More

    Submitted 21 January, 2023; v1 submitted 9 October, 2021; originally announced October 2021.

  36. arXiv:2109.04275  [pdf, other

    cs.CV cs.MM

    M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining

    Authors: Xiao Dong, Xunlin Zhan, Yangxin Wu, Yunchao Wei, Michael C. Kampffmeyer, Xiaoyong Wei, Minlong Lu, Yaowei Wang, Xiaodan Liang

    Abstract: Despite the potential of multi-modal pre-training to learn highly discriminative feature representations from complementary data modalities, current progress is being slowed by the lack of large-scale modality-diverse datasets. By leveraging the natural suitability of E-commerce, where different modalities capture complementary semantic information, we contribute a large-scale multi-modal pre-trai… ▽ More

    Submitted 2 April, 2022; v1 submitted 9 September, 2021; originally announced September 2021.

    Comments: CVPR2022

  37. arXiv:2108.12204  [pdf, other

    cs.LG

    This looks more like that: Enhancing Self-Explaining Models by Prototypical Relevance Propagation

    Authors: Srishti Gautam, Marina M. -C. Höhne, Stine Hansen, Robert Jenssen, Michael Kampffmeyer

    Abstract: Current machine learning models have shown high efficiency in solving a wide variety of real-world problems. However, their black box character poses a major challenge for the understanding and traceability of the underlying decision-making strategies. As a remedy, many post-hoc explanation and self-explanatory methods have been developed to interpret the models' behavior. These methods, in additi… ▽ More

    Submitted 27 August, 2021; originally announced August 2021.

  38. arXiv:2108.05126  [pdf, other

    cs.CV

    M3D-VTON: A Monocular-to-3D Virtual Try-On Network

    Authors: Fuwei Zhao, Zhenyu Xie, Michael Kampffmeyer, Haoye Dong, Songfang Han, Tianxiang Zheng, Tao Zhang, Xiaodan Liang

    Abstract: Virtual 3D try-on can provide an intuitive and realistic view for online shopping and has a huge potential commercial value. However, existing 3D virtual try-on methods mainly rely on annotated 3D human shapes and garment templates, which hinders their applications in practical scenarios. 2D virtual try-on approaches provide a faster alternative to manipulate clothed humans, but lack the rich and… ▽ More

    Submitted 11 August, 2021; originally announced August 2021.

    Comments: Accepted at ICCV 2021

  39. arXiv:2108.00386  [pdf, other

    cs.CV

    WAS-VTON: Warping Architecture Search for Virtual Try-on Network

    Authors: Zhenyu Xie, Xujie Zhang, Fuwei Zhao, Haoye Dong, Michael C. Kampffmeyer, Haonan Yan, Xiaodan Liang

    Abstract: Despite recent progress on image-based virtual try-on, current methods are constraint by shared warping networks and thus fail to synthesize natural try-on results when faced with clothing categories that require different warping operations. In this paper, we address this problem by finding clothing category-specific warping networks for the virtual try-on task via Neural Architecture Search (NAS… ▽ More

    Submitted 1 August, 2021; originally announced August 2021.

  40. arXiv:2105.09580  [pdf, other

    cs.LG quant-ph stat.ML

    Negational Symmetry of Quantum Neural Networks for Binary Pattern Classification

    Authors: Nanqing Dong, Michael Kampffmeyer, Irina Voiculescu, Eric Xing

    Abstract: Entanglement is a physical phenomenon, which has fueled recent successes of quantum algorithms. Although quantum neural networks (QNNs) have shown promising results in solving simple machine learning tasks recently, for the time being, the effect of entanglement in QNNs and the behavior of QNNs in binary pattern classification are still underexplored. In this work, we provide some theoretical insi… ▽ More

    Submitted 25 April, 2022; v1 submitted 20 May, 2021; originally announced May 2021.

    Comments: Accepted by Pattern Recognition

  41. arXiv:2103.07738  [pdf, other

    cs.CV cs.LG

    Reconsidering Representation Alignment for Multi-view Clustering

    Authors: Daniel J. Trosten, Sigurd Løkse, Robert Jenssen, Michael Kampffmeyer

    Abstract: Aligning distributions of view representations is a core component of today's state of the art models for deep multi-view clustering. However, we identify several drawbacks with naïvely aligning representation distributions. We demonstrate that these drawbacks both lead to less separable clusters in the representation space, and inhibit the model's ability to prioritize views. Based on these obser… ▽ More

    Submitted 13 March, 2021; originally announced March 2021.

    Comments: To appear in CVPR 2021. Code available at https://github.com/DanielTrosten/mvc

  42. arXiv:2012.03740  [pdf, other

    stat.ML cs.LG

    Joint Optimization of an Autoencoder for Clustering and Embedding

    Authors: Ahcène Boubekki, Michael Kampffmeyer, Robert Jenssen, Ulf Brefeld

    Abstract: Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder's embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In… ▽ More

    Submitted 1 May, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

  43. arXiv:2011.14164  [pdf, other

    cs.CV cs.LG eess.IV

    Towards Robust Partially Supervised Multi-Structure Medical Image Segmentation on Small-Scale Data

    Authors: Nanqing Dong, Michael Kampffmeyer, Xiaodan Liang, Min Xu, Irina Voiculescu, Eric P. Xing

    Abstract: The data-driven nature of deep learning (DL) models for semantic segmentation requires a large number of pixel-level annotations. However, large-scale and fully labeled medical datasets are often unavailable for practical tasks. Recently, partially supervised methods have been proposed to utilize images with incomplete labels in the medical domain. To bridge the methodological gaps in partially su… ▽ More

    Submitted 26 October, 2021; v1 submitted 28 November, 2020; originally announced November 2020.

    Comments: Accepted by Applied Soft Computing

  44. Uncertainty-Aware Deep Ensembles for Reliable and Explainable Predictions of Clinical Time Series

    Authors: Kristoffer Wickstrøm, Karl Øyvind Mikalsen, Michael Kampffmeyer, Arthur Revhaug, Robert Jenssen

    Abstract: Deep learning-based support systems have demonstrated encouraging results in numerous clinical applications involving the processing of time series data. While such systems often are very accurate, they have no inherent mechanism for explaining what influenced the predictions, which is critical for clinical tasks. However, existing explainability techniques lack an important component for trustwor… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

    Comments: 11 pages, 9 figures, code at https://github.com/Wickstrom/TimeSeriesXAI

  45. arXiv:2009.01599  [pdf, other

    cs.CV

    SCG-Net: Self-Constructing Graph Neural Networks for Semantic Segmentation

    Authors: Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg

    Abstract: Capturing global contextual representations by exploiting long-range pixel-pixel dependencies has shown to improve semantic segmentation performance. However, how to do this efficiently is an open question as current approaches of utilising attention schemes or very deep models to increase the models field of view, result in complex models with large memory consumption. Inspired by recent work on… ▽ More

    Submitted 3 January, 2021; v1 submitted 3 September, 2020; originally announced September 2020.

    Comments: 11 pages, 5 figs. code will be open soon

  46. arXiv:2004.10327  [pdf, other

    cs.CV

    Multi-view Self-Constructing Graph Convolutional Networks with Adaptive Class Weighting Loss for Semantic Segmentation

    Authors: Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg

    Abstract: We propose a novel architecture called the Multi-view Self-Constructing Graph Convolutional Networks (MSCG-Net) for semantic segmentation. Building on the recently proposed Self-Constructing Graph (SCG) module, which makes use of learnable latent variables to self-construct the underlying graphs directly from the input features without relying on manually built prior knowledge graphs, we leverage… ▽ More

    Submitted 21 April, 2020; originally announced April 2020.

    Comments: 7-page, MSCG-Net, CVPRW-2020

    Report number: 2004.10327

  47. arXiv:2004.09754  [pdf, other

    cs.CV cs.LG eess.IV

    The 1st Agriculture-Vision Challenge: Methods and Results

    Authors: Mang Tik Chiu, Xingqian Xu, Kai Wang, Jennifer Hobbs, Naira Hovakimyan, Thomas S. Huang, Honghui Shi, Yunchao Wei, Zilong Huang, Alexander Schwing, Robert Brunner, Ivan Dozier, Wyatt Dozier, Karen Ghandilyan, David Wilson, Hyunseong Park, Junhee Kim, Sungho Kim, Qinghui Liu, Michael C. Kampffmeyer, Robert Jenssen, Arnt B. Salberg, Alexandre Barbosa, Rodrigo Trevisan, Bingchen Zhao , et al. (17 additional authors not shown)

    Abstract: The first Agriculture-Vision Challenge aims to encourage research in developing novel and effective algorithms for agricultural pattern recognition from aerial images, especially for the semantic segmentation task associated with our challenge dataset. Around 57 participating teams from various countries compete to achieve state-of-the-art in aerial agriculture semantic segmentation. The Agricultu… ▽ More

    Submitted 23 April, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

    Comments: CVPR 2020 Workshop

  48. arXiv:2004.07011  [pdf, other

    cs.CV

    Code-Aligned Autoencoders for Unsupervised Change Detection in Multimodal Remote Sensing Images

    Authors: Luigi T. Luppino, Mads A. Hansen, Michael Kampffmeyer, Filippo M. Bianchi, Gabriele Moser, Robert Jenssen, Stian N. Anfinsen

    Abstract: Image translation with convolutional autoencoders has recently been used as an approach to multimodal change detection in bitemporal satellite images. A main challenge is the alignment of the code spaces by reducing the contribution of change pixels to the learning of the translation function. Many existing approaches train the networks by exploiting supervised information of the change areas, whi… ▽ More

    Submitted 15 April, 2020; originally announced April 2020.

  49. arXiv:2003.06932  [pdf, other

    cs.CV

    Self-Constructing Graph Convolutional Networks for Semantic Labeling

    Authors: Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg

    Abstract: Graph Neural Networks (GNNs) have received increasing attention in many fields. However, due to the lack of prior graphs, their use for semantic labeling has been limited. Here, we propose a novel architecture called the Self-Constructing Graph (SCG), which makes use of learnable latent variables to generate embeddings and to self-construct the underlying graphs directly from the input features wi… ▽ More

    Submitted 23 April, 2020; v1 submitted 15 March, 2020; originally announced March 2020.

    Comments: IGARSS-2020, code at: github.com/samleoqh/MSCG-Net

  50. Dense Dilated Convolutions Merging Network for Land Cover Classification

    Authors: Qinghui Liu, Michael Kampffmeyer, Robert Jessen, Arnt-Børre Salberg

    Abstract: Land cover classification of remote sensing images is a challenging task due to limited amounts of annotated data, highly imbalanced classes, frequent incorrect pixel-level annotations, and an inherent complexity in the semantic segmentation task. In this article, we propose a novel architecture called the dense dilated convolutions' merging network (DDCM-Net) to address this task. The proposed DD… ▽ More

    Submitted 9 March, 2020; originally announced March 2020.

    Comments: Semantic Segmentation, 12 pages, TGRS-2020 early access in IEEE Transactions on Geoscience and Remote Sensing. 2020, Code available at https://github.com/samleoqh/DDCM-Semantic-Segmentation-PyTorch