Skip to main content

Showing 1–30 of 30 results for author: Parisot, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.15724  [pdf, ps, other

    cs.CV

    A Practical Investigation of Spatially-Controlled Image Generation with Transformers

    Authors: Guoxuan Xia, Harleen Hanspal, Petru-Daniel Tudosiu, Shifeng Zhang, Sarah Parisot

    Abstract: Enabling image generation models to be spatially controlled is an important area of research, empowering users to better generate images according to their own fine-grained specifications via e.g. edge maps, poses. Although this task has seen impressive improvements in recent times, a focus on rapidly producing stronger models has come at the cost of detailed and fair scientific comparison. Differ… ▽ More

    Submitted 21 July, 2025; originally announced July 2025.

    Comments: preprint

  2. arXiv:2506.17967  [pdf, ps, other

    cs.LG cs.AI cs.CV

    Adapting Vision-Language Models for Evaluating World Models

    Authors: Mariya Hendriksen, Tabish Rashid, David Bignell, Raluca Georgescu, Abdelhak Lemkhenter, Katja Hofmann, Sam Devlin, Sarah Parisot

    Abstract: World models -- generative models that simulate environment dynamics conditioned on past observations and actions -- are gaining prominence in planning, simulation, and embodied AI. However, evaluating their rollouts remains a fundamental challenge, requiring fine-grained, temporally grounded assessment of action alignment and semantic consistency -- capabilities not captured by existing metrics.… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

  3. arXiv:2503.22517  [pdf, other

    cs.CL cs.AI cs.CV

    Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities

    Authors: Raman Dutt, Harleen Hanspal, Guoxuan Xia, Petru-Daniel Tudosiu, Alexander Black, Yongxin Yang, Steven McDonagh, Sarah Parisot

    Abstract: In this work, we undertake the challenge of augmenting the existing generative capabilities of pre-trained text-only large language models (LLMs) with multi-modal generation capability while satisfying two core constraints: C1 preserving the preservation of original language generative capabilities with negligible performance degradation, and C2 adhering to a small parameter budget to learn the ne… ▽ More

    Submitted 1 April, 2025; v1 submitted 28 March, 2025; originally announced March 2025.

  4. arXiv:2411.10913  [pdf, other

    cs.CV cs.LG

    Generating Compositional Scenes via Text-to-image RGBA Instance Generation

    Authors: Alessandro Fontanella, Petru-Daniel Tudosiu, Yongxin Yang, Shifeng Zhang, Sarah Parisot

    Abstract: Text-to-image diffusion generative models can generate high quality images at the cost of tedious prompt engineering. Controllability can be improved by introducing layout conditioning, however existing methods lack layout editing ability and fine-grained control over object attributes. The concept of multi-layer generation holds great potential to address these limitations, however generating ima… ▽ More

    Submitted 16 November, 2024; originally announced November 2024.

    Comments: NeurIPS 2024

  5. arXiv:2410.05058  [pdf, other

    cs.CV

    Improving Object Detection via Local-global Contrastive Learning

    Authors: Danai Triantafyllidou, Sarah Parisot, Ales Leonardis, Steven McDonagh

    Abstract: Visual domain gaps often impact object detection performance. Image-to-image translation can mitigate this effect, where contrastive approaches enable learning of the image-to-image mapping under unsupervised regimes. However, existing methods often fail to handle content-rich scenes with multiple object instances, which manifests in unsatisfactory detection performance. Sensitivity to such instan… ▽ More

    Submitted 25 October, 2024; v1 submitted 7 October, 2024; originally announced October 2024.

    Comments: BMVC 2024 - Project page: https://local-global-detection.github.io

  6. arXiv:2404.02790  [pdf, other

    cs.CV

    MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation

    Authors: Petru-Daniel Tudosiu, Yongxin Yang, Shifeng Zhang, Fei Chen, Steven McDonagh, Gerasimos Lampouras, Ignacio Iacobacci, Sarah Parisot

    Abstract: Text-to-image generation has achieved astonishing results, yet precise spatial controllability and prompt fidelity remain highly challenging. This limitation is typically addressed through cumbersome prompt engineering, scene layout conditioning, or image editing techniques which often require hand drawn masks. Nonetheless, pre-existing works struggle to take advantage of the natural instance-leve… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 - Project page: https://MuLAn-dataset.github.io/

  7. arXiv:2311.16882  [pdf, other

    cs.CV cs.CL cs.LG

    Optimisation-Based Multi-Modal Semantic Image Editing

    Authors: Bowen Li, Yongxin Yang, Steven McDonagh, Shifeng Zhang, Petru-Daniel Tudosiu, Sarah Parisot

    Abstract: Image editing affords increased control over the aesthetics and content of generated images. Pre-existing works focus predominantly on text-based instructions to achieve desired image modifications, which limit edit precision and accuracy. In this work, we propose an inference-time editing optimisation, designed to extend beyond textual edits to accommodate multiple editing instruction types (e.g.… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  8. arXiv:2304.01830  [pdf, other

    cs.CV

    Learning to Name Classes for Vision and Language Models

    Authors: Sarah Parisot, Yongxin Yang, Steven McDonagh

    Abstract: Large scale vision and language models can achieve impressive zero-shot recognition performance by mapping class specific text queries to image content. Two distinct challenges that remain however, are high sensitivity to the choice of handcrafted class names that define queries, and the difficulty of adaptation to new, smaller datasets. Towards addressing these problems, we propose to leverage av… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2023

  9. arXiv:2211.05215  [pdf, other

    cs.CV

    Content-Diverse Comparisons improve IQA

    Authors: William Thong, Jose Costa Pereira, Sarah Parisot, Ales Leonardis, Steven McDonagh

    Abstract: Image quality assessment (IQA) forms a natural and often straightforward undertaking for humans, yet effective automation of the task remains highly challenging. Recent metrics from the deep learning community commonly compare image pairs during training to improve upon traditional metrics such as PSNR or SSIM. However, current comparisons ignore the fact that image content affects quality assessm… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

    Comments: Accepted at British Machine Vision Conference (BMVC) 2022

  10. arXiv:2210.03482  [pdf, other

    cs.CV cs.LG

    CLAD: A realistic Continual Learning benchmark for Autonomous Driving

    Authors: Eli Verwimp, Kuo Yang, Sarah Parisot, Hong Lanqing, Steven McDonagh, Eduardo Pérez-Pellitero, Matthias De Lange, Tinne Tuytelaars

    Abstract: In this paper we describe the design and the ideas motivating a new Continual Learning benchmark for Autonomous Driving (CLAD), that focuses on the problems of object classification and object detection. The benchmark utilises SODA10M, a recently released large-scale dataset that concerns autonomous driving related problems. First, we review and discuss existing continual learning benchmarks, how… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  11. arXiv:2204.01407  [pdf, other

    cs.CV cs.LG

    Re-examining Distillation For Continual Object Detection

    Authors: Eli Verwimp, Kuo Yang, Sarah Parisot, Hong Lanqing, Steven McDonagh, Eduardo Pérez-Pellitero, Matthias De Lange, Tinne Tuytelaars

    Abstract: Training models continually to detect and classify objects, from new classes and new domains, remains an open problem. In this work, we conduct a thorough analysis of why and how object detection models forget catastrophically. We focus on distillation-based approaches in two-stage networks; the most-common strategy employed in contemporary continual object detection work.Distillation aims to tran… ▽ More

    Submitted 7 October, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

    Comments: Accepted at BMVC '22

  12. arXiv:2112.06741  [pdf, other

    cs.CV cs.LG

    Long-tail Recognition via Compositional Knowledge Transfer

    Authors: Sarah Parisot, Pedro M. Esperanca, Steven McDonagh, Tamas J. Madarasz, Yongxin Yang, Zhenguo Li

    Abstract: In this work, we introduce a novel strategy for long-tail recognition that addresses the tail classes' few-shot problem via training-free knowledge transfer. Our objective is to transfer knowledge acquired from information-rich common classes to semantically similar, and yet data-hungry, rare classes in order to obtain stronger tail class representations. We leverage the fact that class prototypes… ▽ More

    Submitted 12 April, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

    Comments: Accepted to CVPR 2022

  13. arXiv:2106.06440  [pdf, other

    cs.CV cs.LG

    Learning Compositional Shape Priors for Few-Shot 3D Reconstruction

    Authors: Mateusz Michalkiewicz, Stavros Tsogkas, Sarah Parisot, Mahsa Baktashmotlagh, Anders Eriksson, Eugene Belilovsky

    Abstract: The impressive performance of deep convolutional neural networks in single-view 3D reconstruction suggests that these models perform non-trivial reasoning about the 3D structure of the output space. Recent work has challenged this belief, showing that, on standard benchmarks, complex encoder-decoder architectures perform similarly to nearest-neighbor baselines or simple linear decoder models that… ▽ More

    Submitted 16 June, 2021; v1 submitted 11 June, 2021; originally announced June 2021.

    Comments: 13 pages, 12 figures. arXiv admin note: substantial text overlap with arXiv:2004.06302

  14. arXiv:2010.02041  [pdf, other

    cs.CV cs.LG eess.IV

    Probabilistic 3D surface reconstruction from sparse MRI information

    Authors: Katarína Tóthová, Sarah Parisot, Matthew Lee, Esther Puyol-Antón, Andrew King, Marc Pollefeys, Ender Konukoglu

    Abstract: Surface reconstruction from magnetic resonance (MR) imaging data is indispensable in medical image analysis and clinical research. A reliable and effective reconstruction tool should: be fast in prediction of accurate well localised and high resolution models, evaluate prediction uncertainty, work with as little input data as possible. Current deep learning state of the art (SOTA) 3D reconstructio… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

    Comments: MICCAI 2020

  15. arXiv:2008.09694  [pdf, other

    cs.CV eess.IV

    Many-shot from Low-shot: Learning to Annotate using Mixed Supervision for Object Detection

    Authors: Carlo Biffi, Steven McDonagh, Philip Torr, Ales Leonardis, Sarah Parisot

    Abstract: Object detection has witnessed significant progress by relying on large, manually annotated datasets. Annotating such datasets is highly time consuming and expensive, which motivates the development of weakly supervised and few-shot object detection methods. However, these methods largely underperform with respect to their strongly supervised counterpart, as weak training signals \emph{often} resu… ▽ More

    Submitted 26 August, 2020; v1 submitted 21 August, 2020; originally announced August 2020.

    Comments: Accepted at ECCV 2020. Camera-ready version and Appendices

  16. arXiv:2004.06302  [pdf, other

    cs.CV cs.LG

    Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors

    Authors: Mateusz Michalkiewicz, Sarah Parisot, Stavros Tsogkas, Mahsa Baktashmotlagh, Anders Eriksson, Eugene Belilovsky

    Abstract: The impressive performance of deep convolutional neural networks in single-view 3D reconstruction suggests that these models perform non-trivial reasoning about the 3D structure of the output space. However, recent work has challenged this belief, showing that complex encoder-decoder architectures perform similarly to nearest-neighbor baselines or simple linear decoder models that exploit large am… ▽ More

    Submitted 2 May, 2020; v1 submitted 14 April, 2020; originally announced April 2020.

  17. arXiv:2003.13985  [pdf, other

    cs.CV

    DeepLPF: Deep Local Parametric Filters for Image Enhancement

    Authors: Sean Moran, Pierre Marza, Steven McDonagh, Sarah Parisot, Gregory Slabaugh

    Abstract: Digital artists often improve the aesthetic quality of digital photographs through manual retouching. Beyond global adjustments, professional image editing programs provide local adjustment tools operating on specific parts of an image. Options include parametric (graduated, radial filters) and unconstrained brush tools. These highly expressive tools enable a diverse set of local image enhancement… ▽ More

    Submitted 31 March, 2020; originally announced March 2020.

    Comments: Accepted for publication at CVPR2020

  18. arXiv:2003.13296  [pdf, other

    cs.CV cs.CR

    Unsupervised Model Personalization while Preserving Privacy and Scalability: An Open Problem

    Authors: Matthias De Lange, Xu Jia, Sarah Parisot, Ales Leonardis, Gregory Slabaugh, Tinne Tuytelaars

    Abstract: This work investigates the task of unsupervised model personalization, adapted to continually evolving, unlabeled local user images. We consider the practical scenario where a high capacity server interacts with a myriad of resource-limited edge devices, imposing strong requirements on scalability and local data privacy. We aim to address this challenge within the continual learning paradigm and p… ▽ More

    Submitted 30 March, 2020; originally announced March 2020.

    Comments: CVPR 2020

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020

  19. arXiv:2002.12896  [pdf, other

    cs.CV

    A Multi-Hypothesis Approach to Color Constancy

    Authors: Daniel Hernandez-Juarez, Sarah Parisot, Benjamin Busam, Ales Leonardis, Gregory Slabaugh, Steven McDonagh

    Abstract: Contemporary approaches frame the color constancy problem as learning camera specific illuminant mappings. While high accuracy can be achieved on camera specific data, these models depend on camera spectral sensitivity and typically exhibit poor generalisation to new devices. Additionally, regression methods produce point estimates that do not explicitly account for potential ambiguities among pla… ▽ More

    Submitted 2 March, 2020; v1 submitted 28 February, 2020; originally announced February 2020.

    Comments: Accepted for publication at CVPR2020

  20. arXiv:2002.07421  [pdf, other

    cs.CV

    EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with Cascade Refinement

    Authors: Linpu Fang, Hang Xu, Zhili Liu, Sarah Parisot, Zhenguo Li

    Abstract: Object detectors trained on fully-annotated data currently yield state of the art performance but require expensive manual annotations. On the other hand, weakly-supervised detectors have much lower performance and cannot be used reliably in a realistic setting. In this paper, we study the hybrid-supervised object detection problem, aiming to train a high quality detector with only a limited amoun… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

    Comments: Accepted by AAAI20

  21. A continual learning survey: Defying forgetting in classification tasks

    Authors: Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Ales Leonardis, Gregory Slabaugh, Tinne Tuytelaars

    Abstract: Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts thi… ▽ More

    Submitted 16 April, 2021; v1 submitted 18 September, 2019; originally announced September 2019.

    Comments: Accepted TPAMI paper, including Appendix, code publicly available

  22. arXiv:1811.11788  [pdf, other

    cs.CV stat.ML

    Formulating Camera-Adaptive Color Constancy as a Few-shot Meta-Learning Problem

    Authors: Steven McDonagh, Sarah Parisot, Fengwei Zhou, Xing Zhang, Ales Leonardis, Zhenguo Li, Gregory Slabaugh

    Abstract: Digital camera pipelines employ color constancy methods to estimate an unknown scene illuminant, in order to re-illuminate images as if they were acquired under an achromatic light source. Fully-supervised learning approaches exhibit state-of-the-art estimation accuracy with camera-specific labelled training imagery. Resulting models typically suffer from domain gaps and fail to generalise across… ▽ More

    Submitted 3 April, 2019; v1 submitted 28 November, 2018; originally announced November 2018.

    Comments: First two authors contributed equally

  23. arXiv:1807.11272  [pdf, other

    cs.CV cs.AI cs.LG

    Uncertainty Quantification in CNN-Based Surface Prediction Using Shape Priors

    Authors: Katarína Tóthová, Sarah Parisot, Matthew C. H. Lee, Esther Puyol-Antón, Lisa M. Koch, Andrew P. King, Ender Konukoglu, Marc Pollefeys

    Abstract: Surface reconstruction is a vital tool in a wide range of areas of medical image analysis and clinical research. Despite the fact that many methods have proposed solutions to the reconstruction problem, most, due to their deterministic nature, do not directly address the issue of quantifying uncertainty associated with their predictions. We remedy this by proposing a novel probabilistic deep learn… ▽ More

    Submitted 30 July, 2018; originally announced July 2018.

    Comments: Accepted to ShapeMI MICCAI 2018: Workshop on Shape in Medical Imaging

  24. arXiv:1806.07243  [pdf, other

    cs.CV

    Learning Conditioned Graph Structures for Interpretable Visual Question Answering

    Authors: Will Norcliffe-Brown, Efstathios Vafeias, Sarah Parisot

    Abstract: Visual Question answering is a challenging problem requiring a combination of concepts from Computer Vision and Natural Language Processing. Most existing approaches use a two streams strategy, computing image and question features that are consequently merged using a variety of techniques. Nonetheless, very few rely on higher level image representations, which can capture semantic and spatial rel… ▽ More

    Submitted 1 November, 2018; v1 submitted 19 June, 2018; originally announced June 2018.

    Comments: NIPS 2018 (13 pages, 7 figures)

  25. Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer's Disease

    Authors: Sarah Parisot, Sofia Ira Ktena, Enzo Ferrante, Matthew Lee, Ricardo Guerrero, Ben Glocker, Daniel Rueckert

    Abstract: Graphs are widely used as a natural framework that captures interactions between individual elements represented as nodes in a graph. In medical applications, specifically, nodes can represent individuals within a potentially large population (patients or healthy controls) accompanied by a set of features, while the graph edges incorporate associations between subjects in an intuitive manner. This… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

    Comments: in Press at Medical Image Analysis, MICCAI 2017 Special Issue

  26. arXiv:1703.10062  [pdf, other

    q-bio.NC cs.NE

    Exploring Heritability of Functional Brain Networks with Inexact Graph Matching

    Authors: Sofia Ira Ktena, Salim Arslan, Sarah Parisot, Daniel Rueckert

    Abstract: Data-driven brain parcellations aim to provide a more accurate representation of an individual's functional connectivity, since they are able to capture individual variability that arises due to development or disease. This renders comparisons between the emerging brain connectivity networks more challenging, since correspondences between their elements are not preserved. Unveiling these correspon… ▽ More

    Submitted 29 March, 2017; originally announced March 2017.

    Comments: accepted at ISBI 2017: International Symposium on Biomedical Imaging, Apr 2017, Melbourne, Australia

  27. arXiv:1703.03020  [pdf, ps, other

    stat.ML cs.LG

    Spectral Graph Convolutions for Population-based Disease Prediction

    Authors: Sarah Parisot, Sofia Ira Ktena, Enzo Ferrante, Matthew Lee, Ricardo Guerrerro Moreno, Ben Glocker, Daniel Rueckert

    Abstract: Exploiting the wealth of imaging and non-imaging information for disease prediction tasks requires models capable of representing, at the same time, individual features as well as data associations between subjects from potentially large populations. Graphs provide a natural framework for such tasks, yet previous graph-based approaches focus on pairwise similarities without modelling the subjects'… ▽ More

    Submitted 21 June, 2017; v1 submitted 8 March, 2017; originally announced March 2017.

    Comments: International Conference on Medical Image Computing and Computer-Assisted Interventions (MICCAI) 2017

  28. arXiv:1703.02161  [pdf, other

    cs.CV cs.LG

    Distance Metric Learning using Graph Convolutional Networks: Application to Functional Brain Networks

    Authors: Sofia Ira Ktena, Sarah Parisot, Enzo Ferrante, Martin Rajchl, Matthew Lee, Ben Glocker, Daniel Rueckert

    Abstract: Evaluating similarity between graphs is of major importance in several computer vision and pattern recognition problems, where graph representations are often used to model objects or interactions between elements. The choice of a distance or similarity metric is, however, not trivial and can be highly dependent on the application at hand. In this work, we propose a novel metric learning method to… ▽ More

    Submitted 14 June, 2017; v1 submitted 6 March, 2017; originally announced March 2017.

    Comments: International Conference on Medical Image Computing and Computer-Assisted Interventions (MICCAI) 2017

  29. arXiv:1611.04783  [pdf, other

    q-bio.NC cs.NE

    Comparison of Brain Networks with Unknown Correspondences

    Authors: Sofia Ira Ktena, Sarah Parisot, Jonathan Passerat-Palmbach, Daniel Rueckert

    Abstract: Graph theory has drawn a lot of attention in the field of Neuroscience during the last decade, mainly due to the abundance of tools that it provides to explore the interactions of elements in a complex network like the brain. The local and global organization of a brain network can shed light on mechanisms of complex cognitive functions, while disruptions within the network can be linked to neurod… ▽ More

    Submitted 15 November, 2016; originally announced November 2016.

    Comments: Presented at The MICCAI-BACON 16 Workshop (https://arxiv.boxedpaper.com/abs/1611.03363)

    Report number: BACON/2016/03

  30. arXiv:1611.03363  other

    cs.NE q-bio.NC

    Proceedings of the Workshop on Brain Analysis using COnnectivity Networks - BACON 2016

    Authors: Sarah Parisot, Jonathan Passerat-Palmbach, Markus D. Schirmer, Boris Gutman

    Abstract: Understanding brain connectivity in a network-theoretic context has shown much promise in recent years. This type of analysis identifies brain organisational principles, bringing a new perspective to neuroscience. At the same time, large public databases of connectomic data are now available. However, connectome analysis is still an emerging field and there is a crucial need for robust computation… ▽ More

    Submitted 24 November, 2016; v1 submitted 10 November, 2016; originally announced November 2016.