Search | arXiv e-print repository

Strategic priorities for transformative progress in advancing biology with proteomics and artificial intelligence

Authors: Yingying Sun, Jun A, Zhiwei Liu, Rui Sun, Liujia Qian, Samuel H. Payne, Wout Bittremieux, Markus Ralser, Chen Li, Yi Chen, Zhen Dong, Yasset Perez-Riverol, Asif Khan, Chris Sander, Ruedi Aebersold, Juan Antonio Vizcaíno, Jonathan R Krieger, Jianhua Yao, Han Wen, Linfeng Zhang, Yunping Zhu, Yue Xuan, Benjamin Boyang Sun, Liang Qiao, Henning Hermjakob , et al. (37 additional authors not shown)

Abstract: Artificial intelligence (AI) is transforming scientific research, including proteomics. Advances in mass spectrometry (MS)-based proteomics data quality, diversity, and scale, combined with groundbreaking AI techniques, are unlocking new challenges and opportunities in biological discovery. Here, we highlight key areas where AI is driving innovation, from data analysis to new biological insights.… ▽ More Artificial intelligence (AI) is transforming scientific research, including proteomics. Advances in mass spectrometry (MS)-based proteomics data quality, diversity, and scale, combined with groundbreaking AI techniques, are unlocking new challenges and opportunities in biological discovery. Here, we highlight key areas where AI is driving innovation, from data analysis to new biological insights. These include developing an AI-friendly ecosystem for proteomics data generation, sharing, and analysis; improving peptide and protein identification and quantification; characterizing protein-protein interactions and protein complexes; advancing spatial and perturbation proteomics; integrating multi-omics data; and ultimately enabling AI-empowered virtual cells. △ Less

Submitted 21 February, 2025; originally announced February 2025.

Comments: 28 pages, 2 figures, perspective in AI proteomics

arXiv:2405.11977 [pdf, other]

doi 10.1007/978-3-031-73480-9_2

GuidedRec: Guiding Ill-Posed Unsupervised Volumetric Recovery

Authors: Alexandre Cafaro, Amaury Leroy, Guillaume Beldjoudi, Pauline Maury, Charlotte Robert, Eric Deutsch, Vincent Grégoire, Vincent Lepetit, Nikos Paragios

Abstract: We introduce a novel unsupervised approach to reconstructing a 3D volume from only two planar projections that exploits a previous\-ly-captured 3D volume of the patient. Such volume is readily available in many important medical procedures and previous methods already used such a volume. Earlier methods that work by deforming this volume to match the projections typically fail when the number of p… ▽ More We introduce a novel unsupervised approach to reconstructing a 3D volume from only two planar projections that exploits a previous\-ly-captured 3D volume of the patient. Such volume is readily available in many important medical procedures and previous methods already used such a volume. Earlier methods that work by deforming this volume to match the projections typically fail when the number of projections is very low as the alignment becomes underconstrained. We show how to use a generative model of the volume structures to constrain the deformation and obtain a correct estimate. Moreover, our method is not bounded to a specific sensor calibration and can be applied to new calibrations without retraining. We evaluate our approach on a challenging dataset and show it outperforms state-of-the-art methods. As a result, our method could be used in treatment scenarios such as surgery and radiotherapy while drastically reducing patient radiation exposure. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2208.12847 [pdf, other]

Region-guided CycleGANs for Stain Transfer in Whole Slide Images

Authors: Joseph Boyd, Irène Villa, Marie-Christine Mathieu, Eric Deutsch, Nikos Paragios, Maria Vakalopoulou, Stergios Christodoulidis

Abstract: In whole slide imaging, commonly used staining techniques based on hematoxylin and eosin (H&E) and immunohistochemistry (IHC) stains accentuate different aspects of the tissue landscape. In the case of detecting metastases, IHC provides a distinct readout that is readily interpretable by pathologists. IHC, however, is a more expensive approach and not available at all medical centers. Virtually ge… ▽ More In whole slide imaging, commonly used staining techniques based on hematoxylin and eosin (H&E) and immunohistochemistry (IHC) stains accentuate different aspects of the tissue landscape. In the case of detecting metastases, IHC provides a distinct readout that is readily interpretable by pathologists. IHC, however, is a more expensive approach and not available at all medical centers. Virtually generating IHC images from H&E using deep neural networks thus becomes an attractive alternative. Deep generative models such as CycleGANs learn a semantically-consistent mapping between two image domains, while emulating the textural properties of each domain. They are therefore a suitable choice for stain transfer applications. However, they remain fully unsupervised, and possess no mechanism for enforcing biological consistency in stain transfer. In this paper, we propose an extension to CycleGANs in the form of a region of interest discriminator. This allows the CycleGAN to learn from unpaired datasets where, in addition, there is a partial annotation of objects for which one wishes to enforce consistency. We present a use case on whole slide images, where an IHC stain provides an experimentally generated signal for metastatic cells. We demonstrate the superiority of our approach over prior art in stain transfer on histopathology tiles over two datasets. Our code and model are available at https://github.com/jcboyd/miccai2022-roigan. △ Less

Submitted 26 August, 2022; originally announced August 2022.

arXiv:2111.12123 [pdf, other]

MICS : Multi-steps, Inverse Consistency and Symmetric deep learning registration network

Authors: Théo Estienne, Maria Vakalopoulou, Enzo Battistella, Theophraste Henry, Marvin Lerousseau, Amaury Leroy, Nikos Paragios, Eric Deutsch

Abstract: Deformable registration consists of finding the best dense correspondence between two different images. Many algorithms have been published, but the clinical application was made difficult by the high calculation time needed to solve the optimisation problem. Deep learning overtook this limitation by taking advantage of GPU calculation and the learning process. However, many deep learning methods… ▽ More Deformable registration consists of finding the best dense correspondence between two different images. Many algorithms have been published, but the clinical application was made difficult by the high calculation time needed to solve the optimisation problem. Deep learning overtook this limitation by taking advantage of GPU calculation and the learning process. However, many deep learning methods do not take into account desirable properties respected by classical algorithms. In this paper, we present MICS, a novel deep learning algorithm for medical imaging registration. As registration is an ill-posed problem, we focused our algorithm on the respect of different properties: inverse consistency, symmetry and orientation conservation. We also combined our algorithm with a multi-step strategy to refine and improve the deformation grid. While many approaches applied registration to brain MRI, we explored a more challenging body localisation: abdominal CT. Finally, we evaluated our method on a dataset used during the Learn2Reg challenge, allowing a fair comparison with published methods. △ Less

Submitted 23 November, 2021; originally announced November 2021.

Comments: In submission

arXiv:2109.03299 [pdf, other]

Self-Supervised Representation Learning using Visual Field Expansion on Digital Pathology

Authors: Joseph Boyd, Mykola Liashuha, Eric Deutsch, Nikos Paragios, Stergios Christodoulidis, Maria Vakalopoulou

Abstract: The examination of histopathology images is considered to be the gold standard for the diagnosis and stratification of cancer patients. A key challenge in the analysis of such images is their size, which can run into the gigapixels and can require tedious screening by clinicians. With the recent advances in computational medicine, automatic tools have been proposed to assist clinicians in their ev… ▽ More The examination of histopathology images is considered to be the gold standard for the diagnosis and stratification of cancer patients. A key challenge in the analysis of such images is their size, which can run into the gigapixels and can require tedious screening by clinicians. With the recent advances in computational medicine, automatic tools have been proposed to assist clinicians in their everyday practice. Such tools typically process these large images by slicing them into tiles that can then be encoded and utilized for different clinical models. In this study, we propose a novel generative framework that can learn powerful representations for such tiles by learning to plausibly expand their visual field. In particular, we developed a progressively grown generative model with the objective of visual field expansion. Thus trained, our model learns to generate different tissue types with fine details, while simultaneously learning powerful representations that can be used for different clinical endpoints, all in a self-supervised way. To evaluate the performance of our model, we conducted classification experiments on CAMELYON17 and CRC benchmark datasets, comparing favorably to other self-supervised and pre-trained strategies that are commonly used in digital pathology. Our code is available at https://github.com/jcboyd/cdpath21-gan. △ Less

Submitted 7 September, 2021; originally announced September 2021.

arXiv:2107.11238 [pdf, other]

Exploring Deep Registration Latent Spaces

Authors: Théo Estienne, Maria Vakalopoulou, Stergios Christodoulidis, Enzo Battistella, Théophraste Henry, Marvin Lerousseau, Amaury Leroy, Guillaume Chassagnon, Marie-Pierre Revel, Nikos Paragios, Eric Deutsch

Abstract: Explainability of deep neural networks is one of the most challenging and interesting problems in the field. In this study, we investigate the topic focusing on the interpretability of deep learning-based registration methods. In particular, with the appropriate model architecture and using a simple linear projection, we decompose the encoding space, generating a new basis, and we empirically show… ▽ More Explainability of deep neural networks is one of the most challenging and interesting problems in the field. In this study, we investigate the topic focusing on the interpretability of deep learning-based registration methods. In particular, with the appropriate model architecture and using a simple linear projection, we decompose the encoding space, generating a new basis, and we empirically show that this basis captures various decomposed anatomically aware geometrical transformations. We perform experiments using two different datasets focusing on lungs and hippocampus MRI. We show that such an approach can decompose the highly convoluted latent spaces of registration pipelines in an orthogonal space with several interesting properties. We hope that this work could shed some light on a better understanding of deep learning-based registration methods. △ Less

Submitted 23 July, 2021; originally announced July 2021.

Comments: 13 pages, 5 figures + 3 figures in supplementary materials Accepted to DART 2021 workshop

arXiv:2105.04269 [pdf, other]

Weakly supervised pan-cancer segmentation tool

Authors: Marvin Lerousseau, Marion Classe, Enzo Battistella, Théo Estienne, Théophraste Henry, Amaury Leroy, Roger Sun, Maria Vakalopoulou, Jean-Yves Scoazec, Eric Deutsch, Nikos Paragios

Abstract: The vast majority of semantic segmentation approaches rely on pixel-level annotations that are tedious and time consuming to obtain and suffer from significant inter and intra-expert variability. To address these issues, recent approaches have leveraged categorical annotations at the slide-level, that in general suffer from robustness and generalization. In this paper, we propose a novel weakly su… ▽ More The vast majority of semantic segmentation approaches rely on pixel-level annotations that are tedious and time consuming to obtain and suffer from significant inter and intra-expert variability. To address these issues, recent approaches have leveraged categorical annotations at the slide-level, that in general suffer from robustness and generalization. In this paper, we propose a novel weakly supervised multi-instance learning approach that deciphers quantitative slide-level annotations which are fast to obtain and regularly present in clinical routine. The extreme potentials of the proposed approach are demonstrated for tumor segmentation of solid cancer subtypes. The proposed approach achieves superior performance in out-of-distribution, out-of-location, and out-of-domain testing sets. △ Less

Submitted 10 May, 2021; originally announced May 2021.

arXiv:2105.02726 [pdf, other]

SparseConvMIL: Sparse Convolutional Context-Aware Multiple Instance Learning for Whole Slide Image Classification

Authors: Marvin Lerousseau, Maria Vakalopoulou, Eric Deutsch, Nikos Paragios

Abstract: Multiple instance learning (MIL) is the preferred approach for whole slide image classification. However, most MIL approaches do not exploit the interdependencies of tiles extracted from a whole slide image, which could provide valuable cues for classification. This paper presents a novel MIL approach that exploits the spatial relationship of tiles for classifying whole slide images. To do so, a s… ▽ More Multiple instance learning (MIL) is the preferred approach for whole slide image classification. However, most MIL approaches do not exploit the interdependencies of tiles extracted from a whole slide image, which could provide valuable cues for classification. This paper presents a novel MIL approach that exploits the spatial relationship of tiles for classifying whole slide images. To do so, a sparse map is built from tiles embeddings, and is then classified by a sparse-input CNN. It obtained state-of-the-art performance over popular MIL approaches on the classification of cancer subtype involving 10000 whole slide images. Our results suggest that the proposed approach might (i) improve the representation learning of instances and (ii) exploit the context of instance embeddings to enhance the classification performance. The code of this work is open-source at {github censored for review}. △ Less

Submitted 25 August, 2021; v1 submitted 6 May, 2021; originally announced May 2021.

arXiv:2102.07713 [pdf, other]

Cancer Gene Profiling through Unsupervised Discovery

Authors: Enzo Battistella, Maria Vakalopoulou, Roger Sun, Théo Estienne, Marvin Lerousseau, Sergey Nikolaev, Emilie Alvarez Andres, Alexandre Carré, Stéphane Niyoteka, Charlotte Robert, Nikos Paragios, Eric Deutsch

Abstract: Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of genes as well as the lack of comparisons on the analysis of data, remain a tremendous bottleneck regarding clinical adoption. In this paper, we introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarke… ▽ More Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of genes as well as the lack of comparisons on the analysis of data, remain a tremendous bottleneck regarding clinical adoption. In this paper, we introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers. Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm, that offers modularity as concerns metric functions and scalability, while being able to automatically determine the best number of clusters. Our evaluation includes both mathematical and biological criteria. The recovered signature is applied to a variety of biological tasks, including screening of biological pathways and functions, and characterization relevance on tumor types and subtypes. Quantitative comparisons among different distance metrics, commonly used clustering methods and a referential gene signature used in the literature, confirm state of the art performance of our approach. In particular, our signature, that is based on 27 genes, reports at least $30$ times better mathematical significance (average Dunn's Index) and 25% better biological significance (average Enrichment in Protein-Protein Interaction) than those produced by other referential clustering methods. Finally, our signature reports promising results on distinguishing immune inflammatory and immune desert tumors, while reporting a high balanced accuracy of 92% on tumor types classification and averaged balanced accuracy of 68% on tumor subtypes classification, which represents, respectively 7% and 9% higher performance compared to the referential signature. △ Less

Submitted 11 February, 2021; originally announced February 2021.

arXiv:2011.01045 [pdf, other]

Brain tumor segmentation with self-ensembled, deeply-supervised 3D U-net neural networks: a BraTS 2020 challenge solution

Authors: Theophraste Henry, Alexandre Carre, Marvin Lerousseau, Theo Estienne, Charlotte Robert, Nikos Paragios, Eric Deutsch

Abstract: Brain tumor segmentation is a critical task for patient's disease management. In order to automate and standardize this task, we trained multiple U-net like neural networks, mainly with deep supervision and stochastic weight averaging, on the Multimodal Brain Tumor Segmentation Challenge (BraTS) 2020 training dataset. Two independent ensembles of models from two different training pipelines were t… ▽ More Brain tumor segmentation is a critical task for patient's disease management. In order to automate and standardize this task, we trained multiple U-net like neural networks, mainly with deep supervision and stochastic weight averaging, on the Multimodal Brain Tumor Segmentation Challenge (BraTS) 2020 training dataset. Two independent ensembles of models from two different training pipelines were trained, and each produced a brain tumor segmentation map. These two labelmaps per patient were then merged, taking into account the performance of each ensemble for specific tumor subregions. Our performance on the online validation dataset with test time augmentation were as follows: Dice of 0.81, 0.91 and 0.85; Hausdorff (95%) of 20.6, 4,3, 5.7 mm for the enhancing tumor, whole tumor and tumor core, respectively. Similarly, our solution achieved a Dice of 0.79, 0.89 and 0.84, as well as Hausdorff (95%) of 20.4, 6.7 and 19.5mm on the final test dataset, ranking us among the top ten teams. More complicated training schemes and neural network architectures were investigated without significant performance gain at the cost of greatly increased training time. Overall, our approach yielded good and balanced performance for each tumor subregion. Our solution is open sourced at https://github.com/lescientifik/open_brats2020. △ Less

Submitted 27 November, 2020; v1 submitted 30 October, 2020; originally announced November 2020.

Comments: BraTS 2020 proceedings (LNCS) paper

arXiv:2010.10897 [pdf, other]

doi 10.1007/978-3-030-71827-5_11

Deep learning based registration using spatial gradients and noisy segmentation labels

Authors: Théo Estienne, Maria Vakalopoulou, Enzo Battistella, Alexandre Carré, Théophraste Henry, Marvin Lerousseau, Charlotte Robert, Nikos Paragios, Eric Deutsch

Abstract: Image registration is one of the most challenging problems in medical image analysis. In the recent years, deep learning based approaches became quite popular, providing fast and performing registration strategies. In this short paper, we summarise our work presented on Learn2Reg challenge 2020. The main contributions of our work rely on (i) a symmetric formulation, predicting the transformations… ▽ More Image registration is one of the most challenging problems in medical image analysis. In the recent years, deep learning based approaches became quite popular, providing fast and performing registration strategies. In this short paper, we summarise our work presented on Learn2Reg challenge 2020. The main contributions of our work rely on (i) a symmetric formulation, predicting the transformations from source to target and from target to source simultaneously, enforcing the trained representations to be similar and (ii) integration of variety of publicly available datasets used both for pretraining and for augmenting segmentation labels. Our method reports a mean dice of $0.64$ for task 3 and $0.85$ for task 4 on the test sets, taking third place on the challenge. Our code and models are publicly available at https://github.com/TheoEst/abdominal_registration and \https://github.com/TheoEst/hippocampus_registration. △ Less

Submitted 9 April, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

Comments: 6 pages, 3 figures. Updated version after review modifications. Published to Segmentation, Classification, and Registration of Multi-modality Medical Imaging Data. MICCAI 2020. Lecture Notes in Computer Science, vol 12587

Journal ref: In: Shusharina N., Heinrich M.P., Huang R. (eds) Segmentation, Classification, and Registration of Multi-modality Medical Imaging Data. MICCAI 2020. Lecture Notes in Computer Science, vol 12587. Springer, Cham

arXiv:2004.12852 [pdf, other]

AI-Driven CT-based quantification, staging and short-term outcome prediction of COVID-19 pneumonia

Authors: Guillaume Chassagnon, Maria Vakalopoulou, Enzo Battistella, Stergios Christodoulidis, Trieu-Nghi Hoang-Thi, Severine Dangeard, Eric Deutsch, Fabrice Andre, Enora Guillo, Nara Halm, Stefany El Hajj, Florian Bompard, Sophie Neveu, Chahinez Hani, Ines Saab, Alienor Campredon, Hasmik Koulakian, Souhail Bennani, Gael Freche, Aurelien Lombard, Laure Fournier, Hippolyte Monnier, Teodor Grand, Jules Gregory, Antoine Khalil , et al. (6 additional authors not shown)

Abstract: Chest computed tomography (CT) is widely used for the management of Coronavirus disease 2019 (COVID-19) pneumonia because of its availability and rapidity. The standard of reference for confirming COVID-19 relies on microbiological tests but these tests might not be available in an emergency setting and their results are not immediately available, contrary to CT. In addition to its role for early… ▽ More Chest computed tomography (CT) is widely used for the management of Coronavirus disease 2019 (COVID-19) pneumonia because of its availability and rapidity. The standard of reference for confirming COVID-19 relies on microbiological tests but these tests might not be available in an emergency setting and their results are not immediately available, contrary to CT. In addition to its role for early diagnosis, CT has a prognostic role by allowing visually evaluating the extent of COVID-19 lung abnormalities. The objective of this study is to address prediction of short-term outcomes, especially need for mechanical ventilation. In this multi-centric study, we propose an end-to-end artificial intelligence solution for automatic quantification and prognosis assessment by combining automatic CT delineation of lung disease meeting performance of experts and data-driven identification of biomarkers for its prognosis. AI-driven combination of variables with CT-based biomarkers offers perspectives for optimal patient management given the shortage of intensive care beds and ventilators. △ Less

Submitted 20 April, 2020; originally announced April 2020.

arXiv:2004.05024 [pdf, other]

Weakly supervised multiple instance learning histopathological tumor segmentation

Authors: Marvin Lerousseau, Maria Vakalopoulou, Marion Classe, Julien Adam, Enzo Battistella, Alexandre Carré, Théo Estienne, Théophraste Henry, Eric Deutsch, Nikos Paragios

Abstract: Histopathological image segmentation is a challenging and important topic in medical imaging with tremendous potential impact in clinical practice. State of the art methods rely on hand-crafted annotations which hinder clinical translation since histology suffers from significant variations between cancer phenotypes. In this paper, we propose a weakly supervised framework for whole slide imaging s… ▽ More Histopathological image segmentation is a challenging and important topic in medical imaging with tremendous potential impact in clinical practice. State of the art methods rely on hand-crafted annotations which hinder clinical translation since histology suffers from significant variations between cancer phenotypes. In this paper, we propose a weakly supervised framework for whole slide imaging segmentation that relies on standard clinical annotations, available in most medical systems. In particular, we exploit a multiple instance learning scheme for training models. The proposed framework has been evaluated on multi-locations and multi-centric public data from The Cancer Genome Atlas and the PatchCamelyon dataset. Promising results when compared with experts' annotations demonstrate the potentials of the presented approach. The complete framework, including $6481$ generated tumor maps and data processing, is available at https://github.com/marvinler/tcga_segmentation. △ Less

Submitted 11 May, 2021; v1 submitted 10 April, 2020; originally announced April 2020.

Comments: Accepted MICCAI 2020; added code + results url; 10 pages, 3 figures

arXiv:1811.02629 [pdf, other]

Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

Authors: Spyridon Bakas, Mauricio Reyes, Andras Jakab, Stefan Bauer, Markus Rempfler, Alessandro Crimi, Russell Takeshi Shinohara, Christoph Berger, Sung Min Ha, Martin Rozycki, Marcel Prastawa, Esther Alberts, Jana Lipkova, John Freymann, Justin Kirby, Michel Bilello, Hassan Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Benedikt Wiestler, Rivka Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko , et al. (402 additional authors not shown)

Abstract: Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles dissem… ▽ More Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset. △ Less

Submitted 23 April, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

Comments: The International Multimodal Brain Tumor Segmentation (BraTS) Challenge

Showing 1–14 of 14 results for author: Deutsch, E