-
On the importance of local and global feature learning for automated measurable residual disease detection in flow cytometry data
Authors:
Lisa Weijler,
Michael Reiter,
Pedro Hermosilla,
Margarita Maurer-Granofszky,
Michael Dworzak
Abstract:
This paper evaluates various deep learning methods for measurable residual disease (MRD) detection in flow cytometry (FCM) data, addressing questions regarding the benefits of modeling long-range dependencies, methods of obtaining global information, and the importance of learning local features. Based on our findings, we propose two adaptations to the current state-of-the-art (SOTA) model. Our co…
▽ More
This paper evaluates various deep learning methods for measurable residual disease (MRD) detection in flow cytometry (FCM) data, addressing questions regarding the benefits of modeling long-range dependencies, methods of obtaining global information, and the importance of learning local features. Based on our findings, we propose two adaptations to the current state-of-the-art (SOTA) model. Our contributions include an enhanced SOTA model, demonstrating superior performance on publicly available datasets and improved generalization across laboratories, as well as valuable insights for the FCM community, guiding future DL architecture designs for FCM data analysis. The code is available at \url{https://github.com/lisaweijler/flowNetworks}.
△ Less
Submitted 23 November, 2024;
originally announced November 2024.
-
Automated Immunophenotyping Assessment for Diagnosing Childhood Acute Leukemia using Set-Transformers
Authors:
Elpiniki Maria Lygizou,
Michael Reiter,
Margarita Maurer-Granofszky,
Michael Dworzak,
Radu Grosu
Abstract:
Acute Leukemia is the most common hematologic malignancy in children and adolescents. A key methodology in the diagnostic evaluation of this malignancy is immunophenotyping based on Multiparameter Flow Cytometry (FCM). However, this approach is manual, and thus time-consuming and subjective. To alleviate this situation, we propose in this paper the FCM-Former, a machine learning, self-attention ba…
▽ More
Acute Leukemia is the most common hematologic malignancy in children and adolescents. A key methodology in the diagnostic evaluation of this malignancy is immunophenotyping based on Multiparameter Flow Cytometry (FCM). However, this approach is manual, and thus time-consuming and subjective. To alleviate this situation, we propose in this paper the FCM-Former, a machine learning, self-attention based FCM-diagnostic tool, automating the immunophenotyping assessment in Childhood Acute Leukemia. The FCM-Former is trained in a supervised manner, by directly using flow cytometric data. Our FCM-Former achieves an accuracy of 96.5% assigning lineage to each sample among 960 cases of either acute B-cell, T-cell lymphoblastic, and acute myeloid leukemia (B-ALL, T-ALL, AML). To the best of our knowledge, the FCM-Former is the first work that automates the immunophenotyping assessment with FCM data in diagnosing pediatric Acute Leukemia.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
FATE: Feature-Agnostic Transformer-based Encoder for learning generalized embedding spaces in flow cytometry data
Authors:
Lisa Weijler,
Florian Kowarsch,
Michael Reiter,
Pedro Hermosilla,
Margarita Maurer-Granofszky,
Michael Dworzak
Abstract:
While model architectures and training strategies have become more generic and flexible with respect to different data modalities over the past years, a persistent limitation lies in the assumption of fixed quantities and arrangements of input features. This limitation becomes particularly relevant in scenarios where the attributes captured during data acquisition vary across different samples. In…
▽ More
While model architectures and training strategies have become more generic and flexible with respect to different data modalities over the past years, a persistent limitation lies in the assumption of fixed quantities and arrangements of input features. This limitation becomes particularly relevant in scenarios where the attributes captured during data acquisition vary across different samples. In this work, we aim at effectively leveraging data with varying features, without the need to constrain the input space to the intersection of potential feature sets or to expand it to their union. We propose a novel architecture that can directly process data without the necessity of aligned feature modalities by learning a general embedding space that captures the relationship between features across data samples with varying sets of features. This is achieved via a set-transformer architecture augmented by feature-encoder layers, thereby enabling the learning of a shared latent feature space from data originating from heterogeneous feature spaces. The advantages of the model are demonstrated for automatic cancer cell detection in acute myeloid leukemia in flow cytometry data, where the features measured during acquisition often vary between samples. Our proposed architecture's capacity to operate seamlessly across incongruent feature spaces is particularly relevant in this context, where data scarcity arises from the low prevalence of the disease. The code is available for research purposes at https://github.com/lisaweijler/FATE.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Explainable Techniques for Analyzing Flow Cytometry Cell Transformers
Authors:
Florian Kowarsch,
Lisa Weijler,
FLorian Kleber,
Matthias Wödlinger,
Michael Reiter,
Margarita Maurer-Granofszky,
Michael Dworzak
Abstract:
Explainability for Deep Learning Models is especially important for clinical applications, where decisions of automated systems have far-reaching consequences.
While various post-hoc explainable methods, such as attention visualization and saliency maps, already exist for common data modalities, including natural language and images, little work has been done to adapt them to the modality of Flo…
▽ More
Explainability for Deep Learning Models is especially important for clinical applications, where decisions of automated systems have far-reaching consequences.
While various post-hoc explainable methods, such as attention visualization and saliency maps, already exist for common data modalities, including natural language and images, little work has been done to adapt them to the modality of Flow CytoMetry (FCM) data.
In this work, we evaluate the usage of a transformer architecture called ReluFormer that ease attention visualization as well as we propose a gradient- and an attention-based visualization technique tailored for FCM. We qualitatively evaluate the visualization techniques for cell classification and polygon regression on pediatric Acute Lymphoblastic Leukemia (ALL) FCM samples. The results outline the model's decision process and demonstrate how to utilize the proposed techniques to inspect the trained model. The gradient-based visualization not only identifies cells that are most significant for a particular prediction but also indicates the directions in the FCM feature space in which changes have the most impact on the prediction. The attention visualization provides insights on the transformer's decision process when handling FCM data. We show that different attention heads specialize by attending to different biologically meaningful sub-populations in the data, even though the model retrieved solely supervised binary classification signals during training.
△ Less
Submitted 26 July, 2023;
originally announced July 2023.
-
Automated Identification of Cell Populations in Flow Cytometry Data with Transformers
Authors:
Matthias Wödlinger,
Michael Reiter,
Lisa Weijler,
Margarita Maurer-Granofszky,
Angela Schumich,
Elisa O. Sajaroff,
Stefanie Groeneveld-Krentz,
Jorge G. Rossi,
Leonid Karawajew,
Richard Ratei,
Michael Dworzak
Abstract:
Acute Lymphoblastic Leukemia (ALL) is the most frequent hematologic malignancy in children and adolescents. A strong prognostic factor in ALL is given by the Minimal Residual Disease (MRD), which is a measure for the number of leukemic cells persistent in a patient. Manual MRD assessment from Multiparameter Flow Cytometry (FCM) data after treatment is time-consuming and subjective. In this work, w…
▽ More
Acute Lymphoblastic Leukemia (ALL) is the most frequent hematologic malignancy in children and adolescents. A strong prognostic factor in ALL is given by the Minimal Residual Disease (MRD), which is a measure for the number of leukemic cells persistent in a patient. Manual MRD assessment from Multiparameter Flow Cytometry (FCM) data after treatment is time-consuming and subjective. In this work, we present an automated method to compute the MRD value directly from FCM data. We present a novel neural network approach based on the transformer architecture that learns to directly identify blast cells in a sample. We train our method in a supervised manner and evaluate it on publicly available ALL FCM data from three different clinical centers. Our method reaches a median F1 score of ~0.94 when evaluated on 519 B-ALL samples and shows better results than existing methods on 4 different datasets
△ Less
Submitted 7 March, 2022; v1 submitted 23 August, 2021;
originally announced August 2021.