-
Genomics Data Analysis via Spectral Shape and Topology
Authors:
Erik J. Amézquita,
Farzana Nasrin,
Kathleen M. Storey,
Masato Yoshizawa
Abstract:
Mapper, a topological algorithm, is frequently used as an exploratory tool to build a graphical representation of data. This representation can help to gain a better understanding of the intrinsic shape of high-dimensional genomic data and to retain information that may be lost using standard dimension-reduction algorithms. We propose a novel workflow to process and analyze RNA-seq data from tumor…
▽ More
Mapper, a topological algorithm, is frequently used as an exploratory tool to build a graphical representation of data. This representation can help to gain a better understanding of the intrinsic shape of high-dimensional genomic data and to retain information that may be lost using standard dimension-reduction algorithms. We propose a novel workflow to process and analyze RNA-seq data from tumor and healthy subjects integrating Mapper and differential gene expression. Precisely, we show that a Gaussian mixture approximation method can be used to produce graphical structures that successfully separate tumor and healthy subjects, and produce two subgroups of tumor subjects. A further analysis using DESeq2, a popular tool for the detection of differentially expressed genes, shows that these two subgroups of tumor cells bear two distinct gene regulations, suggesting two discrete paths for forming lung cancer, which could not be highlighted by other popular clustering methods, including t-SNE. Although Mapper shows promise in analyzing high-dimensional data, building tools to statistically analyze Mapper graphical structures is limited in the existing literature. In this paper, we develop a scoring method using heat kernel signatures that provides an empirical setting for statistical inferences such as hypothesis testing, sensitivity analysis, and correlation analysis.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
Designing experimental conditions to use the Lotka-Volterra model to infer tumor cell line interaction types
Authors:
Heyrim Cho,
Allison L. Lewis,
Kathleen M. Storey,
Helen M. Byrne
Abstract:
The Lotka-Volterra model is widely used to model interactions between two species. Here, we generate synthetic data mimicking competitive, mutualistic and antagonistic interactions between two tumor cell lines, and then use the Lotka-Volterra model to infer the interaction type. Structural identifiability of the Lotka-Volterra model is confirmed, and practical identifiability is assessed for three…
▽ More
The Lotka-Volterra model is widely used to model interactions between two species. Here, we generate synthetic data mimicking competitive, mutualistic and antagonistic interactions between two tumor cell lines, and then use the Lotka-Volterra model to infer the interaction type. Structural identifiability of the Lotka-Volterra model is confirmed, and practical identifiability is assessed for three experimental designs: (a) use of a single data set, with a mixture of both cell lines observed over time, (b) a sequential design where growth rates and carrying capacities are estimated using data from experiments in which each cell line is grown in isolation, and then interaction parameters are estimated from an experiment involving a mixture of both cell lines, and (c) a parallel experimental design where all model parameters are fitted to data from two mixtures simultaneously. In addition to assessing each design for practical identifiability, we investigate how the predictive power of the model-i.e., its ability to fit data for initial ratios other than those to which it was calibrated-is affected by the choice of experimental design. The parallel calibration procedure is found to be optimal and is further tested on in silico data generated from a spatially-resolved cellular automaton model, which accounts for oxygen consumption and allows for variation in the intensity level of the interaction between the two cell lines. We use this study to highlight the care that must be taken when interpreting parameter estimates for the spatially-averaged Lotka-Volterra model when it is calibrated against data produced by the spatially-resolved cellular automaton model, since baseline competition for space and resources in the CA model may contribute to a discrepancy between the type of interaction used to generate the CA data and the type of interaction inferred by the LV model.
△ Less
Submitted 17 September, 2022;
originally announced September 2022.
-
Utilizing gradient approximations to optimize data selection protocols for tumor growth model calibration
Authors:
Allison L. Lewis,
Kathleen M. Storey,
Heyrim Cho,
Anna C. Zittle
Abstract:
The use of mathematical models to make predictions about tumor growth and response to treatment has become increasingly more prevalent in the clinical setting. The level of complexity within these models ranges broadly, and the calibration of more complex models correspondingly requires more detailed clinical data. This raises questions about how much data should be collected and when, in order to…
▽ More
The use of mathematical models to make predictions about tumor growth and response to treatment has become increasingly more prevalent in the clinical setting. The level of complexity within these models ranges broadly, and the calibration of more complex models correspondingly requires more detailed clinical data. This raises questions about how much data should be collected and when, in order to minimize the total amount of data used and the time until a model can be calibrated accurately. To address these questions, we propose a Bayesian information-theoretic procedure, using a gradient-based score function to determine the optimal data collection times for model calibration. The novel score function introduced in this work eliminates the need for a weight parameter used in a previous study's score function, while still yielding accurate and efficient model calibration using even fewer scans on a sample set of synthetic data, simulating tumors of varying levels of radiosensitivity. We also conduct a robust analysis of the calibration accuracy and certainty, using both error and uncertainty metrics. Unlike the error analysis of the previous study, the inclusion of uncertainty analysis in this work|as a means for deciding when the algorithm can be terminated|provides a more realistic option for clinical decision-making, since it does not rely on data that will be collected later in time.
△ Less
Submitted 25 December, 2021;
originally announced December 2021.
-
Effective dose fractionation schemes of radiotherapy for prostate cancer
Authors:
Jose Alvarez,
Kathleen M. Storey,
Pavitra Kannan,
Heyrim Cho
Abstract:
Radiation therapy has remained as one of the main cancer treatment modalities and a highly cost-effective single modality treatment of cancer care. Typical regimens for fractionated external beam radiotherapy comprise a constant dose administered on weekdays, and no radiation on weekends. However, every patient has a tumor with distinct properties depending on intra-tumor heterogeneity, aggressive…
▽ More
Radiation therapy has remained as one of the main cancer treatment modalities and a highly cost-effective single modality treatment of cancer care. Typical regimens for fractionated external beam radiotherapy comprise a constant dose administered on weekdays, and no radiation on weekends. However, every patient has a tumor with distinct properties depending on intra-tumor heterogeneity, aggressiveness, and interactive properties with other cells that may make it more resistant or sensitive to radiation treatment. Accordingly, the concept of personalized cancer treatment is emerging to specialize each patient treatment case to the unique properties of the tumor. In this paper, we examine adaptive radiation treatment strategies for heterogeneous tumors using a dynamical system model that consists of radiation-resistant and parental cell populations with unique interactive properties. We study different adaptive dosage strategies for PC3 and DU145 prostate cancer cell lines. We show that stronger doses of radiation given in longer time intervals, while keeping the overall dosage the same, reduce final tumor volume by more than half in PC3 cell lines, but by only five percent in DU145 cell lines. In addition, we tested an adaptive dosing schedule by administering a stronger dosage on Friday to compensate for the treatment-off period during the weekend, which was effective in decreasing the final tumor volume of both cell lines. This result creates interesting possibilities for new radiotherapy strategies at clinics that cannot provide treatment on weekends. Finally, we propose a dosage plan incorporating our findings.
△ Less
Submitted 25 December, 2021;
originally announced December 2021.
-
Bayesian information-theoretic calibration of patient-specific radiotherapy sensitivity parameters for informing effective scanning protocols in cancer
Authors:
Heyrim Cho,
Allison L. Lewis,
Kathleen M. Storey
Abstract:
With new advancements in technology, it is now possible to collect data for a variety of different metrics describing tumor growth, including tumor volume, composition, and vascularity, among others. For any proposed model of tumor growth and treatment, we observe large variability among individual patients' parameter values, particularly those relating to treatment response; thus, exploiting the…
▽ More
With new advancements in technology, it is now possible to collect data for a variety of different metrics describing tumor growth, including tumor volume, composition, and vascularity, among others. For any proposed model of tumor growth and treatment, we observe large variability among individual patients' parameter values, particularly those relating to treatment response; thus, exploiting the use of these various metrics for model calibration can be helpful to infer such patient-specific parameters both accurately and early, so that treatment protocols can be adjusted mid-course for maximum efficacy. However, taking measurements can be costly and invasive, limiting clinicians to a sparse collection schedule. As such, the determination of optimal times and metrics for which to collect data in order to best inform proper treatment protocols could be of great assistance to clinicians. In this investigation, we employ a Bayesian information-theoretic calibration protocol for experimental design in order to identify the optimal times at which to collect data for informing treatment parameters. Within this procedure, data collection times are chosen sequentially to maximize the reduction in parameter uncertainty with each added measurement, ensuring that a budget of $n$ high-fidelity experimental measurements results in maximum information gain about the low-fidelity model parameter values. In addition to investigating the optimal temporal pattern for data collection, we also develop a framework for deciding which metrics should be utilized at each data collection point. We illustrate this framework with a variety of toy examples, each utilizing a radiotherapy treatment regimen. For each scenario, we analyze the dependence of the predictive power of the low-fidelity model upon the measurement budget.
△ Less
Submitted 5 September, 2020;
originally announced September 2020.