-
Towards Quantum Tensor Decomposition in Biomedical Applications
Authors:
Myson Burch,
Jiasen Zhang,
Gideon Idumah,
Hakan Doga,
Richard Lartey,
Lamis Yehia,
Mingrui Yang,
Murat Yildirim,
Mihriban Karaayvaz,
Omar Shehab,
Weihong Guo,
Ying Ni,
Laxmi Parida,
Xiaojuan Li,
Aritra Bose
Abstract:
Tensor decomposition has emerged as a powerful framework for feature extraction in multi-modal biomedical data. In this review, we present a comprehensive analysis of tensor decomposition methods such as Tucker, CANDECOMP/PARAFAC, spiked tensor decomposition, etc. and their diverse applications across biomedical domains such as imaging, multi-omics, and spatial transcriptomics. To systematically i…
▽ More
Tensor decomposition has emerged as a powerful framework for feature extraction in multi-modal biomedical data. In this review, we present a comprehensive analysis of tensor decomposition methods such as Tucker, CANDECOMP/PARAFAC, spiked tensor decomposition, etc. and their diverse applications across biomedical domains such as imaging, multi-omics, and spatial transcriptomics. To systematically investigate the literature, we applied a topic modeling-based approach that identifies and groups distinct thematic sub-areas in biomedicine where tensor decomposition has been used, thereby revealing key trends and research directions. We evaluated challenges related to the scalability of latent spaces along with obtaining the optimal rank of the tensor, which often hinder the extraction of meaningful features from increasingly large and complex datasets. Additionally, we discuss recent advances in quantum algorithms for tensor decomposition, exploring how quantum computing can be leveraged to address these challenges. Our study includes a preliminary resource estimation analysis for quantum computing platforms and examines the feasibility of implementing quantum-enhanced tensor decomposition methods on near-term quantum devices. Collectively, this review not only synthesizes current applications and challenges of tensor decomposition in biomedical analyses but also outlines promising quantum computing strategies to enhance its impact on deriving actionable insights from complex biomedical data.
△ Less
Submitted 19 February, 2025; v1 submitted 18 February, 2025;
originally announced February 2025.
-
Automatic detection of Mild Cognitive Impairment using high-dimensional acoustic features in spontaneous speech
Authors:
Cong Zhang,
Wenxing Guo,
Hongsheng Dai
Abstract:
This study addresses the TAUKADIAL challenge, focusing on the classification of speech from people with Mild Cognitive Impairment (MCI) and neurotypical controls. We conducted three experiments comparing five machine-learning methods: Random Forests, Sparse Logistic Regression, k-Nearest Neighbors, Sparse Support Vector Machine, and Decision Tree, utilizing 1076 acoustic features automatically ext…
▽ More
This study addresses the TAUKADIAL challenge, focusing on the classification of speech from people with Mild Cognitive Impairment (MCI) and neurotypical controls. We conducted three experiments comparing five machine-learning methods: Random Forests, Sparse Logistic Regression, k-Nearest Neighbors, Sparse Support Vector Machine, and Decision Tree, utilizing 1076 acoustic features automatically extracted using openSMILE. In Experiment 1, the entire dataset was used to train a language-agnostic model. Experiment 2 introduced a language detection step, leading to separate model training for each language. Experiment 3 further enhanced the language-agnostic model from Experiment 1, with a specific focus on evaluating the robustness of the models using out-of-sample test data. Across all three experiments, results consistently favored models capable of handling high-dimensional data, such as Random Forest and Sparse Logistic Regression, in classifying speech from MCI and controls.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Node-Aligned Graph-to-Graph (NAG2G): Elevating Template-Free Deep Learning Approaches in Single-Step Retrosynthesis
Authors:
Lin Yao,
Wentao Guo,
Zhen Wang,
Shang Xiang,
Wentan Liu,
Guolin Ke
Abstract:
Single-step retrosynthesis (SSR) in organic chemistry is increasingly benefiting from deep learning (DL) techniques in computer-aided synthesis design. While template-free DL models are flexible and promising for retrosynthesis prediction, they often ignore vital 2D molecular information and struggle with atom alignment for node generation, resulting in lower performance compared to the template-b…
▽ More
Single-step retrosynthesis (SSR) in organic chemistry is increasingly benefiting from deep learning (DL) techniques in computer-aided synthesis design. While template-free DL models are flexible and promising for retrosynthesis prediction, they often ignore vital 2D molecular information and struggle with atom alignment for node generation, resulting in lower performance compared to the template-based and semi-template-based methods. To address these issues, we introduce Node-Aligned Graph-to-Graph (NAG2G), a transformer-based template-free DL model. NAG2G combines 2D molecular graphs and 3D conformations to retain comprehensive molecular details and incorporates product-reactant atom mapping through node alignment which determines the order of the node-by-node graph outputs process in an auto-regressive manner. Through rigorous benchmarking and detailed case studies, we have demonstrated that NAG2G stands out with its remarkable predictive accuracy on the expansive datasets of USPTO-50k and USPTO-FULL. Moreover, the model's practical utility is underscored by its successful prediction of synthesis pathways for multiple drug candidate molecules. This not only proves NAG2G's robustness but also its potential to revolutionize the prediction of complex chemical synthesis processes for future synthetic route design tasks.
△ Less
Submitted 25 March, 2024; v1 submitted 27 September, 2023;
originally announced September 2023.
-
Distinguishing representational geometries with controversial stimuli: Bayesian experimental design and its application to face dissimilarity judgments
Authors:
Tal Golan,
Wenxuan Guo,
Heiko H. Schütt,
Nikolaus Kriegeskorte
Abstract:
Comparing representations of complex stimuli in neural network layers to human brain representations or behavioral judgments can guide model development. However, even qualitatively distinct neural network models often predict similar representational geometries of typical stimulus sets. We propose a Bayesian experimental design approach to synthesizing stimulus sets for adjudicating among represe…
▽ More
Comparing representations of complex stimuli in neural network layers to human brain representations or behavioral judgments can guide model development. However, even qualitatively distinct neural network models often predict similar representational geometries of typical stimulus sets. We propose a Bayesian experimental design approach to synthesizing stimulus sets for adjudicating among representational models efficiently. We apply our method to discriminate among candidate neural network models of behavioral face dissimilarity judgments. Our results indicate that a neural network trained to invert a 3D-face-model graphics renderer is more human-aligned than the same architecture trained on identification, classification, or autoencoding. Our proposed stimulus synthesis objective is generally applicable to designing experiments to be analyzed by representational similarity analysis for model comparison.
△ Less
Submitted 27 November, 2022;
originally announced November 2022.
-
Active Learning with Point Supervision for Cost-Effective Panicle Detection in Cereal Crops
Authors:
Akshay L Chandra,
Sai Vikas Desai,
Vineeth N Balasubramanian,
Seishi Ninomiya,
Wei Guo
Abstract:
Panicle density of cereal crops such as wheat and sorghum is one of the main components for plant breeders and agronomists in understanding the yield of their crops. To phenotype the panicle density effectively, researchers agree there is a significant need for computer vision-based object detection techniques. Especially in recent times, research in deep learning-based object detection shows prom…
▽ More
Panicle density of cereal crops such as wheat and sorghum is one of the main components for plant breeders and agronomists in understanding the yield of their crops. To phenotype the panicle density effectively, researchers agree there is a significant need for computer vision-based object detection techniques. Especially in recent times, research in deep learning-based object detection shows promising results in various agricultural studies. However, training such systems usually requires a lot of bounding-box labeled data. Since crops vary by both environmental and genetic conditions, acquisition of huge amount of labeled image datasets for each crop is expensive and time-consuming. Thus, to catalyze the widespread usage of automatic object detection for crop phenotyping, a cost-effective method to develop such automated systems is essential. We propose a point supervision based active learning approach for panicle detection in cereal crops. In our approach, the model constantly interacts with a human annotator by iteratively querying the labels for only the most informative images, as opposed to all images in a dataset. Our query method is specifically designed for cereal crops which usually tend to have panicles with low variance in appearance. Our method reduces labeling costs by intelligently leveraging low-cost weak labels (object centers) for picking the most informative images for which strong labels (bounding boxes) are required. We show promising results on two publicly available cereal crop datasets - Sorghum and Wheat. On Sorghum, 6 variants of our proposed method outperform the best baseline method with more than 55% savings in labeling time. Similarly, on Wheat, 3 variants of our proposed methods outperform the best baseline method with more than 50% of savings in labeling time.
△ Less
Submitted 17 April, 2020; v1 submitted 3 October, 2019;
originally announced October 2019.
-
Automatic estimation of heading date of paddy rice using deep learning
Authors:
Sai Vikas Desai,
Vineeth N Balasubramanian,
Tokihiro Fukatsu,
Seishi Ninomiya,
Wei Guo
Abstract:
Accurate estimation of heading date of paddy rice greatly helps the breeders to understand the adaptability of different crop varieties in a given location. The heading date also plays a vital role in determining grain yield for research experiments. Visual examination of the crop is laborious and time consuming. Therefore, quick and precise estimation of heading date of paddy rice is highly essen…
▽ More
Accurate estimation of heading date of paddy rice greatly helps the breeders to understand the adaptability of different crop varieties in a given location. The heading date also plays a vital role in determining grain yield for research experiments. Visual examination of the crop is laborious and time consuming. Therefore, quick and precise estimation of heading date of paddy rice is highly essential. In this work, we propose a simple pipeline to detect regions containing flowering panicles from ground level RGB images of paddy rice. Given a fixed region size for an image, the number of regions containing flowering panicles is directly proportional to the number of flowering panicles present. Consequently, we use the flowering panicle region counts to estimate the heading date of the crop. The method is based on image classification using Convolutional Neural Networks (CNNs). We evaluated the performance of our algorithm on five time series image sequences of three different varieties of rice crops. When compared to the previous work on this dataset, the accuracy and general versatility of the method has been improved and heading date has been estimated with a mean absolute error of less than 1 day.
△ Less
Submitted 19 June, 2019;
originally announced June 2019.
-
Sequential Bayesian Detection of Spike Activities from Fluorescence Observations
Authors:
Zhuangkun Wei,
Bin Li,
Weisi Guo,
Wenxiu Hu,
Chenglin Zhao
Abstract:
Extracting and detecting spike activities from the fluorescence observations is an important step in understanding how neuron systems work. The main challenge lies in that the combination of the ambient noise with dynamic baseline fluctuation, often contaminates the observations, thereby deteriorating the reliability of spike detection. This may be even worse in the face of the nonlinear biologica…
▽ More
Extracting and detecting spike activities from the fluorescence observations is an important step in understanding how neuron systems work. The main challenge lies in that the combination of the ambient noise with dynamic baseline fluctuation, often contaminates the observations, thereby deteriorating the reliability of spike detection. This may be even worse in the face of the nonlinear biological process, the coupling interactions between spikes and baseline, and the unknown critical parameters of an underlying physiological model, in which erroneous estimations of parameters will affect the detection of spikes causing further error propagation. In this paper, we propose a random finite set (RFS) based Bayesian approach. The dynamic behaviors of spike sequence, fluctuated baseline and unknown parameters are formulated as one RFS. This RFS state is capable of distinguishing the hidden active/silent states induced by spike and non-spike activities respectively, thereby \emph{negating the interaction role} played by spikes and other factors. Then, premised on the RFS states, a Bayesian inference scheme is designed to simultaneously estimate the model parameters, baseline, and crucial spike activities. Our results demonstrate that the proposed scheme can gain an extra $12\%$ detection accuracy in comparison with the state-of-the-art MLSpike method.
△ Less
Submitted 31 January, 2019;
originally announced January 2019.
-
DeepMetabolism: A Deep Learning System to Predict Phenotype from Genome Sequencing
Authors:
Weihua Guo,
You Xu,
Xueyang Feng
Abstract:
Life science is entering a new era of petabyte-level sequencing data. Converting such big data to biological insights represents a huge challenge for computational analysis. To this end, we developed DeepMetabolism, a biology-guided deep learning system to predict cell phenotypes from transcriptomics data. By integrating unsupervised pre-training with supervised training, DeepMetabolism is able to…
▽ More
Life science is entering a new era of petabyte-level sequencing data. Converting such big data to biological insights represents a huge challenge for computational analysis. To this end, we developed DeepMetabolism, a biology-guided deep learning system to predict cell phenotypes from transcriptomics data. By integrating unsupervised pre-training with supervised training, DeepMetabolism is able to predict phenotypes with high accuracy (PCC>0.92), high speed (<30 min for >100 GB data using a single GPU), and high robustness (tolerate up to 75% noise). We envision DeepMetabolism to bridge the gap between genotype and phenotype and to serve as a springboard for applications in synthetic biology and precision medicine.
△ Less
Submitted 8 May, 2017;
originally announced May 2017.
-
Do ROS really slow down aging in C. elegans?
Authors:
Yaguang Ren,
Sixi Chen,
Mengmeng Ma,
Congjie Zhang,
Kejie Wang,
Feng Li,
Wenxuan Guo,
Jiatao Huang,
Chao Zhang
Abstract:
The view that ROS slow down aging is getting popular. We here proposed an idea that aging is slowed down by secondary responses rather than ROS.
The view that ROS slow down aging is getting popular. We here proposed an idea that aging is slowed down by secondary responses rather than ROS.
△ Less
Submitted 26 July, 2017; v1 submitted 20 April, 2017;
originally announced April 2017.
-
Data-Driven Prediction of CRISPR-Based Transcription Regulation for Programmable Control of Metabolic Flux
Authors:
Jiayuan Sheng,
Weihua Guo,
Christine Ash,
Brendan Freitas,
Mitchell Paoletti,
Xueyang Feng
Abstract:
Multiplex and multi-directional control of metabolic pathways is crucial for metabolic engineering to improve product yield of fuels, chemicals, and pharmaceuticals. To achieve this goal, artificial transcriptional regulators such as CRISPR-based transcription regulators have been developed to specifically activate or repress genes of interest. Here, we found that by deploying guide RNAs to target…
▽ More
Multiplex and multi-directional control of metabolic pathways is crucial for metabolic engineering to improve product yield of fuels, chemicals, and pharmaceuticals. To achieve this goal, artificial transcriptional regulators such as CRISPR-based transcription regulators have been developed to specifically activate or repress genes of interest. Here, we found that by deploying guide RNAs to target on DNA sites at different locations of genetic cassettes, we could use just one synthetic CRISPR-based transcriptional regulator to simultaneously activate and repress gene expressions. By using the pairwise datasets of guide RNAs and gene expressions, we developed a data-driven predictive model to rationally design this system for fine-tuning expression of target genes. We demonstrated that this system could achieve programmable control of metabolic fluxes when using yeast to produce versatile chemicals. We anticipate that this master CRISPR-based transcription regulator will be a valuable addition to the synthetic biology toolkit for metabolic engineering, speeding up the design-build-test cycle in industrial biomanufacturing as well as generating new biological insights on the fates of eukaryotic cells.
△ Less
Submitted 10 April, 2017;
originally announced April 2017.
-
Evolutionary Dynamics for Persistent Cooperation in Structured Populations
Authors:
Yan Li,
Xinsheng Liu,
Jens Christian Claussen,
Wanlin Guo
Abstract:
The emergence and maintenance of cooperative behavior is a fascinating topic in evolutionary biology and social science. The public goods game (PGG) is a paradigm for exploring cooperative behavior. In PGG, the total resulting payoff is divided equally among all participants. This feature still leads to the dominance of defection without substantially magnifying the public good by a multiplying fa…
▽ More
The emergence and maintenance of cooperative behavior is a fascinating topic in evolutionary biology and social science. The public goods game (PGG) is a paradigm for exploring cooperative behavior. In PGG, the total resulting payoff is divided equally among all participants. This feature still leads to the dominance of defection without substantially magnifying the public good by a multiplying factor. Much effort has been made to explain the evolution of cooperative strategies, including a recent model in which only a portion of the total benefit is shared by all the players through introducing a new strategy named persistent cooperation. A persistent cooperator is a contributor who is willing to pay a second cost to retrieve the remaining portion of the payoff contributed by themselves. In a previous study, this model was analyzed in the framework of well-mixed populations. This paper focuses on discussing the persistent cooperation in lattice-structured populations. The evolutionary dynamics of the structured populations consisting of three types of competing players (pure cooperators, defectors and persistent cooperators) are revealed by theoretical analysis and numerical simulations. In particular, the approximate expressions of fixation probabilities for strategies are derived on one-dimensional lattices. The phase diagrams of stationary states, the evolution of frequencies and spatial patterns for strategies are illustrated on both one-dimensional and square lattices by simulations. Our results are consistent with the general observation that, at least in most situations, a structured population facilitates the evolution of cooperation. Specifically, here we find that the existence of persistent cooperators greatly suppresses the spreading of defectors under more relaxed conditions in structured populations compared to that obtained in well-mixed population.
△ Less
Submitted 19 May, 2015;
originally announced May 2015.
-
Advantageous punishers in nature
Authors:
Xinsheng Liu,
Wanlin Guo
Abstract:
The evolution and maintenance of cooperation fascinated researchers for several decades. Recently, theoretical models and experimental evidence show that costly punishment may facilitate cooperation in human societies, but may not be used by winners. The puzzle how the costly punishment behaviour evolves can be solved under voluntary participation. Could the punishers emerge if participation is co…
▽ More
The evolution and maintenance of cooperation fascinated researchers for several decades. Recently, theoretical models and experimental evidence show that costly punishment may facilitate cooperation in human societies, but may not be used by winners. The puzzle how the costly punishment behaviour evolves can be solved under voluntary participation. Could the punishers emerge if participation is compulsory? Is the punishment inevitably a selfish behaviour or an altruistic behaviour? The motivations behind punishment are still an enigma. Based on public goods interactions, we present a model in which just a certain portion of the public good is divided equally among all members. The other portion is distributed to contributors when paying a second cost. Contributors who are willing to pay a second cost can be costly (and then altruistic) punishers, but they can also flourish or dominate the population, in this case we may call them "advantageous punishers". We argue that most of successful cooperators and punishers in nature are advantageous punishers, and costly punishment mostly happens in humans. This indicates a universal surviving rule: contributing more and gaining more. Our models show theoretically that the original motivation behind punishment is to retrieve deserved payoff from their own contributions, a selfish incentive.
△ Less
Submitted 15 June, 2010;
originally announced June 2010.