-
BrainStratify: Coarse-to-Fine Disentanglement of Intracranial Neural Dynamics
Authors:
Hui Zheng,
Hai-Teng Wang,
Yi-Tao Jing,
Pei-Yang Lin,
Han-Qing Zhao,
Wei Chen,
Peng-Hu Wei,
Yong-Zhi Shan,
Guo-Guang Zhao,
Yun-Zhe Liu
Abstract:
Decoding speech directly from neural activity is a central goal in brain-computer interface (BCI) research. In recent years, exciting advances have been made through the growing use of intracranial field potential recordings, such as stereo-ElectroEncephaloGraphy (sEEG) and ElectroCorticoGraphy (ECoG). These neural signals capture rich population-level activity but present key challenges: (i) task…
▽ More
Decoding speech directly from neural activity is a central goal in brain-computer interface (BCI) research. In recent years, exciting advances have been made through the growing use of intracranial field potential recordings, such as stereo-ElectroEncephaloGraphy (sEEG) and ElectroCorticoGraphy (ECoG). These neural signals capture rich population-level activity but present key challenges: (i) task-relevant neural signals are sparsely distributed across sEEG electrodes, and (ii) they are often entangled with task-irrelevant neural signals in both sEEG and ECoG. To address these challenges, we introduce a unified Coarse-to-Fine neural disentanglement framework, BrainStratify, which includes (i) identifying functional groups through spatial-context-guided temporal-spatial modeling, and (ii) disentangling distinct neural dynamics within the target functional group using Decoupled Product Quantization (DPQ). We evaluate BrainStratify on two open-source sEEG datasets and one (epidural) ECoG dataset, spanning tasks like vocal production and speech perception. Extensive experiments show that BrainStratify, as a unified framework for decoding speech from intracranial neural signals, significantly outperforms previous decoding methods. Overall, by combining data-driven stratification with neuroscience-inspired modularity, BrainStratify offers a robust and interpretable solution for speech decoding from intracranial recordings.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
TransCDR: a deep learning model for enhancing the generalizability of cancer drug response prediction through transfer learning and multimodal data fusion for drug representation
Authors:
Xiaoqiong Xia,
Chaoyu Zhu,
Yuqi Shan,
Fan Zhong,
Lei Liu
Abstract:
Accurate and robust drug response prediction is of utmost importance in precision medicine. Although many models have been developed to utilize the representations of drugs and cancer cell lines for predicting cancer drug responses (CDR), their performances can be improved by addressing issues such as insufficient data modality, suboptimal fusion algorithms, and poor generalizability for novel dru…
▽ More
Accurate and robust drug response prediction is of utmost importance in precision medicine. Although many models have been developed to utilize the representations of drugs and cancer cell lines for predicting cancer drug responses (CDR), their performances can be improved by addressing issues such as insufficient data modality, suboptimal fusion algorithms, and poor generalizability for novel drugs or cell lines. We introduce TransCDR, which uses transfer learning to learn drug representations and fuses multi-modality features of drugs and cell lines by a self-attention mechanism, to predict the IC50 values or sensitive states of drugs on cell lines. We are the first to systematically evaluate the generalization of the CDR prediction model to novel (i.e., never-before-seen) compound scaffolds and cell line clusters. TransCDR shows better generalizability than 8 state-of-the-art models. TransCDR outperforms its 5 variants that train drug encoders (i.e., RNN and AttentiveFP) from scratch under various scenarios. The most critical contributors among multiple drug notations and omics profiles are Extended Connectivity Fingerprint and genetic mutation. Additionally, the attention-based fusion module further enhances the predictive performance of TransCDR. TransCDR, trained on the GDSC dataset, demonstrates strong predictive performance on the external testing set CCLE. It is also utilized to predict missing CDRs on GDSC. Moreover, we investigate the biological mechanisms underlying drug response by classifying 7,675 patients from TCGA into drug-sensitive or drug-resistant groups, followed by a Gene Set Enrichment Analysis. TransCDR emerges as a potent tool with significant potential in drug response prediction. The source code and data can be accessed at https://github.com/XiaoqiongXia/TransCDR.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
GeneSupport Maximum Gene-Support Tree Approach to Species Phylogeny Inference
Authors:
Yunfeng Shan,
Xiu-Qing Li
Abstract:
Summary: GeneSupport implements a genome-scale algorithm: Maximum Gene-Support Tree to estimate species tree from gene trees based on multilocus sequences. It provides a new option for multiple genes to infer species tree. It is incorporated into popular phylogentic program: PHYLIP package with the same usage and user interface. It is suitable for phylogenetic methods such as maximum parsimony,…
▽ More
Summary: GeneSupport implements a genome-scale algorithm: Maximum Gene-Support Tree to estimate species tree from gene trees based on multilocus sequences. It provides a new option for multiple genes to infer species tree. It is incorporated into popular phylogentic program: PHYLIP package with the same usage and user interface. It is suitable for phylogenetic methods such as maximum parsimony, maximum likelihood, Baysian and neighbour-joining, which is used to reconstruct single gene trees firstly with a variety of phylogenetic inference programs.
△ Less
Submitted 10 October, 2009;
originally announced October 2009.
-
Genome-scale approach proofs that the lungfish-coelacanth sister group is the closest living relative of tetrapods with Bayesian method under coalescence model
Authors:
Yunfeng Shan,
Robin Gras
Abstract:
This paper has been withdrawn by the author(s), due t a modification in Eqn. 1.
This paper has been withdrawn by the author(s), due t a modification in Eqn. 1.
△ Less
Submitted 22 October, 2009; v1 submitted 10 October, 2009;
originally announced October 2009.
-
The hypothesis that coelacanth is the closest living relative of tetrapods 3 was rejected based on three genome-scale approaches
Authors:
Yunfeng Shan,
Xiu-Qing Li,
Robin Gras
Abstract:
Since its discovery of the living fossil in 1938, the coelacanth (Latimeria chalumnae) has generally been considered to be the closest living relative of the land vertebrates, and this is still the prevailing opinion in most general biology textbooks. However, the origin of tetrapods has been the subject of intense debate for decades. The three principal hypothesis (lungfish-tetrapod, coelacanth…
▽ More
Since its discovery of the living fossil in 1938, the coelacanth (Latimeria chalumnae) has generally been considered to be the closest living relative of the land vertebrates, and this is still the prevailing opinion in most general biology textbooks. However, the origin of tetrapods has been the subject of intense debate for decades. The three principal hypothesis (lungfish-tetrapod, coelacanth-tetrapod, or lungfish-coelacanth sister group) have been proposed. We used the maximum gene-support tree approach to analyze 43 nuclear genes encoding amino acid residues, and compared the results of concatenation and majority-rule tree approaches. The results inferred with three common phylogenetic methods and three genome-scale approaches consistently rejected the hypothesis that the coelacanth is the closest living relative of tetrapods.
△ Less
Submitted 10 October, 2009;
originally announced October 2009.
-
Genome-wide EST data mining approaches to resolving incongruence of molecular phylogenies
Authors:
Yunfeng Shan
Abstract:
36 single genes of six plants inferred 18 unique trees using maximum parsimony. Such incongruence is an important issue and how to reconstruct the congruent tree still is one of the most challenges in molecular phylogenetics. For resolving this problem, a genome-wide EST data mining approach was systematically investigated by retrieving a large size of EST data of 144 shared genes of six green p…
▽ More
36 single genes of six plants inferred 18 unique trees using maximum parsimony. Such incongruence is an important issue and how to reconstruct the congruent tree still is one of the most challenges in molecular phylogenetics. For resolving this problem, a genome-wide EST data mining approach was systematically investigated by retrieving a large size of EST data of 144 shared genes of six green plants from GenBank. The results show that the concatenated alignments approach overcame incongruence among single-gene phylogenies and successfully reconstructed the congruent tree of six species with 100% jackknife support across each branch when 144 genes was used. Jackknife supports of correct branches increased with number of genes linearly, but those of wrong branches also increased linearly. For inferring the congruent tree, the minimum 30 genes were required. This approach may provide potential power in resolving conflictions of phylogenies.
Keywords: Genome-wide; Data mining; EST; Phylogeny; Congruent tree; Jackknife support; Plants.
△ Less
Submitted 17 January, 2007; v1 submitted 3 September, 2006;
originally announced September 2006.
-
Maximum-frequency gene tree: a simplified genome-scale approach to overcoming incongruence in molecular phylogenies
Authors:
Yunfeng Shan,
Xiu-Qing Li
Abstract:
Genomes and genes diversify during evolution; however, it is unclear to what extent genes still retain the relationship among species. Model species for molecular phylogenetic studies include yeasts and viruses whose genomes were sequenced as well as plants that have the fossil-supported true phylogenetic trees available. In this study, we generated single gene trees of seven yeast species as we…
▽ More
Genomes and genes diversify during evolution; however, it is unclear to what extent genes still retain the relationship among species. Model species for molecular phylogenetic studies include yeasts and viruses whose genomes were sequenced as well as plants that have the fossil-supported true phylogenetic trees available. In this study, we generated single gene trees of seven yeast species as well as single gene trees of nine baculovirus species using all the orthologous genes among the species compared. Homologous genes among seven known plants were used for validation of the fi nding. Four algorithms: maximum parsimony, minimum evolution, maximum likelihood, and neighbor-joining, were used. Trees were reconstructed before and after weighting the DNA and protein sequence lengths among genes. Rarely a gene can always generate the "true tree" by all the four algorithms. However, the most frequent gene tree, termed "maximum gene-support tree" (MGS tree, or WMGS tree for the weighted one), in yeasts, baculoviruses, or plants was consistently found to be the "true tree" among the species. The results provide insights into the overall degree of divergence of orthologous genes of the genomes analyzed and suggest the following: 1) The true tree relationship among the species studied is still maintained by the largest group of orthologous genes; 2) There are usually more orthologous genes with higher similarities between genetically closer species than between genetically more distant ones; and 3) The maximum gene-support tree refl ects the phylogenetic relationship among species in comparison.
Keywords: genome, gene evolution, molecular phylogeny, true tree
△ Less
Submitted 7 June, 2008; v1 submitted 2 September, 2006;
originally announced September 2006.