-
Fluctuations in DNA Packing Density Drive the Spatial Segregation between Euchromatin and Heterochromatin
Authors:
Luming Meng,
Boping Liu,
Qiong Luo
Abstract:
In the crowded eukaryotic nucleus, euchromatin and heterochromatin segregate into distinct compartments, a phenomenon often attributed to homotypic interactions mediated by liquid liquid phase separation of chromatin associated proteins. Here, we revisit genome compartmentalization by examining the role of in vivo DNA packing density fluctuations driven by ATP dependent chromatin remodelers. Lever…
▽ More
In the crowded eukaryotic nucleus, euchromatin and heterochromatin segregate into distinct compartments, a phenomenon often attributed to homotypic interactions mediated by liquid liquid phase separation of chromatin associated proteins. Here, we revisit genome compartmentalization by examining the role of in vivo DNA packing density fluctuations driven by ATP dependent chromatin remodelers. Leveraging DNA accessibility data, we develop a polymer based model that captures these fluctuations and successfully reproduces genome wide compartment patterns observed in HiC data, without invoking homotypic interactions. Further analysis reveals that density fluctuations in a crowded nuclear environment elevate the system energy, while euchromatin heterochromatin segregation facilitates energy dissipation, offering a thermodynamic advantage for spontaneous compartment formation. These findings suggest that euchromatin heterochromatin segregation may arise through a non equilibrium, self organizing process, providing new insights into genome organization.
△ Less
Submitted 25 May, 2025;
originally announced May 2025.
-
On the Within-class Variation Issue in Alzheimer's Disease Detection
Authors:
Jiawen Kang,
Dongrui Han,
Lingwei Meng,
Jingyan Zhou,
Jinchao Li,
Xixin Wu,
Helen Meng
Abstract:
Alzheimer's Disease (AD) detection employs machine learning classification models to distinguish between individuals with AD and those without. Different from conventional classification tasks, we identify within-class variation as a critical challenge in AD detection: individuals with AD exhibit a spectrum of cognitive impairments. Therefore, simplistic binary AD classification may overlook two c…
▽ More
Alzheimer's Disease (AD) detection employs machine learning classification models to distinguish between individuals with AD and those without. Different from conventional classification tasks, we identify within-class variation as a critical challenge in AD detection: individuals with AD exhibit a spectrum of cognitive impairments. Therefore, simplistic binary AD classification may overlook two crucial aspects: within-class heterogeneity and instance-level imbalance. In this work, we found using a sample score estimator can generate sample-specific soft scores aligning with cognitive scores. We subsequently propose two simple yet effective methods: Soft Target Distillation (SoTD) and Instance-level Re-balancing (InRe), targeting two problems respectively. Based on the ADReSS and CU-MARVEL corpora, we demonstrated and analyzed the advantages of the proposed approaches in detection performance. These findings provide insights for developing robust and reliable AD detection models.
△ Less
Submitted 28 May, 2025; v1 submitted 21 September, 2024;
originally announced September 2024.
-
Large Language Model-based FMRI Encoding of Language Functions for Subjects with Neurocognitive Disorder
Authors:
Yuejiao Wang,
Xianmin Gong,
Lingwei Meng,
Xixin Wu,
Helen Meng
Abstract:
Functional magnetic resonance imaging (fMRI) is essential for developing encoding models that identify functional changes in language-related brain areas of individuals with Neurocognitive Disorders (NCD). While large language model (LLM)-based fMRI encoding has shown promise, existing studies predominantly focus on healthy, young adults, overlooking older NCD populations and cognitive level corre…
▽ More
Functional magnetic resonance imaging (fMRI) is essential for developing encoding models that identify functional changes in language-related brain areas of individuals with Neurocognitive Disorders (NCD). While large language model (LLM)-based fMRI encoding has shown promise, existing studies predominantly focus on healthy, young adults, overlooking older NCD populations and cognitive level correlations. This paper explores language-related functional changes in older NCD adults using LLM-based fMRI encoding and brain scores, addressing current limitations. We analyze the correlation between brain scores and cognitive scores at both whole-brain and language-related ROI levels. Our findings reveal that higher cognitive abilities correspond to better brain scores, with correlations peaking in the middle temporal gyrus. This study highlights the potential of fMRI encoding models and brain scores for detecting early functional changes in NCD patients.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
2D and 3D CT Radiomic Features Performance Comparison in Characterization of Gastric Cancer: A Multi-center Study
Authors:
Lingwei Meng,
Di Dong,
Xin Chen,
Mengjie Fang,
Rongpin Wang,
Jing Li,
Zaiyi Liu,
Jie Tian
Abstract:
Objective: Radiomics, an emerging tool for medical image analysis, is potential towards precisely characterizing gastric cancer (GC). Whether using one-slice 2D annotation or whole-volume 3D annotation remains a long-time debate, especially for heterogeneous GC. We comprehensively compared 2D and 3D radiomic features' representation and discrimination capacity regarding GC, via three tasks.
Meth…
▽ More
Objective: Radiomics, an emerging tool for medical image analysis, is potential towards precisely characterizing gastric cancer (GC). Whether using one-slice 2D annotation or whole-volume 3D annotation remains a long-time debate, especially for heterogeneous GC. We comprehensively compared 2D and 3D radiomic features' representation and discrimination capacity regarding GC, via three tasks.
Methods: Four-center 539 GC patients were retrospectively enrolled and divided into the training and validation cohorts. From 2D or 3D regions of interest (ROIs) annotated by radiologists, radiomic features were extracted respectively. Feature selection and model construction procedures were customed for each combination of two modalities (2D or 3D) and three tasks. Subsequently, six machine learning models (Model_2D^LNM, Model_3D^LNM; Model_2D^LVI, Model_3D^LVI; Model_2D^pT, Model_3D^pT) were derived and evaluated to reflect modalities' performances in characterizing GC. Furthermore, we performed an auxiliary experiment to assess modalities' performances when resampling spacing is different.
Results: Regarding three tasks, the yielded areas under the curve (AUCs) were: Model_2D^LNM's 0.712 (95% confidence interval, 0.613-0.811), Model_3D^LNM's 0.680 (0.584-0.775); Model_2D^LVI's 0.677 (0.595-0.761), Model_3D^LVI's 0.615 (0.528-0.703); Model_2D^pT's 0.840 (0.779-0.901), Model_3D^pT's 0.813 (0.747-0.879). Moreover, the auxiliary experiment indicated that Models_2D are statistically more advantageous than Models3D with different resampling spacings.
Conclusion: Models constructed with 2D radiomic features revealed comparable performances with those constructed with 3D features in characterizing GC.
Significance: Our work indicated that time-saving 2D annotation would be the better choice in GC, and provided a related reference to further radiomics-based researches.
△ Less
Submitted 29 October, 2022;
originally announced October 2022.
-
Outcome-guided Sparse K-means for Disease Subtype Discovery via Integrating Phenotypic Data with High-dimensional Transcriptomic Data
Authors:
Lingsong Meng,
Dorina Avram,
George Tseng,
Zhiguang Huo
Abstract:
The discovery of disease subtypes is an essential step for developing precision medicine, and disease subtyping via omics data has become a popular approach. While promising, subtypes obtained from existing approaches are not necessarily associated with clinical outcomes. With the rich clinical data along with the omics data in modern epidemiology cohorts, it is urgent to develop an outcome-guided…
▽ More
The discovery of disease subtypes is an essential step for developing precision medicine, and disease subtyping via omics data has become a popular approach. While promising, subtypes obtained from existing approaches are not necessarily associated with clinical outcomes. With the rich clinical data along with the omics data in modern epidemiology cohorts, it is urgent to develop an outcome-guided clustering algorithm to fully integrate the phenotypic data with the high-dimensional omics data. Hence, we extended a sparse K-means method to an outcome-guided sparse K-means (GuidedSparseKmeans) method. An unified objective function was proposed, which was comprised of (i) weighted K-means to perform sample clusterings; (ii) lasso regularizations to perform gene selection from the high-dimensional omics data; (iii) incorporation of a phenotypic variable from the clinical dataset to facilitate biologically meaningful clustering results. By iteratively optimizing the objective function, we will simultaneously obtain a phenotype-related sample clustering results and gene selection results. We demonstrated the superior performance of the GuidedSparseKmeans by comparing with existing clustering methods in simulations and applications of high-dimensional transcriptomic data of breast cancer and Alzheimer's disease. Our algorithm has been implemented into an R package, which is publicly available on GitHub (https://github.com/LingsongMeng/GuidedSparseKmeans).
△ Less
Submitted 27 February, 2022; v1 submitted 17 March, 2021;
originally announced March 2021.
-
Pairwise versus multiple network alignment
Authors:
Vipin Vijayan,
Shawn Gu,
Eric Krebs,
Lei Meng,
Tijana Milenkovic
Abstract:
Biological network alignment (NA) aims to identify similar regions between molecular networks of different species. NA can be local or global. Just as the recent trend in the NA field, we also focus on global NA, which can be pairwise (PNA) and multiple (MNA). PNA produces aligned node pairs between two networks. MNA produces aligned node clusters between more than two networks. Recently, the focu…
▽ More
Biological network alignment (NA) aims to identify similar regions between molecular networks of different species. NA can be local or global. Just as the recent trend in the NA field, we also focus on global NA, which can be pairwise (PNA) and multiple (MNA). PNA produces aligned node pairs between two networks. MNA produces aligned node clusters between more than two networks. Recently, the focus has shifted from PNA to MNA, because MNA captures conserved regions between more networks than PNA (and MNA is thus considered to be more insightful), though at higher computational complexity. The issue is that, due to the different outputs of PNA and MNA, a PNA method is only compared to other PNA methods, and an MNA method is only compared to other MNA methods. Comparison of PNA against MNA must be done to evaluate whether MNA's higher complexity is justified by its higher accuracy. We introduce a framework that allows for this. We evaluate eight prominent PNA and MNA methods, on synthetic and real-world biological networks, using topological and functional alignment quality measures. We compare PNA against MNA in both a pairwise (native to PNA) and multiple (native to MNA) manner. PNA is expected to perform better under the pairwise evaluation framework. Indeed this is what we find. MNA is expected to perform better under the multiple evaluation framework. Shockingly, we find this not to always hold; PNA is often better than MNA in this framework, depending on the choice of evaluation test.
△ Less
Submitted 25 April, 2020; v1 submitted 13 September, 2017;
originally announced September 2017.
-
IGLOO: Integrating global and local biological network alignment
Authors:
Lei Meng,
Joseph Crawford,
Aaron Striegel,
Tijana Milenkovic
Abstract:
Analogous to genomic sequence alignment, biological network alignment (NA) aims to find regions of similarities between molecular networks (rather than sequences) of different species. NA can be either local (LNA) or global (GNA). LNA aims to identify highly conserved common subnetworks, which are typically small, while GNA aims to identify large common subnetworks, which are typically suboptimall…
▽ More
Analogous to genomic sequence alignment, biological network alignment (NA) aims to find regions of similarities between molecular networks (rather than sequences) of different species. NA can be either local (LNA) or global (GNA). LNA aims to identify highly conserved common subnetworks, which are typically small, while GNA aims to identify large common subnetworks, which are typically suboptimally conserved. We recently showed that LNA and GNA yield complementary results: LNA has high functional but low topological alignment quality, while GNA has high topological but low functional alignment quality. Thus, we propose IGLOO, a new approach that integrates GNA and LNA in hope to reconcile the two. We evaluate IGLOO against state-of-the-art LNA (NetworkBLAST, NetAligner, AlignNemo, and AlignMCL) and GNA (GHOST, NETAL, GEDEVO, MAGNA++, WAVE, and L-GRAAL) methods. We show that IGLOO allows for a trade-off between topological and functional alignment quality better than the existing LNA and GNA methods considered in our study.
△ Less
Submitted 5 June, 2016; v1 submitted 20 April, 2016;
originally announced April 2016.
-
Local versus Global Biological Network Alignment
Authors:
Lei Meng,
Aaron Striegel,
Tijana Milenkovic
Abstract:
Network alignment (NA) aims to find regions of similarities between molecular networks of different species. There exist two NA categories: local (LNA) or global (GNA). LNA finds small highly conserved network regions and produces a many-to-many node mapping. GNA finds large conserved regions and produces a one-to-one node mapping. Given the different outputs of LNA and GNA, when a new NA method i…
▽ More
Network alignment (NA) aims to find regions of similarities between molecular networks of different species. There exist two NA categories: local (LNA) or global (GNA). LNA finds small highly conserved network regions and produces a many-to-many node mapping. GNA finds large conserved regions and produces a one-to-one node mapping. Given the different outputs of LNA and GNA, when a new NA method is proposed, it is compared against existing methods from the same category. However, both NA categories have the same goal: to allow for transferring functional knowledge from well- to poorly-studied species between conserved network regions. So, which one to choose, LNA or GNA? To answer this, we introduce the first systematic evaluation of the two NA categories.
We introduce new measures of alignment quality that allow for fair comparison of the different LNA and GNA outputs, as such measures do not exist. We provide user-friendly software for efficient alignment evaluation that implements the new and existing measures. We evaluate prominent LNA and GNA methods on synthetic and real-world biological networks. We study the effect on alignment quality of using different interaction types and confidence levels. We find that the superiority of one NA category over the other is context-dependent. Further, when we contrast LNA and GNA in the application of learning novel protein functional knowledge, the two produce very different predictions, indicating their complementarity. Our results and software provide guidelines for future NA method development and evaluation.
△ Less
Submitted 28 September, 2015;
originally announced September 2015.