-
Knowledge-guided Contextual Gene Set Analysis Using Large Language Models
Authors:
Zhizheng Wang,
Chi-Ping Day,
Chih-Hsuan Wei,
Qiao Jin,
Robert Leaman,
Yifan Yang,
Shubo Tian,
Aodong Qiu,
Yin Fang,
Qingqing Zhu,
Xinghua Lu,
Zhiyong Lu
Abstract:
Gene set analysis (GSA) is a foundational approach for interpreting genomic data of diseases by linking genes to biological processes. However, conventional GSA methods overlook clinical context of the analyses, often generating long lists of enriched pathways with redundant, nonspecific, or irrelevant results. Interpreting these requires extensive, ad-hoc manual effort, reducing both reliability…
▽ More
Gene set analysis (GSA) is a foundational approach for interpreting genomic data of diseases by linking genes to biological processes. However, conventional GSA methods overlook clinical context of the analyses, often generating long lists of enriched pathways with redundant, nonspecific, or irrelevant results. Interpreting these requires extensive, ad-hoc manual effort, reducing both reliability and reproducibility. To address this limitation, we introduce cGSA, a novel AI-driven framework that enhances GSA by incorporating context-aware pathway prioritization. cGSA integrates gene cluster detection, enrichment analysis, and large language models to identify pathways that are not only statistically significant but also biologically meaningful. Benchmarking on 102 manually curated gene sets across 19 diseases and ten disease-related biological mechanisms shows that cGSA outperforms baseline methods by over 30%, with expert validation confirming its increased precision and interpretability. Two independent case studies in melanoma and breast cancer further demonstrate its potential to uncover context-specific insights and support targeted hypothesis generation.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
7 Tesla multimodal MRI dataset of ex-vivo human brain
Authors:
Qinfeng Zhu,
Sihui Li,
Zuozhen Cao,
Yao Shen,
Haoan Xu,
Guojun Xu,
Haotian Li,
Keqing Zhu,
Zhiyong Zhao,
Jing Zhang,
Dan Wu
Abstract:
Ex-vivo MRI offers invaluable insights into the complexity of the human brain, enabling high-resolution anatomical delineation and integration with histopathology, and thus, contributes to both basic and clinical studies on normal and pathological brains. However, ex-vivo MRI is challenging in sample preparation, acquisition, and data analysis, and existing ex-vivo MRI datasets are often single im…
▽ More
Ex-vivo MRI offers invaluable insights into the complexity of the human brain, enabling high-resolution anatomical delineation and integration with histopathology, and thus, contributes to both basic and clinical studies on normal and pathological brains. However, ex-vivo MRI is challenging in sample preparation, acquisition, and data analysis, and existing ex-vivo MRI datasets are often single image modality and lack of ethnic diversity. In our study, we aimed to address these limitations by constructing a comprehensive multimodal MRI database acquired from six ex-vivo Chinese human brains. This database included structural MRI, high-angular resolution diffusion MRI, quantitative susceptibility mapping, and quantitative T1 and T2 maps, which enabled multifaceted depiction of brain microstructure and connectivity. Furthermore, we generated population-averaged multimodal templates and the segmentation labels to facilitate analysis of ex-vivo brain MRI. This public database offers a collection of high-resolution and multi-parametric ex-vivo human brain MRI and filled the gap of lacking Asian brain samples in existing databases.
△ Less
Submitted 6 December, 2024;
originally announced December 2024.
-
SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics
Authors:
Qingtian Zhu,
Yumin Zheng,
Yuling Sang,
Yifan Zhan,
Ziyan Zhu,
Jun Ding,
Yinqiang Zheng
Abstract:
Spatial Transcriptomics (ST) is a method that captures gene expression profiles aligned with spatial coordinates. The discrete spatial distribution and the super-high dimensional sequencing results make ST data challenging to be modeled effectively. In this paper, we manage to model ST in a continuous and compact manner by the proposed tool, SUICA, empowered by the great approximation capability o…
▽ More
Spatial Transcriptomics (ST) is a method that captures gene expression profiles aligned with spatial coordinates. The discrete spatial distribution and the super-high dimensional sequencing results make ST data challenging to be modeled effectively. In this paper, we manage to model ST in a continuous and compact manner by the proposed tool, SUICA, empowered by the great approximation capability of Implicit Neural Representations (INRs) that can enhance both the spatial density and the gene expression. Concretely within the proposed SUICA, we incorporate a graph-augmented Autoencoder to effectively model the context information of the unstructured spots and provide informative embeddings that are structure-aware for spatial mapping. We also tackle the extremely skewed distribution in a regression-by-classification fashion and enforce classification-based loss functions for the optimization of SUICA. By extensive experiments of a wide range of common ST platforms under varying degradations, SUICA outperforms both conventional INR variants and SOTA methods regarding numerical fidelity, statistical correlation, and bio-conservation. The prediction by SUICA also showcases amplified gene signatures that enriches the bio-conservation of the raw data and benefits subsequent analysis. The code is available at https://github.com/Szym29/SUICA.
△ Less
Submitted 7 May, 2025; v1 submitted 2 December, 2024;
originally announced December 2024.
-
Theta and/or alpha? Neural oscillational substrates for dynamic inter-brain synchrony during mother-child cooperation
Authors:
Jiayang Xu,
Yamin Li,
Ruxin Su,
Saishuang Wu,
Chengcheng Wu,
Haiwa Wang,
Qi Zhu,
Yue Fang,
Fan Jiang,
Shanbao Tong,
Yunting Zhang,
Xiaoli Guo
Abstract:
Mother-child interaction is a highly dynamic process neurally characterized by inter-brain synchrony (IBS) at θ and/or α rhythms. However, their establishment, dynamic changes, and roles in mother-child interactions remain unknown. Through dynamic analysis of dual-EEG from 40 mother-child dyads during turn-taking cooperation, we uncover that θ-IBS and α-IBS alternated with interactive behaviors, w…
▽ More
Mother-child interaction is a highly dynamic process neurally characterized by inter-brain synchrony (IBS) at θ and/or α rhythms. However, their establishment, dynamic changes, and roles in mother-child interactions remain unknown. Through dynamic analysis of dual-EEG from 40 mother-child dyads during turn-taking cooperation, we uncover that θ-IBS and α-IBS alternated with interactive behaviors, with EEG frequency-shift as a prerequisite for IBS transitions. When mothers attempt to track their children's attention and/or predict their intentions, they will adjust their EEG frequencies to align with their children's θ oscillations, leading to a higher occurrence of the θ-IBS state. Conversely, the α-IBS state, accompanied by the EEG frequency-shift to the α range, is more prominent during mother-led interactions. Further exploratory analysis reveals greater presence and stability of the θ-IBS state during cooperative than non-cooperative conditions, particularly in dyads with stronger emotional attachments and more frequent interactions in their daily lives. Our findings shed light on the neural oscillational substrates underlying the IBS dynamics during mother-child interactions.
△ Less
Submitted 30 October, 2024; v1 submitted 17 October, 2024;
originally announced October 2024.
-
End-to-End Reaction Field Energy Modeling via Deep Learning based Voxel-to-voxel Transform
Authors:
Yongxian Wu,
Qiang Zhu,
Ray Luo
Abstract:
In computational biochemistry and biophysics, understanding the role of electrostatic interactions is crucial for elucidating the structure, dynamics, and function of biomolecules. The Poisson-Boltzmann (PB) equation is a foundational tool for modeling these interactions by describing the electrostatic potential in and around charged molecules. However, solving the PB equation presents significant…
▽ More
In computational biochemistry and biophysics, understanding the role of electrostatic interactions is crucial for elucidating the structure, dynamics, and function of biomolecules. The Poisson-Boltzmann (PB) equation is a foundational tool for modeling these interactions by describing the electrostatic potential in and around charged molecules. However, solving the PB equation presents significant computational challenges due to the complexity of biomolecular surfaces and the need to account for mobile ions. While traditional numerical methods for solving the PB equation are accurate, they are computationally expensive and scale poorly with increasing system size. To address these challenges, we introduce PBNeF, a novel machine learning approach inspired by recent advancements in neural network-based partial differential equation solvers. Our method formulates the input and boundary electrostatic conditions of the PB equation into a learnable voxel representation, enabling the use of a neural field transformer to predict the PB solution and, subsequently, the reaction field potential energy. Extensive experiments demonstrate that PBNeF achieves over a 100-fold speedup compared to traditional PB solvers, while maintaining accuracy comparable to the Generalized Born (GB) model.
△ Less
Submitted 4 October, 2024;
originally announced October 2024.
-
Neural Dynamics of Delayed Feedback in Robot Teleoperation: Insights from fNIRS Analysis
Authors:
Tianyu Zhou,
Yang Ye,
Qi Zhu,
William Vann,
Jing Du
Abstract:
As robot teleoperation increasingly becomes integral in executing tasks in distant, hazardous, or inaccessible environments, the challenge of operational delays remains a significant obstacle. These delays are inherent in signal transmission and processing and can adversely affect the operators performance, particularly in tasks requiring precision and timeliness. While current research has made s…
▽ More
As robot teleoperation increasingly becomes integral in executing tasks in distant, hazardous, or inaccessible environments, the challenge of operational delays remains a significant obstacle. These delays are inherent in signal transmission and processing and can adversely affect the operators performance, particularly in tasks requiring precision and timeliness. While current research has made strides in mitigating these delays through advanced control strategies and training methods, a crucial gap persists in understanding the neurofunctional impacts of these delays and the efficacy of countermeasures from a cognitive perspective. Our study narrows this gap by leveraging functional Near-Infrared Spectroscopy (fNIRS) to examine the neurofunctional implications of simulated haptic feedback on cognitive activity and motor coordination under delayed conditions. In a human-subject experiment (N=41), we manipulated sensory feedback to observe its influences on various brain regions of interest (ROIs) response during teleoperation tasks. The fNIRS data provided a detailed assessment of cerebral activity, particularly in ROIs implicated in time perception and the execution of precise movements. Our results reveal that certain conditions, which provided immediate simulated haptic feedback, significantly optimized neural functions related to time perception and motor coordination, and improved motor performance. These findings provide empirical evidence about the neurofunctional basis of the enhanced motor performance with simulated synthetic force feedback in the presence of teleoperation delays.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
May the Force be with You: Unified Force-Centric Pre-Training for 3D Molecular Conformations
Authors:
Rui Feng,
Qi Zhu,
Huan Tran,
Binghong Chen,
Aubrey Toland,
Rampi Ramprasad,
Chao Zhang
Abstract:
Recent works have shown the promise of learning pre-trained models for 3D molecular representation. However, existing pre-training models focus predominantly on equilibrium data and largely overlook off-equilibrium conformations. It is challenging to extend these methods to off-equilibrium data because their training objective relies on assumptions of conformations being the local energy minima. W…
▽ More
Recent works have shown the promise of learning pre-trained models for 3D molecular representation. However, existing pre-training models focus predominantly on equilibrium data and largely overlook off-equilibrium conformations. It is challenging to extend these methods to off-equilibrium data because their training objective relies on assumptions of conformations being the local energy minima. We address this gap by proposing a force-centric pretraining model for 3D molecular conformations covering both equilibrium and off-equilibrium data. For off-equilibrium data, our model learns directly from their atomic forces. For equilibrium data, we introduce zero-force regularization and forced-based denoising techniques to approximate near-equilibrium forces. We obtain a unified pre-trained model for 3D molecular representation with over 15 million diverse conformations. Experiments show that, with our pre-training objective, we increase forces accuracy by around 3 times compared to the un-pre-trained Equivariant Transformer model. By incorporating regularizations on equilibrium data, we solved the problem of unstable MD simulations in vanilla Equivariant Transformers, achieving state-of-the-art simulation performance with 2.45 times faster inference time than NequIP. As a powerful molecular encoder, our pre-trained model achieves on-par performance with state-of-the-art property prediction tasks.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
-
Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health
Authors:
Shubo Tian,
Qiao Jin,
Lana Yeganova,
Po-Ting Lai,
Qingqing Zhu,
Xiuying Chen,
Yifan Yang,
Qingyu Chen,
Won Kim,
Donald C. Comeau,
Rezarta Islamaj,
Aadit Kapoor,
Xin Gao,
Zhiyong Lu
Abstract:
ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically we explore the…
▽ More
ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically we explore the areas of biomedical information retrieval, question answering, medical text summarization, information extraction, and medical education, and investigate whether LLMs possess the transformative power to revolutionize these tasks or whether the distinct complexities of biomedical domain presents unique challenges. Following an extensive literature survey, we find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods. For other applications, the advances have been modest. Overall, LLMs have not yet revolutionized biomedicine, but recent rapid progress indicates that such methods hold great potential to provide valuable means for accelerating discovery and improving health. We also find that the use of LLMs, like ChatGPT, in the fields of biomedicine and health entails various risks and challenges, including fabricated information in its generated responses, as well as legal and privacy concerns associated with sensitive patient data. We believe this survey can provide a comprehensive and timely overview to biomedical researchers and healthcare practitioners on the opportunities and challenges associated with using ChatGPT and other LLMs for transforming biomedicine and health.
△ Less
Submitted 16 October, 2023; v1 submitted 15 June, 2023;
originally announced June 2023.
-
Inter-brain substrates of role switching during mother-child interaction
Authors:
Yamin Li,
Saishuang Wu,
Jiayang Xu,
Haiwa Wang,
Qi Zhu,
Wen Shi,
Yue Fang,
Fan Jiang,
Shanbao Tong,
Yunting Zhang,
Xiaoli Guo
Abstract:
Mother-child interaction is highly dynamic and reciprocal. Switching roles in these back-and-forth interactions serves as a crucial feature of reciprocal behaviors while the underlying neural entrainment is still not well-studied. Here, we designed a role-controlled cooperative task with dual EEG recording to study how differently two brains interact when mothers and children hold different roles.…
▽ More
Mother-child interaction is highly dynamic and reciprocal. Switching roles in these back-and-forth interactions serves as a crucial feature of reciprocal behaviors while the underlying neural entrainment is still not well-studied. Here, we designed a role-controlled cooperative task with dual EEG recording to study how differently two brains interact when mothers and children hold different roles. When children were actors and mothers were observers, mother-child inter-brain synchrony emerged within the theta oscillations and the frontal lobe, which highly correlated with children's attachment to their mothers. When their roles were reversed, this synchrony was shifted to the alpha oscillations and the central area and associated with mothers' perception of their relationship with their children. The results suggested an observer-actor neural alignment within the actor's oscillations, which was modulated by the actor-toward-observer emotional bonding. Our findings contribute to the understanding of how inter-brain synchrony is established and dynamically changed during mother-child reciprocal interaction.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Therapeutic algebra of immunomodulatory drug responses at single-cell resolution
Authors:
Jialong Jiang,
Sisi Chen,
Tiffany Tsou,
Christopher S. McGinnis,
Tahmineh Khazaei,
Qin Zhu,
Jong H. Park,
Paul Rivaud,
Inna-Marie Strazhnik,
Eric D. Chow,
David A. Sivak,
Zev J. Gartner,
Matt Thomson
Abstract:
Therapeutic modulation of immune states is central to the treatment of human disease. However, how drugs and drug combinations impact the diverse cell types in the human immune system remains poorly understood at the transcriptome scale. Here, we apply single-cell mRNA-seq to profile the response of human immune cells to 502 immunomodulatory drugs alone and in combination. We develop a unified mat…
▽ More
Therapeutic modulation of immune states is central to the treatment of human disease. However, how drugs and drug combinations impact the diverse cell types in the human immune system remains poorly understood at the transcriptome scale. Here, we apply single-cell mRNA-seq to profile the response of human immune cells to 502 immunomodulatory drugs alone and in combination. We develop a unified mathematical model that quantitatively describes the transcriptome scale response of myeloid and lymphoid cell types to individual drugs and drug combinations through a single inferred regulatory network. The mathematical model reveals how drug combinations generate novel, macrophage and T-cell states by recruiting combinations of gene expression programs through both additive and non-additive drug interactions. A simplified drug response algebra allows us to predict the continuous modulation of immune cell populations between activated, resting and hyper-inhibited states through combinatorial drug dose titrations. Our results suggest that transcriptome-scale mathematical models could enable the design of therapeutic strategies for programming the human immune system using combinations of therapeutics.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Minimal Specifications for Non-Human Primate MRI: Challenges in Standardizing and Harmonizing Data Collection
Authors:
Joonas A. Autio,
Qi Zhu,
Xiaolian Li,
Matthew F. Glasser,
Caspar M. Schwiedrzik,
Damien A. Fair,
Jan Zimmermann,
Essa Yacoub,
Ravi S. Menon,
David C. Van Essen,
Takuya Hayashi,
Brian Russ,
Wim Vanduffel
Abstract:
Recent methodological advances in MRI have enabled substantial growth in neuroimaging studies of non-human primates (NHPs), while open data-sharing through the PRIME-DE initiative has increased the availability of NHP MRI data and the need for robust multi-subject multi-center analyses. Streamlined acquisition and analysis protocols would accelerate and improve these efforts. However, consensus on…
▽ More
Recent methodological advances in MRI have enabled substantial growth in neuroimaging studies of non-human primates (NHPs), while open data-sharing through the PRIME-DE initiative has increased the availability of NHP MRI data and the need for robust multi-subject multi-center analyses. Streamlined acquisition and analysis protocols would accelerate and improve these efforts. However, consensus on minimal standards for data acquisition protocols and analysis pipelines for NHP imaging remains to be established, particularly for multi-center studies. Here, we draw parallels between NHP and human neuroimaging and provide minimal guidelines for harmonizing and standardizing data acquisition. We advocate robust translation of widely used open-access toolkits that are well established for analyzing human data. We also encourage the use of validated, automated pre-processing tools for analyzing NHP data sets. These guidelines aim to refine methodological and analytical strategies for small and large-scale NHP neuroimaging data. This will improve reproducibility of results, and accelerate the convergence between NHP and human neuroimaging strategies which will ultimately benefit fundamental and translational brain science.
△ Less
Submitted 8 October, 2020;
originally announced October 2020.
-
Hybrid Mortality Prediction using Multiple Source Systems
Authors:
Isaac Mativo,
Yelena Yesha,
Michael Grasso,
Tim Oates,
Qian Zhu
Abstract:
The use of artificial intelligence in clinical care to improve decision support systems is increasing. This is not surprising since, by its very nature, the practice of medicine consists of making decisions based on observations from different systems both inside and outside the human body. In this paper, we combine three general systems (ICU, diabetes, and comorbidities) and use them to make pati…
▽ More
The use of artificial intelligence in clinical care to improve decision support systems is increasing. This is not surprising since, by its very nature, the practice of medicine consists of making decisions based on observations from different systems both inside and outside the human body. In this paper, we combine three general systems (ICU, diabetes, and comorbidities) and use them to make patient clinical predictions. We use an artificial intelligence approach to show that we can improve mortality prediction of hospitalized diabetic patients. We do this by utilizing a machine learning approach to select clinical input features that are more likely to predict mortality. We then use these features to create a hybrid mortality prediction model and compare our results to non-artificial intelligence models. For simplicity, we limit our input features to patient comorbidities and features derived from a well-known mortality measure, the Sequential Organ Failure Assessment (SOFA).
△ Less
Submitted 17 April, 2019;
originally announced May 2019.
-
Flexible Metal Oxide/Graphene Oxide Hybrid Neuromorphic Devices on Flexible Conducting Graphene Substrates
Authors:
Chang Jin Wan,
Wei Wang,
Li Qiang Zhu,
Yang Hui Liu,
Ping Feng,
Zhao Ping Liu,
Yi Shi,
Qing Wan
Abstract:
Flexible metal oxide/graphene oxide hybrid multi-gate neuron transistors were fabricated on flexible graphene substrates. Dendritic integrations in both spatial and temporal modes were successfully emulated, and spatiotemporal correlated logics were obtained. A proof-of-principle visual system model for emulating lobula giant motion detector neuron was investigated. Our results are of great intere…
▽ More
Flexible metal oxide/graphene oxide hybrid multi-gate neuron transistors were fabricated on flexible graphene substrates. Dendritic integrations in both spatial and temporal modes were successfully emulated, and spatiotemporal correlated logics were obtained. A proof-of-principle visual system model for emulating lobula giant motion detector neuron was investigated. Our results are of great interest for flexible neuromorphic cognitive systems.
△ Less
Submitted 7 March, 2016;
originally announced September 2016.
-
LAGE: A Java Framework to reconstruct Gene Regulatory Networks from Large-Scale Continues Expression Data
Authors:
Yang Lu,
Mengying Wang,
Kenny Q. Zhu,
Bo Yuan
Abstract:
LAGE is a systematic framework developed in Java. The motivation of LAGE is to provide a scalable and parallel solution to reconstruct Gene Regulatory Networks (GRNs) from continuous gene expression data for very large amount of genes. The basic idea of our framework is motivated by the philosophy of divideand-conquer. Specifically, LAGE recursively partitions genes into multiple overlapping commu…
▽ More
LAGE is a systematic framework developed in Java. The motivation of LAGE is to provide a scalable and parallel solution to reconstruct Gene Regulatory Networks (GRNs) from continuous gene expression data for very large amount of genes. The basic idea of our framework is motivated by the philosophy of divideand-conquer. Specifically, LAGE recursively partitions genes into multiple overlapping communities with much smaller sizes, learns intra-community GRNs respectively before merge them altogether. Besides, the complete information of overlapping communities serves as the byproduct, which could be used to mine meaningful functional modules in biological networks.
△ Less
Submitted 9 November, 2012;
originally announced November 2012.
-
Semantic Inference using Chemogenomics Data for Drug Discovery
Authors:
Qian Zhu,
Yuyin Sun,
Sashikiran Challa,
Ying Ding,
Michael S. Lajiness,
David J. Wild
Abstract:
Background Semantic Web Technology (SWT) makes it possible to integrate and search the large volume of life science datasets in the public domain, as demonstrated by well-known linked data projects such as LODD, Bio2RDF, and Chem2Bio2RDF. Integration of these sets creates large networks of information. We have previously described a tool called WENDI for aggregating information pertaining to new c…
▽ More
Background Semantic Web Technology (SWT) makes it possible to integrate and search the large volume of life science datasets in the public domain, as demonstrated by well-known linked data projects such as LODD, Bio2RDF, and Chem2Bio2RDF. Integration of these sets creates large networks of information. We have previously described a tool called WENDI for aggregating information pertaining to new chemical compounds, effectively creating evidence paths relating the compounds to genes, diseases and so on. In this paper we examine the utility of automatically inferring new compound-disease associations (and thus new links in the network) based on semantically marked-up versions of these evidence paths, rule-sets and inference engines.
Results Through the implementation of a semantic inference algorithm, rule set, Semantic Web methods (RDF, OWL and SPARQL) and new interfaces, we have created a new tool called Chemogenomic Explorer that uses networks of ontologically annotated RDF statements along with deductive reasoning tools to infer new associations between the query structure and genes and diseases from WENDI results. The tool then permits interactive clustering and filtering of these evidence paths.
Conclusions We present a new aggregate approach to inferring links between chemical compounds and diseases using semantic inference. This approach allows multiple evidence paths between compounds and diseases to be identified using a rule-set and semantically annotated data, and for these evidence paths to be clustered to show overall evidence linking the compound to a disease. We believe this is a powerful approach, because it allows compound-disease relationships to be ranked by the amount of evidence supporting them.
△ Less
Submitted 23 June, 2011;
originally announced June 2011.
-
Chem2Bio2RDF: A Linked Open Data Portal for Chemical Biology
Authors:
Bin Chen,
David J Wild,
Qian Zhu,
Ying Ding,
Xiao Dong,
Madhuvanthi Sankaranarayanan,
Huijun Wang,
Yuyin Sun
Abstract:
The Chem2Bio2RDF portal is a Linked Open Data (LOD) portal for systems chemical biology aiming for facilitating drug discovery. It converts around 25 different datasets on genes, compounds, drugs, pathways, side effects, diseases, and MEDLINE/PubMed documents into RDF triples and links them to other LOD bubbles, such as Bio2RDF, LODD and DBPedia. The portal is based on D2R server and provides a SP…
▽ More
The Chem2Bio2RDF portal is a Linked Open Data (LOD) portal for systems chemical biology aiming for facilitating drug discovery. It converts around 25 different datasets on genes, compounds, drugs, pathways, side effects, diseases, and MEDLINE/PubMed documents into RDF triples and links them to other LOD bubbles, such as Bio2RDF, LODD and DBPedia. The portal is based on D2R server and provides a SPARQL endpoint, but adds on few unique features like RDF faceted browser, user-friendly SPARQL query generator, MEDLINE/PubMed cross validation service, and Cytoscape visualization plugin. Three use cases demonstrate the functionality and usability of this portal.
△ Less
Submitted 21 December, 2010;
originally announced December 2010.