Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs > arXiv:1511.00736

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Computer Science > Machine Learning

arXiv:1511.00736 (cs)
[Submitted on 2 Nov 2015 (v1), last revised 25 Jan 2016 (this version, v2)]

Title:ProtNN: Fast and Accurate Nearest Neighbor Protein Function Prediction based on Graph Embedding in Structural and Topological Space

Authors:Wajdi Dhifli, Abdoulaye Baniré Diallo
View a PDF of the paper titled ProtNN: Fast and Accurate Nearest Neighbor Protein Function Prediction based on Graph Embedding in Structural and Topological Space, by Wajdi Dhifli and 1 other authors
View PDF
Abstract:Studying the function of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the determination of the function of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the determination of protein functions in living cells. In this paper, we propose ProtNN, a novel approach for protein function prediction. Given an unannotated protein structure and a set of annotated proteins, ProtNN finds the nearest neighbor annotated structures based on protein-graph pairwise similarities. Given a query protein, ProtNN finds the nearest neighbor reference proteins based on a graph representation model and a pairwise similarity between vector embedding of both query and reference protein-graphs in structural and topological spaces. ProtNN assigns to the query protein the function with the highest number of votes across the set of k nearest neighbor reference proteins, where k is a user-defined parameter. Experimental evaluation demonstrates that ProtNN is able to accurately classify several datasets in an extremely fast runtime compared to state-of-the-art approaches. We further show that ProtNN is able to scale up to a whole PDB dataset in a single-process mode with no parallelization, with a gain of thousands order of magnitude of runtime compared to state-of-the-art approaches.
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI)
Cite as: arXiv:1511.00736 [cs.LG]
  (or arXiv:1511.00736v2 [cs.LG] for this version)
  https://doi.org/10.48550/arXiv.1511.00736
arXiv-issued DOI via DataCite
Journal reference: BMC BioData Mining, 9:30, 2016
Related DOI: https://doi.org/10.1186/s13040-016-0108-2
DOI(s) linking to related resources

Submission history

From: Wajdi Dhifli [view email]
[v1] Mon, 2 Nov 2015 23:02:48 UTC (971 KB)
[v2] Mon, 25 Jan 2016 01:55:45 UTC (1,369 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled ProtNN: Fast and Accurate Nearest Neighbor Protein Function Prediction based on Graph Embedding in Structural and Topological Space, by Wajdi Dhifli and 1 other authors
  • View PDF
  • TeX Source
  • Other Formats
view license
Current browse context:
cs.LG
< prev   |   next >
new | recent | 2015-11
Change to browse by:
cs
cs.SI

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar

DBLP - CS Bibliography

listing | bibtex
Wajdi Dhifli
Abdoulaye Baniré Diallo
export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack