-
Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules
Authors:
Benjamin Sanchez-Lengeling,
Jennifer N. Wei,
Brian K. Lee,
Richard C. Gerkin,
Alán Aspuru-Guzik,
Alexander B. Wiltschko
Abstract:
Predicting the relationship between a molecule's structure and its odor remains a difficult, decades-old task. This problem, termed quantitative structure-odor relationship (QSOR) modeling, is an important challenge in chemistry, impacting human nutrition, manufacture of synthetic fragrance, the environment, and sensory neuroscience. We propose the use of graph neural networks for QSOR, and show t…
▽ More
Predicting the relationship between a molecule's structure and its odor remains a difficult, decades-old task. This problem, termed quantitative structure-odor relationship (QSOR) modeling, is an important challenge in chemistry, impacting human nutrition, manufacture of synthetic fragrance, the environment, and sensory neuroscience. We propose the use of graph neural networks for QSOR, and show they significantly out-perform prior methods on a novel data set labeled by olfactory experts. Additional analysis shows that the learned embeddings from graph neural networks capture a meaningful odor space representation of the underlying relationship between structure and odor, as demonstrated by strong performance on two challenging transfer learning tasks. Machine learning has already had a large impact on the senses of sight and sound. Based on these early results with graph neural networks for molecular properties, we hope machine learning can eventually do for olfaction what it has already done for vision and hearing.
△ Less
Submitted 25 October, 2019; v1 submitted 23 October, 2019;
originally announced October 2019.
-
Requirements for storing electrophysiology data
Authors:
Jeff Teeters,
Jan Benda,
Andrew Davison,
Stephen Eglen,
Richard C. Gerkin,
Jeffrey Grethe,
Jan Grewe,
Kenneth Harris,
Christian Kellner,
Yann Le Franc,
Roman Mouček,
Dimiter Prodanov,
Robert Pröpper,
Hyrum L. Sessions,
Leslie Smith,
Andrey Sobolev,
Friedrich Sommer,
Adrian Stoewer,
Thomas Wachtler,
Barry Wark
Abstract:
The purpose of this document is to specify the basic data types required for storing electrophysiology and optical imaging data to facilitate computer-based neuroscience studies and data sharing. These requirements are being developed within a working group of the Electrophysiology Task Force in the International Neuroinformatics Coordinating Facility (INCF) Program on Standards for Data Sharing.…
▽ More
The purpose of this document is to specify the basic data types required for storing electrophysiology and optical imaging data to facilitate computer-based neuroscience studies and data sharing. These requirements are being developed within a working group of the Electrophysiology Task Force in the International Neuroinformatics Coordinating Facility (INCF) Program on Standards for Data Sharing. While this document describes the requirements of the standard independent of the actual storage technology, the Task Force has recommended basing a standard on HDF5. This is in line with a number of groups who are already using HDF5 to store electrophysiology data, although currently without being based on a standard.
△ Less
Submitted 3 June, 2016; v1 submitted 24 May, 2016;
originally announced May 2016.
-
Unit Testing, Model Validation, and Biological Simulation
Authors:
Gopal P. Sarma,
Travis W. Jacobs,
Mark D. Watts,
Vahid Ghayoomi,
Richard C. Gerkin,
Stephen D. Larson
Abstract:
The growth of the software industry has gone hand in hand with the development of tools and cultural practices for ensuring the reliability of complex pieces of software. These tools and practices are now acknowledged to be essential to the management of modern software. As computational models and methods have become increasingly common in the biological sciences, it is important to examine how t…
▽ More
The growth of the software industry has gone hand in hand with the development of tools and cultural practices for ensuring the reliability of complex pieces of software. These tools and practices are now acknowledged to be essential to the management of modern software. As computational models and methods have become increasingly common in the biological sciences, it is important to examine how these practices can accelerate biological software development and improve research quality. In this article, we give a focused case study of our experience with the practices of unit testing and test-driven development in OpenWorm, an open-science project aimed at modeling Caenorhabditis elegans. We identify and discuss the challenges of incorporating test-driven development into a heterogeneous, data-driven project, as well as the role of model validation tests, a category of tests unique to software which expresses scientific models.
△ Less
Submitted 5 March, 2017; v1 submitted 19 August, 2015;
originally announced August 2015.