-
Quantum critical electro-optic and piezo-electric nonlinearities
Authors:
Christopher P. Anderson,
Giovanni Scuri,
Aaron Chan,
Sungjun Eun,
Alexander D. White,
Geun Ho Ahn,
Christine Jilly,
Amir Safavi-Naeini,
Kasper Van Gasse,
Lu Li,
Jelena Vučković
Abstract:
Electro-optics, the tuning of optical properties of materials with electric fields, is key to a multitude of quantum and classical photonics applications. However, a major obstacle preventing many emerging use cases is inefficient modulation in cryogenic environments, as traditional tuning mechanisms degrade at low temperatures. Guided by the connection between phase transitions and nonlinearity,…
▽ More
Electro-optics, the tuning of optical properties of materials with electric fields, is key to a multitude of quantum and classical photonics applications. However, a major obstacle preventing many emerging use cases is inefficient modulation in cryogenic environments, as traditional tuning mechanisms degrade at low temperatures. Guided by the connection between phase transitions and nonlinearity, we identify the quantum paraelectric perovskite SrTiO$_3$ (STO) as the strongest cryogenic electro-optic photonic material. As a result of the unique quantum paraelectric phase of STO, we demonstrate a dynamically tunable linear Pockels coefficient ($r_{33}$) exceeding 500 pm/V at $T=5$ K, and study its full temperature and bias dependence. We also measure an enhanced piezo-electric coefficient ($d_{33}$) above 90 pC/N. Both of these coefficients exceed all previously reported values for cryogenic materials, including lithium niobate ($r_{33}\approx24$ pm/V) and barium titanate ($r_{42}\approx170$ pm/V). Furthermore, by tuning STO towards \textit{quantum criticality} with oxygen isotope substitution we more than double the optical and piezo-electric nonlinearities, demonstrating a linear Pockels coefficient above 1100 pm/V. Our results probe the link between quantum phase transitions, dielectric susceptibility, and optical nonlinearities, unlocking opportunities in cryogenic optical and mechanical systems, and provide a framework for discovering new nonlinear materials.
△ Less
Submitted 25 February, 2025; v1 submitted 20 February, 2025;
originally announced February 2025.
-
MDCrow: Automating Molecular Dynamics Workflows with Large Language Models
Authors:
Quintina Campbell,
Sam Cox,
Jorge Medina,
Brittany Watterson,
Andrew D. White
Abstract:
Molecular dynamics (MD) simulations are essential for understanding biomolecular systems but remain challenging to automate. Recent advances in large language models (LLM) have demonstrated success in automating complex scientific tasks using LLM-based agents. In this paper, we introduce MDCrow, an agentic LLM assistant capable of automating MD workflows. MDCrow uses chain-of-thought over 40 exper…
▽ More
Molecular dynamics (MD) simulations are essential for understanding biomolecular systems but remain challenging to automate. Recent advances in large language models (LLM) have demonstrated success in automating complex scientific tasks using LLM-based agents. In this paper, we introduce MDCrow, an agentic LLM assistant capable of automating MD workflows. MDCrow uses chain-of-thought over 40 expert-designed tools for handling and processing files, setting up simulations, analyzing the simulation outputs, and retrieving relevant information from literature and databases. We assess MDCrow's performance across 25 tasks of varying required subtasks and difficulty, and we evaluate the agent's robustness to both difficulty and prompt style. \texttt{gpt-4o} is able to complete complex tasks with low variance, followed closely by \texttt{llama3-405b}, a compelling open-source model. While prompt style does not influence the best models' performance, it has significant effects on smaller models.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
PLUMED Tutorials: a collaborative, community-driven learning ecosystem
Authors:
Gareth A. Tribello,
Massimiliano Bonomi,
Giovanni Bussi,
Carlo Camilloni,
Blake I. Armstrong,
Andrea Arsiccio,
Simone Aureli,
Federico Ballabio,
Mattia Bernetti,
Luigi Bonati,
Samuel G. H. Brookes,
Z. Faidon Brotzakis,
Riccardo Capelli,
Michele Ceriotti,
Kam-Tung Chan,
Pilar Cossio,
Siva Dasetty,
Davide Donadio,
Bernd Ensing,
Andrew L. Ferguson,
Guillaume Fraux,
Julian D. Gale,
Francesco Luigi Gervasio,
Toni Giorgino,
Nicholas S. M. Herringer
, et al. (38 additional authors not shown)
Abstract:
In computational physics, chemistry, and biology, the implementation of new techniques in a shared and open source software lowers barriers to entry and promotes rapid scientific progress. However, effectively training new software users presents several challenges. Common methods like direct knowledge transfer and in-person workshops are limited in reach and comprehensiveness. Furthermore, while…
▽ More
In computational physics, chemistry, and biology, the implementation of new techniques in a shared and open source software lowers barriers to entry and promotes rapid scientific progress. However, effectively training new software users presents several challenges. Common methods like direct knowledge transfer and in-person workshops are limited in reach and comprehensiveness. Furthermore, while the COVID-19 pandemic highlighted the benefits of online training, traditional online tutorials can quickly become outdated and may not cover all the software's functionalities. To address these issues, here we introduce ``PLUMED Tutorials'', a collaborative model for developing, sharing, and updating online tutorials. This initiative utilizes repository management and continuous integration to ensure compatibility with software updates. Moreover, the tutorials are interconnected to form a structured learning path and are enriched with automatic annotations to provide broader context. This paper illustrates the development, features, and advantages of PLUMED Tutorials, aiming to foster an open community for creating and sharing educational resources.
△ Less
Submitted 29 November, 2024;
originally announced December 2024.
-
Language agents achieve superhuman synthesis of scientific knowledge
Authors:
Michael D. Skarlinski,
Sam Cox,
Jon M. Laurent,
James D. Braza,
Michaela Hinks,
Michael J. Hammerling,
Manvitha Ponnapati,
Samuel G. Rodriques,
Andrew D. White
Abstract:
Language models are known to hallucinate incorrect information, and it is unclear if they are sufficiently accurate and reliable for use in scientific research. We developed a rigorous human-AI comparison methodology to evaluate language model agents on real-world literature search tasks covering information retrieval, summarization, and contradiction detection tasks. We show that PaperQA2, a fron…
▽ More
Language models are known to hallucinate incorrect information, and it is unclear if they are sufficiently accurate and reliable for use in scientific research. We developed a rigorous human-AI comparison methodology to evaluate language model agents on real-world literature search tasks covering information retrieval, summarization, and contradiction detection tasks. We show that PaperQA2, a frontier language model agent optimized for improved factuality, matches or exceeds subject matter expert performance on three realistic literature research tasks without any restrictions on humans (i.e., full access to internet, search tools, and time). PaperQA2 writes cited, Wikipedia-style summaries of scientific topics that are significantly more accurate than existing, human-written Wikipedia articles. We also introduce a hard benchmark for scientific literature research called LitQA2 that guided design of PaperQA2, leading to it exceeding human performance. Finally, we apply PaperQA2 to identify contradictions within the scientific literature, an important scientific task that is challenging for humans. PaperQA2 identifies 2.34 +/- 1.99 contradictions per paper in a random subset of biology papers, of which 70% are validated by human experts. These results demonstrate that language model agents are now capable of exceeding domain experts across meaningful tasks on scientific literature.
△ Less
Submitted 26 September, 2024; v1 submitted 10 September, 2024;
originally announced September 2024.
-
Rigorous Bound on the Violation of Dynamic Reciprocity Induced by Four-Wave Mixing
Authors:
Alexander D. White,
Rahul Trivedi
Abstract:
Dynamic reciprocity imposes stringent performance constraints on nonlinear optical devices such as isolators and circulators. The seminal result by Shi et al. establishes that nonlinear optical devices relying on the intensity-dependent refractive index obey dynamic reciprocity for small signals with spectrally distinct fields. However, it has also been recognized that it is possible to violate dy…
▽ More
Dynamic reciprocity imposes stringent performance constraints on nonlinear optical devices such as isolators and circulators. The seminal result by Shi et al. establishes that nonlinear optical devices relying on the intensity-dependent refractive index obey dynamic reciprocity for small signals with spectrally distinct fields. However, it has also been recognized that it is possible to violate dynamic reciprocity by exploiting frequency mixing processes. In this paper, we establish a rigorous upper bound on this violation that is independent of device geometry. We demonstrate that this bound captures the parameter scalings of realizable physical systems, and that under some conditions dynamic reciprocity violation can grow unbounded to achieve arbitrary nonlinear isolation. These results provide an analytically robust version of dynamic reciprocity, as well as theoretical guidance for the development of power efficient nonlinear optical isolators and circulators.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
Slow molecular beams from a cryogenic buffer gas source
Authors:
A. D. White,
S. Popa,
J. Mellado-Munoz,
N. J. Fitch,
B. E. Sauer,
J. Lim,
M. R. Tarbutt
Abstract:
We study the properties of a cryogenic buffer gas source that uses a low temperature two-stage buffer gas cell to produce very slow beams of ytterbium monofluoride molecules. The molecules are produced by laser ablation inside the cell and extracted into a beam by a flow of cold helium. We measure the flux and velocity distribution of the beam as a function of ablation energy, helium flow rate, ce…
▽ More
We study the properties of a cryogenic buffer gas source that uses a low temperature two-stage buffer gas cell to produce very slow beams of ytterbium monofluoride molecules. The molecules are produced by laser ablation inside the cell and extracted into a beam by a flow of cold helium. We measure the flux and velocity distribution of the beam as a function of ablation energy, helium flow rate, cell temperature, and the size of the gap between the first and second stages of the cell. We also compare the velocity distributions from one-stage and two-stage cells. The one-stage cell emits a beam with a speed of about 82 m s$^{-1}$ and a translational temperature of 0.63 K. The slowest beams are obtained using the two-stage cell at the lowest achievable cell temperature of 1.8 K. This beam has a peak velocity of 56 m s$^{-1}$ and a flux of $9 \times 10^9$ ground state molecules per steradian per pulse, with a substantial fraction at speeds below 40 m s$^{-1}$. These slow molecules can be decelerated further by radiation pressure slowing and then captured in a magneto-optical trap.
△ Less
Submitted 17 November, 2024; v1 submitted 3 August, 2024;
originally announced August 2024.
-
A Review of Large Language Models and Autonomous Agents in Chemistry
Authors:
Mayk Caldas Ramos,
Christopher J. Collison,
Andrew D. White
Abstract:
Large language models (LLMs) have emerged as powerful tools in chemistry, significantly impacting molecule design, property prediction, and synthesis optimization. This review highlights LLM capabilities in these domains and their potential to accelerate scientific discovery through automation. We also review LLM-based autonomous agents: LLMs with a broader set of tools to interact with their surr…
▽ More
Large language models (LLMs) have emerged as powerful tools in chemistry, significantly impacting molecule design, property prediction, and synthesis optimization. This review highlights LLM capabilities in these domains and their potential to accelerate scientific discovery through automation. We also review LLM-based autonomous agents: LLMs with a broader set of tools to interact with their surrounding environment. These agents perform diverse tasks such as paper scraping, interfacing with automated laboratories, and synthesis planning. As agents are an emerging topic, we extend the scope of our review of agents beyond chemistry and discuss across any scientific domains. This review covers the recent history, current capabilities, and design of LLMs and autonomous agents, addressing specific challenges, opportunities, and future directions in chemistry. Key challenges include data quality and integration, model interpretability, and the need for standard benchmarks, while future directions point towards more sophisticated multi-modal agents and enhanced collaboration between agents and experimental methods. Due to the quick pace of this field, a repository has been built to keep track of the latest studies: https://github.com/ur-whitelab/LLMs-in-science.
△ Less
Submitted 14 November, 2024; v1 submitted 26 June, 2024;
originally announced July 2024.
-
Unified laser stabilization and isolation on a silicon chip
Authors:
Alexander D. White,
Geun Ho Ahn,
Richard Luhtaru,
Joel Guo,
Theodore J. Morin,
Abhi Saxena,
Lin Chang,
Arka Majumdar,
Kasper Van Gasse,
John E. Bowers,
Jelena Vučković
Abstract:
Rapid progress in photonics has led to an explosion of integrated devices that promise to deliver the same performance as table-top technology at the nanoscale; heralding the next generation of optical communications, sensing and metrology, and quantum technologies. However, the challenge of co-integrating the multiple components of high-performance laser systems has left application of these nano…
▽ More
Rapid progress in photonics has led to an explosion of integrated devices that promise to deliver the same performance as table-top technology at the nanoscale; heralding the next generation of optical communications, sensing and metrology, and quantum technologies. However, the challenge of co-integrating the multiple components of high-performance laser systems has left application of these nanoscale devices thwarted by bulky laser sources that are orders of magnitude larger than the devices themselves. Here we show that the two main ingredients for high-performance lasers -- noise reduction and isolation -- currently requiring serial combination of incompatible technologies, can be sourced simultaneously from a single, passive, CMOS-compatible nanophotonic device. To do this, we take advantage of both the long photon lifetime and the nonreciprocal Kerr nonlinearity of a high quality factor silicon nitride ring resonator to self-injection lock a semiconductor laser chip while also providing isolation. Additionally, we identify a previously unappreciated power regime limitation of current on-chip laser architectures which our system overcomes. Using our device, which we term a unified laser stabilizer, we demonstrate an on-chip integrated laser system with built-in isolation and noise reduction that operates with turnkey reliability. This approach departs from efforts to directly miniaturize and integrate traditional laser system components and serves to bridge the gap to fully integrated optical technologies.
△ Less
Submitted 24 May, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
Titanium:Sapphire-on-insulator for broadband tunable lasers and high-power amplifiers on chip
Authors:
Joshua Yang,
Kasper Van Gasse,
Daniil M. Lukin,
Melissa A. Guidry,
Geun Ho Ahn,
Alexander D. White,
Jelena Vučković
Abstract:
Titanium:Sapphire (Ti:Sa) lasers have been essential for advancing fundamental research and technological applications. Ti:Sa lasers are unmatched in bandwidth and tuning range, yet their use is severely restricted due to their large size, cost, and need for high optical pump powers. Here, we demonstrate a monocrystalline Ti:Sa-on-insulator (Ti:SaOI) photonics platform which enables dramatic minia…
▽ More
Titanium:Sapphire (Ti:Sa) lasers have been essential for advancing fundamental research and technological applications. Ti:Sa lasers are unmatched in bandwidth and tuning range, yet their use is severely restricted due to their large size, cost, and need for high optical pump powers. Here, we demonstrate a monocrystalline Ti:Sa-on-insulator (Ti:SaOI) photonics platform which enables dramatic miniaturization, cost-reduction, and scalability of Ti:Sa technology. First, through fabrication of low-loss whispering gallery mode resonators, we realize a Ti:Sa laser operating with an ultra-low lasing threshold of 290 $μ$W. Then, through orders-of-magnitude improvement in mode confinement in Ti:SaOI waveguides, we realize the first integrated solid-state (i.e., non-semiconductor) optical amplifier operating below 1 $μ$m, with an ultra-wide bandwidth of 700 - 950 nm and peak gain of 64 dB/cm. We demonstrate unprecedented 17 dB distortion-free amplification of picosecond pulses to up to 2.3 nJ pulse energy, corresponding to a peak power of 1.0 kW. Finally, we demonstrate the first tunable integrated Ti:Sa laser, featuring narrow linewidths and a 24.7 THz tuning range, which, for the first time, can be pumped with low-cost, miniature, off-the-shelf green laser diodes. This opens doors to new modalities of Ti:Sa lasers (now occupying a footprint less than 0.15 mm$^2$), such as massively-scalable Ti:Sa laser array systems for a variety of applications. As a proof-of-concept demonstration, we employ a Ti:SaOI laser array as the sole optical control for a cavity quantum electrodynamics experiment with artificial atoms in silicon carbide. This work is a key step towards the democratization of Ti:Sa technology through a three orders-of-magnitude reduction in cost and footprint, as well as the introduction of solid-state broadband amplification of sub-micron wavelength light.
△ Less
Submitted 30 November, 2023;
originally announced December 2023.
-
An inverse-designed nanophotonic interface for excitons in atomically thin materials
Authors:
Ryan J. Gelly,
Alexander D. White,
Giovanni Scuri,
Xing Liao,
Geun Ho Ahn,
Bingchen Deng,
Kenji Watanabe,
Takashi Taniguchi,
Jelena Vučković,
Hongkun Park
Abstract:
Efficient nanophotonic devices are essential for applications in quantum networking, optical information processing, sensing, and nonlinear optics. Extensive research efforts have focused on integrating two-dimensional (2D) materials into photonic structures, but this integration is often limited by size and material quality. Here, we use hexagonal boron nitride (hBN), a benchmark choice for encap…
▽ More
Efficient nanophotonic devices are essential for applications in quantum networking, optical information processing, sensing, and nonlinear optics. Extensive research efforts have focused on integrating two-dimensional (2D) materials into photonic structures, but this integration is often limited by size and material quality. Here, we use hexagonal boron nitride (hBN), a benchmark choice for encapsulating atomically thin materials, as a waveguiding layer while simultaneously improving the optical quality of the embedded films. When combined with photonic inverse design, it becomes a complete nanophotonic platform to interface with optically active 2D materials. Grating couplers and low-loss waveguides provide optical interfacing and routing, tunable cavities provide a large exciton-photon coupling to transition metal dichalcogenides (TMD) monolayers through Purcell enhancement, and metasurfaces enable the efficient detection of TMD dark excitons. This work paves the way for advanced 2D-material nanophotonic structures for classical and quantum nonlinear optics.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Predicting small molecules solubilities on endpoint devices using deep ensemble neural networks
Authors:
Mayk Caldas Ramos,
Andrew D. White
Abstract:
Aqueous solubility is a valuable yet challenging property to predict. Computing solubility using first-principles methods requires accounting for the competing effects of entropy and enthalpy, resulting in long computations for relatively poor accuracy. Data-driven approaches, such as deep learning, offer improved accuracy and computational efficiency but typically lack uncertainty quantification.…
▽ More
Aqueous solubility is a valuable yet challenging property to predict. Computing solubility using first-principles methods requires accounting for the competing effects of entropy and enthalpy, resulting in long computations for relatively poor accuracy. Data-driven approaches, such as deep learning, offer improved accuracy and computational efficiency but typically lack uncertainty quantification. Additionally, ease of use remains a concern for any computational technique, resulting in the sustained popularity of group-based contribution methods. In this work, we addressed these problems with a deep learning model with predictive uncertainty that runs on a static website (without a server). This approach moves computing needs onto the website visitor without requiring installation, removing the need to pay for and maintain servers. Our model achieves satisfactory results in solubility prediction. Furthermore, we demonstrate how to create molecular property prediction models that balance uncertainty and ease of use. The code is available at https://github.com/ur-whitelab/mol.dev, and the model is usable at https://mol.dev.
△ Less
Submitted 7 March, 2024; v1 submitted 11 July, 2023;
originally announced July 2023.
-
14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon
Authors:
Kevin Maik Jablonka,
Qianxiang Ai,
Alexander Al-Feghali,
Shruti Badhwar,
Joshua D. Bocarsly,
Andres M Bran,
Stefan Bringuier,
L. Catherine Brinson,
Kamal Choudhary,
Defne Circi,
Sam Cox,
Wibe A. de Jong,
Matthew L. Evans,
Nicolas Gastellu,
Jerome Genzling,
María Victoria Gil,
Ankur K. Gupta,
Zhi Hong,
Alishba Imran,
Sabine Kruschwitz,
Anne Labarre,
Jakub Lála,
Tao Liu,
Steven Ma,
Sauradeep Majumdar
, et al. (28 additional authors not shown)
Abstract:
Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon.
This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of mole…
▽ More
Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon.
This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of molecules and materials, designing novel interfaces for tools, extracting knowledge from unstructured data, and developing new educational applications.
The diverse topics and the fact that working prototypes could be generated in less than two days highlight that LLMs will profoundly impact the future of our fields. The rich collection of ideas and projects also indicates that the applications of LLMs are not limited to materials science and chemistry but offer potential benefits to a wide range of scientific disciplines.
△ Less
Submitted 14 July, 2023; v1 submitted 9 June, 2023;
originally announced June 2023.
-
Active Learning in Symbolic Regression with Physical Constraints
Authors:
Jorge Medina,
Andrew D. White
Abstract:
Evolutionary symbolic regression (SR) fits a symbolic equation to data, which gives a concise interpretable model. We explore using SR as a method to propose which data to gather in an active learning setting with physical constraints. SR with active learning proposes which experiments to do next. Active learning is done with query by committee, where the Pareto frontier of equations is the commit…
▽ More
Evolutionary symbolic regression (SR) fits a symbolic equation to data, which gives a concise interpretable model. We explore using SR as a method to propose which data to gather in an active learning setting with physical constraints. SR with active learning proposes which experiments to do next. Active learning is done with query by committee, where the Pareto frontier of equations is the committee. The physical constraints improve proposed equations in very low data settings. These approaches reduce the data required for SR and achieves state of the art results in data required to rediscover known equations.
△ Less
Submitted 9 August, 2024; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Tunable vector beam decoder by inverse design for high-dimensional quantum key distribution with 3D polarized spatial modes
Authors:
Eileen Otte,
Alexander D. White,
Nicholas A. Güsken,
Jelena Vučković,
Mark L. Brongersma
Abstract:
Spatial modes of light have become highly attractive to increase the dimension and, thereby, security and information capacity in quantum key distribution (QKD). So far, only transverse electric field components have been considered, while longitudinal polarization components have remained neglected. Here, we present an approach to include all three spatial dimensions of electric field oscillation…
▽ More
Spatial modes of light have become highly attractive to increase the dimension and, thereby, security and information capacity in quantum key distribution (QKD). So far, only transverse electric field components have been considered, while longitudinal polarization components have remained neglected. Here, we present an approach to include all three spatial dimensions of electric field oscillation in QKD by implementing our tunable, on-a-chip vector beam decoder (VBD). This inversely designed device pioneers the "preparation" and "measurement" of three-dimensionally polarized mutually unbiased basis states for high-dimensional (HD) QKD and paves the way for the integration of HD QKD with spatial modes in multifunctional on-a-chip photonics platforms.
△ Less
Submitted 25 April, 2023; v1 submitted 24 April, 2023;
originally announced April 2023.
-
Censoring chemical data to mitigate dual use risk
Authors:
Quintina L. Campbell,
Jonathan Herington,
Andrew D. White
Abstract:
The dual use of machine learning applications, where models can be used for both beneficial and malicious purposes, presents a significant challenge. This has recently become a particular concern in chemistry, where chemical datasets containing sensitive labels (e.g. toxicological information) could be used to develop predictive models that identify novel toxins or chemical warfare agents. To miti…
▽ More
The dual use of machine learning applications, where models can be used for both beneficial and malicious purposes, presents a significant challenge. This has recently become a particular concern in chemistry, where chemical datasets containing sensitive labels (e.g. toxicological information) could be used to develop predictive models that identify novel toxins or chemical warfare agents. To mitigate dual use risks, we propose a model-agnostic method of selectively noising datasets while preserving the utility of the data for training deep neural networks in a beneficial region. We evaluate the effectiveness of the proposed method across least squares, a multilayer perceptron, and a graph neural network. Our findings show selectively noised datasets can induce model variance and bias in predictions for sensitive labels with control, suggesting the safe sharing of datasets containing sensitive information is feasible. We also find omitting sensitive data often increases model variance sufficiently to mitigate dual use. This work is proposed as a foundation for future research on enabling more secure and collaborative data sharing practices and safer machine learning applications in chemistry.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Bloom filters for molecules
Authors:
Jorge Medina,
Andrew D White
Abstract:
Ultra-large chemical libraries are reaching 10s to 100s of billions of molecules. A challenge for these libraries is to efficiently check if a proposed molecule is present. Here we propose and study Bloom filters for testing if a molecule is present in a set using either string or fingerprint representations. Bloom filters are small enough to hold billions of molecules in just a few GB of memory a…
▽ More
Ultra-large chemical libraries are reaching 10s to 100s of billions of molecules. A challenge for these libraries is to efficiently check if a proposed molecule is present. Here we propose and study Bloom filters for testing if a molecule is present in a set using either string or fingerprint representations. Bloom filters are small enough to hold billions of molecules in just a few GB of memory and check membership in sub milliseconds. We found string representations can have a false positive rate below 1% and require significantly less storage than using fingerprints. Canonical SMILES with Bloom filters with the simple FNV hashing function provide fast and accurate membership tests with small memory requirements. We provide a general implementation and specific filters for detecting if a molecule is purchasable, patented, or a natural product according to existing databases at https://github.com/whitead/molbloom
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
ChemCrow: Augmenting large-language models with chemistry tools
Authors:
Andres M Bran,
Sam Cox,
Oliver Schilter,
Carlo Baldassari,
Andrew D White,
Philippe Schwaller
Abstract:
Over the last decades, excellent computational chemistry tools have been developed. Integrating them into a single platform with enhanced accessibility could help reaching their full potential by overcoming steep learning curves. Recently, large-language models (LLMs) have shown strong performance in tasks across domains, but struggle with chemistry-related problems. Moreover, these models lack ac…
▽ More
Over the last decades, excellent computational chemistry tools have been developed. Integrating them into a single platform with enhanced accessibility could help reaching their full potential by overcoming steep learning curves. Recently, large-language models (LLMs) have shown strong performance in tasks across domains, but struggle with chemistry-related problems. Moreover, these models lack access to external knowledge sources, limiting their usefulness in scientific applications. In this study, we introduce ChemCrow, an LLM chemistry agent designed to accomplish tasks across organic synthesis, drug discovery, and materials design. By integrating 18 expert-designed tools, ChemCrow augments the LLM performance in chemistry, and new capabilities emerge. Our agent autonomously planned and executed the syntheses of an insect repellent, three organocatalysts, and guided the discovery of a novel chromophore. Our evaluation, including both LLM and expert assessments, demonstrates ChemCrow's effectiveness in automating a diverse set of chemical tasks. Surprisingly, we find that GPT-4 as an evaluator cannot distinguish between clearly wrong GPT-4 completions and Chemcrow's performance. Our work not only aids expert chemists and lowers barriers for non-experts, but also fosters scientific advancement by bridging the gap between experimental and computational chemistry.
△ Less
Submitted 2 October, 2023; v1 submitted 11 April, 2023;
originally announced April 2023.
-
Bayesian Optimization of Catalysis With In-Context Learning
Authors:
Mayk Caldas Ramos,
Shane S. Michtavy,
Marc D. Porosoff,
Andrew D. White
Abstract:
Large language models (LLMs) can perform accurate classification with zero or few examples through in-context learning. We extend this capability to regression with uncertainty estimation using frozen LLMs (e.g., GPT-3.5, Gemini), enabling Bayesian optimization (BO) in natural language without explicit model training or feature engineering. We apply this to materials discovery by representing expe…
▽ More
Large language models (LLMs) can perform accurate classification with zero or few examples through in-context learning. We extend this capability to regression with uncertainty estimation using frozen LLMs (e.g., GPT-3.5, Gemini), enabling Bayesian optimization (BO) in natural language without explicit model training or feature engineering. We apply this to materials discovery by representing experimental catalyst synthesis and testing procedures as natural language prompts. A key challenge in materials discovery is the need to characterize suboptimal candidates, which slows progress. While BO is effective for navigating large design spaces, standard surrogate models like Gaussian processes assume smoothness and continuity, an assumption that fails in highly non-linear domains such as heterogeneous catalysis. Our task-agnostic BO workflow overcomes this by operating directly in language space, producing interpretable and actionable predictions without requiring structural or electronic descriptors. On benchmarks like aqueous solubility and oxidative coupling of methane (OCM), BO-ICL matches or outperforms Gaussian processes. In live experiments on the reverse water-gas shift (RWGS) reaction, BO-ICL identifies near-optimal multi-metallic catalysts within six iterations from a pool of 3,700 candidates. Our method redefines materials representation and accelerates discovery, with broad applications across catalysis, materials science, and AI. Code: https://github.com/ur-whitelab/BO-ICL.
△ Less
Submitted 14 May, 2025; v1 submitted 11 April, 2023;
originally announced April 2023.
-
Recent advances in the Self-Referencing Embedding Strings (SELFIES) library
Authors:
Alston Lo,
Robert Pollice,
AkshatKumar Nigam,
Andrew D. White,
Mario Krenn,
Alán Aspuru-Guzik
Abstract:
String-based molecular representations play a crucial role in cheminformatics applications, and with the growing success of deep learning in chemistry, have been readily adopted into machine learning pipelines. However, traditional string-based representations such as SMILES are often prone to syntactic and semantic errors when produced by generative models. To address these problems, a novel repr…
▽ More
String-based molecular representations play a crucial role in cheminformatics applications, and with the growing success of deep learning in chemistry, have been readily adopted into machine learning pipelines. However, traditional string-based representations such as SMILES are often prone to syntactic and semantic errors when produced by generative models. To address these problems, a novel representation, SELF-referencIng Embedded Strings (SELFIES), was proposed that is inherently 100% robust, alongside an accompanying open-source implementation. Since then, we have generalized SELFIES to support a wider range of molecules and semantic constraints and streamlined its underlying grammar. We have implemented this updated representation in subsequent versions of \selfieslib, where we have also made major advances with respect to design, efficiency, and supported features. Hence, we present the current status of \selfieslib (version 2.1.1) in this manuscript.
△ Less
Submitted 7 February, 2023;
originally announced February 2023.
-
Platform-agnostic waveguide integration of high-speed photodetectors with evaporated tellurium thin films
Authors:
Geun Ho Ahn,
Alexander D. White,
Hyungjin Kim,
Naoki Higashitarumizu,
Felix M. Mayor,
Jason F. Herrmann,
Wentao Jiang,
Kevin K. S. Multani,
Amir H. Safavi-Naeini,
Ali Javey,
Jelena Vučković
Abstract:
Many attractive photonics platforms still lack integrated photodetectors due to inherent material incompatibilities and lack of process scalability, preventing their widespread deployment. Here we address the problem of scalably integrating photodetectors in a photonic platform-independent manner. Using a thermal evaporation and deposition technique developed for nanoelectronics, we show that tell…
▽ More
Many attractive photonics platforms still lack integrated photodetectors due to inherent material incompatibilities and lack of process scalability, preventing their widespread deployment. Here we address the problem of scalably integrating photodetectors in a photonic platform-independent manner. Using a thermal evaporation and deposition technique developed for nanoelectronics, we show that tellurium (Te), a quasi-2D semi-conductive element, can be evaporated at low temperature directly onto photonic chips to form air-stable, high-responsivity, high-speed, ultrawide-band photodetectors. We demonstrate detection at visible, telecom, and mid-infrared wavelengths, a bandwidth of more than 40 GHz, and platform-independent scalable integration with photonic structures in silicon, silicon nitride and lithium niobate.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.
-
Integrated Passive Nonlinear Optical Isolators
Authors:
Alexander D. White,
Geun Ho Ahn,
Kasper Van Gasse,
Ki Youl Yang,
Lin Chang,
John E. Bowers,
Jelena Vučković
Abstract:
Fiber and bulk-optical isolators are widely used to stabilize laser cavities by preventing unwanted feedback. However, their integrated counterparts have been slow to be adopted. While several strategies for on-chip optical isolation have been realized, these rely on either integration of magneto-optic materials or high frequency modulation with acousto-optic or electro-optic modulators. Here, we…
▽ More
Fiber and bulk-optical isolators are widely used to stabilize laser cavities by preventing unwanted feedback. However, their integrated counterparts have been slow to be adopted. While several strategies for on-chip optical isolation have been realized, these rely on either integration of magneto-optic materials or high frequency modulation with acousto-optic or electro-optic modulators. Here, we demonstrate an integrated approach for passively isolating a continuous wave laser using the intrinsically non-reciprocal Kerr nonlinearity in ring resonators. Using silicon nitride as a model platform, we achieve single ring isolation of 17-23dB with 1.8-5.5dB insertion loss, and a cascaded ring isolation of 35dB with 5dB insertion loss. Employing these devices, we demonstrate hybrid integration and isolation with a semi-conductor laser chip.
△ Less
Submitted 13 June, 2022; v1 submitted 2 June, 2022;
originally announced June 2022.
-
Physics is the New Data
Authors:
Sergei V. Kalinin,
Maxim Ziatdinov,
Bobby G. Sumpter,
Andrew D. White
Abstract:
The rapid development of machine learning (ML) methods has fundamentally affected numerous applications ranging from computer vision, biology, and medicine to accounting and text analytics. Until now, it was the availability of large and often labeled data sets that enabled significant breakthroughs. However, the adoption of these methods in classical physical disciplines has been relatively slow,…
▽ More
The rapid development of machine learning (ML) methods has fundamentally affected numerous applications ranging from computer vision, biology, and medicine to accounting and text analytics. Until now, it was the availability of large and often labeled data sets that enabled significant breakthroughs. However, the adoption of these methods in classical physical disciplines has been relatively slow, a tendency that can be traced to the intrinsic differences between correlative approaches of purely data-based ML and the causal hypothesis-driven nature of physical sciences. Furthermore, anomalous behaviors of classical ML necessitate addressing issues such as explainability and fairness of ML. We also note the sequence in which deep learning became mainstream in different scientific disciplines - starting from medicine and biology and then towards theoretical chemistry, and only after that, physics - is rooted in the progressively more complex level of descriptors, constraints, and causal structures available for incorporation in ML architectures. Here we put forth that over the next decade, physics will become a new data, and this will continue the transition from dot-coms and scientific computing concepts of the 90ies to big data of 2000-2010 to deep learning of 2010-2020 to physics-enabled scientific ML.
△ Less
Submitted 11 April, 2022;
originally announced April 2022.
-
Symmetric Molecular Dynamics
Authors:
Sam Cox,
Andrew D. White
Abstract:
We derive a formulation of molecular dynamics that generates only symmetric configurations. We implement it for all 2D planar and 3D space groups. An atlas of 2D Lennard-Jones crystals under all planar groups is created with symmetric molecular dynamics.
We derive a formulation of molecular dynamics that generates only symmetric configurations. We implement it for all 2D planar and 3D space groups. An atlas of 2D Lennard-Jones crystals under all planar groups is created with symmetric molecular dynamics.
△ Less
Submitted 17 June, 2022; v1 submitted 3 April, 2022;
originally announced April 2022.
-
Gradient-Based Optimization of Optical Vortex Beam Emitters
Authors:
Alexander D. White,
Logan Su,
Daniel I. Shahar,
Ki Youl Yang,
Geun Ho Ahn,
Jinhie Skarda,
Siddharth Ramachandran,
Jelena Vučković
Abstract:
Vortex beams are stable solutions of Maxwell's equations that carry phase singularities and orbital angular momentum, unique properties that give rise to many applications in the basic sciences, optical communications, and quantum technologies. Scalable integration and fabrication of vortex beam emitters will allow these applications to flourish and enable new applications not possible with tradit…
▽ More
Vortex beams are stable solutions of Maxwell's equations that carry phase singularities and orbital angular momentum, unique properties that give rise to many applications in the basic sciences, optical communications, and quantum technologies. Scalable integration and fabrication of vortex beam emitters will allow these applications to flourish and enable new applications not possible with traditional optics. Here we present a general framework to generate integrated vortex beam emitters using photonic inverse design. We experimentally demonstrate generation of vortex beams with angular momentum spanning -3$\hbar$ to 3$\hbar$. We show the generality of this design procedure by designing a vortex beam multiplexer capable of exciting a custom vortex beam fiber. Finally, we produce foundry-fabricated beam emitters with wide-bandwidths and high-efficiencies that take advantage of a multi-layer heterogeneous integration.
△ Less
Submitted 18 February, 2022;
originally announced February 2022.
-
Inferring Spatial Source of Disease Outbreaks using Maximum Entropy
Authors:
Mehrad Ansari,
David Soriano-Paños,
Gourab Ghoshal,
Andrew D. White
Abstract:
Mathematical modeling of disease outbreaks can infer the future trajectory of an epidemic, which can inform policy decisions. Another task is inferring the origin of a disease, which is relatively difficult with current mathematical models. Such frameworks -- across varying levels of complexity -- are typically sensitive to input data on epidemic parameters, case-counts and mortality rates, which…
▽ More
Mathematical modeling of disease outbreaks can infer the future trajectory of an epidemic, which can inform policy decisions. Another task is inferring the origin of a disease, which is relatively difficult with current mathematical models. Such frameworks -- across varying levels of complexity -- are typically sensitive to input data on epidemic parameters, case-counts and mortality rates, which are generally noisy and incomplete. To alleviate these limitations, we propose a maximum entropy framework that fits epidemiological models, provides a calibrated infection origin probabilities, and is robust to noise due to a prior belief model. Maximum entropy is agnostic to the parameters or model structure used and allows for flexible use when faced with sparse data conditions and incomplete knowledge in the dynamical phase of disease-spread, providing for more reliable modeling at early stages of outbreaks. We evaluate the performance of our model by predicting future disease trajectories in synthetic graph networks and the real mobility network of New York state. In addition, unlike existing approaches, we demonstrate that the method can be used to infer the origin of the outbreak with accurate confidence. Indeed, despite the prevalent belief on the feasibility of contact-tracing being limited to the initial stages of an outbreak, we report the possibility of reconstructing early disease dynamics, including the epidemic seed, at advanced stages.
△ Less
Submitted 7 October, 2021;
originally announced October 2021.
-
Augmenting On-Chip Microresonator through Photonic Inverse Design
Authors:
Geun Ho Ahn,
Ki Youl Yang,
Rahul Trivedi,
Alexander D. White,
Logan Su,
Jinhie Skarda,
Jelena Vučković
Abstract:
Recent advances in the design and fabrication of on-chip optical microresonators has greatly expanded their applications in photonics, enabling metrology, communications, and on-chip lasers. Designs for these applications require fine control of dispersion, bandwidth and high optical quality factors. Co-engineering these figures of merit remains a significant technological challenge due to design…
▽ More
Recent advances in the design and fabrication of on-chip optical microresonators has greatly expanded their applications in photonics, enabling metrology, communications, and on-chip lasers. Designs for these applications require fine control of dispersion, bandwidth and high optical quality factors. Co-engineering these figures of merit remains a significant technological challenge due to design strategies being largely limited to analytical tuning of cross-sectional geometry. Here, we show that photonic inverse-design facilitates and expands the functionality of on-chip microresonators; theoretically and experimentally demonstrating flexible dispersion engineering, quality factor beyond 2 million on the silicon-on-insulator platform with single mode operation, and selective wavelength-band operation.
△ Less
Submitted 15 September, 2021;
originally announced September 2021.
-
Natural Language Processing Models That Automate Programming Will Transform Chemistry Research and Teaching
Authors:
Glen M. Hocky,
Andrew D. White
Abstract:
Natural language processing models have emerged that can generate usable software and automate a number of programming tasks with high fidelity. These tools have yet to have an impact on the chemistry community. Yet, our initial testing demonstrates that this form of Artificial Intelligence is poised to transform chemistry and chemical engineering research. Here, we review developments that brough…
▽ More
Natural language processing models have emerged that can generate usable software and automate a number of programming tasks with high fidelity. These tools have yet to have an impact on the chemistry community. Yet, our initial testing demonstrates that this form of Artificial Intelligence is poised to transform chemistry and chemical engineering research. Here, we review developments that brought us to this point, examine applications in chemistry, and give our perspective on how this may fundamentally alter research and teaching.
△ Less
Submitted 2 February, 2022; v1 submitted 30 August, 2021;
originally announced August 2021.
-
Iterative Symbolic Regression for Learning Transport Equations
Authors:
Mehrad Ansari,
Heta A. Gandhi,
David G. Foster,
Andrew D. White
Abstract:
Computational fluid dynamics (CFD) analysis is widely used in engineering. Although CFD calculations are accurate, the computational cost associated with complex systems makes it difficult to obtain empirical equations between system variables. Here we combine active learning (AL) and symbolic regression (SR) to get a symbolic equation for system variables from CFD simulations. Gaussian process re…
▽ More
Computational fluid dynamics (CFD) analysis is widely used in engineering. Although CFD calculations are accurate, the computational cost associated with complex systems makes it difficult to obtain empirical equations between system variables. Here we combine active learning (AL) and symbolic regression (SR) to get a symbolic equation for system variables from CFD simulations. Gaussian process regression-based AL allows for automated selection of variables by selecting the most instructive points from the available range of possible parameters. The results from these experiments are then passed to SR to find empirical symbolic equations for CFD models. This approach is scalable and applicable for any desired number of CFD design parameters. To demonstrate the effectiveness, we use this method with two model systems. We recover an empirical equation for the pressure drop in a bent pipe and a new equation for predicting backflow in a heart valve under arotic insufficiency.
△ Less
Submitted 16 March, 2022; v1 submitted 6 August, 2021;
originally announced August 2021.
-
Inverse-designed multi-dimensional silicon photonic transmitters
Authors:
Ki Youl Yang,
Alexander D. White,
Farshid Ashtiani,
Chinmay Shirpurkar,
Srinivas V. Pericherla,
Lin Chang,
Hao Song,
Kaiheng Zou,
Huibin Zhou,
Kai Pang,
Joshua Yang,
Melissa A. Guidry,
Daniil M. Lukin,
Han Hao,
Lawrence Trask,
Geun Ho Ahn,
Andy Netherton,
Travis C. Briles,
Jordan R. Stone,
Lior Rechtman,
Jeffery S. Stone,
Kasper Van Gasse,
Jinhie L. Skarda,
Logan Su,
Dries Vercruysse
, et al. (11 additional authors not shown)
Abstract:
Modern microelectronic processors have migrated towards parallel computing architectures with many-core processors. However, such expansion comes with diminishing returns exacted by the high cost of data movement between individual processors. The use of optical interconnects has burgeoned as a promising technology that can address the limits of this data transfer. While recent pushes to enhance o…
▽ More
Modern microelectronic processors have migrated towards parallel computing architectures with many-core processors. However, such expansion comes with diminishing returns exacted by the high cost of data movement between individual processors. The use of optical interconnects has burgeoned as a promising technology that can address the limits of this data transfer. While recent pushes to enhance optical communication have focused on developing wavelength-division multiplexing technology, this approach will eventually saturate the usable bandwidth, and new dimensions of data transfer will be paramount to fulfill the ever-growing need for speed. Here we demonstrate an integrated intra- and inter-chip multi-dimensional communication scheme enabled by photonic inverse design. Using inverse-designed mode-division multiplexers, we combine wavelength- and mode- multiplexing and send massively parallel data through nano-photonic waveguides and optical fibres. Crucially, as we take advantage of an orthogonal optical basis, our approach is inherently scalable to a multiplicative enhancement over the current state of the art.
△ Less
Submitted 10 October, 2021; v1 submitted 25 March, 2021;
originally announced March 2021.
-
Spectrally reconfigurable quantum emitters enabled by optimized fast modulation
Authors:
Daniil M. Lukin,
Alexander D. White,
Rahul Trivedi,
Melissa A. Guidry,
Naoya Morioka,
Charles Babin,
Öney O. Soykal,
Jawad Ul Hassan,
Nguyen Tien Son,
Takeshi Ohshima,
Praful K. Vasireddy,
Mamdouh H. Nasr,
Shuo Sun,
Jean-Phillipe W. MacLean,
Constantin Dory,
Emilio A. Nanni,
Jörg Wrachtrup,
Florian Kaiser,
Jelena Vučković
Abstract:
The ability to shape photon emission facilitates strong photon-mediated interactions between disparate physical systems, thereby enabling applications in quantum information processing, simulation and communication. Spectral control in solid state platforms such as color centers, rare earth ions, and quantum dots is particularly attractive for realizing such applications on-chip. Here we propose t…
▽ More
The ability to shape photon emission facilitates strong photon-mediated interactions between disparate physical systems, thereby enabling applications in quantum information processing, simulation and communication. Spectral control in solid state platforms such as color centers, rare earth ions, and quantum dots is particularly attractive for realizing such applications on-chip. Here we propose the use of frequency-modulated optical transitions for spectral engineering of single photon emission. Using a scattering-matrix formalism, we find that a two-level system, when modulated faster than its optical lifetime, can be treated as a single-photon source with a widely reconfigurable photon spectrum that is amenable to standard numerical optimization techniques. To enable the experimental demonstration of this spectral control scheme, we investigate the Stark tuning properties of the silicon vacancy in silicon carbide, a color center with promise for optical quantum information processing technologies. We find that the silicon vacancy possesses excellent spectral stability and tuning characteristics, allowing us to probe its fast modulation regime, observe the theoretically-predicted two-photon correlations, and demonstrate spectral engineering. Our results suggest that frequency modulation is a powerful technique for the generation of new light states with unprecedented control over the spectral and temporal properties of single photons.
△ Less
Submitted 27 July, 2020; v1 submitted 27 March, 2020;
originally announced March 2020.
-
Recent Advances in Maximum Entropy Biasing Techniques for Molecular Dynamics
Authors:
Dilnoza B. Amirkulova,
Andrew D. White
Abstract:
This review describes recent advances by the authors and others on the topic of incorporating experimental data into molecular simulations through maximum entropy methods. Methods which incorporate experimental data improve accuracy in molecular simulation by minimally modifying the thermodynamic ensemble. This is especially important where force fields are approximate, such as when employing coar…
▽ More
This review describes recent advances by the authors and others on the topic of incorporating experimental data into molecular simulations through maximum entropy methods. Methods which incorporate experimental data improve accuracy in molecular simulation by minimally modifying the thermodynamic ensemble. This is especially important where force fields are approximate, such as when employing coarse-grain models, or where high accuracy is required, such as when attempting to mimic a multiscale self-assembly process. The authors review here the experiment directed simulation (EDS) and experiment directed metadynamics (EDM) methods that allow matching averages and distributions in simulations, respectively. Important system-specific considerations are discussed such as using enhanced sampling simultaneously, the role of pressure, treating uncertainty, and implementations of these methods. Recent examples of EDS and EDM are reviewed including applications to ab initio molecular dynamics of water, incorporating environmental fluctuations inside of a macromolecular protein complex, improving RNA force fields, and the combination of enhanced sampling with minimal biasing to model peptides.
△ Less
Submitted 6 February, 2019;
originally announced February 2019.
-
Combining Enhanced Sampling with Experiment Directed Simulation of the GYG peptide
Authors:
Dilnoza B Amirkulova,
Andrew D White
Abstract:
Experiment directed simulation is a technique to minimally bias molecular dynamics simulations to match experimentally observed results. The method improves accuracy but does not address the sampling problem of molecular dynamics simulations of large systems. This work combines experiment directed simulation with both the parallel-tempering and parallel-tempering well-tempered ensemble replica-exc…
▽ More
Experiment directed simulation is a technique to minimally bias molecular dynamics simulations to match experimentally observed results. The method improves accuracy but does not address the sampling problem of molecular dynamics simulations of large systems. This work combines experiment directed simulation with both the parallel-tempering and parallel-tempering well-tempered ensemble replica-exchange methods to enhance sampling of experiment directed simulations. These methods are demonstrated on the GYG tripeptide in explicit water. The collective variables biased by experiment directed simulation are chemical shifts, where the set-points are determined by NMR experiments. The results show that it is possible to enhance sampling with either parallel-tempering and parallel-tempering well-tempered ensemble in the experiment directed simulation method. This combination of methods provides a novel approach for both accurately and exhaustively simulating biological systems.
△ Less
Submitted 13 April, 2018;
originally announced April 2018.
-
Encoding and Selecting Coarse-Grain Mapping Operators with Hierarchical Graphs
Authors:
Maghesree Chakraborty,
Chenliang Xu,
Andrew D. White
Abstract:
Coarse grain (CG) molecular dynamics (MD) can simulate systems inaccessible to fine grain (FG) MD simulations. A CG simulation decreases the degrees of freedom by mapping atoms from an FG representation into agglomerate CG particles. The FG to CG mapping is not unique. Research into systematic selection of these mappings is challenging due to their combinatorial growth with respect to the number o…
▽ More
Coarse grain (CG) molecular dynamics (MD) can simulate systems inaccessible to fine grain (FG) MD simulations. A CG simulation decreases the degrees of freedom by mapping atoms from an FG representation into agglomerate CG particles. The FG to CG mapping is not unique. Research into systematic selection of these mappings is challenging due to their combinatorial growth with respect to the number of atoms in a molecule. Here we present a method of reducing the total count of mappings by imposing molecular topology and symmetry constraints. The count reduction is illustrated by considering all mappings for nearly 49,889 molecules. The resulting number of mapping operators is still large, so we introduce hierarchical graphs which encode multiple CG mapping operators. The encoding method is demonstrated for methanol and a 14-mer peptide. This encoding provides a foundation to perform automated mapping selection.
△ Less
Submitted 13 April, 2018;
originally announced April 2018.
-
Improved Ab Initio Molecular Dynamics by Minimal Biasing with Experimental Data
Authors:
Andrew D. White,
Chris Knight,
Glen M. Hocky,
Gregory A. Voth
Abstract:
Accounting for electrons and nuclei simultaneously is a powerful capability of ab initio molecular dynamics (AIMD). However, AIMD is often unable to accurately reproduce properties of systems such as water due to inaccuracies in the underlying electronic density functionals. This shortcoming is often addressed by added empirical corrections and/or increasing the simulation temperature. We present…
▽ More
Accounting for electrons and nuclei simultaneously is a powerful capability of ab initio molecular dynamics (AIMD). However, AIMD is often unable to accurately reproduce properties of systems such as water due to inaccuracies in the underlying electronic density functionals. This shortcoming is often addressed by added empirical corrections and/or increasing the simulation temperature. We present here a maximum-entropy approach to directly incorporate limited experimental data via a minimal bias. Biased AIMD simulations of water and an excess proton in water are shown to give significantly improved properties both for observables which were biased to match experimental data and for unbiased observables. This approach also yields new physical insight into inaccuracies in the underlying density functional theory as utilized in the unbiased AIMD.
△ Less
Submitted 24 January, 2017; v1 submitted 1 July, 2016;
originally announced July 2016.