Search | arXiv e-print repository

Resolution-vs.-Accuracy Dilemma in Machine Learning Modeling of Electronic Excitation Spectra

Authors: Prakriti Kayastha, Sabyasachi Chakraborty, Raghunathan Ramakrishnan

Abstract: In this study, we explore the potential of machine learning for modeling molecular electronic spectral intensities as a continuous function in a given wavelength range. Since presently available chemical space datasets provide excitation energies and corresponding oscillator strengths for only a few valence transitions, here, we present a new dataset -- \bigqm -- with 12,880 molecules containing u… ▽ More In this study, we explore the potential of machine learning for modeling molecular electronic spectral intensities as a continuous function in a given wavelength range. Since presently available chemical space datasets provide excitation energies and corresponding oscillator strengths for only a few valence transitions, here, we present a new dataset -- \bigqm -- with 12,880 molecules containing up to 7 CONF atoms and report ground state and excited state properties. A publicly accessible web-based data-mining platform is presented to facilitate on-the-fly screening of several molecular properties including harmonic vibrational and electronic spectra. We present all singlet electronic transitions from the ground state calculated using the time-dependent density functional theory framework with the $ω$B97XD exchange-correlation functional and a diffuse-function augmented basis set. The resulting spectra predominantly span the X-ray to deep-UV region (10--120 nm). To compare the target spectra with predictions based on small basis sets, we bin spectral intensities and show good agreement is obtained only at the expense of the resolution. Compared to this, machine learning models with latest structural representations trained directly using $<10 \%$ of the target data recover the spectra of the remaining molecules with better accuracies at a desirable $<1$ nm wavelength resolution. △ Less

Submitted 31 July, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

Comments: Major update: Dimensionless error metric is compared with MAE

arXiv:2110.05414 [pdf, other]

doi 10.1063/5.0076787

Data-Driven Modeling of S0 -> S1 Excitation Energy in the BODIPY Chemical Space: High-Throughput Computation, Quantum Machine Learning, and Inverse Design

Authors: Amit Gupta, Sabyasachi Chakraborty, Debashree Ghosh, Raghunathan Ramakrishnan

Abstract: Derivatives of BODIPY are popular fluorophores due to their synthetic feasibility, structural rigidity, high quantum yield, and tunable spectroscopic properties. While the characteristic absorption maximum of BODIPY is at 2.5 eV, combinations of functional groups and substitution sites can shift the peak position by +/- 1 eV. Time-dependent long-range corrected hybrid density functional methods ca… ▽ More Derivatives of BODIPY are popular fluorophores due to their synthetic feasibility, structural rigidity, high quantum yield, and tunable spectroscopic properties. While the characteristic absorption maximum of BODIPY is at 2.5 eV, combinations of functional groups and substitution sites can shift the peak position by +/- 1 eV. Time-dependent long-range corrected hybrid density functional methods can model the lowest excitation energies offering a semi-quantitative precision of +/- 0.3 eV. Alas, the chemical space of BODIPYs stemming from combinatorial introduction of -- even a few dozen -- substituents is too large for brute-force high-throughput modeling. To navigate this vast space, we select 77,412 molecules and train a kernel-based quantum machine learning model providing < 2% hold-out error. Further reuse of the results presented here to navigate the entire BODIPY universe comprising over 253 giga (253 x 10^9) molecules is demonstrated by inverse-designing candidates with desired target excitation energies. △ Less

Submitted 28 October, 2021; v1 submitted 11 October, 2021; originally announced October 2021.

Comments: references updated, some key papers cited

arXiv:2108.13736 [pdf, ps, other]

doi 10.1103/PhysRevE.105.044203

Dynamics of nondegenerate solitons in long-wave short-wave resonance interaction system

Authors: S. Stalin, R. Ramakrishnan, M. Lakshmanan

Abstract: In this paper, we study the dynamics of an interesting class of vector solitons in the long wave-short wave resonance interaction (LSRI) system. The model that we consider here describes the nonlinear interaction of the long-wave and two-short waves and it generically appears in several physical settings. To derive this class of nondegenerate vector soliton solutions we adopt the Hirota bilinear m… ▽ More In this paper, we study the dynamics of an interesting class of vector solitons in the long wave-short wave resonance interaction (LSRI) system. The model that we consider here describes the nonlinear interaction of the long-wave and two-short waves and it generically appears in several physical settings. To derive this class of nondegenerate vector soliton solutions we adopt the Hirota bilinear method with the more general form of admissible seed solutions with nonidentical distinct propagation constants. We express the resultant fundamental as well as multi-soliton solutions in a compact way using Gram-determinants. The general fundamental vector soliton solution possesses several interesting properties. For instance, the double-hump or a single-hump profile structure including a special flattop profile form results in when the soliton propagates in all the components with identical velocities. Interestingly, in the case of nonidentical velocities, the soliton number is increased to two in the long-wave (LW) component, while a single-humped soliton propagates in the two short-wave (SW) components. We establish through a detailed analysis that the nondegenerate multi-solitons in contrast to the already known vector solitons (with identical wave numbers) can undergo three types of elastic collision scenarios: (i) shape preserving, (ii) shape altering, and (iii) a novel shape changing collision, depending on the choice of the soliton parameters. In addition, we point out the coexistence of nondegenerate and degenerate solitons simultaneously along with the associated physical consequences. We also indicate the physical realizations of these general vector solitons in nonlinear optics, hydrodynamics, and Bose-Einstein condensates. Our results are generic and they will be useful in these physical systems and other closely related systems including plasma physics. △ Less

Submitted 13 March, 2022; v1 submitted 31 August, 2021; originally announced August 2021.

Comments: Accepted for Publication in Physical Review E (2022)

arXiv:2106.14260 [pdf, other]

Nondegenerate bright solitons in coupled nonlinear Schrödinger systems: Recent developments on optical vector solitons

Authors: S. Stalin, R. Ramakrishnan, M. Lakshmanan

Abstract: Nonlinear dynamics of an optical pulse or a beam continue to be one of the active areas of research in the field of optical solitons. Especially, in multi-mode fibers or fiber arrays and photorefractive materials, the vector solitons display rich nonlinear phenomena. Due to their fascinating and intriguing novel properties, the theory of optical vector solitons has been developed considerably both… ▽ More Nonlinear dynamics of an optical pulse or a beam continue to be one of the active areas of research in the field of optical solitons. Especially, in multi-mode fibers or fiber arrays and photorefractive materials, the vector solitons display rich nonlinear phenomena. Due to their fascinating and intriguing novel properties, the theory of optical vector solitons has been developed considerably both from theoretical and experimental points of view leading to soliton based promising potential applications. In the recent past, many types of vector solitons have been identified both in the integrable and non-integrable coupled nonlinear Schrödinger (CNLS) equations framework. In this article, we review some of the recent progress in understanding the dynamics of the so called nondegenerate vector bright solitons in nonlinear optics, where the fundamental soliton can have more than one propagation constant. We address this theme by considering the integrable two CNLS family of equations, namely Manakov system, mixed 2-CNLS system, coherently CNLS system, generalized CNLS system and two-component long-wave short-wave resonance interaction (LSRI) system. In these models, we discuss the existence of nondegenerate vector solitons and their associated novel multi-hump geometrical profile nature by deriving their analytical forms through the Hirota bilinear method. Then we reveal the novel collision properties of the nondegenerate solitons in the Manakov system as an example. The asymptotic analysis shows that the nondegenerate solitons, in general, undergo three types of elastic collisions without any energy redistribution among the modes. Further, we show that the energy sharing collision exhibiting vector solitons arises as a special case of the newly reported nondegenerate vector solitons. Finally, we point out the possible further developments in this subject and potential applications. △ Less

Submitted 27 June, 2021; originally announced June 2021.

Comments: Accepted for publication in 'Photonics' and to appear in the special issue on 'Optical Solitons: Current Status'

arXiv:2105.07707 [pdf, ps, other]

B-splines on the Heisenberg group

Authors: Santi R. Das, Peter R. Massopust, Radha Ramakrishnan

Abstract: In this paper, we introduce a class of $B$-splines on the Heisenberg group $\mathbb{H}$ and study their fundamental properties. Unlike the classical case, we prove that there does not exist any sequence $\{α_n\}_{n\in\mathbb{N}}$ such that $L_{(-n.-\frac{n}{2},-α_n)}φ_n(x,y,t)=L_{(-n.-\frac{n}{2},-α_n)}φ_n(-x,-y,-t)$, for $n\geq 2$, where $L_{(x,y,t)}$ denotes the left translation on $\mathbb{H}$.… ▽ More In this paper, we introduce a class of $B$-splines on the Heisenberg group $\mathbb{H}$ and study their fundamental properties. Unlike the classical case, we prove that there does not exist any sequence $\{α_n\}_{n\in\mathbb{N}}$ such that $L_{(-n.-\frac{n}{2},-α_n)}φ_n(x,y,t)=L_{(-n.-\frac{n}{2},-α_n)}φ_n(-x,-y,-t)$, for $n\geq 2$, where $L_{(x,y,t)}$ denotes the left translation on $\mathbb{H}$. We further investigate the problem of finding an equivalent condition for the system of left translates to form a frame sequence or a Riesz sequence in terms of twisted translates. We also find a sufficient condition for obtaining an oblique dual of the system $\{L_{(2k,l,m)}g:k,l,m\in\mathbb{Z}\}$ for a certain class of functions $g\in L^2(\mathbb{H})$. These concepts are illustrated by some examples. Finally, we make some remarks about $B$-splines regarding these results. △ Less

Submitted 15 December, 2022; v1 submitted 17 May, 2021; originally announced May 2021.

MSC Class: 42C15; 41A15; 43A30

arXiv:2103.15171 [pdf, other]

A Bayesian Approach to Identifying Representational Errors

Authors: Ramya Ramakrishnan, Vaibhav Unhelkar, Ece Kamar, Julie Shah

Abstract: Trained AI systems and expert decision makers can make errors that are often difficult to identify and understand. Determining the root cause for these errors can improve future decisions. This work presents Generative Error Model (GEM), a generative model for inferring representational errors based on observations of an actor's behavior (either simulated agent, robot, or human). The model conside… ▽ More Trained AI systems and expert decision makers can make errors that are often difficult to identify and understand. Determining the root cause for these errors can improve future decisions. This work presents Generative Error Model (GEM), a generative model for inferring representational errors based on observations of an actor's behavior (either simulated agent, robot, or human). The model considers two sources of error: those that occur due to representational limitations -- "blind spots" -- and non-representational errors, such as those caused by noise in execution or systematic errors present in the actor's policy. Disambiguating these two error types allows for targeted refinement of the actor's policy (i.e., representational errors require perceptual augmentation, while other errors can be reduced through methods such as improved training or attention support). We present a Bayesian inference algorithm for GEM and evaluate its utility in recovering representational errors on multiple domains. Results show that our approach can recover blind spots of both reinforcement learning agents as well as human users. △ Less

Submitted 28 March, 2021; originally announced March 2021.

arXiv:2102.06506 [pdf, ps, other]

Multihumped nondegenerate fundamental bright solitons in $N$-coupled nonlinear Schrödinger system

Authors: R. Ramakrishnan, S. Stalin, M. Lakshmanan

Abstract: In this letter we report the existence of nondegenerate fundamental bright soliton solution for coupled multi-component nonlinear Schrödinger equations of Manakov type. To derive this class of nondegenerate vector soliton solutions, we adopt the Hirota bilinear method with appopriate general class of seed solutions. Very interestingly the obtained nondegenerate fundamental soliton solution of the… ▽ More In this letter we report the existence of nondegenerate fundamental bright soliton solution for coupled multi-component nonlinear Schrödinger equations of Manakov type. To derive this class of nondegenerate vector soliton solutions, we adopt the Hirota bilinear method with appopriate general class of seed solutions. Very interestingly the obtained nondegenerate fundamental soliton solution of the $N$-coupled nonlinear Schrödinger (CNLS) system admits multi-hump natured intensity profiles. We explicitly demonstrate this specific property by considering the nondegenerate soliton solutions for $3$ and $4$-CNLS systems. We also point out the existence of a special class of partially nondegenerate soliton solutions by imposing appropriate restrictions on the wavenumbers in the already obtained completely nondegenerate soliton solution. Such class of soliton solutions can also exhibit multi-hump profile structures. Finally, we present the stability analysis of nondegenerate fundamental soliton of the $3$-CNLS system as an example. The numerical results confirm the stability of triple-humped profile nature against perturbations of 5\% and 10\% white noise. The multi-hump nature of nondegenerate fundamental soliton solution will be usefull in multi-level optical communication applications with enhanced flow of data in multi-mode fibers. △ Less

Submitted 12 February, 2021; originally announced February 2021.

Comments: Accepted for publication in Journal of Physics A: Mathematical and Theoretical as a Fast Track Communication

arXiv:2101.11359 [pdf]

An explainable Transformer-based deep learning model for the prediction of incident heart failure

Authors: Shishir Rao, Yikuan Li, Rema Ramakrishnan, Abdelaali Hassaine, Dexter Canoy, John Cleland, Thomas Lukasiewicz, Gholamreza Salimi-Khorshidi, Kazem Rahimi

Abstract: Predicting the incidence of complex chronic conditions such as heart failure is challenging. Deep learning models applied to rich electronic health records may improve prediction but remain unexplainable hampering their wider use in medical practice. We developed a novel Transformer deep-learning model for more accurate and yet explainable prediction of incident heart failure involving 100,071 pat… ▽ More Predicting the incidence of complex chronic conditions such as heart failure is challenging. Deep learning models applied to rich electronic health records may improve prediction but remain unexplainable hampering their wider use in medical practice. We developed a novel Transformer deep-learning model for more accurate and yet explainable prediction of incident heart failure involving 100,071 patients from longitudinal linked electronic health records across the UK. On internal 5-fold cross validation and held-out external validation, our model achieved 0.93 and 0.93 area under the receiver operator curve and 0.69 and 0.70 area under the precision-recall curve, respectively and outperformed existing deep learning models. Predictor groups included all community and hospital diagnoses and medications contextualised within the age and calendar year for each patient's clinical encounter. The importance of contextualised medical information was revealed in a number of sensitivity analyses, and our perturbation method provided a way of identifying factors contributing to risk. Many of the identified risk factors were consistent with existing knowledge from clinical and epidemiological research but several new associations were revealed which had not been considered in expert-driven risk prediction models. △ Less

Submitted 27 January, 2021; originally announced January 2021.

arXiv:2012.15619 [pdf, other]

Machine Learning Modeling of Materials with a Group-Subgroup Structure

Authors: Prakriti Kayastha, Raghunathan Ramakrishnan

Abstract: Crystal structures connected by continuous phase transitions are linked through mathematical relations between crystallographic groups and their subgroups. In the present study, we introduce group-subgroup machine learning (GS-ML) and show that including materials with small unit cells in the training set decreases out-of-sample prediction errors for materials with large unit cells. GS-ML incurs t… ▽ More Crystal structures connected by continuous phase transitions are linked through mathematical relations between crystallographic groups and their subgroups. In the present study, we introduce group-subgroup machine learning (GS-ML) and show that including materials with small unit cells in the training set decreases out-of-sample prediction errors for materials with large unit cells. GS-ML incurs the least training cost to reach 2-3% target accuracy compared to other ML approaches. Since available materials datasets are heterogeneous providing insufficient examples for realizing the group-subgroup structure, we present the "FriezeRMQ1D" dataset with 8393 Q1D organometallic materials uniformly distributed across 7 frieze groups. Furthermore, by comparing the performances of FCHL and 1-hot representations, we show GS-ML to capture subgroup information efficiently when the descriptor encodes structural information. The proposed approach is generic and extendable to symmetry abstractions such as spin-, valency-, or charge order. △ Less

Submitted 27 April, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

Comments: Minor revision

arXiv:2010.02635 [pdf, other]

Troubleshooting Unstable Molecules in Chemical Space

Authors: Salini Senthil, Sabyasachi Chakraborty, Raghunathan Ramakrishnan

Abstract: A key challenge in automated chemical compound space explorations is ensuring veracity in minimum energy geometries---to preserve intended bonding connectivities. We discuss an iterative high-throughput workflow for connectivity preserving geometry optimizations exploiting the nearness between quantum mechanical models. The methodology is benchmarked on the QM9 dataset comprising DFT-level propert… ▽ More A key challenge in automated chemical compound space explorations is ensuring veracity in minimum energy geometries---to preserve intended bonding connectivities. We discuss an iterative high-throughput workflow for connectivity preserving geometry optimizations exploiting the nearness between quantum mechanical models. The methodology is benchmarked on the QM9 dataset comprising DFT-level properties of 133,885 small molecules; of which 3,054 have questionable geometric stability. We successfully troubleshoot 2,988 molecules and ensure a bijective mapping between desired Lewis formulae and final geometries. Our workflow, based on DFT and post-DFT methods, identifies 66 molecules as unstable; 52 contain $-{\rm NNO}-$, the rest are strained due to pyramidal sp$^2$ C. In the curated dataset, we inspect molecules with long CC bonds and identify ultralong contestants ($r>1.70$~Å) supported by topological analysis of electron density. We hope the proposed strategy to play a role in big data quantum chemistry initiatives. △ Less

Submitted 15 October, 2020; v1 submitted 6 October, 2020; originally announced October 2020.

arXiv:2009.12519 [pdf, other]

doi 10.1063/5.0041717

High-Throughput Design of Peierls and Charge Density Wave Phases in Q1D Organometallic Materials

Authors: Prakriti Kayastha, Raghunathan Ramakrishnan

Abstract: Soft-phonon modes of an undistorted phase encode a material's preference for symmetry lowering. However, the evidence is sparse for the relationship between an unstable phonon wavevector's reciprocal and the number of formula units in the stable distorted phase. This "1/q*-criterion" holds great potential for the first-principles design of materials, especially in low-dimension. We validate the ap… ▽ More Soft-phonon modes of an undistorted phase encode a material's preference for symmetry lowering. However, the evidence is sparse for the relationship between an unstable phonon wavevector's reciprocal and the number of formula units in the stable distorted phase. This "1/q*-criterion" holds great potential for the first-principles design of materials, especially in low-dimension. We validate the approach on the Q1D materials space containing 1199 ring-metal units and identify candidates that are stable in undistorted (1 unit), Peierls (2 units), charge density wave (3-5 units), or long wave (>5 units) phases. We highlight materials exhibiting gap-opening as well as an uncommon gap-closing Peierls transition, and discuss an example case stabilized as a charge density wave insulator. We present the data generated for this study through an interactive publicly accessible Big Data analytics platform (http://moldis.tifrh.res.in/data/rmq1d) facilitating limitless and seamless data-mining explorations. △ Less

Submitted 22 January, 2021; v1 submitted 26 September, 2020; originally announced September 2020.

arXiv:2009.07426 [pdf, ps, other]

doi 10.1103/PhysRevE.102.042212

Nondegenerate Solitons and their Collisions in Manakov System

Authors: R. Ramakrishnan, S. Stalin, M. Lakshmanan

Abstract: Recently, we have shown that the Manakov equation can admit a more general class of nondegenerate vector solitons, which can undergo collision without any intensity redistribution in general among the modes, associated with distinct wave numbers, besides the already known energy exchanging solitons corresponding to identical wave numbers. In the present comprehensive paper, we discuss in detail th… ▽ More Recently, we have shown that the Manakov equation can admit a more general class of nondegenerate vector solitons, which can undergo collision without any intensity redistribution in general among the modes, associated with distinct wave numbers, besides the already known energy exchanging solitons corresponding to identical wave numbers. In the present comprehensive paper, we discuss in detail the various special features of the reported nondegenerate vector solitons. To bring out these details, we derive the exact forms of such vector one-, two- and three-soliton solutions through Hirota bilinear method and they are rewritten in more compact forms using Gram determinants. The presence of distinct wave numbers allows the nondegenerate fundamental soliton to admit various profiles such as double-hump, flat-top and single-hump structures. We explain the formation of double-hump structure in the fundamental soliton when the relative velocity of the two modes tends to zero. More critical analysis shows that the nondegenerate fundamental solitons can undergo shape preserving as well as shape altering collisions under appropriate conditions. The shape changing collision occurs between the modes of nondegenerate solitons when the parameters are fixed suitably. Then we observe the coexistence of degenerate and nondegenerate solitons when the wave numbers are restricted appropriately in the obtained two-soliton solution. In such a situation we find the degenerate soliton induces shape changing behavior of nondegenerate soliton during the collision process. By performing suitable asymptotic analysis we analyze the consequences that occur in each of the collision scenario. Finally we point out that the previously known class of energy exchanging vector bright solitons, with identical wave numbers, turns out to be a special case of the newly derived nondegenerate solitons. △ Less

Submitted 15 September, 2020; originally announced September 2020.

Comments: Accepted for publication in Phys. Rev. E (2020)

arXiv:2009.06814 [pdf, other]

Revving up 13C NMR shielding predictions across chemical space: Benchmarks for atoms-in-molecules kernel machine learning with new data for 134 kilo molecules

Authors: Amit Gupta, Sabyasachi Chakraborty, Raghunathan Ramakrishnan

Abstract: The requirement for accelerated and quantitatively accurate screening of nuclear magnetic resonance spectra across the small molecules chemical compound space is two-fold: (1) a robust `local' machine learning (ML) strategy capturing the effect of neighbourhood on an atom's `near-sighted' property -- chemical shielding; (2) an accurate reference dataset generated with a state-of-the-art first prin… ▽ More The requirement for accelerated and quantitatively accurate screening of nuclear magnetic resonance spectra across the small molecules chemical compound space is two-fold: (1) a robust `local' machine learning (ML) strategy capturing the effect of neighbourhood on an atom's `near-sighted' property -- chemical shielding; (2) an accurate reference dataset generated with a state-of-the-art first principles method for training. Herein we report the QM9-NMR dataset comprising isotropic shielding of over 0.8 million C atoms in 134k molecules of the QM9 dataset in gas and five common solvent phases. Using these data for training, we present benchmark results for the prediction transferability of kernel-ridge regression models with popular local descriptors. Our best model trained on 100k samples, accurately predict isotropic shielding of 50k `hold-out' atoms with a mean error of less than $1.9$ ppm. For rapid prediction of new query molecules, the models were trained on geometries from an inexpensive theory. Furthermore, by using a $Δ$-ML strategy, we quench the error below $1.4$ ppm. Finally, we test the transferability on non-trivial benchmark sets that include benchmark molecules comprising 10 to 17 heavy atoms and drugs. △ Less

Submitted 3 December, 2020; v1 submitted 14 September, 2020; originally announced September 2020.

arXiv:2007.06436 [pdf, other]

doi 10.1063/5.0032713

Critical Benchmarking of the G4(MP2) Model, the Correlation Consistent Composite Approach and Popular Density Functional Approximations on a Probabilistically Pruned Benchmark Dataset of Formation Enthalpies

Authors: Sambit Kumar Das, Sabyasachi Chakraborty, Raghunathan Ramakrishnan

Abstract: First-principles calculation of the standard formation enthalpy, $ΔH_f^\circ$ (298K), in such large scale as required by chemical space explorations, is amenable only with density functional approximations (DFAs) and some composite wave function theories (cWFTs). Alas, the accuracies of popular range-separated hybrid, `rung-4' DFAs, and cWFTs that offer the best accuracy-vs.-cost trade-off have as… ▽ More First-principles calculation of the standard formation enthalpy, $ΔH_f^\circ$ (298K), in such large scale as required by chemical space explorations, is amenable only with density functional approximations (DFAs) and some composite wave function theories (cWFTs). Alas, the accuracies of popular range-separated hybrid, `rung-4' DFAs, and cWFTs that offer the best accuracy-vs.-cost trade-off have as yet been established only for datasets predominantly comprising small molecules, hence, their transferability to larger datasets remains vague. In this study, we present an extended benchmark dataset of over 1600 values of $ΔH_f^\circ$ for structurally and electronically diverse molecules. We apply quartile-ranking based on boundary-corrected kernel density estimation to filter outliers and arrive at Probabilistically Pruned Enthalpies of 1694 compounds (PPE1694). For this dataset, we rank the prediction accuracies of G4, G4(MP2), ccCA, CBS-QB3 and 23 popular DFAs using conventional and probabilistic error metrics. We discuss systematic prediction errors and highlight the role an empirical higher-level correction (HLC) plays in the G4(MP2) model. Furthermore, we comment on uncertainties associated with the reference empirical data for atoms and the systematic errors stemming from these that grow with the molecular size. We believe these findings to aid in identifying meaningful application domains for quantum thermochemical methods. △ Less

Submitted 28 December, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

Comments: Major Revision. Please ignore the previous versions

arXiv:2006.05442 [pdf, other]

Tensor train decompositions on recurrent networks

Authors: Alejandro Murua, Ramchalam Ramakrishnan, Xinlin Li, Rui Heng Yang, Vahid Partovi Nia

Abstract: Recurrent neural networks (RNN) such as long-short-term memory (LSTM) networks are essential in a multitude of daily live tasks such as speech, language, video, and multimodal learning. The shift from cloud to edge computation intensifies the need to contain the growth of RNN parameters. Current research on RNN shows that despite the performance obtained on convolutional neural networks (CNN), kee… ▽ More Recurrent neural networks (RNN) such as long-short-term memory (LSTM) networks are essential in a multitude of daily live tasks such as speech, language, video, and multimodal learning. The shift from cloud to edge computation intensifies the need to contain the growth of RNN parameters. Current research on RNN shows that despite the performance obtained on convolutional neural networks (CNN), keeping a good performance in compressed RNNs is still a challenge. Most of the literature on compression focuses on CNNs using matrix product (MPO) operator tensor trains. However, matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference. We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task. △ Less

Submitted 9 June, 2020; originally announced June 2020.

arXiv:2004.09659 [pdf, other]

doi 10.1039/D0CP01396J

Quantum-chemistry-aided identification, synthesis and experimental validation of model systems for conformationally controlled reaction studies: Separation of the conformers of 2,3-dibromobuta-1,3-diene in the gas phase

Authors: Ardita Kilaj, Hong Gao, Diana Tahchieva, Raghunathan Ramakrishnan, Daniel Bachmann, Dennis Gillingham, O. Anatole von Lilienfeld, Jochen Küpper, Stefan Willitsch

Abstract: The Diels-Alder cycloaddition, in which a diene reacts with a dienophile to form a cyclic compound, counts among the most important tools in organic synthesis. Achieving a precise understanding of its mechanistic details on the quantum level requires new experimental and theoretical methods. Here, we present an experimental approach that separates different diene conformers in a molecular beam as… ▽ More The Diels-Alder cycloaddition, in which a diene reacts with a dienophile to form a cyclic compound, counts among the most important tools in organic synthesis. Achieving a precise understanding of its mechanistic details on the quantum level requires new experimental and theoretical methods. Here, we present an experimental approach that separates different diene conformers in a molecular beam as a prerequisite for the investigation of their individual cycloaddition reaction kinetics and dynamics under single-collision conditions in the gas phase. A low- and high-level quantum-chemistry-based screening of more than one hundred dienes identified 2,3-dibromobutadiene (DBB) as an optimal candidate for efficient separation of its gauche and s-trans conformers by electrostatic deflection. A preparation method for DBB was developed which enabled the generation of dense molecular beams of this compound. The theoretical predictions of the molecular properties of DBB were validated by the successful separation of the conformers in the molecular beam. A marked difference in photofragment ion yields of the two conformers upon femtosecond-laser pulse ionization was observed, pointing at a pronounced conformer-specific fragmentation dynamics of ionized DBB. Our work sets the stage for a rigorous examination of mechanistic models of cycloaddition reactions under controlled conditions in the gas phase. △ Less

Submitted 20 April, 2020; originally announced April 2020.

Comments: 12 pages, 7 figures

arXiv:2003.10170 [pdf, other]

Deep Bayesian Gaussian Processes for Uncertainty Estimation in Electronic Health Records

Authors: Yikuan Li, Shishir Rao, Abdelaali Hassaine, Rema Ramakrishnan, Yajie Zhu, Dexter Canoy, Gholamreza Salimi-Khorshidi, Thomas Lukasiewicz, Kazem Rahimi

Abstract: One major impediment to the wider use of deep learning for clinical decision making is the difficulty of assigning a level of confidence to model predictions. Currently, deep Bayesian neural networks and sparse Gaussian processes are the main two scalable uncertainty estimation methods. However, deep Bayesian neural network suffers from lack of expressiveness, and more expressive models such as de… ▽ More One major impediment to the wider use of deep learning for clinical decision making is the difficulty of assigning a level of confidence to model predictions. Currently, deep Bayesian neural networks and sparse Gaussian processes are the main two scalable uncertainty estimation methods. However, deep Bayesian neural network suffers from lack of expressiveness, and more expressive models such as deep kernel learning, which is an extension of sparse Gaussian process, captures only the uncertainty from the higher level latent space. Therefore, the deep learning model under it lacks interpretability and ignores uncertainty from the raw data. In this paper, we merge features of the deep Bayesian learning framework with deep kernel learning to leverage the strengths of both methods for more comprehensive uncertainty estimation. Through a series of experiments on predicting the first incidence of heart failure, diabetes and depression applied to large-scale electronic medical records, we demonstrate that our method is better at capturing uncertainty than both Gaussian processes and deep Bayesian neural networks in terms of indicating data insufficiency and distinguishing true positive and false positive predictions, with a comparable generalisation performance. Furthermore, by assessing the accuracy and area under the receiver operating characteristic curve over the predictive probability, we show that our method is less susceptible to making overconfident predictions, especially for the minority class in imbalanced datasets. Finally, we demonstrate how uncertainty information derived by the model can inform risk factor analysis towards model interpretability. △ Less

Submitted 23 March, 2020; originally announced March 2020.

Comments: 21 pages

arXiv:1912.03985 [pdf, ps, other]

Nondegenerate soliton solutions in certain coupled nonlinear Schrödinger systems

Authors: S. Stalin, R. Ramakrishnan, M. Lakshmanan

Abstract: In this paper, we report a more general class of nondegenerate soliton solutions, associated with two distinct wave numbers in different modes, for a certain class of physically important integrable two component nonlinear Schrödinger type equations through bilinearization procedure. In particular, we consider coupled nonlinear Schrödinger (CNLS) equations (both focusing as well as mixed type nonl… ▽ More In this paper, we report a more general class of nondegenerate soliton solutions, associated with two distinct wave numbers in different modes, for a certain class of physically important integrable two component nonlinear Schrödinger type equations through bilinearization procedure. In particular, we consider coupled nonlinear Schrödinger (CNLS) equations (both focusing as well as mixed type nonlinearities), coherently coupled nonlinear Schrödinger (CCNLS) equations and long-wave-short-wave resonance interaction (LSRI) system. We point out that the obtained general form of soliton solutions exhibit novel profile structures than the previously known degenerate soliton solutions corresponding to identical wave numbers in both the modes. We show that such degenerate soliton solutions can be recovered from the newly derived nondegenerate soliton solutions as limiting cases. △ Less

Submitted 9 December, 2019; originally announced December 2019.

Comments: Accepted for Publication in Physics Letters A (2019)

arXiv:1911.10620 [pdf, other]

doi 10.1063/5.0009196

Charge-Transfer Selectivity and Quantum Interference in Real-Time Electron Dynamics: Gaining Insights from Time-Dependent Configuration Interaction Simulations

Authors: Raghunathan Ramakrishnan

Abstract: Many-electron wavepacket dynamics based on time-dependent configuration interaction (TDCI) is a numerically rigorous approach to quantitatively model electron-transfer across molecular junctions. TDCI simulations of cyanobenzene thiolates---para- and meta-linked to an acceptor gold atom---show donor states \emph{conjugating} with the benzene $π$-network to allow better through-molecule electron mi… ▽ More Many-electron wavepacket dynamics based on time-dependent configuration interaction (TDCI) is a numerically rigorous approach to quantitatively model electron-transfer across molecular junctions. TDCI simulations of cyanobenzene thiolates---para- and meta-linked to an acceptor gold atom---show donor states \emph{conjugating} with the benzene $π$-network to allow better through-molecule electron migration in the para isomer compared to the meta counterpart. For dynamics involving \emph{non-conjugating} states, we find electron-injection to stem exclusively from distance-dependent non-resonant quantum mechanical tunneling, in which case the meta isomer exhibits better dynamics. Computed trend in donor-to-acceptor net-electron transfer through differently linked azulene bridges agrees with the trend seen in low-bias conductivity measurements. Disruption of $π$-conjugation has been shown to be the cause of diminished electron-injection through the 1,3-azulene, a pathological case for graph-based diagnosis of destructive quantum interference. Furthermore, we demonstrate quantum interference of many-electron wavefunctions to drive para- vs. meta- selectivity in the coherent evolution of superposed $π$(CN)- and $σ$(NC-C)-type wavepackets. Analyses reveal that in the para-linked benzene, $σ$ and $π$ MOs localized at the donor terminal are \emph{in-phase} leading to constructive interference of electron density distribution while phase-flip of one of the MOs in the meta isomer results in destructive interference. These findings suggest that \emph{a priori} detection of orbital phase-flip and quantum coherence conditions can aid in molecular device design strategies. △ Less

Submitted 26 March, 2020; v1 submitted 24 November, 2019; originally announced November 2019.

arXiv:1911.00231 [pdf, other]

Extending Relational Query Processing with ML Inference

Authors: Konstantinos Karanasos, Matteo Interlandi, Doris Xin, Fotis Psallidas, Rathijit Sen, Kwanghyun Park, Ivan Popivanov, Supun Nakandal, Subru Krishnan, Markus Weimer, Yuan Yu, Raghu Ramakrishnan, Carlo Curino

Abstract: The broadening adoption of machine learning in the enterprise is increasing the pressure for strict governance and cost-effective performance, in particular for the common and consequential steps of model storage and inference. The RDBMS provides a natural starting point, given its mature infrastructure for fast data access and processing, along with support for enterprise features (e.g., encrypti… ▽ More The broadening adoption of machine learning in the enterprise is increasing the pressure for strict governance and cost-effective performance, in particular for the common and consequential steps of model storage and inference. The RDBMS provides a natural starting point, given its mature infrastructure for fast data access and processing, along with support for enterprise features (e.g., encryption, auditing, high-availability). To take advantage of all of the above, we need to address a key concern: Can in-RDBMS scoring of ML models match (outperform?) the performance of dedicated frameworks? We answer the above positively by building Raven, a system that leverages native integration of ML runtimes (i.e., ONNX Runtime) deep within SQL Server, and a unified intermediate representation (IR) to enable advanced cross-optimizations between ML and DB operators. In this optimization space, we discover the most exciting research opportunities that combine DB/Compiler/ML thinking. Our initial evaluation on real data demonstrates performance gains of up to 5.5x from the native integration of ML in SQL Server, and up to 24x from cross-optimizations--we will demonstrate Raven live during the conference talk. △ Less

Submitted 1 November, 2019; originally announced November 2019.

arXiv:1909.08234 [pdf, other]

doi 10.4204/EPTCS.306.14

Value of Information in Probabilistic Logic Programs

Authors: Sarthak Ghosh, C. R. Ramakrishnan

Abstract: In medical decision making, we have to choose among several expensive diagnostic tests such that the certainty about a patient's health is maximized while remaining within the bounds of resources like time and money. The expected increase in certainty in the patient's condition due to performing a test is called the value of information (VoI) for that test. In general, VoI relates to acquiring add… ▽ More In medical decision making, we have to choose among several expensive diagnostic tests such that the certainty about a patient's health is maximized while remaining within the bounds of resources like time and money. The expected increase in certainty in the patient's condition due to performing a test is called the value of information (VoI) for that test. In general, VoI relates to acquiring additional information to improve decision-making based on probabilistic reasoning in an uncertain system. This paper presents a framework for acquiring information based on VoI in uncertain systems modeled as Probabilistic Logic Programs (PLPs). Optimal decision-making in uncertain systems modeled as PLPs have already been studied before. But, acquiring additional information to further improve the results of making the optimal decision has remained open in this context. We model decision-making in an uncertain system with a PLP and a set of top-level queries, with a set of utility measures over the distributions of these queries. The PLP is annotated with a set of atoms labeled as "observable"; in the medical diagnosis example, the observable atoms will be results of diagnostic tests. Each observable atom has an associated cost. This setting of optimally selecting observations based on VoI is more general than that considered by any prior work. Given a limited budget, optimally choosing observable atoms based on VoI is intractable in general. We give a greedy algorithm for constructing a "conditional plan" of observations: a schedule where the selection of what atom to observe next depends on earlier observations. We show that, preempting the algorithm anytime before completion provides a usable result, the result improves over time, and, in the absence of a well-defined budget, converges to the optimal solution. △ Less

Submitted 18 September, 2019; originally announced September 2019.

Comments: In Proceedings ICLP 2019, arXiv:1909.07646

ACM Class: I.2.4

Journal ref: EPTCS 306, 2019, pp. 71-84

arXiv:1909.04567 [pdf, other]

Differentiable Mask for Pruning Convolutional and Recurrent Networks

Authors: Ramchalam Kinattinkara Ramakrishnan, Eyyüb Sari, Vahid Partovi Nia

Abstract: Pruning is one of the most effective model reduction techniques. Deep networks require massive computation and such models need to be compressed to bring them on edge devices. Most existing pruning techniques are focused on vision-based models like convolutional networks, while text-based models are still evolving. The emergence of multi-modal multi-task learning calls for a general method that wo… ▽ More Pruning is one of the most effective model reduction techniques. Deep networks require massive computation and such models need to be compressed to bring them on edge devices. Most existing pruning techniques are focused on vision-based models like convolutional networks, while text-based models are still evolving. The emergence of multi-modal multi-task learning calls for a general method that works on vision and text architectures simultaneously. We introduce a \emph{differentiable mask}, that induces sparsity on various granularity to fill this gap. We apply our method successfully to prune weights, filters, subnetwork of a convolutional architecture, as well as nodes of a recurrent network. △ Less

Submitted 29 April, 2020; v1 submitted 10 September, 2019; originally announced September 2019.

arXiv:1909.00084 [pdf, other]

Cloudy with high chance of DBMS: A 10-year prediction for Enterprise-Grade ML

Authors: Ashvin Agrawal, Rony Chatterjee, Carlo Curino, Avrilia Floratou, Neha Gowdal, Matteo Interlandi, Alekh Jindal, Kostantinos Karanasos, Subru Krishnan, Brian Kroth, Jyoti Leeka, Kwanghyun Park, Hiren Patel, Olga Poppe, Fotis Psallidas, Raghu Ramakrishnan, Abhishek Roy, Karla Saur, Rathijit Sen, Markus Weimer, Travis Wright, Yiwen Zhu

Abstract: Machine learning (ML) has proven itself in high-value web applications such as search ranking and is emerging as a powerful tool in a much broader range of enterprise scenarios including voice recognition and conversational understanding for customer support, autotuning for videoconferencing, intelligent feedback loops in large-scale sysops, manufacturing and autonomous vehicle management, complex… ▽ More Machine learning (ML) has proven itself in high-value web applications such as search ranking and is emerging as a powerful tool in a much broader range of enterprise scenarios including voice recognition and conversational understanding for customer support, autotuning for videoconferencing, intelligent feedback loops in large-scale sysops, manufacturing and autonomous vehicle management, complex financial predictions, just to name a few. Meanwhile, as the value of data is increasingly recognized and monetized, concerns about securing valuable data and risks to individual privacy have been growing. Consequently, rigorous data management has emerged as a key requirement in enterprise settings. How will these trends (ML growing popularity, and stricter data governance) intersect? What are the unmet requirements for applying ML in enterprise settings? What are the technical challenges for the DB community to solve? In this paper, we present our vision of how ML and database systems are likely to come together, and early steps we take towards making this vision a reality. △ Less

Submitted 27 December, 2019; v1 submitted 30 August, 2019; originally announced September 2019.

arXiv:1904.00775 [pdf, other]

Deep Demosaicing for Edge Implementation

Authors: Ramchalam Kinattinkara Ramakrishnan, Shangling Jui, Vahid Patrovi Nia

Abstract: Most digital cameras use sensors coated with a Color Filter Array (CFA) to capture channel components at every pixel location, resulting in a mosaic image that does not contain pixel values in all channels. Current research on reconstructing these missing channels, also known as demosaicing, introduces many artifacts, such as zipper effect and false color. Many deep learning demosaicing techniques… ▽ More Most digital cameras use sensors coated with a Color Filter Array (CFA) to capture channel components at every pixel location, resulting in a mosaic image that does not contain pixel values in all channels. Current research on reconstructing these missing channels, also known as demosaicing, introduces many artifacts, such as zipper effect and false color. Many deep learning demosaicing techniques outperform other classical techniques in reducing the impact of artifacts. However, most of these models tend to be over-parametrized. Consequently, edge implementation of the state-of-the-art deep learning-based demosaicing algorithms on low-end edge devices is a major challenge. We provide an exhaustive search of deep neural network architectures and obtain a pareto front of Color Peak Signal to Noise Ratio (CPSNR) as the performance criterion versus the number of parameters as the model complexity that beats the state-of-the-art. Architectures on the pareto front can then be used to choose the best architecture for a variety of resource constraints. Simple architecture search methods such as exhaustive search and grid search require some conditions of the loss function to converge to the optimum. We clarify these conditions in a brief theoretical study. △ Less

Submitted 23 May, 2019; v1 submitted 26 March, 2019; originally announced April 2019.

Comments: Accepted in the 16th International Conference of Image Analysis and Recognition (ICIAR 2019)

arXiv:1901.00649 [pdf, other]

doi 10.1063/1.5088083

The Chemical Space of B, N-substituted Polycyclic Aromatic Hydrocarbons: Combinatorial Enumeration and High-Throughput First-Principles Modeling

Authors: Sabyasachi Chakraborty, Prakriti Kayastha, Raghunathan Ramakrishnan

Abstract: Combinatorial introduction of heteroatoms in the two-dimensional framework of aromatic hydrocarbons opens up possibilities to design compound libraries exhibiting desirable photovoltaic and photochemical properties. Exhaustive enumeration and first-principles characterization of this chemical space provide indispensable insights for rational compound design strategies. Here, for the smallest seven… ▽ More Combinatorial introduction of heteroatoms in the two-dimensional framework of aromatic hydrocarbons opens up possibilities to design compound libraries exhibiting desirable photovoltaic and photochemical properties. Exhaustive enumeration and first-principles characterization of this chemical space provide indispensable insights for rational compound design strategies. Here, for the smallest seventy-seven Kekulean-benzenoid polycyclic systems, we reveal combinatorial substitution of C atom pairs with the isosteric and isoelectronic B, N pairs to result in 7,453,041,547,842 (7.4 tera) unique molecules. We present comprehensive frequency distributions of this chemical space, analyze trends and discuss a symmetry-controlled selectivity manifestable in synthesis product-yield. Furthermore, by performing high-throughput ab initio density functional theory calculations of over thirty-three thousand (33k) representative molecules, we discuss quantitative trends in the structural stability and inter-property relationships across heteroarenes. Our results indicate a significant fraction of the 33k molecules to be electronically active in the 1.5-2.5 eV region, encompassing the most intense region of the solar spectrum, indicating their suitability as potential light-harvesting molecular components in photo-catalyzed solar cells. △ Less

Submitted 22 February, 2019; v1 submitted 3 January, 2019; originally announced January 2019.

arXiv:1810.01331 [pdf, ps, other]

doi 10.1103/PhysRevLett.122.043901

Nondegenerate solitons in Manakov system

Authors: S. Stalin, R. Ramakrishnan, M. Senthilvelan, M. Lakshmanan

Abstract: It is known that Manakov equation which describes wave propagation in two mode optical fibers, photorefractive materials, etc. can admit solitons which allow energy redistribution between the modes on collision that also leads to logical computing. In this paper, we point out that Manakov system can admit more general type of nondegenerate fundamental solitons corresponding to different wave numbe… ▽ More It is known that Manakov equation which describes wave propagation in two mode optical fibers, photorefractive materials, etc. can admit solitons which allow energy redistribution between the modes on collision that also leads to logical computing. In this paper, we point out that Manakov system can admit more general type of nondegenerate fundamental solitons corresponding to different wave numbers, which undergo collisions without any energy redistribution. The previously known class of solitons which allows energy redistribution among the modes turns out to be a special case corresponding to solitary waves with identical wave numbers in both the modes and travelling with the same velocity. We trace out the reason behind such a possibility and analyze the physical consequences. △ Less

Submitted 8 January, 2019; v1 submitted 2 October, 2018; originally announced October 2018.

Comments: Slightly modified version: Accepted for Publication in Phys. Rev. Letters (2019)

Journal ref: Phys. Rev. Lett. 122, 043901 (2019)

arXiv:1806.07834 [pdf, other]

A Look at Motion Planning for Autonomous Vehicles at an Intersection

Authors: Shravan Krishnan, Govind Aadithya R, Rahul Ramakrishnan, Vijay Arvindh, Sivanathan K

Abstract: Autonomous Vehicles are currently being tested in a variety of scenarios. As we move towards Autonomous Vehicles, how should intersections look? To answer that question, we break down an intersection management into the different conundrums and scenarios involved in the trajectory planning and current approaches to solve them. Then, a brief analysis of current works in autonomous intersection is c… ▽ More Autonomous Vehicles are currently being tested in a variety of scenarios. As we move towards Autonomous Vehicles, how should intersections look? To answer that question, we break down an intersection management into the different conundrums and scenarios involved in the trajectory planning and current approaches to solve them. Then, a brief analysis of current works in autonomous intersection is conducted. With a critical eye, we try to delve into the discrepancies of existing solutions while presenting some critical and important factors that have been addressed. Furthermore, open issues that have to be addressed are also emphasized. We also try to answer the question of how to benchmark intersection management algorithms by providing some factors that impact autonomous navigation at intersection. △ Less

Submitted 7 September, 2018; v1 submitted 20 June, 2018; originally announced June 2018.

Comments: Accepted for presentation at ITSC 2018, Final Version

arXiv:1805.08966 [pdf, other]

Discovering Blind Spots in Reinforcement Learning

Authors: Ramya Ramakrishnan, Ece Kamar, Debadeepta Dey, Julie Shah, Eric Horvitz

Abstract: Agents trained in simulation may make errors in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult to discover because the agent cannot predict them a priori. We propose using oracle feedback to learn a predictive model of these blind spots to reduce costly errors in real-world applications. We focus on blind spots in reinfor… ▽ More Agents trained in simulation may make errors in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult to discover because the agent cannot predict them a priori. We propose using oracle feedback to learn a predictive model of these blind spots to reduce costly errors in real-world applications. We focus on blind spots in reinforcement learning (RL) that occur due to incomplete state representation: The agent does not have the appropriate features to represent the true state of the world and thus cannot distinguish among numerous states. We formalize the problem of discovering blind spots in RL as a noisy supervised learning problem with class imbalance. We learn models to predict blind spots in unseen regions of the state space by combining techniques for label aggregation, calibration, and supervised learning. The models take into consideration noise emerging from different forms of oracle feedback, including demonstrations and corrections. We evaluate our approach on two domains and show that it achieves higher predictive performance than baseline methods, and that the learned model can be used to selectively query an oracle at execution time to prevent errors. We also empirically analyze the biases of various feedback types and how they influence the discovery of blind spots. △ Less

Submitted 23 May, 2018; originally announced May 2018.

Comments: To appear at AAMAS 2018

arXiv:1804.10237 [pdf, ps, other]

Constraint-Based Inference in Probabilistic Logic Programs

Authors: Arun Nampally, Timothy Zhang, C. R. Ramakrishnan

Abstract: Probabilistic Logic Programs (PLPs) generalize traditional logic programs and allow the encoding of models combining logical structure and uncertainty. In PLP, inference is performed by summarizing the possible worlds which entail the query in a suitable data structure, and using it to compute the answer probability. Systems such as ProbLog, PITA, etc., use propositional data structures like expla… ▽ More Probabilistic Logic Programs (PLPs) generalize traditional logic programs and allow the encoding of models combining logical structure and uncertainty. In PLP, inference is performed by summarizing the possible worlds which entail the query in a suitable data structure, and using it to compute the answer probability. Systems such as ProbLog, PITA, etc., use propositional data structures like explanation graphs, BDDs, SDDs, etc., to represent the possible worlds. While this approach saves inference time due to substructure sharing, there are a number of problems where a more compact data structure is possible. We propose a data structure called Ordered Symbolic Derivation Diagram (OSDD) which captures the possible worlds by means of constraint formulas. We describe a program transformation technique to construct OSDDs via query evaluation, and give procedures to perform exact and approximate inference over OSDDs. Our approach has two key properties. Firstly, the exact inference procedure is a generalization of traditional inference, and results in speedup over the latter in certain settings. Secondly, the approximate technique is a generalization of likelihood weighting in Bayesian Networks, and allows us to perform sampling-based inference with lower rejection rate and variance. We evaluate the effectiveness of the proposed techniques through experiments on several problems. This paper is under consideration for acceptance in TPLP. △ Less

Submitted 26 April, 2018; originally announced April 2018.

Comments: Paper presented at the 34nd International Conference on Logic Programming (ICLP 2018), Oxford, UK, July 14 to July 17, 2018 18 pages, LaTeX, 5 PDF figures (arXiv:YYMM.NNNNN)

arXiv:1802.06033 [pdf, other]

Torsional potentials of glyoxal, oxalyl halides and their thiocarbonyl derivatives: Challenges for popular density functional approximations

Authors: D. Tahchieva, D. Bakowies, R. Ramakrishnan, O. A. von Lilienfeld

Abstract: The reliability of popular density functionals was studied for the description of torsional profiles of 36 molecules: glyoxal, oxalyl halides and their thiocarbonyl derivatives. HF and \textcolor{black}{eighteen} functionals of varying complexity, from local density to range-separated hybrid approximations and double-hybrid, have been considered and benchmarked against CCSD(T)-level rotational pro… ▽ More The reliability of popular density functionals was studied for the description of torsional profiles of 36 molecules: glyoxal, oxalyl halides and their thiocarbonyl derivatives. HF and \textcolor{black}{eighteen} functionals of varying complexity, from local density to range-separated hybrid approximations and double-hybrid, have been considered and benchmarked against CCSD(T)-level rotational profiles. For molecules containing heavy halogens, all functionals except M05-2X and M06-2X fail to reproduce barrier heights accurately and a number of functionals introduce spurious minima. Dispersion corrections show no improvement. Calibrated torsion-corrected atom-centered potentials rectify the shortcomings of PBE and also improve on $σ$-hole based intermolecular binding in dimers and crystals. △ Less

Submitted 8 December, 2020; v1 submitted 16 February, 2018; originally announced February 2018.

arXiv:1802.00873 [pdf, other]

Machine Learning Modeling of Wigner Intracule Functionals for Two Electrons in One Dimension

Authors: Rutvij Vihang Bhavsar, Raghunathan Ramakrishnan

Abstract: In principle, many-electron correlation energy can be precisely computed from a reduced Wigner distribution function ($\mathcal{W}$) thanks to a universal functional transformation ($\mathcal{F}$), whose formal existence is akin to that of the exchange-correlation functional in density functional theory. While the exact dependence of $\mathcal{F}$ on $\mathcal{W}$ is unknown, a few approximate par… ▽ More In principle, many-electron correlation energy can be precisely computed from a reduced Wigner distribution function ($\mathcal{W}$) thanks to a universal functional transformation ($\mathcal{F}$), whose formal existence is akin to that of the exchange-correlation functional in density functional theory. While the exact dependence of $\mathcal{F}$ on $\mathcal{W}$ is unknown, a few approximate parametric models have been proposed in the past. Here, for a dataset of 923 one-dimensional external potentials with two interacting electrons, we apply machine learning to model $\mathcal{F}$ within the kernel Ansatz. We deal with over-fitting of the kernel to a specific region of phase-space by a one-step regularization not depending on any hyperparameters. Reference correlation energies have been computed by performing exact and Hartree--Fock calculations using discrete variable representation. The resulting models require $\mathcal{W}$ calculated at the Hartree--Fock level as input while yielding monotonous decay in the predicted correlation energies of new molecules reaching sub-chemical accuracy with training. △ Less

Submitted 21 January, 2019; v1 submitted 2 February, 2018; originally announced February 2018.

arXiv:1801.07286 [pdf, other]

doi 10.1016/j.cplett.2019.02.004

Exact separation of radial and angular correlation energies in two-electron atoms

Authors: Anjana R Kammath, Raghunathan Ramakrishnan

Abstract: Partitioning of helium atom's correlation energy into radial and angular contributions, although of fundamental interest, has eluded critical scrutiny. Conventionally, radial and angular correlation energies of helium atom are defined for its ground state as deviations, from Hartree--Fock and exact values, of the energy obtained using a purely radial wavefunction devoid of any explicit dependence… ▽ More Partitioning of helium atom's correlation energy into radial and angular contributions, although of fundamental interest, has eluded critical scrutiny. Conventionally, radial and angular correlation energies of helium atom are defined for its ground state as deviations, from Hartree--Fock and exact values, of the energy obtained using a purely radial wavefunction devoid of any explicit dependence on the interelectronic distance. Here, we show this rationale to associate the contribution from radial-angular coupling entirely to the angular part underestimating the radial one, thereby also incorrectly predict non-vanishing residual radial probability densities. We derive analytic matrix elements for the high-precision Hylleraas basis set framework to seamlessly uncouple the angular correlation energy from its radial counterpart. The resulting formula agrees with numerical cubature yielding precise purely angular correlation energies for the ground as well as excited states. Our calculations indicate 60.2% of helium's correlation energy to arise from strictly radial interactions; when excluding the contribution from the radial-angular coupling, this value drops to 41.3%. △ Less

Submitted 1 February, 2019; v1 submitted 22 January, 2018; originally announced January 2018.

arXiv:1612.03505 [pdf, ps, other]

doi 10.1109/ICASSP.2017.7952638

Convolutional Neural Networks for Passive Monitoring of a Shallow Water Environment using a Single Sensor

Authors: Eric L. Ferguson, Rishi Ramakrishnan, Stefan B. Williams, Craig T. Jin

Abstract: A cost effective approach to remote monitoring of protected areas such as marine reserves and restricted naval waters is to use passive sonar to detect, classify, localize, and track marine vessel activity (including small boats and autonomous underwater vehicles). Cepstral analysis of underwater acoustic data enables the time delay between the direct path arrival and the first multipath arrival t… ▽ More A cost effective approach to remote monitoring of protected areas such as marine reserves and restricted naval waters is to use passive sonar to detect, classify, localize, and track marine vessel activity (including small boats and autonomous underwater vehicles). Cepstral analysis of underwater acoustic data enables the time delay between the direct path arrival and the first multipath arrival to be measured, which in turn enables estimation of the instantaneous range of the source (a small boat). However, this conventional method is limited to ranges where the Lloyd's mirror effect (interference pattern formed between the direct and first multipath arrivals) is discernible. This paper proposes the use of convolutional neural networks (CNNs) for the joint detection and ranging of broadband acoustic noise sources such as marine vessels in conjunction with a data augmentation approach for improving network performance in varied signal-to-noise ratio (SNR) situations. Performance is compared with a conventional passive sonar ranging method for monitoring marine vessel activity using real data from a single hydrophone mounted above the sea floor. It is shown that CNNs operating on cepstrum data are able to detect the presence and estimate the range of transiting vessels at greater distances than the conventional method. △ Less

Submitted 11 December, 2016; originally announced December 2016.

Comments: Final draft for IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2017. 5 pages, 4 figures

arXiv:1611.09007 [pdf, other]

Hyperspectral CNN Classification with Limited Training Samples

Authors: Lloyd Windrim, Rishi Ramakrishnan, Arman Melkumyan, Richard Murphy

Abstract: Hyperspectral imaging sensors are becoming increasingly popular in robotics applications such as agriculture and mining, and allow per-pixel thematic classification of materials in a scene based on their unique spectral signatures. Recently, convolutional neural networks have shown remarkable performance for classification tasks, but require substantial amounts of labelled training data. This data… ▽ More Hyperspectral imaging sensors are becoming increasingly popular in robotics applications such as agriculture and mining, and allow per-pixel thematic classification of materials in a scene based on their unique spectral signatures. Recently, convolutional neural networks have shown remarkable performance for classification tasks, but require substantial amounts of labelled training data. This data must sufficiently cover the variability expected to be encountered in the environment. For hyperspectral data, one of the main variations encountered outdoors is due to incident illumination, which can change in spectral shape and intensity depending on the scene geometry. For example, regions occluded from the sun have a lower intensity and their incident irradiance skewed towards shorter wavelengths. In this work, a data augmentation strategy based on relighting is used during training of a hyperspectral convolutional neural network. It allows training to occur in the outdoor environment given only a small labelled region, which does not need to sufficiently represent the geometric variability of the entire scene. This is important for applications where obtaining large amounts of training data is labourious, hazardous or difficult, such as labelling pixels within shadows. Radiometric normalisation approaches for pre-processing the hyperspectral data are analysed and it is shown that methods based on the raw pixel data are sufficient to be used as input for the classifier. This removes the need for external hardware such as calibration boards, which can restrict the application of hyperspectral sensors in robotics applications. Experiments to evaluate the classification system are carried out on two datasets captured from a field-based platform. △ Less

Submitted 28 November, 2016; originally announced November 2016.

Comments: 10 pages, 6 figures

arXiv:1611.07435 [pdf, other]

Genetic optimization of training sets for improved machine learning models of molecular properties

Authors: Nicholas J. Browning, Raghunathan Ramakrishnan, O. Anatole von Lilienfeld, Ursula Röthlisberger

Abstract: The training of molecular models of quantum mechanical properties based on statistical machine learning requires large datasets which exemplify the map from chemical structure to molecular property. Intelligent a priori selection of training examples is often difficult or impossible to achieve as prior knowledge may be sparse or unavailable. Ordinarily representative selection of training molecule… ▽ More The training of molecular models of quantum mechanical properties based on statistical machine learning requires large datasets which exemplify the map from chemical structure to molecular property. Intelligent a priori selection of training examples is often difficult or impossible to achieve as prior knowledge may be sparse or unavailable. Ordinarily representative selection of training molecules from such datasets is achieved through random sampling. We use genetic algorithms for the optimization of training set composition consisting of tens of thousands of small organic molecules. The resulting machine learning models are considerably more accurate with respect to small randomly selected training sets: mean absolute errors for out-of-sample predictions are reduced to ~25% for enthalpies, free energies, and zero-point vibrational energy, to ~50% for heat-capacity, electron-spread, and polarizability, and by more than ~20% for electronic properties such as frontier orbital eigenvalues or dipole-moments. We discuss and present optimized training sets consisting of 10 molecular classes for all molecular properties studied. We show that these classes can be used to design improved training sets for the generation of machine learning models of the same properties in similar but unrelated molecular sets. △ Less

Submitted 24 November, 2016; v1 submitted 22 November, 2016; originally announced November 2016.

Comments: 9 pages, 6 figures

arXiv:1608.05763 [pdf, ps, other]

Inference in Probabilistic Logic Programs using Lifted Explanations

Authors: Arun Nampally, C. R. Ramakrishnan

Abstract: In this paper, we consider the problem of lifted inference in the context of Prism-like probabilistic logic programming languages. Traditional inference in such languages involves the construction of an explanation graph for the query and computing probabilities over this graph. When evaluating queries over probabilistic logic programs with a large number of instances of random variables, traditio… ▽ More In this paper, we consider the problem of lifted inference in the context of Prism-like probabilistic logic programming languages. Traditional inference in such languages involves the construction of an explanation graph for the query and computing probabilities over this graph. When evaluating queries over probabilistic logic programs with a large number of instances of random variables, traditional methods treat each instance separately. For many programs and queries, we observe that explanations can be summarized into substantially more compact structures, which we call lifted explanation graphs. In this paper, we define lifted explanation graphs and operations over them. In contrast to existing lifted inference techniques, our method for constructing lifted explanations naturally generalizes existing methods for constructing explanation graphs. To compute probability of query answers, we solve recurrences generated from the lifted graphs. We show examples where the use of our technique reduces the asymptotic complexity of inference. △ Less

Submitted 19 August, 2016; originally announced August 2016.

arXiv:1604.06118 [pdf, other]

XPL: An extended probabilistic logic for probabilistic transition systems

Authors: Andrey Gorlin, C. R. Ramakrishnan

Abstract: Generalized Probabilistic Logic (GPL) is a temporal logic, based on the modal mu-calculus, for specifying properties of reactive probabilistic systems. We explore XPL, an extension to GPL allowing the semantics of nondeterminism present in Markov decision processes (MDPs). XPL is expressive enough that a number of independently studied problems--- such as termination of Recursive MDPs (RMDPs), PCT… ▽ More Generalized Probabilistic Logic (GPL) is a temporal logic, based on the modal mu-calculus, for specifying properties of reactive probabilistic systems. We explore XPL, an extension to GPL allowing the semantics of nondeterminism present in Markov decision processes (MDPs). XPL is expressive enough that a number of independently studied problems--- such as termination of Recursive MDPs (RMDPs), PCTL* model checking of MDPs, and reachability for Branching MDPs--- can all be cast as model checking over XPL. Termination of multi-exit RMDPs is undecidable; thus, model checking in XPL is undecidable in general. We define a subclass, called separable XPL, for which model checking is decidable. Decidable problems such as termination of 1-exit RMDPs, PCTL* model checking of MDPs, and reachability for Branching MDPs can be reduced to model checking separable XPL. Thus, XPL forms a uniform framework for studying problems involving systems with non-deterministic and probabilistic behaviors, while separable XPL provides a way to solve decidable fragments of these problems. △ Less

Submitted 9 May, 2017; v1 submitted 20 April, 2016; originally announced April 2016.

ACM Class: I.2.3

arXiv:1510.07512 [pdf, other]

Machine Learning, Quantum Mechanics, and Chemical Compound Space

Authors: Raghunathan Ramakrishnan, O. Anatole von Lilienfeld

Abstract: We review recent studies dealing with the generation of machine learning models of molecular and solid properties. The models are trained and validated using standard quantum chemistry results obtained for organic molecules and materials selected from chemical space at random. We review recent studies dealing with the generation of machine learning models of molecular and solid properties. The models are trained and validated using standard quantum chemistry results obtained for organic molecules and materials selected from chemical space at random. △ Less

Submitted 12 May, 2016; v1 submitted 26 October, 2015; originally announced October 2015.

arXiv:1509.08439 [pdf, other]

Hyper-Fisher Vectors for Action Recognition

Authors: Sanath Narayan, Kalpathi R. Ramakrishnan

Abstract: In this paper, a novel encoding scheme combining Fisher vector and bag-of-words encodings has been proposed for recognizing action in videos. The proposed Hyper-Fisher vector encoding is sum of local Fisher vectors which are computed based on the traditional Bag-of-Words (BoW) encoding. Thus, the proposed encoding is simple and yet an effective representation over the traditional Fisher Vector enc… ▽ More In this paper, a novel encoding scheme combining Fisher vector and bag-of-words encodings has been proposed for recognizing action in videos. The proposed Hyper-Fisher vector encoding is sum of local Fisher vectors which are computed based on the traditional Bag-of-Words (BoW) encoding. Thus, the proposed encoding is simple and yet an effective representation over the traditional Fisher Vector encoding. By extensive evaluation on challenging action recognition datasets, viz., Youtube, Olympic Sports, UCF50 and HMDB51, we show that the proposed Hyper-Fisher Vector encoding improves the recognition performance by around 2-3% compared to the improved Fisher Vector encoding. We also perform experiments to show that the performance of the Hyper-Fisher Vector is robust to the dictionary size of the BoW encoding. △ Less

Submitted 28 September, 2015; originally announced September 2015.

arXiv:1509.02847 [pdf, other]

doi 10.1063/1.4947217

Fast and accurate predictions of covalent bonds in chemical space

Authors: K. Y. Samuel Chang, Stijn Fias, Raghunathan Ramakrishnan, O. Anatole von Lilienfeld

Abstract: We assess the predictive accuracy of perturbation theory based estimates of changes in covalent bonding due to linear alchemical interpolations among molecules. We have investigated $σ$ bonding to hydrogen, as well as $σ$ and $π$ bonding between main-group elements, occurring in small sets of iso-valence-electronic molecular species with elements drawn from second to fourth rows in the $p$-block o… ▽ More We assess the predictive accuracy of perturbation theory based estimates of changes in covalent bonding due to linear alchemical interpolations among molecules. We have investigated $σ$ bonding to hydrogen, as well as $σ$ and $π$ bonding between main-group elements, occurring in small sets of iso-valence-electronic molecular species with elements drawn from second to fourth rows in the $p$-block of the periodic table. Numerical evidence suggests that first order estimates of covalent bonding potentials can achieve chemical accuracy if (i) the alchemical interpolation is vertical (fixed geometry), (ii) involves molecules containing elements in the third and fourth row of the periodic table, and (iii) a reference geometry is optimized. In this case, changes in the bonding potential become near-linear in coupling parameter, resulting in analytical predictions with very high accuracy ($\sim$1 kcal/mol). Second order estimates deteriorate the prediction. If initial and final molecules differ not only in composition but also in geometry, all estimates become substantially worse, with second order being slightly more accurate than first order. The independent particle approximation to the second order perturbation performs poorly when compared to the coupled perturbed or finite difference approach. Taylor series expansions up to fourth order of the potential energy curve of highly symmetric systems indicate a finite radius of convergence, as illustrated for the alchemical stretching of H$_2^+$. Numerical results are presented for covalent bonds to hydrogen in 12 molecules with 8 valence electrons; (ii) main-group single bonds in 9 molecules with 14 valence electrons; (iii) main-group double bonds in 9 molecules with 12 valence electrons; (iv) main-group triple bonds in 9 molecules with 10 valence electrons; (v) H$_2^+$ single bond with 1 electron. △ Less

Submitted 13 January, 2016; v1 submitted 9 September, 2015; originally announced September 2015.

arXiv:1505.00350 [pdf, other]

doi 10.1021/acs.jpclett.5b01456

Machine Learning for Quantum Mechanical Properties of Atoms in Molecules

Authors: Matthias Rupp, Raghunathan Ramakrishnan, O. Anatole von Lilienfeld

Abstract: We introduce machine learning models of quantum mechanical observables of atoms in molecules. Instant out-of-sample predictions for proton and carbon nuclear chemical shifts, atomic core level excitations, and forces on atoms reach accuracies on par with density functional theory reference. Locality is exploited within non-linear regression via local atom-centered coordinate systems. The approach… ▽ More We introduce machine learning models of quantum mechanical observables of atoms in molecules. Instant out-of-sample predictions for proton and carbon nuclear chemical shifts, atomic core level excitations, and forces on atoms reach accuracies on par with density functional theory reference. Locality is exploited within non-linear regression via local atom-centered coordinate systems. The approach is validated on a diverse set of 9k small organic molecules. Linear scaling of computational cost in system size is demonstrated for saturated polymers with up to sub-mesoscale lengths. △ Less

Submitted 25 August, 2015; v1 submitted 2 May, 2015; originally announced May 2015.

Journal ref: Journal of Physical Chemistry Letters 6(16): 3309-3313, 2015

arXiv:1504.01966 [pdf, other]

doi 10.1063/1.4928757

Electronic Spectra from TDDFT and Machine Learning in Chemical Space

Authors: Raghunathan Ramakrishnan, Mia Hartmann, Enrico Tapavicza, O. Anatole von Lilienfeld

Abstract: Due to its favorable computational efficiency time-dependent (TD) density functional theory (DFT) enables the prediction of electronic spectra in a high-throughput manner across chemical space. Its predictions, however, can be quite inaccurate. We resolve this issue with machine learning models trained on deviations of reference second-order approximate coupled-cluster singles and doubles (CC2) sp… ▽ More Due to its favorable computational efficiency time-dependent (TD) density functional theory (DFT) enables the prediction of electronic spectra in a high-throughput manner across chemical space. Its predictions, however, can be quite inaccurate. We resolve this issue with machine learning models trained on deviations of reference second-order approximate coupled-cluster singles and doubles (CC2) spectra from TDDFT counterparts, or even from DFT gap. We applied this approach to low-lying singlet-singlet vertical electronic spectra of over 20 thousand synthetically feasible small organic molecules with up to eight CONF atoms. The prediction errors decay monotonously as a function of training set size. For a training set of 10 thousand molecules, CC2 excitation energies can be reproduced to within $\pm$0.1 eV for the remaining molecules. Analysis of our spectral database via chromophore counting suggests that even higher accuracies can be achieved. Based on the evidence collected, we discuss open challenges associated with data-driven modeling of high-lying spectra, and transition intensities. △ Less

Submitted 4 July, 2015; v1 submitted 8 April, 2015; originally announced April 2015.

arXiv:1503.04987 [pdf, other]

doi 10.1021/acs.jctc.5b00099

Big Data meets Quantum Chemistry Approximations: The $Δ$-Machine Learning Approach

Authors: Raghunathan Ramakrishnan, Pavlo O. Dral, Matthias Rupp, O. Anatole von Lilienfeld

Abstract: Chemically accurate and comprehensive studies of the virtual space of all possible molecules are severely limited by the computational cost of quantum chemistry. We introduce a composite strategy that adds machine learning corrections to computationally inexpensive approximate legacy quantum methods. After training, highly accurate predictions of enthalpies, free energies, entropies, and electron… ▽ More Chemically accurate and comprehensive studies of the virtual space of all possible molecules are severely limited by the computational cost of quantum chemistry. We introduce a composite strategy that adds machine learning corrections to computationally inexpensive approximate legacy quantum methods. After training, highly accurate predictions of enthalpies, free energies, entropies, and electron correlation energies are possible, for significantly larger molecular sets than used for training. For thermochemical properties of up to 16k constitutional isomers of C$_7$H$_{10}$O$_2$ we present numerical evidence that chemical accuracy can be reached. We also predict electron correlation energy in post Hartree-Fock methods, at the computational cost of Hartree-Fock, and we establish a qualitative relationship between molecular entropy and electron correlation. The transferability of our approach is demonstrated, using semi-empirical quantum chemistry and machine learning models trained on 1 and 10\% of 134k organic molecules, to reproduce enthalpies of all remaining molecules at density functional theory level of accuracy. △ Less

Submitted 17 March, 2015; originally announced March 2015.

arXiv:1502.04563 [pdf, other]

Many Molecular Properties from One Kernel in Chemical Space

Authors: Raghunathan Ramakrishnan, O. Anatole von Lilienfeld

Abstract: We introduce property-independent kernels for machine learning modeling of arbitrarily many molecular properties. The kernels encode molecular structures for training sets of varying size, as well as similarity measures sufficiently diffuse in chemical space to sample over all training molecules. Corresponding molecular reference properties provided, they enable the instantaneous generation of ML… ▽ More We introduce property-independent kernels for machine learning modeling of arbitrarily many molecular properties. The kernels encode molecular structures for training sets of varying size, as well as similarity measures sufficiently diffuse in chemical space to sample over all training molecules. Corresponding molecular reference properties provided, they enable the instantaneous generation of ML models which can systematically be improved through the addition of more data. This idea is exemplified for single kernel based modeling of internal energy, enthalpy, free energy, heat capacity, polarizability, electronic spread, zero-point vibrational energy, energies of frontier orbitals, HOMO-LUMO gap, and the highest fundamental vibrational wavenumber. Models of these properties are trained and tested using 112 kilo organic molecules of similar size. Resulting models are discussed as well as the kernels' use for generating and using other property models. △ Less

Submitted 17 March, 2015; v1 submitted 16 February, 2015; originally announced February 2015.

arXiv:1501.03879 [pdf, other]

A new ADMM algorithm for the Euclidean median and its application to robust patch regression

Authors: Kunal N. Chaudhury, K. R. Ramakrishnan

Abstract: The Euclidean Median (EM) of a set of points $Ω$ in an Euclidean space is the point x minimizing the (weighted) sum of the Euclidean distances of x to the points in $Ω$. While there exits no closed-form expression for the EM, it can nevertheless be computed using iterative methods such as the Wieszfeld algorithm. The EM has classically been used as a robust estimator of centrality for multivariate… ▽ More The Euclidean Median (EM) of a set of points $Ω$ in an Euclidean space is the point x minimizing the (weighted) sum of the Euclidean distances of x to the points in $Ω$. While there exits no closed-form expression for the EM, it can nevertheless be computed using iterative methods such as the Wieszfeld algorithm. The EM has classically been used as a robust estimator of centrality for multivariate data. It was recently demonstrated that the EM can be used to perform robust patch-based denoising of images by generalizing the popular Non-Local Means algorithm. In this paper, we propose a novel algorithm for computing the EM (and its box-constrained counterpart) using variable splitting and the method of augmented Lagrangian. The attractive feature of this approach is that the subproblems involved in the ADMM-based optimization of the augmented Lagrangian can be resolved using simple closed-form projections. The proposed ADMM solver is used for robust patch-based image denoising and is shown to exhibit faster convergence compared to an existing solver. △ Less

Submitted 16 January, 2015; originally announced January 2015.

Comments: 5 pages, 3 figures, 1 table. To appear in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, April 19-24, 2015

arXiv:1405.6341 [pdf, other]

doi 10.1145/2696454.2696455

Efficient Model Learning for Human-Robot Collaborative Tasks

Authors: Stefanos Nikolaidis, Keren Gu, Ramya Ramakrishnan, Julie Shah

Abstract: We present a framework for learning human user models from joint-action demonstrations that enables the robot to compute a robust policy for a collaborative task with a human. The learning takes place completely automatically, without any human intervention. First, we describe the clustering of demonstrated action sequences into different human types using an unsupervised learning algorithm. These… ▽ More We present a framework for learning human user models from joint-action demonstrations that enables the robot to compute a robust policy for a collaborative task with a human. The learning takes place completely automatically, without any human intervention. First, we describe the clustering of demonstrated action sequences into different human types using an unsupervised learning algorithm. These demonstrated sequences are also used by the robot to learn a reward function that is representative for each type, through the employment of an inverse reinforcement learning algorithm. The learned model is then used as part of a Mixed Observability Markov Decision Process formulation, wherein the human type is a partially observable variable. With this framework, we can infer, either offline or online, the human type of a new user that was not included in the training set, and can compute a policy for the robot that will be aligned to the preference of this new user and will be robust to deviations of the human actions from prior demonstrations. Finally we validate the approach using data collected in human subject experiments, and conduct proof-of-concept demonstrations in which a person performs a collaborative task with a small industrial robot. △ Less

Submitted 24 May, 2014; originally announced May 2014.

ACM Class: I.2.6; I.2.8; I.2.9

Journal ref: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (HRI 2015)

arXiv:1403.6036 [pdf, other]

Adaptive MCMC-Based Inference in Probabilistic Logic Programs

Authors: Arun Nampally, C. R. Ramakrishnan

Abstract: Probabilistic Logic Programming (PLP) languages enable programmers to specify systems that combine logical models with statistical knowledge. The inference problem, to determine the probability of query answers in PLP, is intractable in general, thereby motivating the need for approximate techniques. In this paper, we present a technique for approximate inference of conditional probabilities for P… ▽ More Probabilistic Logic Programming (PLP) languages enable programmers to specify systems that combine logical models with statistical knowledge. The inference problem, to determine the probability of query answers in PLP, is intractable in general, thereby motivating the need for approximate techniques. In this paper, we present a technique for approximate inference of conditional probabilities for PLP queries. It is an Adaptive Markov Chain Monte Carlo (MCMC) technique, where the distribution from which samples are drawn is modified as the Markov Chain is explored. In particular, the distribution is progressively modified to increase the likelihood that a generated sample is consistent with evidence. In our context, each sample is uniquely characterized by the outcomes of a set of random variables. Inspired by reinforcement learning, our technique propagates rewards to random variable/outcome pairs used in a sample based on whether the sample was consistent or not. The cumulative rewards of each outcome is used to derive a new "adapted distribution" for each random variable. For a sequence of samples, the distributions are progressively adapted after each sample. For a query with "Markovian evaluation structure", we show that the adapted distribution of samples converges to the query's conditional probability distribution. For Markovian queries, we present a modified adaptation process that can be used in adaptive MCMC as well as adaptive independent sampling. We empirically evaluate the effectiveness of the adaptive sampling methods for queries with and without Markovian evaluation structure. △ Less

Submitted 24 March, 2014; originally announced March 2014.

arXiv:1307.2918 [pdf, other]

Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties

Authors: O. Anatole von Lilienfeld, Raghunathan Ramakrishnan, Matthias Rupp, Aaron Knoll

Abstract: We introduce a fingerprint representation of molecules based on a Fourier series of atomic radial distribution functions. This fingerprint is unique (except for chirality), continuous, and differentiable with respect to atomic coordinates and nuclear charges. It is invariant with respect to translation, rotation, and nuclear permutation, and requires no pre-conceived knowledge about chemical bondi… ▽ More We introduce a fingerprint representation of molecules based on a Fourier series of atomic radial distribution functions. This fingerprint is unique (except for chirality), continuous, and differentiable with respect to atomic coordinates and nuclear charges. It is invariant with respect to translation, rotation, and nuclear permutation, and requires no pre-conceived knowledge about chemical bonding, topology, or electronic orbitals. As such it meets many important criteria for a good molecular representation, suggesting its usefulness for machine learning models of molecular properties trained across chemical compound space. To assess the performance of this new descriptor we have trained machine learning models of molecular enthalpies of atomization for training sets with up to 10k organic molecules, drawn at random from a published set of 134k organic molecules. We validate the descriptor on all remaining molecules of the 134k set. For a training set of 5k molecules the fingerprint descriptor achieves a mean absolute error of 8.0 kcal/mol, respectively. This is slightly worse than the performance attained using the Coulomb matrix, another popular alternative, reaching 6.2 kcal/mol for the same training and test sets. △ Less

Submitted 17 March, 2015; v1 submitted 10 July, 2013; originally announced July 2013.

arXiv:1303.3517 [pdf, other]

Iterative MapReduce for Large Scale Machine Learning

Authors: Joshua Rosen, Neoklis Polyzotis, Vinayak Borkar, Yingyi Bu, Michael J. Carey, Markus Weimer, Tyson Condie, Raghu Ramakrishnan

Abstract: Large datasets ("Big Data") are becoming ubiquitous because the potential value in deriving insights from data, across a wide range of business and scientific applications, is increasingly recognized. In particular, machine learning - one of the foundational disciplines for data analysis, summarization and inference - on Big Data has become routine at most organizations that operate large clouds,… ▽ More Large datasets ("Big Data") are becoming ubiquitous because the potential value in deriving insights from data, across a wide range of business and scientific applications, is increasingly recognized. In particular, machine learning - one of the foundational disciplines for data analysis, summarization and inference - on Big Data has become routine at most organizations that operate large clouds, usually based on systems such as Hadoop that support the MapReduce programming paradigm. It is now widely recognized that while MapReduce is highly scalable, it suffers from a critical weakness for machine learning: it does not support iteration. Consequently, one has to program around this limitation, leading to fragile, inefficient code. Further, reliance on the programmer is inherently flawed in a multi-tenanted cloud environment, since the programmer does not have visibility into the state of the system when his or her program executes. Prior work has sought to address this problem by either developing specialized systems aimed at stylized applications, or by augmenting MapReduce with ad hoc support for saving state across iterations (driven by an external loop). In this paper, we advocate support for looping as a first-class construct, and propose an extension of the MapReduce programming paradigm called {\em Iterative MapReduce}. We then develop an optimizer for a class of Iterative MapReduce programs that cover most machine learning techniques, provide theoretical justifications for the key optimization steps, and empirically demonstrate that system-optimized programs for significant machine learning tasks are competitive with state-of-the-art specialized solutions. △ Less

Submitted 13 March, 2013; originally announced March 2013.

arXiv:1204.4736 [pdf, other]

Model Checking with Probabilistic Tabled Logic Programming

Authors: Andrey Gorlin, C. R. Ramakrishnan, Scott A. Smolka

Abstract: We present a formulation of the problem of probabilistic model checking as one of query evaluation over probabilistic logic programs. To the best of our knowledge, our formulation is the first of its kind, and it covers a rich class of probabilistic models and probabilistic temporal logics. The inference algorithms of existing probabilistic logic-programming systems are well defined only for queri… ▽ More We present a formulation of the problem of probabilistic model checking as one of query evaluation over probabilistic logic programs. To the best of our knowledge, our formulation is the first of its kind, and it covers a rich class of probabilistic models and probabilistic temporal logics. The inference algorithms of existing probabilistic logic-programming systems are well defined only for queries with a finite number of explanations. This restriction prohibits the encoding of probabilistic model checkers, where explanations correspond to executions of the system being model checked. To overcome this restriction, we propose a more general inference algorithm that uses finite generative structures (similar to automata) to represent families of explanations. The inference algorithm computes the probability of a possibly infinite set of explanations directly from the finite generative structure. We have implemented our inference algorithm in XSB Prolog, and use this implementation to encode probabilistic model checkers for a variety of temporal logics, including PCTL and GPL (which subsumes PCTL*). Our experiment results show that, despite the highly declarative nature of their encodings, the model checkers constructed in this manner are competitive with their native implementations. △ Less

Submitted 20 April, 2012; originally announced April 2012.

Showing 51–100 of 105 results for author: Ramakrishnan, R