-
American Family Cohort, a data resource description
Authors:
Deepa Balraj,
Ayin Vala,
Shiying Hao,
Melanie Philofsky,
Anna Tsvetkova,
Elena Trach,
Shravani Priya Narra,
Oleg Zhuk,
Mary Shamkhorskaya,
Jim Singer,
Joseph Mesterhazy,
Somalee Datta,
Isabella Chu,
David Rehkopf
Abstract:
This manuscript is a research resource description and presents a large and novel Electronic Health Records (EHR) data resource, American Family Cohort (AFC). The AFC data is derived from Centers for Medicare and Medicaid Services (CMS) certified American Board of Family Medicine (ABFM) PRIME registry. The PRIME registry is the largest national Qualified Clinical Data Registry (QCDR) for Primary C…
▽ More
This manuscript is a research resource description and presents a large and novel Electronic Health Records (EHR) data resource, American Family Cohort (AFC). The AFC data is derived from Centers for Medicare and Medicaid Services (CMS) certified American Board of Family Medicine (ABFM) PRIME registry. The PRIME registry is the largest national Qualified Clinical Data Registry (QCDR) for Primary Care. The data is converted to a popular common data model, the Observational Health Data Sciences and Informatics (OHDSI) Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM).
The resource presents approximately 90 million encounters for 7.5 million patients. All 100% of the patients present age, gender, and address information, and 73% report race. Nealy 93% of patients have lab data in LOINC, 86% have medication data in RxNorm, 93% have diagnosis in SNOWMED and ICD, 81% have procedures in HCPCS or CPT, and 61% have insurance information. The richness, breadth, and diversity of this research accessible and research ready data is expected to accelerate observational studies in many diverse areas. We expect this resource to facilitate research in many years to come.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment
Authors:
Yuan Gong,
Ziyi Chen,
Iek-Heng Chu,
Peng Chang,
James Glass
Abstract:
Automatic pronunciation assessment is an important technology to help self-directed language learners. While pronunciation quality has multiple aspects including accuracy, fluency, completeness, and prosody, previous efforts typically only model one aspect (e.g., accuracy) at one granularity (e.g., at the phoneme-level). In this work, we explore modeling multi-aspect pronunciation assessment at mu…
▽ More
Automatic pronunciation assessment is an important technology to help self-directed language learners. While pronunciation quality has multiple aspects including accuracy, fluency, completeness, and prosody, previous efforts typically only model one aspect (e.g., accuracy) at one granularity (e.g., at the phoneme-level). In this work, we explore modeling multi-aspect pronunciation assessment at multiple granularities. Specifically, we train a Goodness Of Pronunciation feature-based Transformer (GOPT) with multi-task learning. Experiments show that GOPT achieves the best results on speechocean762 with a public automatic speech recognition (ASR) acoustic model trained on Librispeech.
△ Less
Submitted 6 May, 2022;
originally announced May 2022.
-
Deep Neural Networks for Accurate Predictions of Garnet Stability
Authors:
Weike Ye,
Chi Chen,
Zhenbin Wang,
Iek-Heng Chu,
Shyue Ping Ong
Abstract:
Predicting the stability of crystals is one of the central problems in materials science. Today, density functional theory (DFT) calculations are the computational tool of choice to obtain energies of crystals with quantitative accuracy. Despite algorithmic and computing advances, DFT calculations remain comparatively expensive and scale poorly with system size. Here we show that deep neural netwo…
▽ More
Predicting the stability of crystals is one of the central problems in materials science. Today, density functional theory (DFT) calculations are the computational tool of choice to obtain energies of crystals with quantitative accuracy. Despite algorithmic and computing advances, DFT calculations remain comparatively expensive and scale poorly with system size. Here we show that deep neural networks utilizing just two descriptors - the Pauling electronegativity and ionic radii - can predict the DFT formation energies of C3A2D3O12 garnets with extremely low mean absolute errors of 7-8 meV/atom, an order of magnitude improvement over previous machine learning models and well within the limits of DFT accuracy. Further extension to mixed garnets with little loss in accuracy can be achieved using a binary encoding scheme that introduces minimal increase in descriptor dimensionality. Our results demonstrate that generalizable deep-learning models for quantitative crystal stability prediction can be built on a small set of chemically-intuitive descriptors. Such models provide the means to rapidly transverse vast chemical spaces to accurately identify stable compositions, accelerating the discovery of novel materials with potentially superior properties.
△ Less
Submitted 5 December, 2017;
originally announced December 2017.
-
Predicting the Volumes of Crystals
Authors:
Iek-Heng Chu,
Sayan Roychowdhury,
Daehui Han,
Anubhav Jain,
Shyue Ping Ong
Abstract:
New crystal structures are frequently derived by performing ionic substitutions on known crystal structures. These derived structures are then used in further experimental analysis, or as the initial guess for structural optimization in electronic structure calculations, both of which usually require a reasonable guess of the lattice parameters. In this work, we propose two lattice prediction sche…
▽ More
New crystal structures are frequently derived by performing ionic substitutions on known crystal structures. These derived structures are then used in further experimental analysis, or as the initial guess for structural optimization in electronic structure calculations, both of which usually require a reasonable guess of the lattice parameters. In this work, we propose two lattice prediction schemes to improve the initial guess of a candidate crystal structure. The first scheme relies on a one-to-one mapping of species in the candidate crystal structure to a known crystal structure, while the second scheme relies on data-mined minimum atom pair distances to predict the crystal volume of the candidate crystal structure and does not require a reference structure. We demonstrate that the two schemes can effectively predict the volumes within mean absolute errors (MAE) as low as 3.8% and 8.2%. We also discuss the various factors that may impact the performance of the schemes. Implementations for both schemes are available in the open-source pymatgen software.
△ Less
Submitted 4 December, 2017;
originally announced December 2017.
-
Accurate Force Field for Molybdenum by Machine Learning Large Materials Data
Authors:
Chi Chen,
Zhi Deng,
Richard Tran,
Hanmei Tang,
Iek-Heng Chu,
Shyue Ping Ong
Abstract:
In this work, we present a highly accurate spectral neighbor analysis potential (SNAP) model for molybdenum (Mo) developed through the rigorous application of machine learning techniques on large materials data sets. Despite Mo's importance as a structural metal, existing force fields for Mo based on the embedded atom and modified embedded atom methods still do not provide satisfactory accuracy on…
▽ More
In this work, we present a highly accurate spectral neighbor analysis potential (SNAP) model for molybdenum (Mo) developed through the rigorous application of machine learning techniques on large materials data sets. Despite Mo's importance as a structural metal, existing force fields for Mo based on the embedded atom and modified embedded atom methods still do not provide satisfactory accuracy on many properties. We will show that by fitting to the energies, forces and stress tensors of a large density functional theory (DFT)-computed dataset on a diverse set of Mo structures, a Mo SNAP model can be developed that achieves close to DFT accuracy in the prediction of a broad range of properties, including energies, forces, stresses, elastic constants, melting point, phonon spectra, surface energies, grain boundary energies, etc. We will outline a systematic model development process, which includes a rigorous approach to structural selection based on principal component analysis, as well as a differential evolution algorithm for optimizing the hyperparameters in the model fitting so that both the model error and the property prediction error can be simultaneously lowered. We expect that this newly developed Mo SNAP model will find broad applications in large-scale, long-time scale simulations.
△ Less
Submitted 28 June, 2017;
originally announced June 2017.
-
Electronic Structure Descriptor for Discovery of Narrow-Band Red-Emitting Phosphors
Authors:
Zhenbin Wang,
Iek-Heng Chu,
Fei Zhou,
Shyue Ping Ong
Abstract:
Narrow-band red-emitting phosphors are a critical component in phosphor-converted light-emitting diodes for highly efficient illumination-grade lighting. In this work, we report the discovery of a quantitative descriptor for narrow-band Eu2+-activated emission identified through a comparison of the electronic structure of known narrow-band and broad-band phosphors. We find that a narrow emission b…
▽ More
Narrow-band red-emitting phosphors are a critical component in phosphor-converted light-emitting diodes for highly efficient illumination-grade lighting. In this work, we report the discovery of a quantitative descriptor for narrow-band Eu2+-activated emission identified through a comparison of the electronic structure of known narrow-band and broad-band phosphors. We find that a narrow emission bandwidth is characterized by a large splitting of more than 0.1 eV between the two highest Eu2+ 4f7 bands. By incorporating this descriptor in a high throughput first principles screening of 2,259 nitride compounds, we identify five promising new nitride hosts for Eu2+-activated red-emitting phosphors that are predicted to exhibit good chemical stability, thermal quenching resistance and quantum efficiency, as well as narrow-band emission. Our findings provide important insights into the emission characteristics of rare-earth activators in phosphor hosts, and a general strategy to the discovery of phosphors with a desired emission peak and bandwidth.
△ Less
Submitted 15 April, 2016;
originally announced April 2016.
-
All-electron self-consistent GW in the Matsubara-time domain: implementation and benchmarks of semiconductors and insulators
Authors:
Iek-Heng Chu,
Jonathan P. Trinastic,
Yun-Peng Wang,
Adolfo G. Eguiluz,
Anton Kozhevnikov,
Thomas C. Schulthess,
Hai-Ping Cheng
Abstract:
The GW approximation is a well-known method to improve electronic structure predictions calculated within density functional theory. In this work, we have implemented a computationally efficient GW approach that calculates central properties within the Matsubara-time domain using the modified version of Elk, the full-potential linearized augmented plane wave (FP-LAPW) package. Continuous-pole expa…
▽ More
The GW approximation is a well-known method to improve electronic structure predictions calculated within density functional theory. In this work, we have implemented a computationally efficient GW approach that calculates central properties within the Matsubara-time domain using the modified version of Elk, the full-potential linearized augmented plane wave (FP-LAPW) package. Continuous-pole expansion (CPE), a recently proposed analytic continuation method, has been incorporated and compared to the widely used Pade approximation. Full crystal symmetry has been employed for computational speedup. We have applied our approach to 18 well-studied semiconductors/insulators that cover a wide range of band gaps computed at the levels of single-shot G0W0, partially self-consistent GW0, and fully self-consistent GW (scGW). Our calculations show that G0W0 leads to band gaps that agree well with experiment for the case of simple s-p electron systems, whereas scGW is required for improving the band gaps in 3-d electron systems. In addition, GW0 almost always predicts larger band gap values compared to scGW, likely due to the substantial underestimation of screening effects. Both the CPE method and Pade approximation lead to similar band gaps for most systems except strontium titantate, suggesting further investigation into the latter approximation is necessary for strongly correlated systems. Our computed band gaps serve as important benchmarks for the accuracy of the Matsubara-time GW approach.
△ Less
Submitted 6 February, 2016;
originally announced February 2016.
-
Preventing rapid energy loss from electron-hole pairs to phonons in graphene quantum dots
Authors:
Jonathan Trinastic,
Iek-Heng Chu,
Hai-Ping Cheng
Abstract:
In semiconductors, photoexcited electrons and holes (carriers) initially occupy high-energy states, but quickly lose energy to phonons and relax to the band edge within a picosecond [1]. Increasing the lifetime of carriers in light-absorbing materials is necessary to improve open-circuit voltage in photovoltaics [2], charge separation in organic solar cells [3], and charge transfer in photodetecti…
▽ More
In semiconductors, photoexcited electrons and holes (carriers) initially occupy high-energy states, but quickly lose energy to phonons and relax to the band edge within a picosecond [1]. Increasing the lifetime of carriers in light-absorbing materials is necessary to improve open-circuit voltage in photovoltaics [2], charge separation in organic solar cells [3], and charge transfer in photodetection de vices [4]. Here we demonstrate long lifetimes over one hundred picoseconds for electron-hole pairs in graphene quantum dots (GQDs) due to large transition energies and weak coupling to excitonic states below the fundamental band gap. This possibility for a large transition energy to bound excitons is due to graphene's poor screening, illustrating a unique mechanism in this QD to occupy higher-energy states for long timescales. GQD edges can be terminated with either armchair or zigzag carbon patterns, and this edge structure changes excited state lifetimes by orders of magnitude. These results indicate nanoscale control of carrier lifetimes in optoelectronics.
△ Less
Submitted 14 April, 2015;
originally announced April 2015.
-
Electron transport in graphene/graphene side-contact junction by plane-wave multiple scattering method
Authors:
Xiang-Guo Li,
Iek-Heng Chu,
X. -G. Zhang,
Hai-Ping Cheng
Abstract:
Electron transport in graphene is along the sheet but junction devices are often made by stacking different sheets together in a "side-contact" geometry which causes the current to flow perpendicular to the sheets within the device. Such geometry presents a challenge to first-principles transport methods. We solve this problem by implementing a plane-wave based multiple scattering theory for elect…
▽ More
Electron transport in graphene is along the sheet but junction devices are often made by stacking different sheets together in a "side-contact" geometry which causes the current to flow perpendicular to the sheets within the device. Such geometry presents a challenge to first-principles transport methods. We solve this problem by implementing a plane-wave based multiple scattering theory for electron transport. This implementation improves the computational efficiency over the existing plane-wave transport code, scales better for parallelization over large number of nodes, and does not require the current direction to be along a lattice axis. As a first application, we calculate the tunneling current through a side-contact graphene junction formed by two separate graphene sheets with the edges overlapping each other. We find that transport properties of this junction depend strongly on the AA or AB stacking within the overlapping region as well as the vacuum gap between two graphene sheets. Such transport behaviors are explained in terms of carbon orbital orientation, hybridization, and delocalization as the geometry is varied.
△ Less
Submitted 6 April, 2015;
originally announced April 2015.
-
All-Electron GW Quasiparticle Band Structures of Group 14 Nitride Compounds
Authors:
Iek-Heng Chu,
Anton Kozhenikov,
Thomas C. Schulthess,
Hai-Ping Cheng
Abstract:
We have investigated the group 14 nitrides (M$_3$N$_4$) in the spinel phase ($γ$-M$_3$N$_4$ with M= C, Si, Ge and Sn) and $β$ phase ($β$-M$_3$N$_4$ with M= Si, Ge and Sn) using density functional theory with the local density approximation and the GW approximation. The Kohn-Sham energies of these systems have been first calculated within the framework of full-potential linearized augmented plane w…
▽ More
We have investigated the group 14 nitrides (M$_3$N$_4$) in the spinel phase ($γ$-M$_3$N$_4$ with M= C, Si, Ge and Sn) and $β$ phase ($β$-M$_3$N$_4$ with M= Si, Ge and Sn) using density functional theory with the local density approximation and the GW approximation. The Kohn-Sham energies of these systems have been first calculated within the framework of full-potential linearized augmented plane waves and then corrected using single-shot G$_0$W$_0$ calculations, which we have implemented in the modified version of the Elk full-potential LAPW code. Direct band gaps at the $Γ$ point have been found for spinel-type nitrides $γ$-M$_3$N$_4$ with M= Si, Ge and Sn. The corresponding GW-corrected band gaps agree with experiment. We have also found that the GW calculations with and without the plasmon-pole approximation give very similar results, even when the system contains semi-core $d$ electrons. These spinel-type nitrides are novel materials for potential optoelectronic applications because of their direct and tunable band gaps.
△ Less
Submitted 17 April, 2014;
originally announced April 2014.
-
Using Light-Switching Molecules to Modulate Charge Mobility in a Quantum Dot Array
Authors:
Iek-Heng Chu,
Jonathan Trinastic,
Lin-Wang Wang,
Hai-Ping Cheng
Abstract:
We have studied the electron hopping in a two-CdSe quantum dot system linked by an azobenzene-derived light-switching molecule. This system can be considered as a prototype of a QD supercrystal. Following the computational strategies given in our recent work [Chu et al. J. Phys. Chem. C 115, 21409 (2011)], we have investigated the effects of molecular attachment, molecular isomer (trans and cis) a…
▽ More
We have studied the electron hopping in a two-CdSe quantum dot system linked by an azobenzene-derived light-switching molecule. This system can be considered as a prototype of a QD supercrystal. Following the computational strategies given in our recent work [Chu et al. J. Phys. Chem. C 115, 21409 (2011)], we have investigated the effects of molecular attachment, molecular isomer (trans and cis) and QD size on electron hopping rate using Marcus theory. Our results indicate that molecular attachment has a large impact on the system for both isomers. In the most energetically favorable attachment, the cis isomer provides significantly greater coupling between the two QDs and hence the electron hopping rate is greater compared to the trans isomer. As a result, the carrier mobility of the QD array in the low carrier density, weak external electric field regime is several orders of magnitude higher in the cis compared to the trans configuration. This is the first demonstration of mobility modulation using QDs and azobenzene that could lead to a new type of switching device.
△ Less
Submitted 9 January, 2014;
originally announced January 2014.
-
First-Principles Studies of Photoinduced Charge Transfer in Noncovalently Functionalized Carbon Nanotubes
Authors:
Iek-Heng Chu,
Dmitri S. Kilin,
Hai-Ping Cheng
Abstract:
We have studied the energetics, electronic structure, optical excitation, and electron relaxation of dinitromethane molecules (CH$_{2}$N$_{2}$O$_{4}$) adsorbed on semiconducting carbon nanotubes (CNTs) of chiral index (n,0) (n=7, 10, 13, 16, 19). Using first-principles density functional theory (DFT) with generalized gradient approximations and van der Waals corrections, we have calculated adsorpt…
▽ More
We have studied the energetics, electronic structure, optical excitation, and electron relaxation of dinitromethane molecules (CH$_{2}$N$_{2}$O$_{4}$) adsorbed on semiconducting carbon nanotubes (CNTs) of chiral index (n,0) (n=7, 10, 13, 16, 19). Using first-principles density functional theory (DFT) with generalized gradient approximations and van der Waals corrections, we have calculated adsorption energies of dinitropentylpyrene, in which the dinitromethane is linked to the pyrene via an aliphatic chain, on a CNT. A 75.26 kJ/mol binding energy has been found, which explains why such aliphatic chain-pyrene units can be and have been used in experiments to bind functional molecules to CNTs. The calculated electronic structures show that the dinitromethane introduces a localized state inside the band gap of CNT systems of n=10, 13, 16 and 19; such a state can trap an electron when the CNT is photoexcited. We have therefore investigated the dynamics of intra-band relaxations using the reduced density matrix formalism in conjunction with DFT. For pristine CNTs, we have found that the calculated charge relaxation constants agree well with the experimental time scales. Upon adsorption, these constants are modified, but there is not a clear trend for the direction and magnitude of the change. Nevertheless, our calculations predict that electron relaxation in the conduction band is faster than hole relaxation in the valence band for CNTs with and without molecular adsorbates.
△ Less
Submitted 14 June, 2013;
originally announced June 2013.
-
Normal Parameters for an Analytic Description of the CMB Cosmological Parameter Likelihood
Authors:
Mike Chu,
Manoj Kaplinghat,
Lloyd Knox
Abstract:
The normal parameters are a non--linear transformation of the cosmological parameters whose likelihood function is very well--approximated by a normal distribution. This transformation serves as an extreme form of data compression allowing for practically instantaneous calculation of the likelihood of any given model, as long as the model is in the parameter space originally considered. The comp…
▽ More
The normal parameters are a non--linear transformation of the cosmological parameters whose likelihood function is very well--approximated by a normal distribution. This transformation serves as an extreme form of data compression allowing for practically instantaneous calculation of the likelihood of any given model, as long as the model is in the parameter space originally considered. The compression makes all the information about cosmological parameter constraints from a given set of experiments available in a useable manner. Here we explicitly define the normal parameters that work for the current CMB data, and give their mean and covariance matrix which best fit the likelihood function calculated by the Monte Carlo Markov Chain method. Along with standard parameter estimation results, we propose that future CMB parameter analyses define normal parameters and quote their mean and covariance matrix.
△ Less
Submitted 1 October, 2003; v1 submitted 20 December, 2002;
originally announced December 2002.