-
Distributed Lag Interaction Model with Index Modification
Authors:
Danielle Demateis,
Sandra India Aldana,
Robert O. Wright,
Rosalind Wright,
Andrea Baccarelli,
Elena Colicino,
Ander Wilson,
Kayleigh P. Keller
Abstract:
Epidemiological evidence supports an association between exposure to air pollution during pregnancy and birth and child health outcomes. Typically, such associations are estimated by regressing an outcome on daily or weekly measures of exposure during pregnancy using a distributed lag model. However, these associations may be modified by multiple factors. We propose a distributed lag interaction m…
▽ More
Epidemiological evidence supports an association between exposure to air pollution during pregnancy and birth and child health outcomes. Typically, such associations are estimated by regressing an outcome on daily or weekly measures of exposure during pregnancy using a distributed lag model. However, these associations may be modified by multiple factors. We propose a distributed lag interaction model with index modification that allows for effect modification of a functional predictor by a weighted average of multiple modifiers. Our model allows for simultaneous estimation of modifier index weights and the exposure-time-response function via a spline cross-basis in a Bayesian hierarchical framework. Through simulations, we showed that our model out-performs competing methods when there are multiple modifiers of unknown importance. We applied our proposed method to a Colorado birth cohort to estimate the association between birth weight and air pollution modified by a neighborhood-vulnerability index and to a Mexican birth cohort to estimate the association between birthing-parent cardio-metabolic endpoints and air pollution modified by a birthing-parent lifetime stress index.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Compositional Outcomes and Environmental Mixtures: the Dirichlet Bayesian Weighted Quantile Sum Regression
Authors:
Hachem Saddiki,
Joshua L. Warren,
Corina Lesseur,
Elena Colicino
Abstract:
Environmental mixture approaches do not accommodate compositional outcomes, consisting of vectors constrained onto the unit simplex. This limitation poses challenges in effectively evaluating the associations between multiple concurrent environmental exposures and their respective impacts on this type of outcomes. As a result, there is a pressing need for the development of analytical methods that…
▽ More
Environmental mixture approaches do not accommodate compositional outcomes, consisting of vectors constrained onto the unit simplex. This limitation poses challenges in effectively evaluating the associations between multiple concurrent environmental exposures and their respective impacts on this type of outcomes. As a result, there is a pressing need for the development of analytical methods that can more accurately assess the complexity of these relationships. Here, we extend the Bayesian weighted quantile sum regression (BWQS) framework for jointly modeling compositional outcomes and environmental mixtures using a Dirichlet distribution with a multinomial logit link function. The proposed approach, named Dirichlet-BWQS (DBWQS), allows for the simultaneous estimation of mixture weights associated with each exposure mixture component as well as the association between the overall exposure mixture index and each outcome proportion. We assess the performance of DBWQS regression on extensive simulated data and a real scenario where we investigate the associations between environmental chemical mixtures and DNA methylation-derived placental cell composition, using publicly available data (GSE75248). We also compare our findings with results considering environmental mixtures and each outcome component. Finally, we developed an R package "xbwqs" where we made our proposed method publicly available (https://github.com/hasdk/xbwqs).
△ Less
Submitted 28 March, 2025; v1 submitted 27 March, 2025;
originally announced March 2025.
-
Missing data interpolation in integrative multi-cohort analysis with disparate covariate information
Authors:
Ekaterina Smirnova,
Yongqi Zhong,
Rasha Alsaadawi,
Xu Ning,
Amii Kress,
Jordan Kuiper,
Mingyu Zhang,
Kristen Lyall,
Sheenas Martenies,
Akram Alshawabkeh,
Catherine Bulka,
Carlos Camargo,
Jaeun Choi,
Elena Colicino,
Anne Dunlop,
Michael Elliott,
Assiamira Ferrara,
Tebeb Gebrestadik,
Jiang Gui,
Kylie Harrall,
Tina Hartert,
Barry Lester,
Andrew Manigault,
Justin Manjourides,
Yu Ni
, et al. (4 additional authors not shown)
Abstract:
Integrative analysis of datasets generated by multiple cohorts is a widely-used approach for increasing sample size, precision of population estimators, and generalizability of analysis results in epidemiological studies. However, often each individual cohort dataset does not have all variables of interest for an integrative analysis collected as a part of an original study. Such cohort-level miss…
▽ More
Integrative analysis of datasets generated by multiple cohorts is a widely-used approach for increasing sample size, precision of population estimators, and generalizability of analysis results in epidemiological studies. However, often each individual cohort dataset does not have all variables of interest for an integrative analysis collected as a part of an original study. Such cohort-level missingness poses methodological challenges to the integrative analysis since missing variables have traditionally: (1) been removed from the data for complete case analysis; or (2) been completed by missing data interpolation techniques using data with the same covariate distribution from other studies. In most integrative-analysis studies, neither approach is optimal as it leads to either loosing the majority of study covariates or challenges in specifying the cohorts following the same distributions. We propose a novel approach to identify the studies with same distributions that could be used for completing the cohort-level missing information. Our methodology relies on (1) identifying sub-groups of cohorts with similar covariate distributions using cohort identity random forest prediction models followed by clustering; and then (2) applying a recursive pairwise distribution test for high dimensional data to these sub-groups. Extensive simulation studies show that cohorts with the same distribution are correctly grouped together in almost all simulation settings. Our methods' application to two ECHO-wide Cohort Studies reveals that the cohorts grouped together reflect the similarities in study design. The methods are implemented in R software package relate.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.