-
Representative dietary behavior patterns and associations with cardiometabolic outcomes in Puerto Rico using a Bayesian latent class analysis for non-probability samples
Authors:
Stephanie M. Wu,
Abrania Marrero,
Matthew R. Williams,
Terrance D. Savitsky,
Josiemer Mattei,
José Rodríguez-Orengo,
Briana J. K. Stephenson
Abstract:
There is limited understanding of how dietary behaviors cluster together and influence cardiometabolic health at a population level in Puerto Rico. Data availability is scarce, particularly outside of urban areas, and is often limited to non-probability sample (NPS) data where sample inclusion mechanisms are unknown. In order to generalize results to the broader Puerto Rican population, adjustment…
▽ More
There is limited understanding of how dietary behaviors cluster together and influence cardiometabolic health at a population level in Puerto Rico. Data availability is scarce, particularly outside of urban areas, and is often limited to non-probability sample (NPS) data where sample inclusion mechanisms are unknown. In order to generalize results to the broader Puerto Rican population, adjustments are necessary to account for selection bias but are difficult to implement for NPS data. Although Bayesian latent class models enable summaries of dietary behavior variables through underlying patterns, they have not yet been adapted to the NPS setting. We propose a novel Weighted Overfitted Latent Class Analysis for Non-probability samples (WOLCAN). WOLCAN utilizes a quasi-randomization framework to (1) model pseudo-weights for an NPS using Bayesian additive regression trees (BART) and a reference probability sample, and (2) integrate the pseudo-weights within a weighted pseudo-likelihood approach for Bayesian latent class analysis, while propagating pseudo-weight uncertainty into parameter estimation. A stacked sample approach is used to allow shared individuals between the NPS and the reference sample. We evaluate model performance through simulations and apply WOLCAN to data from the Puerto Rico Observational Study of Psychosocial, Environmental, and Chronic Disease Trends (PROSPECT). We identify dietary behavior patterns for adults in Puerto Rico aged 30 to 75 and examine their associations with type 2 diabetes, hypertension, and hypercholesterolemia. Our findings suggest that an out-of-home eating pattern is associated with a higher likelihood of these cardiometabolic outcomes compared to a nutrition-sensitive pattern. WOLCAN effectively reveals generalizable dietary behavior patterns and demonstrates relevant applications in studying diet-disease relationships.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
A Bayesian Mixture Model Approach to Examining Neighborhood Social Determinants of Health Disparities in Endometrial Cancer Care in Massachusetts
Authors:
Carmen B. Rodríguez,
Stephanie M. Wu,
Stephanie Alimena,
Alecia J McGregor,
Briana JK Stephenson
Abstract:
Many studies have examined social determinants of health (SDoH) independently, overlooking their interconnected nature. Our study uses a multidimensional approach to construct a neighborhood-level measure that explores how multiple SDoH jointly impact care received for endometrial cancer (EC) patients in Massachusetts (MA). Using 2015-2019 American Community Survey data, we implemented a Bayesian…
▽ More
Many studies have examined social determinants of health (SDoH) independently, overlooking their interconnected nature. Our study uses a multidimensional approach to construct a neighborhood-level measure that explores how multiple SDoH jointly impact care received for endometrial cancer (EC) patients in Massachusetts (MA). Using 2015-2019 American Community Survey data, we implemented a Bayesian multivariate Bernoulli mixture model to identify neighborhoods with similar SDoH features in MA. Five neighborhood SDoH (NSDoH) profiles were derived and characterized: (1) advantaged non-Hispanic White; (2) disadvantaged racially/ethnically diverse, more renter-occupied housing with limited English proficiency; (3) working class, lower educational attainment; (4) racially/ethnically diverse and greater economic security and educational attainment; and (5) racially/ethnically diverse, more renter-occupied housing with limited English proficiency. Based on residential information, we assigned these profiles to EC patients in the Massachusetts Cancer Registry. We used these profile assignments as the primary exposure in a Bayesian logistic regression to estimate the odds of receiving optimal EC care, adjusting for patient-level sociodemographic and clinical characteristics. NSDoH profiles were not significantly associated with receiving optimal EC care. However, compared to patients assigned to Profile 1, patients in all other profiles had lower odds of receiving optimal care. Our findings demonstrate how a flexible model-based clustering approach can account for the interconnected and multidimensional nature of NSDoH in a practical and interpretable way. Deriving and geospatially mapping NSDoH profiles may allow for identifying areas of need and inform targeted public health interventions tailored to each neighborhood's specific social determinants to improve healthcare delivery.
△ Less
Submitted 29 May, 2025; v1 submitted 9 December, 2024;
originally announced December 2024.
-
Derivation of outcome-dependent dietary patterns for low-income women obtained from survey data using a Supervised Weighted Overfitted Latent Class Analysis
Authors:
Stephanie M. Wu,
Matthew R. Williams,
Terrance D. Savitsky,
Briana J. K. Stephenson
Abstract:
Poor diet quality is a key modifiable risk factor for hypertension and disproportionately impacts low-income women. \sw{Analyzing diet-driven hypertensive outcomes in this demographic is challenging due to the complexity of dietary data and selection bias when the data come from surveys, a main data source for understanding diet-disease relationships in understudied populations. Supervised Bayesia…
▽ More
Poor diet quality is a key modifiable risk factor for hypertension and disproportionately impacts low-income women. \sw{Analyzing diet-driven hypertensive outcomes in this demographic is challenging due to the complexity of dietary data and selection bias when the data come from surveys, a main data source for understanding diet-disease relationships in understudied populations. Supervised Bayesian model-based clustering methods summarize dietary data into latent patterns that holistically capture relationships among foods and a known health outcome but do not sufficiently account for complex survey design. This leads to biased estimation and inference and lack of generalizability of the patterns}. To address this, we propose a supervised weighted overfitted latent class analysis (SWOLCA) based on a Bayesian pseudo-likelihood approach that integrates sampling weights into an exposure-outcome model for discrete data. Our model adjusts for stratification, clustering, and informative sampling, and handles modifying effects via interaction terms within a Markov chain Monte Carlo Gibbs sampling algorithm. Simulation studies confirm that the SWOLCA model exhibits good performance in terms of bias, precision, and coverage. Using data from the National Health and Nutrition Examination Survey (2015-2018), we demonstrate the utility of our model by characterizing dietary patterns associated with hypertensive outcomes among low-income women in the United States.
△ Less
Submitted 28 June, 2024; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Bayesian estimation methods for survey data with potential applications to health disparities research
Authors:
Stephanie M. Wu,
Briana Joy K. Stephenson
Abstract:
Understanding how and why certain communities bear a disproportionate burden of disease is challenging due to the scarcity of data on these communities. Surveys provide a useful avenue for accessing hard-to-reach populations, as many surveys specifically oversample understudied and vulnerable populations. When survey data is used for analysis, it is important to account for the complex survey desi…
▽ More
Understanding how and why certain communities bear a disproportionate burden of disease is challenging due to the scarcity of data on these communities. Surveys provide a useful avenue for accessing hard-to-reach populations, as many surveys specifically oversample understudied and vulnerable populations. When survey data is used for analysis, it is important to account for the complex survey design that gave rise to the data, in order to avoid biased conclusions. The field of Bayesian survey statistics aims to account for such survey design while leveraging the advantages of Bayesian models, which can flexibly handle sparsity through borrowing of information and provide a coherent inferential framework to easily obtain variances for complex models and data types. For these reasons, Bayesian survey methods seem uniquely well-poised for health disparities research, where heterogeneity and sparsity are frequent considerations. This review discusses three main approaches found in the Bayesian survey methodology literature: 1) multilevel regression and post-stratification, 2) weighted pseudolikelihood-based methods, and 3) synthetic population generation. We discuss advantages and disadvantages of each approach, examine recent applications and extensions, and consider how these approaches may be leveraged to improve research in population health equity.
△ Less
Submitted 26 July, 2023; v1 submitted 8 March, 2023;
originally announced March 2023.