-
BreakLoops: A New Feature for the Multi-Gene, Multi-Cancer Family History-Based Model, Fam3Pro
Authors:
Nicolas Kubista,
Ryan Hernandez-Cancela,
Jianfeng Ke,
Romain Berquet,
Gavin Lee,
Giovanni Parmigiani,
Danielle Braun
Abstract:
Previously, we presented PanelPRO, now known as Fam3PRO, an open-source R package for multi-gene, multi-cancer risk modeling with pedigree data. The initial release could not handle pedigrees that contained cyclic structures called loops, which occur when relatives mate. Here, we present a graph-based function called breakloops that can detect and break loops in any pedigree. The core algorithm id…
▽ More
Previously, we presented PanelPRO, now known as Fam3PRO, an open-source R package for multi-gene, multi-cancer risk modeling with pedigree data. The initial release could not handle pedigrees that contained cyclic structures called loops, which occur when relatives mate. Here, we present a graph-based function called breakloops that can detect and break loops in any pedigree. The core algorithm identifies the optimal set of loop breakers when individuals in a loop have exactly one parental mating, and extends to handle cases where individuals have multiple parental matings. The algorithm transforms complex pedigrees by strategically creating clones of key individuals to disrupt cycles while minimizing computational complexity. Our extensive testing demonstrates that this new feature can handle a wide variety of pedigree structures. The breakloops function is available in Fam3Pro version 2.0.0. This advancement enables Fam3Pro to assess cancer risk in a wider range of family structures, enhancing its applicability in clinical settings
△ Less
Submitted 1 May, 2025;
originally announced May 2025.
-
Interpretability in Parameter Space: Minimizing Mechanistic Description Length with Attribution-based Parameter Decomposition
Authors:
Dan Braun,
Lucius Bushnaq,
Stefan Heimersheim,
Jake Mendel,
Lee Sharkey
Abstract:
Mechanistic interpretability aims to understand the internal mechanisms learned by neural networks. Despite recent progress toward this goal, it remains unclear how best to decompose neural network parameters into mechanistic components. We introduce Attribution-based Parameter Decomposition (APD), a method that directly decomposes a neural network's parameters into components that (i) are faithfu…
▽ More
Mechanistic interpretability aims to understand the internal mechanisms learned by neural networks. Despite recent progress toward this goal, it remains unclear how best to decompose neural network parameters into mechanistic components. We introduce Attribution-based Parameter Decomposition (APD), a method that directly decomposes a neural network's parameters into components that (i) are faithful to the parameters of the original network, (ii) require a minimal number of components to process any input, and (iii) are maximally simple. Our approach thus optimizes for a minimal length description of the network's mechanisms. We demonstrate APD's effectiveness by successfully identifying ground truth mechanisms in multiple toy experimental settings: Recovering features from superposition; separating compressed computations; and identifying cross-layer distributed representations. While challenges remain to scaling APD to non-toy models, our results suggest solutions to several open problems in mechanistic interpretability, including identifying minimal circuits in superposition, offering a conceptual foundation for 'features', and providing an architecture-agnostic framework for neural network decomposition.
△ Less
Submitted 7 February, 2025; v1 submitted 24 January, 2025;
originally announced January 2025.
-
The penetrance R package for Estimation of Age Specific Risk in Family-based Studies
Authors:
Nicolas Kubista,
Danielle Braun,
Giovanni Parmigiani
Abstract:
Reliable tools and software for penetrance (age-specific risk among those who carry a genetic variant) estimation are critical to improving clinical decision making and risk assessment for hereditary syndromes. We introduce penetrance, an open-source R package available on CRAN, to estimate age-specific penetrance using family-history pedigree data. The package employs a Bayesian estimation approa…
▽ More
Reliable tools and software for penetrance (age-specific risk among those who carry a genetic variant) estimation are critical to improving clinical decision making and risk assessment for hereditary syndromes. We introduce penetrance, an open-source R package available on CRAN, to estimate age-specific penetrance using family-history pedigree data. The package employs a Bayesian estimation approach, allowing for the incorporation of prior knowledge through the specification of priors for the parameters of the carrier distribution. It also includes options to impute missing ages during the estimation process, addressing incomplete age information which is not uncommon in pedigree datasets. Our open-source software provides a flexible and user-friendly tool for researchers to estimate penetrance in complex family-based studies, facilitating improved genetic risk assessment in hereditary syndromes.
△ Less
Submitted 26 March, 2025; v1 submitted 27 November, 2024;
originally announced November 2024.
-
Treatment Effect Heterogeneity and Importance Measures for Multivariate Continuous Treatments
Authors:
Heejun Shin,
Antonio Linero,
Michelle Audirac,
Kezia Irene,
Danielle Braun,
Joseph Antonelli
Abstract:
Estimating the joint effect of a multivariate, continuous exposure is crucial, particularly in environmental health where interest lies in simultaneously evaluating the impact of multiple environmental pollutants on health. We develop novel methodology that addresses two key issues for estimation of treatment effects of multivariate, continuous exposures. We use nonparametric Bayesian methodology…
▽ More
Estimating the joint effect of a multivariate, continuous exposure is crucial, particularly in environmental health where interest lies in simultaneously evaluating the impact of multiple environmental pollutants on health. We develop novel methodology that addresses two key issues for estimation of treatment effects of multivariate, continuous exposures. We use nonparametric Bayesian methodology that is flexible to ensure our approach can capture a wide range of data generating processes. Additionally, we allow the effect of the exposures to be heterogeneous with respect to covariates. Treatment effect heterogeneity has not been well explored in the causal inference literature for multivariate, continuous exposures, and therefore we introduce novel estimands that summarize the nature and extent of the heterogeneity, and propose estimation procedures for new estimands related to treatment effect heterogeneity. We provide theoretical support for the proposed models in the form of posterior contraction rates and show that it works well in simulated examples both with and without heterogeneity. Our approach is motivated by a study of the health effects of simultaneous exposure to the components of PM$_{2.5}$, where we find that the negative health effects of exposure to environmental pollutants are exacerbated by low socioeconomic status, race and age.
△ Less
Submitted 3 January, 2025; v1 submitted 13 April, 2024;
originally announced April 2024.
-
Adjusting for Ascertainment Bias in Meta-Analysis of Penetrance for Cancer Risk
Authors:
Thanthirige Lakshika M. Ruberu,
Danielle Braun,
Giovanni Parmigiani,
Swati Biswas
Abstract:
Multi-gene panel testing allows efficient detection of pathogenic variants in cancer susceptibility genes including moderate-risk genes such as ATM and PALB2. A growing number of studies examine the risk of breast cancer (BC) conferred by pathogenic variants of such genes. A meta-analysis combining the reported risk estimates can provide an overall age-specific risk of developing BC, i.e., penetra…
▽ More
Multi-gene panel testing allows efficient detection of pathogenic variants in cancer susceptibility genes including moderate-risk genes such as ATM and PALB2. A growing number of studies examine the risk of breast cancer (BC) conferred by pathogenic variants of such genes. A meta-analysis combining the reported risk estimates can provide an overall age-specific risk of developing BC, i.e., penetrance for a gene. However, estimates reported by case-control studies often suffer from ascertainment bias. Currently there are no methods available to adjust for such ascertainment bias in this setting. We consider a Bayesian random-effects meta-analysis method that can synthesize different types of risk measures and extend it to incorporate studies with ascertainment bias. This is achieved by introducing a bias term in the model and assigning appropriate priors. We validate the method through a simulation study and apply it to estimate BC penetrance for carriers of pathogenic variants of ATM and PALB2 genes. Our simulations show that the proposed method results in more accurate and precise penetrance estimates compared to when no adjustment is made for ascertainment bias or when such biased studies are discarded from the analysis. The estimated overall BC risk for individuals with pathogenic variants in (1) ATM is 5.77% (3.22%-9.67%) by age 50 and 26.13% (20.31%-32.94%) by age 80; (2) PALB2 is 12.99% (6.48%-22.23%) by age 50 and 44.69% (34.40%-55.80%) by age 80. The proposed method allows for meta-analyses to include studies with ascertainment bias resulting in a larger number of studies included and thereby more robust estimates.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Partial identification and unmeasured confounding with multiple treatments and multiple outcomes
Authors:
Suyeon Kang,
Alexander Franks,
Michelle Audirac,
Danielle Braun,
Joseph Antonelli
Abstract:
Estimating the health effects of multiple air pollutants is a crucial problem in public health, but one that is difficult due to unmeasured confounding bias. Motivated by this issue, we develop a framework for partial identification of causal effects in the presence of unmeasured confounding in settings with multiple treatments and multiple outcomes. Under a factor confounding assumption, we show…
▽ More
Estimating the health effects of multiple air pollutants is a crucial problem in public health, but one that is difficult due to unmeasured confounding bias. Motivated by this issue, we develop a framework for partial identification of causal effects in the presence of unmeasured confounding in settings with multiple treatments and multiple outcomes. Under a factor confounding assumption, we show that joint partial identification regions for multiple estimands can be more informative than considering partial identification for individual estimands one at a time. We show how assumptions related to the strength of confounding or magnitude of plausible effect sizes for one estimand can reduce the partial identification regions for other estimands. As a special case of this result, we explore how negative control assumptions reduce partial identification regions and discuss conditions under which point identification can be obtained. We develop novel computational approaches to finding partial identification regions under a variety of these assumptions. We then estimate the causal effect of PM2.5 components on a variety of public health outcomes in the United States Medicare cohort, where we find that, in particular, the detrimental effect of black carbon is robust to the potential presence of unmeasured confounding bias.
△ Less
Submitted 20 June, 2025; v1 submitted 20 November, 2023;
originally announced November 2023.
-
CausalGPS: An R Package for Causal Inference With Continuous Exposures
Authors:
Naeem Khoshnevis,
Xiao Wu,
Danielle Braun
Abstract:
Quantifying the causal effects of continuous exposures on outcomes of interest is critical for social, economic, health, and medical research. However, most existing software packages focus on binary exposures. We develop the CausalGPS R package that implements a collection of algorithms to provide algorithmic solutions for causal inference with continuous exposures. CausalGPS implements a causal…
▽ More
Quantifying the causal effects of continuous exposures on outcomes of interest is critical for social, economic, health, and medical research. However, most existing software packages focus on binary exposures. We develop the CausalGPS R package that implements a collection of algorithms to provide algorithmic solutions for causal inference with continuous exposures. CausalGPS implements a causal inference workflow, with algorithms based on generalized propensity scores (GPS) as the core, extending propensity scores (the probability of a unit being exposed given pre-exposure covariates) from binary to continuous exposures. As the first step, the package implements efficient and flexible estimations of the GPS, allowing multiple user-specified modeling options. As the second step, the package provides two ways to adjust for confounding: weighting and matching, generating weighted and matched data sets, respectively. Lastly, the package provides built-in functions to fit flexible parametric, semi-parametric, or non-parametric regression models on the weighted or matched data to estimate the exposure-response function relating the outcome with the exposures. The computationally intensive tasks are implemented in C++, and efficient shared-memory parallelization is achieved by OpenMP API. This paper outlines the main components of the CausalGPS R package and demonstrates its application to assess the effect of long-term exposure to PM2.5 on educational attainment using zip code-level data from the contiguous United States from 2000-2016.
△ Less
Submitted 30 September, 2023;
originally announced October 2023.
-
A spatial interference approach to account for mobility in air pollution studies with multivariate continuous treatments
Authors:
Heejun Shin,
Danielle Braun,
Kezia Irene,
Michelle Audirac,
Joseph Antonelli
Abstract:
We develop new methodology to improve our understanding of the causal effects of multivariate air pollution exposures on public health. Typically, exposure to air pollution for an individual is measured at their home geographic region, though people travel to different regions with potentially different levels of air pollution. To account for this, we incorporate estimates of the mobility of indiv…
▽ More
We develop new methodology to improve our understanding of the causal effects of multivariate air pollution exposures on public health. Typically, exposure to air pollution for an individual is measured at their home geographic region, though people travel to different regions with potentially different levels of air pollution. To account for this, we incorporate estimates of the mobility of individuals from cell phone mobility data to get an improved estimate of their exposure to air pollution. We treat this as an interference problem, where individuals in one geographic region can be affected by exposures in other regions due to mobility into those areas. We propose policy-relevant estimands and derive expressions showing the extent of bias one would obtain by ignoring this mobility. We additionally highlight the benefits of the proposed interference framework relative to a measurement error framework for accounting for mobility. We develop novel estimation strategies to estimate causal effects that account for this spatial spillover utilizing flexible Bayesian methodology. Lastly, we use the proposed methodology to study the health effects of ambient air pollution on mortality among Medicare enrollees in the United States.
△ Less
Submitted 22 May, 2024; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Bayesian Meta-Analysis of Penetrance for Cancer Risk
Authors:
Thanthirige Lakshika M. Ruberu,
Danielle Braun,
Giovanni Parmigiani,
Swati Biswas
Abstract:
Multi-gene panel testing allows many cancer susceptibility genes to be tested quickly at a lower cost making such testing accessible to a broader population. Thus, more patients carrying pathogenic germline mutations in various cancer-susceptibility genes are being identified. This creates a great opportunity, as well as an urgent need, to counsel these patients about appropriate risk reducing man…
▽ More
Multi-gene panel testing allows many cancer susceptibility genes to be tested quickly at a lower cost making such testing accessible to a broader population. Thus, more patients carrying pathogenic germline mutations in various cancer-susceptibility genes are being identified. This creates a great opportunity, as well as an urgent need, to counsel these patients about appropriate risk reducing management strategies. Counseling hinges on accurate estimates of age-specific risks of developing various cancers associated with mutations in a specific gene, i.e., penetrance estimation. We propose a meta-analysis approach based on a Bayesian hierarchical random-effects model to obtain penetrance estimates by integrating studies reporting different types of risk measures (e.g., penetrance, relative risk, odds ratio) while accounting for the associated uncertainties. After estimating posterior distributions of the parameters via a Markov chain Monte Carlo algorithm, we estimate penetrance and credible intervals. We investigate the proposed method and compare with an existing approach via simulations based on studies reporting risks for two moderate-risk breast cancer susceptibility genes, ATM and PALB2. Our proposed method is far superior in terms of coverage probability of credible intervals and mean square error of estimates. Finally, we apply our method to estimate the penetrance of breast cancer among carriers of pathogenic mutations in the ATM gene.
△ Less
Submitted 12 June, 2023; v1 submitted 4 April, 2023;
originally announced April 2023.
-
Hierarchically Structured Task-Agnostic Continual Learning
Authors:
Heinke Hihn,
Daniel A. Braun
Abstract:
One notable weakness of current machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learnin…
▽ More
One notable weakness of current machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle that facilitates a trade-off between learning and forgetting. We derive this principle from a Bayesian perspective and show its connections to previous approaches to continual learning. Based on this principle, we propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths through the network which is governed by a gating policy. Equipped with a diverse and specialized set of parameters, each path can be regarded as a distinct sub-network that learns to solve tasks. To improve expert allocation, we introduce diversity objectives, which we evaluate in additional ablation studies. Importantly, our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms. Due to the general formulation based on generic utility functions, we can apply this optimality principle to a large variety of learning problems, including supervised learning, reinforcement learning, and generative modeling. We demonstrate the competitive performance of our method on continual reinforcement learning and variants of the MNIST, CIFAR-10, and CIFAR-100 datasets.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
Evaluation of Model-Based PM$_{2.5}$ Estimates for Exposure Assessment During Wildfire Smoke Episodes in the Western U.S
Authors:
Ellen M. Considine,
Jiayuan Hao,
Priyanka deSouza,
Danielle Braun,
Colleen E. Reid,
Rachel C. Nethery
Abstract:
Investigating the health impacts of wildfire smoke requires data on people's exposure to fine particulate matter (PM$_{2.5}$) across space and time. In recent years, it has become common to use machine learning models to fill gaps in monitoring data. However, it remains unclear how well these models are able to capture spikes in PM$_{2.5}$ during and across wildfire events. Here, we evaluate the a…
▽ More
Investigating the health impacts of wildfire smoke requires data on people's exposure to fine particulate matter (PM$_{2.5}$) across space and time. In recent years, it has become common to use machine learning models to fill gaps in monitoring data. However, it remains unclear how well these models are able to capture spikes in PM$_{2.5}$ during and across wildfire events. Here, we evaluate the accuracy of two sets of high-coverage and high-resolution machine learning-derived PM$_{2.5}$ data sets created by Di et al. (2021) and Reid et al. (2021). In general, the Reid estimates are more accurate than the Di estimates when compared to independent validation data from mobile smoke monitors deployed by the US Forest Service. However, both models tend to severely under-predict PM$_{2.5}$ on high-pollution days. Our findings complement other recent studies calling for increased air pollution monitoring in the western US and support the inclusion of wildfire-specific monitoring observations and predictor variables in model-based estimates of PM$_{2.5}$. Lastly, we call for more rigorous error quantification of machine-learning derived exposure data sets, with special attention to extreme events.
△ Less
Submitted 9 January, 2023; v1 submitted 3 September, 2022;
originally announced September 2022.
-
Investigating Use of Low-Cost Sensors to Increase Accuracy and Equity of Real-Time Air Quality Information
Authors:
Ellen M. Considine,
Danielle Braun,
Leila Kamareddine,
Rachel C. Nethery,
Priyanka deSouza
Abstract:
Environmental Protection Agency (EPA) air quality (AQ) monitors, the gold standard for measuring air pollutants, are sparsely positioned across the US due to their costliness. Low-cost sensors (LCS) are increasingly being used by the public to fill in the gaps in AQ monitoring; however, LCS are not as accurate as EPA monitors. In this work, we investigate factors impacting the differences between…
▽ More
Environmental Protection Agency (EPA) air quality (AQ) monitors, the gold standard for measuring air pollutants, are sparsely positioned across the US due to their costliness. Low-cost sensors (LCS) are increasingly being used by the public to fill in the gaps in AQ monitoring; however, LCS are not as accurate as EPA monitors. In this work, we investigate factors impacting the differences between an individual's true (unobserved) exposure to fine particulate matter (PM2.5) and the exposure reported by their nearest AQ instrument, which could be either an EPA monitor or an LCS. Three factors contributing to these differences are (1) distance to the nearest AQ instrument, (2) local variability in AQ, and (3) device measurement error. We examine the contributions of each component to the overall error in reported AQ using simulations based on California data. The simulations explore different combinations of hypothetical LCS placement strategies (at schools, near major roads, and in environmentally and socioeconomically marginalized census tracts) for different numbers of LCS, with varying plausible amounts of LCS device measurement error. For each scenario, we evaluate the accuracy of daily AQ information available from individuals' nearest AQ instrument with respect to absolute errors and misclassifications of the Air Quality Index, stratified by socioeconomic and demographic characteristics. We illustrate how real-time AQ reporting could be improved (or, in some cases, worsened) by using LCS, both for the population overall and for marginalized communities specifically. This work has implications for the integration of LCS into real-time AQ reporting platforms.
△ Less
Submitted 4 January, 2023; v1 submitted 6 May, 2022;
originally announced May 2022.
-
Aggregating human judgment probabilistic predictions of COVID-19 transmission, burden, and preventative measures
Authors:
Allison Codi,
Damon Luk,
David Braun,
Juan Cambeiro,
Tamay Besiroglu,
Eva Chen,
Luis Enrique Urtubey de C`esaris,
Paolo Bocchini,
Thomas McAndrew
Abstract:
Aggregated human judgment forecasts for COVID-19 targets of public health importance are accurate, often outperforming computational models. Our work shows aggregated human judgment forecasts for infectious agents are timely, accurate, and adaptable, and can be used as tool to aid public health decision making during outbreaks.
Aggregated human judgment forecasts for COVID-19 targets of public health importance are accurate, often outperforming computational models. Our work shows aggregated human judgment forecasts for infectious agents are timely, accurate, and adaptable, and can be used as tool to aid public health decision making during outbreaks.
△ Less
Submitted 14 April, 2022; v1 submitted 5 April, 2022;
originally announced April 2022.
-
Chimeric forecasting: combining probabilistic predictions from computational models and human judgment
Authors:
Thomas McAndrew,
Allison Codi,
Juan Cambeiro,
Tamay Besiroglu,
David Braun,
Eva Chen,
Luis Enrique Urtubey de Cesaris,
Damon Luk
Abstract:
Forecasts of the trajectory of an infectious agent can help guide public health decision making. A traditional approach to forecasting fits a computational model to structured data and generates a predictive distribution. However, human judgment has access to the same data as computational models plus experience, intuition, and subjective data. We propose a chimeric ensemble -- a combination of co…
▽ More
Forecasts of the trajectory of an infectious agent can help guide public health decision making. A traditional approach to forecasting fits a computational model to structured data and generates a predictive distribution. However, human judgment has access to the same data as computational models plus experience, intuition, and subjective data. We propose a chimeric ensemble -- a combination of computational and human judgment forecasts -- as a novel approach to predicting the trajectory of an infectious agent. Each month from January, 2021 to June, 2021 we asked two generalist crowds, using the same criteria as the COVID-19 Forecast Hub, to submit a predictive distribution over incident cases and deaths at the US national level either two or three weeks into the future and combined these human judgment forecasts with forecasts from computational models submitted to the COVID-19 Forecasthub into a chimeric ensemble. We find a chimeric ensemble compared to an ensemble including only computational models improves predictions of incident cases and shows similar performance for predictions of incident deaths. A chimeric ensemble is a flexible, supportive public health tool and shows promising results for predictions of the spread of an infectious agent.
△ Less
Submitted 20 February, 2022;
originally announced February 2022.
-
Mixture-of-Variational-Experts for Continual Learning
Authors:
Heinke Hihn,
Daniel A. Braun
Abstract:
One weakness of machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning (CL) paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learning and devel…
▽ More
One weakness of machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning (CL) paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle that facilitates a trade-off between learning and forgetting. We discuss this principle from a Bayesian perspective and show its connections to previous approaches to CL. Based on this principle, we propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths through the network which is governed by a gating policy. Due to the general formulation based on generic utility functions, we can apply this optimality principle to a large variety of learning problems, including supervised learning, reinforcement learning, and generative modeling. We demonstrate the competitive performance of our method in continual supervised learning and in continual reinforcement learning.
△ Less
Submitted 1 March, 2022; v1 submitted 25 October, 2021;
originally announced October 2021.
-
Estimating a Causal Exposure Response Function with a Continuous Error-Prone Exposure: A Study of Fine Particulate Matter and All-Cause Mortality
Authors:
Kevin P. Josey,
Priyanka deSouza,
Xiao Wu,
Danielle Braun,
Rachel Nethery
Abstract:
Numerous studies have examined the associations between long-term exposure to fine particulate matter (PM2.5) and adverse health outcomes. Recently, many of these studies have begun to employ high-resolution predicted PM2.5 concentrations, which are subject to measurement error. Previous approaches for exposure measurement error correction have either been applied in non-causal settings or have on…
▽ More
Numerous studies have examined the associations between long-term exposure to fine particulate matter (PM2.5) and adverse health outcomes. Recently, many of these studies have begun to employ high-resolution predicted PM2.5 concentrations, which are subject to measurement error. Previous approaches for exposure measurement error correction have either been applied in non-causal settings or have only considered a categorical exposure. Moreover, most procedures have failed to account for uncertainty induced by error correction when fitting an exposure-response function (ERF). To remedy these deficiencies, we develop a multiple imputation framework that combines regression calibration and Bayesian techniques to estimate a causal ERF. We demonstrate how the output of the measurement error correction steps can be seamlessly integrated into a Bayesian additive regression trees (BART) estimator of the causal ERF. We also demonstrate how locally-weighted smoothing of the posterior samples from BART can be used to create a more accurate ERF estimate. Our proposed approach also properly propagates the exposure measurement error uncertainty to yield accurate standard error estimates. We assess the robustness of our proposed approach in an extensive simulation study. We then apply our methodology to estimate the effects of PM2.5 on all-cause mortality among Medicare enrollees in New England from 2000-2012.
△ Less
Submitted 28 November, 2022; v1 submitted 30 September, 2021;
originally announced September 2021.
-
Statistical methods for Mendelian models with multiple genes and cancers
Authors:
Jane W. Liang,
Gregory E. Idos,
Christine Hong,
Stephen B. Gruber,
Giovanni Parmigiani,
Danielle Braun
Abstract:
Risk evaluation to identify individuals who are at greater risk of cancer as a result of heritable pathogenic variants is a valuable component of individualized clinical management. Using principles of Mendelian genetics, Bayesian probability theory, and variant-specific knowledge, Mendelian models derive the probability of carrying a pathogenic variant and developing cancer in the future, based o…
▽ More
Risk evaluation to identify individuals who are at greater risk of cancer as a result of heritable pathogenic variants is a valuable component of individualized clinical management. Using principles of Mendelian genetics, Bayesian probability theory, and variant-specific knowledge, Mendelian models derive the probability of carrying a pathogenic variant and developing cancer in the future, based on family history. Existing Mendelian models are widely employed, but are generally limited to specific genes and syndromes. However, the upsurge of multi-gene panel germline testing has spurred the discovery of many new gene-cancer associations that are not presently accounted for in these models. We have developed PanelPRO, a flexible, efficient Mendelian risk prediction framework that can incorporate an arbitrary number of genes and cancers, overcoming the computational challenges that arise because of the increased model complexity. We implement an eleven-gene, eleven-cancer model, the largest Mendelian model created thus far, based on this framework. Using simulations and a clinical cohort with germline panel testing data, we evaluate model performance, validate the reverse-compatibility of our approach with existing Mendelian models, and illustrate its usage. Our implementation is freely available for research use in the PanelPRO R package.
△ Less
Submitted 7 May, 2022; v1 submitted 27 August, 2021;
originally announced August 2021.
-
SNIP: An Adaptation of Sorted Neighborhood Methods for Deduplicating Pedigree Data
Authors:
Theodore Huang,
Matthew Ploenzke,
Danielle Braun
Abstract:
Pedigree data contain family history information that is used to analyze hereditary diseases. These clinical data sets may contain duplicate records due to the same family visiting a clinic multiple times or a clinician entering multiple versions of the family for testing purposes. Inferences drawn from the data or using them for training or validation without removing the duplicates could lead to…
▽ More
Pedigree data contain family history information that is used to analyze hereditary diseases. These clinical data sets may contain duplicate records due to the same family visiting a clinic multiple times or a clinician entering multiple versions of the family for testing purposes. Inferences drawn from the data or using them for training or validation without removing the duplicates could lead to invalid conclusions, and hence identifying the duplicates is essential. Since family structures can be complex, existing deduplication algorithms cannot be applied directly. We first motivate the importance of deduplication by examining the impact of pedigree duplicates on the training and validation of a familial risk prediction model. We then introduce an unsupervised algorithm, which we call SNIP (Sorted NeIghborhood for Pedigrees), that builds on the sorted neighborhood method to efficiently find and classify pairwise comparisons by leveraging the inherent hierarchical nature of the pedigrees. We conduct a simulation study to assess the performance of the algorithm and find parameter configurations where the algorithm is able to accurately detect the duplicates. We then apply the method to data from the Risk Service, which includes over 300,000 pedigrees at high risk of hereditary cancers, and uncover large clusters of potential duplicate families. After removing 104,520 pedigrees (33% of original data), the resulting Risk Service dataset can now be used for future analysis, training, and validation. The algorithm is available as an R package snipR available at https://github.com/bayesmendel/snipR.
△ Less
Submitted 19 August, 2021;
originally announced August 2021.
-
Prediction of Hereditary Cancers Using Neural Networks
Authors:
Zoe Guan,
Giovanni Parmigiani,
Danielle Braun,
Lorenzo Trippa
Abstract:
Family history is a major risk factor for many types of cancer. Mendelian risk prediction models translate family histories into cancer risk predictions based on knowledge of cancer susceptibility genes. These models are widely used in clinical practice to help identify high-risk individuals. Mendelian models leverage the entire family history, but they rely on many assumptions about cancer suscep…
▽ More
Family history is a major risk factor for many types of cancer. Mendelian risk prediction models translate family histories into cancer risk predictions based on knowledge of cancer susceptibility genes. These models are widely used in clinical practice to help identify high-risk individuals. Mendelian models leverage the entire family history, but they rely on many assumptions about cancer susceptibility genes that are either unrealistic or challenging to validate due to low mutation prevalence. Training more flexible models, such as neural networks, on large databases of pedigrees can potentially lead to accuracy gains. In this paper, we develop a framework to apply neural networks to family history data and investigate their ability to learn inherited susceptibility to cancer. While there is an extensive literature on neural networks and their state-of-the-art performance in many tasks, there is little work applying them to family history data. We propose adaptations of fully-connected neural networks and convolutional neural networks to pedigrees. In data simulated under Mendelian inheritance, we demonstrate that our proposed neural network models are able to achieve nearly optimal prediction performance. Moreover, when the observed family history includes misreported cancer diagnoses, neural networks are able to outperform the Mendelian BRCAPRO model embedding the correct inheritance laws. Using a large dataset of over 200,000 family histories, the Risk Service cohort, we train prediction models for future risk of breast cancer. We validate the models using data from the Cancer Genetics Network.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
Extending Models Via Gradient Boosting: An Application to Mendelian Models
Authors:
Theodore Huang,
Gregory Idos,
Christine Hong,
Stephen Gruber,
Giovanni Parmigiani,
Danielle Braun
Abstract:
Improving existing widely-adopted prediction models is often a more efficient and robust way towards progress than training new models from scratch. Existing models may (a) incorporate complex mechanistic knowledge, (b) leverage proprietary information and, (c) have surmounted barriers to adoption. Compared to model training, model improvement and modification receive little attention. In this pap…
▽ More
Improving existing widely-adopted prediction models is often a more efficient and robust way towards progress than training new models from scratch. Existing models may (a) incorporate complex mechanistic knowledge, (b) leverage proprietary information and, (c) have surmounted barriers to adoption. Compared to model training, model improvement and modification receive little attention. In this paper we propose a general approach to model improvement: we combine gradient boosting with any previously developed model to improve model performance while retaining important existing characteristics. To exemplify, we consider the context of Mendelian models, which estimate the probability of carrying genetic mutations that confer susceptibility to disease by using family pedigrees and health histories of family members. Via simulations we show that integration of gradient boosting with an existing Mendelian model can produce an improved model that outperforms both that model and the model built using gradient boosting alone. We illustrate the approach on genetic testing data from the USC-Stanford Cancer Genetics Hereditary Cancer Panel (HCP) study.
△ Less
Submitted 13 May, 2021;
originally announced May 2021.
-
A Bayesian Gaussian Process for Estimating a Causal Exposure Response Curve in Environmental Epidemiology
Authors:
Boyu Ren,
Xiao Wu,
Danielle Braun,
Natesh Pillai,
Francesca Dominici
Abstract:
Motivated by environmental policy questions, we address the challenges of estimation, change point detection, and uncertainty quantification of a causal exposure-response function (CERF). Under a potential outcome framework, the CERF describes the relationship between a continuously varying exposure (or treatment) and its causal effect on an outcome. We propose a new Bayesian approach that relies…
▽ More
Motivated by environmental policy questions, we address the challenges of estimation, change point detection, and uncertainty quantification of a causal exposure-response function (CERF). Under a potential outcome framework, the CERF describes the relationship between a continuously varying exposure (or treatment) and its causal effect on an outcome. We propose a new Bayesian approach that relies on a Gaussian process (GP) model to estimate the CERF nonparametrically. To achieve the desired separation of design and analysis phases, we parametrize the covariance (kernel) function of the GP to mimic matching via a Generalized Propensity Score (GPS). The hyper-parameters as well as the form of the kernel function of the GP are chosen to optimize covariate balance. Our approach achieves automatic uncertainty evaluation of the CERF with high computational efficiency, and enables change point detection through inference on derivatives of the CERF. We provide theoretical results showing the correspondence between our Bayesian GP framework and traditional approaches in causal inference for estimating causal effects of a continuous exposure. We apply the methods to 520,711 ZIP-code-level observations to estimate the causal effect of long-term exposures to PM2.5, ozone, and NO2 on all-cause mortality among Medicare enrollees in the US. A computationally efficient implementation of the proposed GP models is provided in the GPCERF R package, which is available on CRAN.
△ Less
Submitted 25 January, 2023; v1 submitted 7 May, 2021;
originally announced May 2021.
-
Assessing the causal effects of a stochastic intervention in time series data: Are heat alerts effective in preventing deaths and hospitalizations?
Authors:
Xiao Wu,
Kate R. Weinberger,
Gregory A. Wellenius,
Francesca Dominici,
Danielle Braun
Abstract:
The methodological development of this paper is motivated by the need to address the following scientific question: does the issuance of heat alerts prevent adverse health effects? Our goal is to address this question within a causal inference framework in the context of time series data. A key challenge is that causal inference methods require the overlap assumption to hold: each unit (i.e., a da…
▽ More
The methodological development of this paper is motivated by the need to address the following scientific question: does the issuance of heat alerts prevent adverse health effects? Our goal is to address this question within a causal inference framework in the context of time series data. A key challenge is that causal inference methods require the overlap assumption to hold: each unit (i.e., a day) must have a positive probability of receiving the treatment (i.e., issuing a heat alert on that day). In our motivating example, the overlap assumption is often violated: the probability of issuing a heat alert on a cooler day is zero. To overcome this challenge, we propose a stochastic intervention for time series data which is implemented via an incremental time-varying propensity score (ItvPS). The ItvPS intervention is executed by multiplying the probability of issuing a heat alert on day $t$ -- conditional on past information up to day $t$ -- by an odds ratio $δ_t$. First, we introduce a new class of causal estimands that relies on the ItvPS intervention. We provide theoretical results to show that these causal estimands can be identified and estimated under a weaker version of the overlap assumption. Second, we propose nonparametric estimators based on the ItvPS and derive an upper bound for the variances of these estimators. Third, we extend this framework to multi-site time series using a spatial meta-analysis approach. Fourth, we show that the proposed estimators perform well in terms of bias and root mean squared error via simulations. Finally, we apply our proposed approach to estimate the causal effects of increasing the probability of issuing heat alerts on each warm-season day in reducing deaths and hospitalizations among Medicare enrollees in $2,837$ U.S. counties.
△ Less
Submitted 29 August, 2022; v1 submitted 20 February, 2021;
originally announced February 2021.
-
Specialization in Hierarchical Learning Systems
Authors:
Heinke Hihn,
Daniel A. Braun
Abstract:
Joining multiple decision-makers together is a powerful way to obtain more sophisticated decision-making systems, but requires to address the questions of division of labor and specialization. We investigate in how far information constraints in hierarchies of experts not only provide a principled method for regularization but also to enforce specialization. In particular, we devise an information…
▽ More
Joining multiple decision-makers together is a powerful way to obtain more sophisticated decision-making systems, but requires to address the questions of division of labor and specialization. We investigate in how far information constraints in hierarchies of experts not only provide a principled method for regularization but also to enforce specialization. In particular, we devise an information-theoretically motivated on-line learning rule that allows partitioning of the problem space into multiple sub-problems that can be solved by the individual experts. We demonstrate two different ways to apply our method: (i) partitioning problems based on individual data samples and (ii) based on sets of data samples representing tasks. Approach (i) equips the system with the ability to solve complex decision-making problems by finding an optimal combination of local expert decision-makers. Approach (ii) leads to decision-makers specialized in solving families of tasks, which equips the system with the ability to solve meta-learning problems. We show the broad applicability of our approach on a range of problems including classification, regression, density estimation, and reinforcement learning problems, both in the standard machine learning setup and in a meta-learning setting.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
PanelPRO: A R package for multi-syndrome, multi-gene risk modeling for individuals with a family history of cancer
Authors:
Gavin Lee,
Qing Zhang,
Jane W. Liang,
Theodore Huang,
Christine Choirat,
Giovanni Parmigiani,
Danielle Braun
Abstract:
Identifying individuals who are at high risk of cancer due to inherited germline mutations is critical for effective implementation of personalized prevention strategies. Most existing models to identify these individuals focus on specific syndromes by including family and personal history for a small number of cancers. Recent evidence from multi-gene panel testing has shown that many syndromes on…
▽ More
Identifying individuals who are at high risk of cancer due to inherited germline mutations is critical for effective implementation of personalized prevention strategies. Most existing models to identify these individuals focus on specific syndromes by including family and personal history for a small number of cancers. Recent evidence from multi-gene panel testing has shown that many syndromes once thought to be distinct are overlapping, motivating the development of models that incorporate family history information on several cancers and predict mutations for more comprehensive panels of genes.
Once such class of models are Mendelian risk prediction models, which use family history information and Mendelian laws of inheritance to estimate the probability of carrying genetic mutations, as well as future risk of developing associated cancers. To flexibly model the complexity of many cancer-mutation associations, we present a new software tool called PanelPRO, a R package that extends the previously developed BayesMendel R package to user-selected lists of susceptibility genes and associated cancers. The model identifies individuals at an increased risk of carrying cancer susceptibility gene mutations and predicts future risk of developing hereditary cancers associated with those genes. Additional functionalities adjust for prophylactic interventions, known genetic testing results, and risk modifiers such as race and ancestry. The package comes with a customizable database with default parameter values estimated from published studies.
The PanelPRO package is open-source and provides a fast and flexible back-end for multi-gene, multi-cancer risk modeling with pedigree data. The software enables the identification of high-risk individuals, which will have an impact on personalized prevention strategies for cancer and individualized decision making about genetic testing.
△ Less
Submitted 24 October, 2020;
originally announced October 2020.
-
Combining Breast Cancer Risk Prediction Models
Authors:
Zoe Guan,
Theodore Huang,
Anne Marie McCarthy,
Kevin S. Hughes,
Alan Semine,
Hajime Uno,
Lorenzo Trippa,
Giovanni Parmigiani,
Danielle Braun
Abstract:
Accurate risk stratification is key to reducing cancer morbidity through targeted screening and preventative interventions. Numerous breast cancer risk prediction models have been developed, but they often give predictions with conflicting clinical implications. Integrating information from different models may improve the accuracy of risk predictions, which would be valuable for both clinicians a…
▽ More
Accurate risk stratification is key to reducing cancer morbidity through targeted screening and preventative interventions. Numerous breast cancer risk prediction models have been developed, but they often give predictions with conflicting clinical implications. Integrating information from different models may improve the accuracy of risk predictions, which would be valuable for both clinicians and patients. BRCAPRO and BCRAT are two widely used models based on largely complementary sets of risk factors. BRCAPRO is a Bayesian model that uses detailed family history information to estimate the probability of carrying a BRCA1/2 mutation, as well as future risk of breast and ovarian cancer, based on mutation prevalence and penetrance (age-specific probability of developing cancer given genotype). BCRAT uses a relative hazard model based on first-degree family history and non-genetic risk factors. We consider two approaches for combining BRCAPRO and BCRAT: 1) modifying the penetrance functions in BRCAPRO using relative hazard estimates from BCRAT, and 2) training an ensemble model that takes as input BRCAPRO and BCRAT predictions. We show that the combination models achieve performance gains over BRCAPRO and BCRAT in simulations and data from the Cancer Genetics Network.
△ Less
Submitted 31 July, 2020;
originally announced August 2020.
-
Hierarchical Expert Networks for Meta-Learning
Authors:
Heinke Hihn,
Daniel A. Braun
Abstract:
The goal of meta-learning is to train a model on a variety of learning tasks, such that it can adapt to new problems within only a few iterations. Here we propose a principled information-theoretic model that optimally partitions the underlying problem space such that specialized expert decision-makers solve the resulting sub-problems. To drive this specialization we impose the same kind of inform…
▽ More
The goal of meta-learning is to train a model on a variety of learning tasks, such that it can adapt to new problems within only a few iterations. Here we propose a principled information-theoretic model that optimally partitions the underlying problem space such that specialized expert decision-makers solve the resulting sub-problems. To drive this specialization we impose the same kind of information processing constraints both on the partitioning and the expert decision-makers. We argue that this specialization leads to efficient adaptation to new tasks. To demonstrate the generality of our approach we evaluate three meta-learning domains: image classification, regression, and reinforcement learning.
△ Less
Submitted 9 September, 2020; v1 submitted 31 October, 2019;
originally announced November 2019.
-
An Information-theoretic On-line Learning Principle for Specialization in Hierarchical Decision-Making Systems
Authors:
Heinke Hihn,
Sebastian Gottwald,
Daniel A. Braun
Abstract:
Information-theoretic bounded rationality describes utility-optimizing decision-makers whose limited information-processing capabilities are formalized by information constraints. One of the consequences of bounded rationality is that resource-limited decision-makers can join together to solve decision-making problems that are beyond the capabilities of each individual. Here, we study an informati…
▽ More
Information-theoretic bounded rationality describes utility-optimizing decision-makers whose limited information-processing capabilities are formalized by information constraints. One of the consequences of bounded rationality is that resource-limited decision-makers can join together to solve decision-making problems that are beyond the capabilities of each individual. Here, we study an information-theoretic principle that drives division of labor and specialization when decision-makers with information constraints are joined together. We devise an on-line learning rule of this principle that learns a partitioning of the problem space such that it can be solved by specialized linear policies. We demonstrate the approach for decision-making problems whose complexity exceeds the capabilities of individual decision-makers, but can be solved by combining the decision-makers optimally. The strength of the model is that it is abstract and principled, yet has direct applications in classification, regression, reinforcement learning and adaptive control.
△ Less
Submitted 5 December, 2019; v1 submitted 26 July, 2019;
originally announced July 2019.
-
Matching on Generalized Propensity Scores with Continuous Exposures
Authors:
Xiao Wu,
Fabrizia Mealli,
Marianthi-Anna Kioumourtzoglou,
Francesca Dominici,
Danielle Braun
Abstract:
In the context of a binary treatment, matching is a well-established approach in causal inference. However, in the context of a continuous treatment or exposure, matching is still underdeveloped. We propose an innovative matching approach to estimate an average causal exposure-response function under the setting of continuous exposures that relies on the generalized propensity score (GPS). Our app…
▽ More
In the context of a binary treatment, matching is a well-established approach in causal inference. However, in the context of a continuous treatment or exposure, matching is still underdeveloped. We propose an innovative matching approach to estimate an average causal exposure-response function under the setting of continuous exposures that relies on the generalized propensity score (GPS). Our approach maintains the following attractive features of matching: a) clear separation between the design and the analysis; b) robustness to model misspecification or to the presence of extreme values of the estimated GPS; c) straightforward assessment of covariate balance. We first introduce an assumption of identifiability, called local weak unconfoundedness. Under this assumption and mild smoothness conditions, we provide theoretical guarantees that our proposed matching estimator attains point-wise consistency and asymptotic normality. In simulations, our proposed matching approach outperforms existing methods under settings of model misspecification or the presence of extreme values of the estimated GPS. We apply our proposed method to estimate the average causal exposure-response function between long-term PM$_{2.5}$ exposure and all-cause mortality among 68.5 million Medicare enrollees, 2000-2016. We found strong evidence of a harmful effect of long-term PM$_{2.5}$ exposure on mortality. Code for the proposed matching approach is provided in the CausalGPS R package, which is available on CRAN and provides a computationally efficient implementation.
△ Less
Submitted 18 August, 2021; v1 submitted 16 December, 2018;
originally announced December 2018.
-
Bounded Rational Decision-Making with Adaptive Neural Network Priors
Authors:
Heinke Hihn,
Sebastian Gottwald,
Daniel A. Braun
Abstract:
Bounded rationality investigates utility-optimizing decision-makers with limited information-processing power. In particular, information theoretic bounded rationality models formalize resource constraints abstractly in terms of relative Shannon information, namely the Kullback-Leibler Divergence between the agents' prior and posterior policy. Between prior and posterior lies an anytime deliberati…
▽ More
Bounded rationality investigates utility-optimizing decision-makers with limited information-processing power. In particular, information theoretic bounded rationality models formalize resource constraints abstractly in terms of relative Shannon information, namely the Kullback-Leibler Divergence between the agents' prior and posterior policy. Between prior and posterior lies an anytime deliberation process that can be instantiated by sample-based evaluations of the utility function through Markov Chain Monte Carlo (MCMC) optimization. The most simple model assumes a fixed prior and can relate abstract information-theoretic processing costs to the number of sample evaluations. However, more advanced models would also address the question of learning, that is how the prior is adapted over time such that generated prior proposals become more efficient. In this work we investigate generative neural networks as priors that are optimized concurrently with anytime sample-based decision-making processes such as MCMC. We evaluate this approach on toy examples.
△ Less
Submitted 4 September, 2018;
originally announced September 2018.
-
airpred: A Flexible R Package Implementing Methods for Predicting Air Pollution
Authors:
M. Benjamin Sabath,
Qian Di,
Danielle Braun,
Joel Schwarz,
Francesca Dominici,
Christine Choirat
Abstract:
Fine particulate matter (PM$_{2.5}$) is one of the criteria air pollutants regulated by the Environmental Protection Agency in the United States. There is strong evidence that ambient exposure to (PM$_{2.5}$) increases risk of mortality and hospitalization. Large scale epidemiological studies on the health effects of PM$_{2.5}$ provide the necessary evidence base for lowering the safety standards…
▽ More
Fine particulate matter (PM$_{2.5}$) is one of the criteria air pollutants regulated by the Environmental Protection Agency in the United States. There is strong evidence that ambient exposure to (PM$_{2.5}$) increases risk of mortality and hospitalization. Large scale epidemiological studies on the health effects of PM$_{2.5}$ provide the necessary evidence base for lowering the safety standards and inform regulatory policy. However, ambient monitors of PM$_{2.5}$ (as well as monitors for other pollutants) are sparsely located across the U.S., and therefore studies based only on the levels of PM$_{2.5}$ measured from the monitors would inevitably exclude large amounts of the population. One approach to resolving this issue has been developing models to predict local PM$_{2.5}$, NO$_2$, and ozone based on satellite, meteorological, and land use data. This process typically relies developing a prediction model that relies on large amounts of input data and is highly computationally intensive to predict levels of air pollution in unmonitored areas. We have developed a flexible R package that allows for environmental health researchers to design and train spatio-temporal models capable of predicting multiple pollutants, including PM$_{2.5}$. We utilize H2O, an open source big data platform, to achieve both performance and scalability when used in conjunction with cloud or cluster computing systems.
△ Less
Submitted 30 October, 2018; v1 submitted 29 May, 2018;
originally announced May 2018.
-
Causal inference in the context of an error prone exposure: air pollution and mortality
Authors:
Xiao Wu,
Danielle Braun,
Marianthi-Anna Kioumourtzoglou,
Christine Choirat,
Qian Di,
Francesca Dominici
Abstract:
We propose a new approach for estimating causal effects when the exposure is measured with error and confounding adjustment is performed via a generalized propensity score (GPS). Using validation data, we propose a regression calibration (RC)-based adjustment for a continuous error-prone exposure combined with GPS to adjust for confounding (RC-GPS). The outcome analysis is conducted after transfor…
▽ More
We propose a new approach for estimating causal effects when the exposure is measured with error and confounding adjustment is performed via a generalized propensity score (GPS). Using validation data, we propose a regression calibration (RC)-based adjustment for a continuous error-prone exposure combined with GPS to adjust for confounding (RC-GPS). The outcome analysis is conducted after transforming the corrected continuous exposure into a categorical exposure. We consider confounding adjustment in the context of GPS subclassification, inverse probability treatment weighting (IPTW) and matching. In simulations with varying degrees of exposure error and confounding bias, RC-GPS eliminates bias from exposure error and confounding compared to standard approaches that rely on the error-prone exposure. We applied RC-GPS to a rich data platform to estimate the causal effect of long-term exposure to fine particles ($PM_{2.5}$) on mortality in New England for the period from 2000 to 2012. The main study consists of $2,202$ zip codes covered by $217,660$ 1km $\times$ 1km grid cells with yearly mortality rates, yearly $PM_{2.5}$ averages estimated from a spatio-temporal model (error-prone exposure) and several potential confounders. The internal validation study includes a subset of 83 1km $\times$ 1km grid cells within 75 zip codes from the main study with error-free yearly $PM_{2.5}$ exposures obtained from monitor stations. Under assumptions of non-interference and weak unconfoundedness, using matching we found that exposure to moderate levels of $PM_{2.5}$ ($8 <$ $PM_{2.5}$ $\leq 10\ {\rm μg/m^3}$) causes a $2.8\%$ ($95\%$ CI: $0.6\%, 3.6\%$) increase in all-cause mortality compared to low exposure ($PM_{2.5}$ $\leq 8\ {\rm μg/m^3}$).
△ Less
Submitted 28 June, 2018; v1 submitted 2 December, 2017;
originally announced December 2017.
-
Hierarchical State Abstractions for Decision-Making Problems with Computational Constraints
Authors:
Daniel T. Larsson,
Daniel Braun,
Panagiotis Tsiotras
Abstract:
In this semi-tutorial paper, we first review the information-theoretic approach to account for the computational costs incurred during the search for optimal actions in a sequential decision-making problem. The traditional (MDP) framework ignores computational limitations while searching for optimal policies, essentially assuming that the acting agent is perfectly rational and aims for exact optim…
▽ More
In this semi-tutorial paper, we first review the information-theoretic approach to account for the computational costs incurred during the search for optimal actions in a sequential decision-making problem. The traditional (MDP) framework ignores computational limitations while searching for optimal policies, essentially assuming that the acting agent is perfectly rational and aims for exact optimality. Using the free-energy, a variational principle is introduced that accounts not only for the value of a policy alone, but also considers the cost of finding this optimal policy. The solution of the variational equations arising from this formulation can be obtained using familiar Bellman-like value iterations from dynamic programming (DP) and the Blahut-Arimoto (BA) algorithm from rate distortion theory. Finally, we demonstrate the utility of the approach for generating hierarchies of state abstractions that can be used to best exploit the available computational resources. A numerical example showcases these concepts for a path-planning problem in a grid world environment.
△ Less
Submitted 22 October, 2017;
originally announced October 2017.
-
Information-Theoretic Bounded Rationality
Authors:
Pedro A. Ortega,
Daniel A. Braun,
Justin Dyer,
Kee-Eung Kim,
Naftali Tishby
Abstract:
Bounded rationality, that is, decision-making and planning under resource limitations, is widely regarded as an important open problem in artificial intelligence, reinforcement learning, computational neuroscience and economics. This paper offers a consolidated presentation of a theory of bounded rationality based on information-theoretic ideas. We provide a conceptual justification for using the…
▽ More
Bounded rationality, that is, decision-making and planning under resource limitations, is widely regarded as an important open problem in artificial intelligence, reinforcement learning, computational neuroscience and economics. This paper offers a consolidated presentation of a theory of bounded rationality based on information-theoretic ideas. We provide a conceptual justification for using the free energy functional as the objective function for characterizing bounded-rational decisions. This functional possesses three crucial properties: it controls the size of the solution space; it has Monte Carlo planners that are exact, yet bypass the need for exhaustive search; and it captures model uncertainty arising from lack of evidence or from interacting with other agents having unknown intentions. We discuss the single-step decision-making case, and show how to extend it to sequential decisions using equivalence transformations. This extension yields a very general class of decision problems that encompass classical decision rules (e.g. EXPECTIMAX and MINIMAX) as limit cases, as well as trust- and risk-sensitive planning.
△ Less
Submitted 21 December, 2015;
originally announced December 2015.
-
Abstraction in decision-makers with limited information processing capabilities
Authors:
Tim Genewein,
Daniel A. Braun
Abstract:
A distinctive property of human and animal intelligence is the ability to form abstractions by neglecting irrelevant information which allows to separate structure from noise. From an information theoretic point of view abstractions are desirable because they allow for very efficient information processing. In artificial systems abstractions are often implemented through computationally costly for…
▽ More
A distinctive property of human and animal intelligence is the ability to form abstractions by neglecting irrelevant information which allows to separate structure from noise. From an information theoretic point of view abstractions are desirable because they allow for very efficient information processing. In artificial systems abstractions are often implemented through computationally costly formations of groups or clusters. In this work we establish the relation between the free-energy framework for decision making and rate-distortion theory and demonstrate how the application of rate-distortion for decision-making leads to the emergence of abstractions. We argue that abstractions are induced due to a limit in information processing capacity.
△ Less
Submitted 19 December, 2013; v1 submitted 16 December, 2013;
originally announced December 2013.
-
Precision Measurements of Temperature and Chemical Potential of Quantum Gases
Authors:
Ugo Marzolino,
Daniel Braun
Abstract:
We investigate the sensitivity with which the temperature and the chemical potential characterizing quantum gases can be measured. We calculate the corresponding quantum Fisher information matrices for both fermionic and bosonic gases. For the latter, particular attention is devoted to the situation close to the Bose-Einstein condensation transition, which we examine not only for the standard scen…
▽ More
We investigate the sensitivity with which the temperature and the chemical potential characterizing quantum gases can be measured. We calculate the corresponding quantum Fisher information matrices for both fermionic and bosonic gases. For the latter, particular attention is devoted to the situation close to the Bose-Einstein condensation transition, which we examine not only for the standard scenario in three dimensions, but also for generalized condensation in lower dimensions, where the bosons condense in a subspace of Hilbert space instead of a unique ground state, as well as condensation at fixed volume or fixed pressure. We show that Bose Einstein condensation can lead to sub-shot noise sensitivity for the measurement of the chemical potential. We also examine the influence of interactions on the sensitivity in three different models, and show that mean-field and contact interactions deteriorate the sensitivity but only slightly for experimentally accessible weak interactions.
△ Less
Submitted 24 November, 2013; v1 submitted 12 August, 2013;
originally announced August 2013.
-
Generalized Thompson Sampling for Sequential Decision-Making and Causal Inference
Authors:
Pedro A. Ortega,
Daniel A. Braun
Abstract:
Recently, it has been shown how sampling actions from the predictive distribution over the optimal action-sometimes called Thompson sampling-can be applied to solve sequential adaptive control problems, when the optimal policy is known for each possible environment. The predictive distribution can then be constructed by a Bayesian superposition of the optimal policies weighted by their posterior p…
▽ More
Recently, it has been shown how sampling actions from the predictive distribution over the optimal action-sometimes called Thompson sampling-can be applied to solve sequential adaptive control problems, when the optimal policy is known for each possible environment. The predictive distribution can then be constructed by a Bayesian superposition of the optimal policies weighted by their posterior probability that is updated by Bayesian inference and causal calculus. Here we discuss three important features of this approach. First, we discuss in how far such Thompson sampling can be regarded as a natural consequence of the Bayesian modeling of policy uncertainty. Second, we show how Thompson sampling can be used to study interactions between multiple adaptive agents, thus, opening up an avenue of game-theoretic analysis. Third, we show how Thompson sampling can be applied to infer causal relationships when interacting with an environment in a sequential fashion. In summary, our results suggest that Thompson sampling might not merely be a useful heuristic, but a principled method to address problems of adaptive sequential decision-making and causal inference.
△ Less
Submitted 18 March, 2013;
originally announced March 2013.
-
A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function
Authors:
Pedro A. Ortega,
Jordi Grau-Moya,
Tim Genewein,
David Balduzzi,
Daniel A. Braun
Abstract:
We propose a novel Bayesian approach to solve stochastic optimization problems that involve finding extrema of noisy, nonlinear functions. Previous work has focused on representing possible functions explicitly, which leads to a two-step procedure of first, doing inference over the function space and second, finding the extrema of these functions. Here we skip the representation step and directly…
▽ More
We propose a novel Bayesian approach to solve stochastic optimization problems that involve finding extrema of noisy, nonlinear functions. Previous work has focused on representing possible functions explicitly, which leads to a two-step procedure of first, doing inference over the function space and second, finding the extrema of these functions. Here we skip the representation step and directly model the distribution over extrema. To this end, we devise a non-parametric conjugate prior based on a kernel regressor. The resulting posterior distribution directly captures the uncertainty over the maximum of the unknown function. We illustrate the effectiveness of our model by optimizing a noisy, high-dimensional, non-convex objective function.
△ Less
Submitted 10 November, 2012; v1 submitted 8 June, 2012;
originally announced June 2012.
-
Free Energy and the Generalized Optimality Equations for Sequential Decision Making
Authors:
Pedro A. Ortega,
Daniel A. Braun
Abstract:
The free energy functional has recently been proposed as a variational principle for bounded rational decision-making, since it instantiates a natural trade-off between utility gains and information processing costs that can be axiomatically derived. Here we apply the free energy principle to general decision trees that include both adversarial and stochastic environments. We derive generalized se…
▽ More
The free energy functional has recently been proposed as a variational principle for bounded rational decision-making, since it instantiates a natural trade-off between utility gains and information processing costs that can be axiomatically derived. Here we apply the free energy principle to general decision trees that include both adversarial and stochastic environments. We derive generalized sequential optimality equations that not only include the Bellman optimality equations as a limit case, but also lead to well-known decision-rules such as Expectimax, Minimax and Expectiminimax. We show how these decision-rules can be derived from a single free energy principle that assigns a resource parameter to each node in the decision tree. These resource parameters express a concrete computational cost that can be measured as the amount of samples that are needed from the distribution that belongs to each node. The free energy principle therefore provides the normative basis for generalized optimality equations that account for both adversarial and stochastic environments.
△ Less
Submitted 17 May, 2012;
originally announced May 2012.