-
Hybrid Summary Statistics
Authors:
T. Lucas Makinen,
Ce Sui,
Benjamin D. Wandelt,
Natalia Porqueres,
Alan Heavens
Abstract:
We present a way to capture high-information posteriors from training sets that are sparsely sampled over the parameter space for robust simulation-based inference. In physical inference problems, we can often apply domain knowledge to define traditional summary statistics to capture some of the information in a dataset. We show that augmenting these statistics with neural network outputs to maxim…
▽ More
We present a way to capture high-information posteriors from training sets that are sparsely sampled over the parameter space for robust simulation-based inference. In physical inference problems, we can often apply domain knowledge to define traditional summary statistics to capture some of the information in a dataset. We show that augmenting these statistics with neural network outputs to maximise the mutual information improves information extraction compared to neural summaries alone or their concatenation to existing summaries and makes inference robust in settings with low training data. We introduce 1) two loss formalisms to achieve this and 2) apply the technique to two different cosmological datasets to extract non-Gaussian parameter information.
△ Less
Submitted 9 October, 2024;
originally announced October 2024.
-
CHARM: Creating Halos with Auto-Regressive Multi-stage networks
Authors:
Shivam Pandey,
Chirag Modi,
Benjamin D. Wandelt,
Deaglan J. Bartlett,
Adrian E. Bayer,
Greg L. Bryan,
Matthew Ho,
Guilhem Lavaux,
T. Lucas Makinen,
Francisco Villaescusa-Navarro
Abstract:
To maximize the amount of information extracted from cosmological datasets, simulations that accurately represent these observations are necessary. However, traditional simulations that evolve particles under gravity by estimating particle-particle interactions (N-body simulations) are computationally expensive and prohibitive to scale to the large volumes and resolutions necessary for the upcomin…
▽ More
To maximize the amount of information extracted from cosmological datasets, simulations that accurately represent these observations are necessary. However, traditional simulations that evolve particles under gravity by estimating particle-particle interactions (N-body simulations) are computationally expensive and prohibitive to scale to the large volumes and resolutions necessary for the upcoming datasets. Moreover, modeling the distribution of galaxies typically involves identifying virialized dark matter halos, which is also a time- and memory-consuming process for large N-body simulations, further exacerbating the computational cost. In this study, we introduce CHARM, a novel method for creating mock halo catalogs by matching the spatial, mass, and velocity statistics of halos directly from the large-scale distribution of the dark matter density field. We develop multi-stage neural spline flow-based networks to learn this mapping at redshift z=0.5 directly with computationally cheaper low-resolution particle mesh simulations instead of relying on the high-resolution N-body simulations. We show that the mock halo catalogs and painted galaxy catalogs have the same statistical properties as obtained from $N$-body simulations in both real space and redshift space. Finally, we use these mock catalogs for cosmological inference using redshift-space galaxy power spectrum, bispectrum, and wavelet-based statistics using simulation-based inference, performing the first inference with accelerated forward model simulations and finding unbiased cosmological constraints with well-calibrated posteriors. The code was developed as part of the Simons Collaboration on Learning the Universe and is publicly available at \url{https://github.com/shivampcosmo/CHARM}.
△ Less
Submitted 13 September, 2024;
originally announced September 2024.
-
Hybrid summary statistics: neural weak lensing inference beyond the power spectrum
Authors:
T. Lucas Makinen,
Alan Heavens,
Natalia Porqueres,
Tom Charnock,
Axel Lapel,
Benjamin D. Wandelt
Abstract:
In inference problems, we often have domain knowledge which allows us to define summary statistics that capture most of the information content in a dataset. In this paper, we present a hybrid approach, where such physics-based summaries are augmented by a set of compressed neural summary statistics that are optimised to extract the extra information that is not captured by the predefined summarie…
▽ More
In inference problems, we often have domain knowledge which allows us to define summary statistics that capture most of the information content in a dataset. In this paper, we present a hybrid approach, where such physics-based summaries are augmented by a set of compressed neural summary statistics that are optimised to extract the extra information that is not captured by the predefined summaries. The resulting statistics are very powerful inputs to simulation-based or implicit inference of model parameters. We apply this generalisation of Information Maximising Neural Networks (IMNNs) to parameter constraints from tomographic weak gravitational lensing convergence maps to find summary statistics that are explicitly optimised to complement angular power spectrum estimates. We study several dark matter simulation resolutions in low- and high-noise regimes. We show that i) the information-update formalism extracts at least $3\times$ and up to $8\times$ as much information as the angular power spectrum in all noise regimes, ii) the network summaries are highly complementary to existing 2-point summaries, and iii) our formalism allows for networks with smaller, physically-informed architectures to match much larger regression networks with far fewer simulations needed to obtain asymptotically optimal inference.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
Optimal Neural Summarisation for Full-Field Weak Lensing Cosmological Implicit Inference
Authors:
Denise Lanzieri,
Justine Zeghal,
T. Lucas Makinen,
Alexandre Boucaud,
Jean-Luc Starck,
François Lanusse
Abstract:
Traditionally, weak lensing cosmological surveys have been analyzed using summary statistics motivated by their analytically tractable likelihoods, or by their ability to access higher-order information, at the cost of requiring Simulation-Based Inference (SBI) approaches. While informative, these statistics are neither designed nor guaranteed to be statistically sufficient. With the rise of deep…
▽ More
Traditionally, weak lensing cosmological surveys have been analyzed using summary statistics motivated by their analytically tractable likelihoods, or by their ability to access higher-order information, at the cost of requiring Simulation-Based Inference (SBI) approaches. While informative, these statistics are neither designed nor guaranteed to be statistically sufficient. With the rise of deep learning, it becomes possible to create summary statistics optimized to extract the full data information. We compare different neural summarization strategies proposed in the weak lensing literature, to assess which loss functions lead to theoretically optimal summary statistics to perform full-field inference. In doing so, we aim to provide guidelines and insights to the community to help guide future neural-based inference analyses. We design an experimental setup to isolate the impact of the loss function used to train neural networks. We have developed the sbi_lens JAX package, which implements an automatically differentiable lognormal wCDM LSST-Y10 weak lensing simulator. The explicit full-field posterior obtained using the Hamiltonian Monte Carlo sampler gives us a ground truth to which to compare different compression strategies. We provide theoretical insight into the loss functions used in the literature and show that some do not necessarily lead to sufficient statistics (e.g. Mean Square Error (MSE)), while those motivated by information theory (e.g. Variational Mutual Information Maximization (VMIM)) can. Our numerical experiments confirm these insights and show, in our simulated wCDM scenario, that the Figure of Merit (FoM) of an analysis using neural summaries optimized under VMIM achieves 100% of the reference Omega_c - sigma_8 full-field FoM, while an analysis using neural summaries trained under MSE achieves only 81% of the same reference FoM.
△ Less
Submitted 16 February, 2025; v1 submitted 15 July, 2024;
originally announced July 2024.
-
LtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and Cosmology
Authors:
Matthew Ho,
Deaglan J. Bartlett,
Nicolas Chartier,
Carolina Cuesta-Lazaro,
Simon Ding,
Axel Lapel,
Pablo Lemos,
Christopher C. Lovell,
T. Lucas Makinen,
Chirag Modi,
Viraj Pandya,
Shivam Pandey,
Lucia A. Perez,
Benjamin Wandelt,
Greg L. Bryan
Abstract:
This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It i…
▽ More
This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It includes comprehensive validation metrics to assess posterior estimate coverage, enhancing the reliability of inferred results. Additionally, the pipeline is easily parallelizable and is designed for efficient exploration of modeling hyperparameters. To demonstrate its capabilities, we present real applications across a range of astrophysics and cosmology problems, such as: estimating galaxy cluster masses from X-ray photometry; inferring cosmology from matter power spectra and halo point clouds; characterizing progenitors in gravitational wave signals; capturing physical dust parameters from galaxy colors and luminosities; and establishing properties of semi-analytic models of galaxy formation. We also include exhaustive benchmarking and comparisons of all implemented methods as well as discussions about the challenges and pitfalls of ML inference in astronomical sciences. All code and examples are made publicly available at https://github.com/maho3/ltu-ili.
△ Less
Submitted 2 July, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Fishnets: Information-Optimal, Scalable Aggregation for Sets and Graphs
Authors:
T. Lucas Makinen,
Justin Alsing,
Benjamin D. Wandelt
Abstract:
Set-based learning is an essential component of modern deep learning and network science. Graph Neural Networks (GNNs) and their edge-free counterparts Deepsets have proven remarkably useful on ragged and topologically challenging datasets. The key to learning informative embeddings for set members is a specified aggregation function, usually a sum, max, or mean. We propose Fishnets, an aggregatio…
▽ More
Set-based learning is an essential component of modern deep learning and network science. Graph Neural Networks (GNNs) and their edge-free counterparts Deepsets have proven remarkably useful on ragged and topologically challenging datasets. The key to learning informative embeddings for set members is a specified aggregation function, usually a sum, max, or mean. We propose Fishnets, an aggregation strategy for learning information-optimal embeddings for sets of data for both Bayesian inference and graph aggregation. We demonstrate that i) Fishnets neural summaries can be scaled optimally to an arbitrary number of data objects, ii) Fishnets aggregations are robust to changes in data distribution, unlike standard deepsets, iii) Fishnets saturate Bayesian information content and extend to regimes where MCMC techniques fail and iv) Fishnets can be used as a drop-in aggregation scheme within GNNs. We show that by adopting a Fishnets aggregation scheme for message passing, GNNs can achieve state-of-the-art performance versus architecture size on ogbn-protein data over existing benchmarks with a fraction of learnable parameters and faster training time.
△ Less
Submitted 28 June, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Field-level inference of cosmic shear with intrinsic alignments and baryons
Authors:
Natalia Porqueres,
Alan Heavens,
Daniel Mortlock,
Guilhem Lavaux,
T. Lucas Makinen
Abstract:
We construct a field-based Bayesian Hierarchical Model for cosmic shear that includes, for the first time, the important astrophysical systematics of intrinsic alignments and baryon feedback, in addition to a gravity model. We add to the BORG-WL framework the tidal alignment and tidal torquing model (TATT) for intrinsic alignments and compare them with the non-linear alignment (NLA) model. With sy…
▽ More
We construct a field-based Bayesian Hierarchical Model for cosmic shear that includes, for the first time, the important astrophysical systematics of intrinsic alignments and baryon feedback, in addition to a gravity model. We add to the BORG-WL framework the tidal alignment and tidal torquing model (TATT) for intrinsic alignments and compare them with the non-linear alignment (NLA) model. With synthetic data, we have shown that adding intrinsic alignments and sampling the TATT parameters does not reduce the constraining power of the method and the field-based approach lifts the weak lensing degeneracy. We add baryon effects at the field level using the enthalpy gradient descent (EGD) model. This model displaces the dark matter particles without knowing whether they belong to a halo and allows for self-calibration of the model parameters, which are inferred from the data. We have also illustrated the effects of model misspecification for the baryons. The resulting model now contains the most important physical effects and is suitable for application to data.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
The Cosmic Graph: Optimal Information Extraction from Large-Scale Structure using Catalogues
Authors:
T. Lucas Makinen,
Tom Charnock,
Pablo Lemos,
Natalia Porqueres,
Alan Heavens,
Benjamin D. Wandelt
Abstract:
We present an implicit likelihood approach to quantifying cosmological information over discrete catalogue data, assembled as graphs. To do so, we explore cosmological parameter constraints using mock dark matter halo catalogues. We employ Information Maximising Neural Networks (IMNNs) to quantify Fisher information extraction as a function of graph representation. We a) demonstrate the high sensi…
▽ More
We present an implicit likelihood approach to quantifying cosmological information over discrete catalogue data, assembled as graphs. To do so, we explore cosmological parameter constraints using mock dark matter halo catalogues. We employ Information Maximising Neural Networks (IMNNs) to quantify Fisher information extraction as a function of graph representation. We a) demonstrate the high sensitivity of modular graph structure to the underlying cosmology in the noise-free limit, b) show that graph neural network summaries automatically combine mass and clustering information through comparisons to traditional statistics, c) demonstrate that networks can still extract information when catalogues are subject to noisy survey cuts, and d) illustrate how nonlinear IMNN summaries can be used as asymptotically optimal compressed statistics for Bayesian simulation-based inference. We reduce the area of joint $Ω_m, σ_8$ parameter constraints with small ($\sim$100 object) halo catalogues by a factor of 42 over the two-point correlation function, and demonstrate that the networks automatically combine mass and clustering information. This work utilises a new IMNN implementation over graph data in Jax, which can take advantage of either numerical or auto-differentiability. We also show that graph IMNNs successfully compress simulations away from the fiducial model at which the network is fitted, indicating a promising alternative to n-point statistics in catalogue simulation-based analyses.
△ Less
Submitted 22 December, 2022; v1 submitted 11 July, 2022;
originally announced July 2022.
-
Exoplanet atmosphere evolution: emulation with neural networks
Authors:
James G. Rogers,
Clàudia Janó Muñoz,
James E. Owen,
T. Lucas Makinen
Abstract:
Atmospheric mass-loss is known to play a leading role in sculpting the demographics of small, close-in exoplanets. Knowledge of how such planets evolve allows one to ``rewind the clock'' to infer the conditions in which they formed. Here, we explore the relationship between a planet's core mass and their atmospheric mass after protoplanetary disc dispersal by exploiting XUV photoevaporation as an…
▽ More
Atmospheric mass-loss is known to play a leading role in sculpting the demographics of small, close-in exoplanets. Knowledge of how such planets evolve allows one to ``rewind the clock'' to infer the conditions in which they formed. Here, we explore the relationship between a planet's core mass and their atmospheric mass after protoplanetary disc dispersal by exploiting XUV photoevaporation as an evolutionary process. Historically, this style of inference problem would be computationally infeasible due to the large number of planet models required; however, we make use of a novel atmospheric evolution emulator which utilises neural networks to provide three orders of magnitude in speedup. First, we provide proof-of-concept for this emulator on a real problem, by inferring the initial atmospheric conditions to the TOI-270 multi-planet system. Using the emulator we find near-indistinguishable results when compared to original model. We then apply the emulator to the more complex inference problem, which aims to find the initial conditions for a sample of \textit{Kepler}, \textit{K2} and \textit{TESS} planets with well-constrained masses and radii. We demonstrate there is a relationship between core masses and the atmospheric mass that they retain after disc dispersal, and this trend is consistent with the `boil-off' scenario, in which close-in planets undergo dramatic atmospheric escape during disc dispersal. Thus, it appears the exoplanet population is consistent with the idea that close-in exoplanets initially acquired large massive atmospheres, the majority of which is lost during disc dispersal; before the final population is sculpted by atmospheric loss over 100~Myr to Gyr timescales.
△ Less
Submitted 8 January, 2023; v1 submitted 28 October, 2021;
originally announced October 2021.
-
Lossless, Scalable Implicit Likelihood Inference for Cosmological Fields
Authors:
T. Lucas Makinen,
Tom Charnock,
Justin Alsing,
Benjamin D. Wandelt
Abstract:
We present a comparison of simulation-based inference to full, field-based analytical inference in cosmological data analysis. To do so, we explore parameter inference for two cases where the information content is calculable analytically: Gaussian random fields whose covariance depends on parameters through the power spectrum; and correlated lognormal fields with cosmological power spectra. We co…
▽ More
We present a comparison of simulation-based inference to full, field-based analytical inference in cosmological data analysis. To do so, we explore parameter inference for two cases where the information content is calculable analytically: Gaussian random fields whose covariance depends on parameters through the power spectrum; and correlated lognormal fields with cosmological power spectra. We compare two inference techniques: i) explicit field-level inference using the known likelihood and ii) implicit likelihood inference with maximally informative summary statistics compressed via Information Maximising Neural Networks (IMNNs). We find that a) summaries obtained from convolutional neural network compression do not lose information and therefore saturate the known field information content, both for the Gaussian covariance and the lognormal cases, b) simulation-based inference using these maximally informative nonlinear summaries recovers nearly losslessly the exact posteriors of field-level inference, bypassing the need to evaluate expensive likelihoods or invert covariance matrices, and c) even for this simple example, implicit, simulation-based likelihood incurs a much smaller computational cost than inference with an explicit likelihood. This work uses a new IMNNs implementation in $\texttt{Jax}$ that can take advantage of fully-differentiable simulation and inference pipeline. We also demonstrate that a single retraining of the IMNN summaries effectively achieves the theoretically maximal information, enhancing the robustness to the choice of fiducial model where the IMNN is trained.
△ Less
Submitted 17 July, 2021; v1 submitted 15 July, 2021;
originally announced July 2021.
-
deep21: a Deep Learning Method for 21cm Foreground Removal
Authors:
T. Lucas Makinen,
Lachlan Lancaster,
Francisco Villaescusa-Navarro,
Peter Melchior,
Shirley Ho,
Laurence Perreault-Levasseur,
David N. Spergel
Abstract:
We seek to remove foreground contaminants from 21cm intensity mapping observations. We demonstrate that a deep convolutional neural network (CNN) with a UNet architecture and three-dimensional convolutions, trained on simulated observations, can effectively separate frequency and spatial patterns of the cosmic neutral hydrogen (HI) signal from foregrounds in the presence of noise. Cleaned maps rec…
▽ More
We seek to remove foreground contaminants from 21cm intensity mapping observations. We demonstrate that a deep convolutional neural network (CNN) with a UNet architecture and three-dimensional convolutions, trained on simulated observations, can effectively separate frequency and spatial patterns of the cosmic neutral hydrogen (HI) signal from foregrounds in the presence of noise. Cleaned maps recover cosmological clustering statistics within 10% at all relevant angular scales and frequencies. This amounts to a reduction in prediction variance of over an order of magnitude on small angular scales ($\ell > 300$), and improved accuracy for small radial scales ($k_{\parallel} > 0.17\ \rm h\ Mpc^{-1})$ compared to standard Principal Component Analysis (PCA) methods. We estimate posterior confidence intervals for the network's prediction by training an ensemble of UNets. Our approach demonstrates the feasibility of analyzing 21cm intensity maps, as opposed to derived summary statistics, for upcoming radio experiments, as long as the simulated foreground model is sufficiently realistic. We provide the code used for this analysis on Github https://github.com/tlmakinen/deep21 as well as a browser-based tutorial for the experiment and UNet model via the accompanying http://bit.ly/deep21-colab Colab notebook.
△ Less
Submitted 1 June, 2021; v1 submitted 29 October, 2020;
originally announced October 2020.