Search | arXiv e-print repository

Cosmology with Topological Deep Learning

Authors: Jun-Young Lee, Francisco Villaescusa-Navarro

Abstract: The standard cosmological model with cold dark matter posits a hierarchical formation of structures. We introduce topological neural networks (TNNs), implemented as message-passing neural networks on higher-order structures, to effectively capture the topological information inherent in these hierarchies that traditional graph neural networks (GNNs) fail to account for. Our approach not only consi… ▽ More The standard cosmological model with cold dark matter posits a hierarchical formation of structures. We introduce topological neural networks (TNNs), implemented as message-passing neural networks on higher-order structures, to effectively capture the topological information inherent in these hierarchies that traditional graph neural networks (GNNs) fail to account for. Our approach not only considers the vertices and edges that comprise a graph but also extends to higher-order cells such as tetrahedra, clusters, and hyperedges. This enables message-passing between these heterogeneous structures within a combinatorial complex. Furthermore, our TNNs are designed to conserve the $E(3)$-invariance, which refers to the symmetry arising from invariance against translations, reflections, and rotations. When applied to the Quijote suite, our TNNs achieve a significant reduction in the mean squared error. Compared to our GNNs, which lack higher-order message-passing, ClusterTNNs show improvements of up to 22% in $Ω_{\rm m}$ and 34% in $σ_8$ jointly, while the best FullTNN achieves an improvement of up to 60% in $σ_8$. In the context of the CAMELS suite, our models yield results comparable to the current GNN benchmark, albeit with a slight decrease in performance. We emphasize that our topology and symmetry-aware neural networks provide enhanced expressive power in modeling the large-scale structures of our universe. △ Less

Submitted 29 May, 2025; originally announced May 2025.

Comments: 19 pages, 5 figures; submitted to ApJ

arXiv:2505.13620 [pdf, ps, other]

Field-Level Comparison and Robustness Analysis of Cosmological N-Body Simulations

Authors: Adrian E. Bayer, Francisco Villaescusa-Navarro, Sammy Sharief, Romain Teyssier, Lehman H. Garrison, Laurence Perreault-Levasseur, Greg L. Bryan, Marco Gatti, Eli Visbal

Abstract: We present the first field-level comparison of cosmological N-body simulations, considering various widely used codes: Abacus, CUBEP$^3$M, Enzo, Gadget, Gizmo, PKDGrav, and Ramses. Unlike previous comparisons focused on summary statistics, we conduct a comprehensive field-level analysis: evaluating statistical similarity, quantifying implications for cosmological parameter inference, and identifyi… ▽ More We present the first field-level comparison of cosmological N-body simulations, considering various widely used codes: Abacus, CUBEP$^3$M, Enzo, Gadget, Gizmo, PKDGrav, and Ramses. Unlike previous comparisons focused on summary statistics, we conduct a comprehensive field-level analysis: evaluating statistical similarity, quantifying implications for cosmological parameter inference, and identifying the regimes in which simulations are consistent. We begin with a traditional comparison using the power spectrum, cross-correlation coefficient, and visual inspection of the matter field. We follow this with a statistical out-of-distribution (OOD) analysis to quantify distributional differences between simulations, revealing insights not captured by the traditional metrics. We then perform field-level simulation-based inference (SBI) using convolutional neural networks (CNNs), training on one simulation and testing on others, including a full hydrodynamic simulation for comparison. We identify several causes of OOD behavior and biased inference, finding that resolution effects, such as those arising from adaptive mesh refinement (AMR), have a significant impact. Models trained on non-AMR simulations fail catastrophically when evaluated on AMR simulations, introducing larger biases than those from hydrodynamic effects. Differences in resolution, even when using the same N-body code, likewise lead to biased inference. We attribute these failures to a CNN's sensitivity to small-scale fluctuations, particularly in voids and filaments, and demonstrate that appropriate smoothing brings the simulations into statistical agreement. Our findings motivate the need for careful data filtering and the use of field-level OOD metrics, such as PQMass, to ensure robust inference. △ Less

Submitted 19 May, 2025; originally announced May 2025.

Comments: 14 pages, 7 figures, 1 table

arXiv:2504.17839 [pdf, other]

Interpreting Cosmological Information from Neural Networks in the Hydrodynamic Universe

Authors: Arnab Lahiry, Adrian E. Bayer, Francisco Villaescusa-Navarro

Abstract: What happens when a black box (neural network) meets a black box (simulation of the Universe)? Recent work has shown that convolutional neural networks (CNNs) can infer cosmological parameters from the matter density field in the presence of complex baryonic processes. A key question that arises is, which parts of the cosmic web is the neural network obtaining information from? We shed light on th… ▽ More What happens when a black box (neural network) meets a black box (simulation of the Universe)? Recent work has shown that convolutional neural networks (CNNs) can infer cosmological parameters from the matter density field in the presence of complex baryonic processes. A key question that arises is, which parts of the cosmic web is the neural network obtaining information from? We shed light on the matter by identifying the Fourier scales, density scales, and morphological features of the cosmic web that CNNs pay most attention to. We find that CNNs extract cosmological information from both high and low density regions: overdense regions provide the most information per pixel, while underdense regions -- particularly deep voids and their surroundings -- contribute significantly due to their large spatial extent and coherent spatial features. Remarkably, we demonstrate that there is negligible degradation in cosmological constraining power after aggressive cutting in both maximum Fourier scale and density. Furthermore, we find similar results when considering both hydrodynamic and gravity-only simulations, implying that neural networks can marginalize over baryonic effects with minimal loss in cosmological constraining power. Our findings point to practical strategies for optimal and robust field-level cosmological inference in the presence of uncertainly modeled astrophysics. △ Less

Submitted 24 April, 2025; originally announced April 2025.

Comments: 14 pages, 11 figures, 1 table

arXiv:2504.06919 [pdf, other]

Correcting for interloper contamination in the power spectrum with neural networks

Authors: Marina S. Cagliari, Azadeh Moradinezhad Dizgah, Francisco Villaescusa-Navarro

Abstract: Modern slitless spectroscopic surveys, such as Euclid and the Roman Space Telescope, collect vast numbers of galaxy spectra but suffer from low signal-to-noise ratios. This often leads to incorrect redshift assignments when relying on a single emission line, due to noise spikes or contamination from non-target emission lines, commonly referred to as redshift interlopers. We propose a machine learn… ▽ More Modern slitless spectroscopic surveys, such as Euclid and the Roman Space Telescope, collect vast numbers of galaxy spectra but suffer from low signal-to-noise ratios. This often leads to incorrect redshift assignments when relying on a single emission line, due to noise spikes or contamination from non-target emission lines, commonly referred to as redshift interlopers. We propose a machine learning approach to correct the impact of interlopers at the level of measured summary statistics, focusing on the power spectrum monopole and line interlopers as a proof of concept. To model interloper effects, we use halo catalogs from the Quijote simulations as proxies for galaxies, displacing a fraction of halos by the distance corresponding to the redshift offset between target and interloper galaxies. This yields contaminated catalogs with varying interloper fractions across a wide range of cosmologies from the Quijote suite. We train a neural network on the power spectrum monopole, alone or combined with the bispectrum monopole, from contaminated mocks to estimate the interloper fraction and reconstruct the cleaned power spectrum. We evaluate performance in two settings: one with fixed cosmology and another where cosmological parameters vary under broad priors. In the fixed case, the network recovers the interloper fraction and corrects the power spectrum to better than 1% accuracy. When cosmology varies, performance degrades, but adding bispectrum information significantly improves results, reducing the interloper fraction error by 40-60%. We also study the method's performance as a function of the size of the training set and find that optimal strategies depend on the correlation between target and interloper samples: bispectrum information aids performance when target and interloper galaxies are uncorrelated, while tighter priors are more effective when the two are strongly correlated. △ Less

Submitted 9 April, 2025; originally announced April 2025.

Comments: 15 pages, 6 figures, 3 tables. Data available at https://quijote-simulations.readthedocs.io/en/latest/interlopers.html . Public code available at https://github.com/mcagliari/NoInterNet

arXiv:2503.22654 [pdf, other]

On the effects of parameters on galaxy properties in CAMELS and the predictability of $Ω_{\rm m}$

Authors: Gabriella Contardo, Roberto Trotta, Serafina Di Gioia, David W. Hogg, Francisco Villaescusa-Navarro

Abstract: Recent analyses of cosmological hydrodynamic simulations from CAMELS have shown that machine learning models can predict the parameter describing the total matter content of the universe, $Ω_{\rm m}$, from the features of a single galaxy. We investigate the statistical properties of two of these simulation suites, IllustrisTNG and ASTRID, confirming that $Ω_{\rm m}$ induces a strong displacement o… ▽ More Recent analyses of cosmological hydrodynamic simulations from CAMELS have shown that machine learning models can predict the parameter describing the total matter content of the universe, $Ω_{\rm m}$, from the features of a single galaxy. We investigate the statistical properties of two of these simulation suites, IllustrisTNG and ASTRID, confirming that $Ω_{\rm m}$ induces a strong displacement on the distribution of galaxy features. We also observe that most other parameters have little to no effect on the distribution, except for the stellar-feedback parameter $A_{SN1}$, which introduces some near-degeneracies that can be broken with specific features. These two properties explain the predictability of $Ω_{\rm m}$. We use Optimal Transport to further measure the effect of parameters on the distribution of galaxy properties, which is found to be consistent with physical expectations. However, we observe discrepancies between the two simulation suites, both in the effect of $Ω_{\rm m}$ on the galaxy properties and in the distributions themselves at identical parameter values. Thus, although $Ω_{\rm m}$'s signature can be easily detected within a given simulation suite using just a single galaxy, applying this result to real observational data may prove significantly more challenging. △ Less

Submitted 28 March, 2025; originally announced March 2025.

Comments: 19 pages, 9 figures. Comments welcome!

arXiv:2503.13755 [pdf, other]

How many simulations do we need for simulation-based inference in cosmology?

Authors: Anirban Bairagi, Benjamin Wandelt, Francisco Villaescusa-Navarro

Abstract: How many simulations do we need to train machine learning methods to extract information available from summary statistics of the cosmological density field? Neural methods have shown the potential to extract non-linear information available from cosmological data. Success depends critically on having sufficient simulations for training the networks and appropriate network architectures. In the fi… ▽ More How many simulations do we need to train machine learning methods to extract information available from summary statistics of the cosmological density field? Neural methods have shown the potential to extract non-linear information available from cosmological data. Success depends critically on having sufficient simulations for training the networks and appropriate network architectures. In the first detailed convergence study of neural network training for cosmological inference, we show that currently available simulation suites, such as the Quijote Latin Hypercube(LH) with 2000 simulations, do not provide sufficient training data for a generic neural network to reach the optimal regime, even for the dark matter power spectrum, and in an idealized case. We discover an empirical neural scaling law that predicts how much information a neural network can extract from a highly informative summary statistic, the dark matter power spectrum, as a function of the number of simulations used to train the network, for a wide range of architectures and hyperparameters. We combine this result with the Cramer-Rao information bound to forecast the number of training simulations needed for near-optimal information extraction. To verify our method we created the largest publicly released simulation data set in cosmology, the Big Sobol Sequence(BSQ), consisting of 32,768 $Λ$CDM n-body simulations uniformly covering the $Λ$CDM parameter space. Our method enables efficient planning of simulation campaigns for machine learning applications in cosmology, while the BSQ dataset provides an unprecedented resource for studying the convergence behavior of neural networks in cosmological parameter inference. Our results suggest that new large simulation suites or new training approaches will be necessary to achieve information-optimal parameter inference from non-linear simulations. △ Less

Submitted 17 March, 2025; originally announced March 2025.

arXiv:2503.01956 [pdf, other]

Lost in the FoG: Pitfalls of Models for Large-Scale Hydrogen Distributions

Authors: Calvin Osinga, Benedikt Diemer, Francisco Villaescusa-Navarro

Abstract: Large-scale HI surveys and their cross-correlations with galaxy distributions have immense potential as cosmological probes. Interpreting these measurements requires theoretical models that must incorporate redshift-space distortions (RSDs), such as the Kaiser and fingers-of-God (FoG) effect, and differences in the tracer and matter distributions via the tracer bias. These effects are commonly app… ▽ More Large-scale HI surveys and their cross-correlations with galaxy distributions have immense potential as cosmological probes. Interpreting these measurements requires theoretical models that must incorporate redshift-space distortions (RSDs), such as the Kaiser and fingers-of-God (FoG) effect, and differences in the tracer and matter distributions via the tracer bias. These effects are commonly approximated with assumptions that should be tested on simulated distributions. In this work, we use the hydrodynamical simulation suite IllustrisTNG to assess the performance of models of $z \leq 1$ HI auto and HI-galaxy cross-power spectra, finding that the models employed by recent observations introduce errors comparable to or exceeding their measurement uncertainties. In particular, neglecting FoG causes $\gtrsim 10\%$ deviations between the modeled and simulated power spectra at $k \gtrsim 0.1$ $h$ / Mpc, larger than assuming a constant bias which reaches the same error threshold at slightly smaller scales. However, even without these assumptions, models can still err by $\sim 10\%$ on relevant scales. These remaining errors arise from multiple RSD damping sources on HI clustering, which are not sufficiently described with a single FoG term. Overall, our results highlight the need for an improved understanding of RSDs to harness the capabilities of future measurements of HI distributions. △ Less

Submitted 17 March, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

Comments: 29 pages, 16 figures

arXiv:2502.17568 [pdf, other]

Cosmology with One Galaxy: Auto-Encoding the Galaxy Properties Manifold

Authors: Amanda Lue, Shy Genel, Marc Huertas-Company, Francisco Villaescusa-Navarro, Matthew Ho

Abstract: Cosmological simulations like CAMELS and IllustrisTNG characterize hundreds of thousands of galaxies using various internal properties. Previous studies have demonstrated that machine learning can be used to infer the cosmological parameter $Ω_m$ from the internal properties of even a single randomly selected simulated galaxy. This ability was hypothesized to originate from galaxies occupying a lo… ▽ More Cosmological simulations like CAMELS and IllustrisTNG characterize hundreds of thousands of galaxies using various internal properties. Previous studies have demonstrated that machine learning can be used to infer the cosmological parameter $Ω_m$ from the internal properties of even a single randomly selected simulated galaxy. This ability was hypothesized to originate from galaxies occupying a low-dimensional manifold within a higher-dimensional galaxy property space, which shifts with variations in $Ω_m$. In this work, we investigate how galaxies occupy the high-dimensional galaxy property space, particularly the effect of $Ω_m$ and other cosmological and astrophysical parameters on the putative manifold. We achieve this by using an autoencoder with an Information-Ordered Bottleneck (IOB), a neural layer with adaptive compression, to perform dimensionality reduction on individual galaxy properties from CAMELS simulations, which are run with various combinations of cosmological and astrophysical parameters. We find that for an autoencoder trained on the fiducial set of parameters, the reconstruction error increases significantly when the test set deviates from fiducial values of $Ω_m$ and $A_{\text{SN1}}$, indicating that these parameters shift galaxies off the fiducial manifold. In contrast, variations in other parameters such as $σ_8$ cause negligible error changes, suggesting galaxies shift along the manifold. These findings provide direct evidence that the ability to infer $Ω_m$ from individual galaxies is tied to the way $Ω_m$ shifts the manifold. Physically, this implies that parameters like $σ_8$ produce galaxy property changes resembling natural scatter, while parameters like $Ω_m$ and $A_{\text{SN1}}$ create unsampled properties, extending beyond the natural scatter in the fiducial model. △ Less

Submitted 24 February, 2025; originally announced February 2025.

Comments: 10 pages, 6 figures. To be submitted to ApJ

arXiv:2502.13589 [pdf, other]

CASCO: Cosmological and AStrophysical parameters from Cosmological simulations and Observations III. The physics behind the emergence of the golden mass scale

Authors: C. Tortora, V. Busillo, N. R. Napolitano, L. V. E. Koopmans, G. Covone, S. Genel, F. Villaescusa-Navarro, M. Silvestrini

Abstract: Different studies have suggested the emergence of the so-called golden mass, corresponding to a virial mass of $\sim 10^{12} \, M_{\rm \odot}$ and a stellar mass of $\sim 5 \times 10^{10} \, M_{\rm \odot}$. This mass scale marks a maximum in star formation efficiency, where galaxies are minimally affected by processes like SN and AGN feedback. We use \textsc{camels} cosmological simulations, based… ▽ More Different studies have suggested the emergence of the so-called golden mass, corresponding to a virial mass of $\sim 10^{12} \, M_{\rm \odot}$ and a stellar mass of $\sim 5 \times 10^{10} \, M_{\rm \odot}$. This mass scale marks a maximum in star formation efficiency, where galaxies are minimally affected by processes like SN and AGN feedback. We use \textsc{camels} cosmological simulations, based on the IllustrisTNG subgrid, to study the origin of this mass scale and whether it persists when varying feedback from SN and AGN. We focus on the correlation between the total-to-stellar mass within the half-mass radius and stellar mass, which follows an inverted bell-shaped trend, with a minimum at the golden mass. SN feedback processes impact the emergence of the golden mass, which shifts to lower mass for high values of wind velocity and energy. We find that most AGN feedback parameters influence the emergence of the golden mass, altering the correlation slope at high mass: the black hole radiative efficiency is the most impactful, followed by the black hole feedback factor and quasar threshold. ETGs preserve the inverted bell-shaped trend, while LTGs have monotonically decreasing DM fractions with mass, with mild indication of an inversion only at low redshift, confirming results from observations. When connecting with global quantities, we see that star formation efficiency show a bell-shaped trend peaking at the golden mass, with behaviours that mirror the central quantities. In ETGs a peak at lower mass is seen, while LTGs mirror the behaviour in the central quantity, with mild indication of a maximum in the stellar fraction only at low redshift. Overall, we find that the emergence of the golden mass is driven by the SN- and AGN-feedback and appears earlier in cosmic time for stronger-feedback simulations, which faster quench star formation in the most massive galaxies. (abridged) △ Less

Submitted 19 February, 2025; originally announced February 2025.

Comments: Submitted to A&A, 11 pages, 6 figures, 4 tables

arXiv:2502.13242 [pdf, other]

Learning the Universe: $3\ h^{-1}{\rm Gpc}$ Tests of a Field Level $N$-body Simulation Emulator

Authors: Matthew T. Scoggins, Matthew Ho, Francisco Villaescusa-Navarro, Drew Jamieson, Ludvig Doeser, Greg L. Bryan

Abstract: We apply and test a field-level emulator for non-linear cosmic structure formation in a volume matching next-generation surveys. Inferring the cosmological parameters and initial conditions from which the particular galaxy distribution of our Universe was seeded can be achieved by comparing simulated data to observational data. Previous work has focused on building accelerated forward models that… ▽ More We apply and test a field-level emulator for non-linear cosmic structure formation in a volume matching next-generation surveys. Inferring the cosmological parameters and initial conditions from which the particular galaxy distribution of our Universe was seeded can be achieved by comparing simulated data to observational data. Previous work has focused on building accelerated forward models that efficiently mimic these simulations. One of these accelerated forward models uses machine learning to apply a non-linear correction to the linear $z=0$ Zeldovich approximation (ZA) fields, closely matching the cosmological statistics in the $N$-body simulation. This emulator was trained and tested at $(h^{-1}{\rm Gpc})^3$ volumes, although cosmological inference requires significantly larger volumes. We test this emulator at $(3\ h^{-1}{\rm Gpc})^3$ by comparing emulator outputs to $N$-body simulations for eight unique cosmologies. We consider several summary statistics, applied to both the raw particle fields and the dark matter (DM) haloes. We find that the power spectrum, bispectrum and wavelet statistics of the raw particle fields agree with the $N$-body simulations within ${\sim} 5 \%$ at most scales. For the haloes, we find a similar agreement between the emulator and the $N$-body for power spectrum and bispectrum, though a comparison of the stacked profiles of haloes shows that the emulator has slight errors in the positions of particles in the highly non-linear interior of the halo. At these large $(3\ h^{-1}{\rm Gpc})^3$ volumes, the emulator can create $z=0$ particle fields in a thousandth of the time required for $N$-body simulations and will be a useful tool for large-scale cosmological inference. This is a Learning the Universe publication. △ Less

Submitted 18 February, 2025; originally announced February 2025.

Comments: 9 pages, 7 figures. This is a Learning the Universe publication

arXiv:2502.13239 [pdf, other]

Towards Robustness Across Cosmological Simulation Models TNG, SIMBA, ASTRID, and EAGLE

Authors: Yongseok Jo, Shy Genel, Anirvan Sengupta, Benjamin Wandelt, Rachel Somerville, Francisco Villaescusa-Navarro

Abstract: The rapid advancement of large-scale cosmological simulations has opened new avenues for cosmological and astrophysical research. However, the increasing diversity among cosmological simulation models presents a challenge to the robustness. In this work, we develop the Model-Insensitive ESTimator (MIEST), a machine that can robustly estimate the cosmological parameters, $Ω_m$ and $σ_8$, from neura… ▽ More The rapid advancement of large-scale cosmological simulations has opened new avenues for cosmological and astrophysical research. However, the increasing diversity among cosmological simulation models presents a challenge to the robustness. In this work, we develop the Model-Insensitive ESTimator (MIEST), a machine that can robustly estimate the cosmological parameters, $Ω_m$ and $σ_8$, from neural hydrogen maps of simulation models in the CAMELS project$-$TNG, SIMBA, ASTRID, and EAGLE. An estimator is considered robust if it possesses a consistent predictive power across all simulations, including those used during the training phase. We train our machine using multiple simulation models and ensure that it only extracts common features between the models while disregarding the model-specific features. This allows us to develop a novel model that is capable of accurately estimating parameters across a range of simulation models, without being biased towards any particular model. Upon the investigation of the latent space$-$a set of summary statistics, we find that the implementation of robustness leads to the blending of latent variables across different models, demonstrating the removal of model-specific features. In comparison to a standard machine lacking robustness, the average performance of MIEST on the unseen simulations during the training phase has been improved by $\sim17$% for $Ω_m$ and $\sim 38$% for $σ_8$. By using a machine learning approach that can extract robust, yet physical features, we hope to improve our understanding of galaxy formation and evolution in a (subgrid) model-insensitive manner, and ultimately, gain insight into the underlying physical processes responsible for robustness. This is a Learning the Universe publication. △ Less

Submitted 18 February, 2025; originally announced February 2025.

Comments: This is a Learning the Universe publication. 26 pages, 11 figures

arXiv:2412.05662 [pdf, other]

Probing massive neutrinos and modified gravity with redshift-space morphologies and anisotropies of large-scale structure

Authors: Wei Liu, Liang Wu, Francisco Villaescusa-Navarro, Marco Baldi, Georgios Valogiannis, Wenjuan Fang

Abstract: Strong degeneracy exists between some modified gravity (MG) models and massive neutrinos because the enhanced structure growth produced by modified gravity can be suppressed due to the free-streaming massive neutrinos. Previous works showed this degeneracy can be broken with non-Gaussian or velocity information. Therefore in this work, we focus on the large-scale structure (LSS) in redshift space… ▽ More Strong degeneracy exists between some modified gravity (MG) models and massive neutrinos because the enhanced structure growth produced by modified gravity can be suppressed due to the free-streaming massive neutrinos. Previous works showed this degeneracy can be broken with non-Gaussian or velocity information. Therefore in this work, we focus on the large-scale structure (LSS) in redshift space and investigate for the first time the possibility of using the non-Gaussian information and velocity information captured by the 3D scalar Minkowski functionals (MFs) and the 3D Minkowski tensors (MTs) to break this degeneracy. Based on the Quijote and Quijote-MG simulations, we find the imprints on redshift space LSS left by the Hu-Sawicki $f(R)$ gravity can be discriminated from those left by massive neutrinos with these statistics. With the Fisher information formalism, we first show how the MTs extract information with their perpendicular and parallel elements for both low- and high-density regions; then we compare constraints from the power spectrum monopole and MFs in real space with those in redshift space, and investigate how the constraining power is further improved with anisotropies captured by the quadrupole and hexadecapole of the power spectrum and the MTs; finally, we combine the power spectrum multipoles with MFs plus MTs and find the constraints from the power spectrum multipoles on $Ω_{\mathrm{m}}, h, σ_8$, $M_ν$, and $f_{R_0}$ can be improved, because they are complemented with non-Gaussian information, by a factor of 3.4, 3.0, 3.3, 3.3, and 1.9 on small scales ($k_{\rm{max}}=0.5~h\rm{Mpc}^{-1},\ R_G=5~h^{-1}\rm{Mpc}$), and 2.8, 2.2, 3.4, 3.4, and 1.5 on large scales ($k_{\rm{max}}=0.25~h\rm{Mpc}^{-1},\ R_G=10~h^{-1}\rm{Mpc}$). △ Less

Submitted 28 April, 2025; v1 submitted 7 December, 2024; originally announced December 2024.

Comments: 47 pages, 17 figures, 5 tables, accepted by JCAP

arXiv:2412.04559 [pdf, other]

doi 10.3847/1538-4357/adc450

X-raying CAMELS: Constraining Baryonic Feedback in the Circum-Galactic Medium with the CAMELS simulations and eRASS X-ray Observations

Authors: Erwin T. Lau, Daisuke Nagai, Ákos Bogdán, Isabel Medlock, Benjamin D. Oppenheimer, Nicholas Battaglia, Daniel Anglés-Alcázar, Shy Genel, Yueying Ni, Francisco Villaescusa-Navarro

Abstract: The circumgalactic medium (CGM) around massive galaxies plays a crucial role in regulating star formation and feedback. Using the CAMELS simulation suite, we develop emulators for the X-ray surface brightness profile and the X-ray luminosity--stellar mass scaling relation to investigate how stellar and AGN feedback shape the X-ray properties of the hot CGM. Our analysis shows that at CGM scales (… ▽ More The circumgalactic medium (CGM) around massive galaxies plays a crucial role in regulating star formation and feedback. Using the CAMELS simulation suite, we develop emulators for the X-ray surface brightness profile and the X-ray luminosity--stellar mass scaling relation to investigate how stellar and AGN feedback shape the X-ray properties of the hot CGM. Our analysis shows that at CGM scales ($10^{12} \lesssim M_{\rm halo}/M_\odot \lesssim 10^{13}$, $10\lesssim r/{\rm kpc} \lesssim 400$), stellar feedback more significantly impacts the X-ray properties than AGN feedback within the parameters studied. Comparing the emulators to recent eROSITA All-Sky Survey observations, it was found that stronger feedback than currently implemented in the IllustrisTNG, SIMBA, and Astrid simulations is required to match observed CGM properties. However, adopting these enhanced feedback parameters causes deviations in the stellar-mass-halo-mass relations from observational constraints below the group mass scale. This tension suggests possible unaccounted systematics in X-ray CGM observations or inadequacies in the feedback models of cosmological simulations. △ Less

Submitted 14 February, 2025; v1 submitted 5 December, 2024; originally announced December 2024.

Comments: 14 pages, 6 figures, ApJ accepted. Updated to match the accepted version

Journal ref: 2025, ApJ, 984,190

arXiv:2411.14377 [pdf, other]

The constraining power of the Marked Power Spectrum: an analytical study

Authors: Marco Marinucci, Gabriel Jung, Michele Liguori, Andrea Ravenni, Francesco Spezzati, Adam Andrews, Marco Baldi, William R. Coulton, Dionysios Karagiannis, Francisco Villaescusa-Navarro, Benjamin Wandlet

Abstract: The marked power spectrum - a two-point correlation function of a transformed density field - has emerged as a promising tool for extracting cosmological information from the large-scale structure of the Universe. In this work, we present the first comprehensive analytical study of the marked power spectrum's sensitivity to primordial non-Gaussianity (PNG) of the non-local type. We extend previous… ▽ More The marked power spectrum - a two-point correlation function of a transformed density field - has emerged as a promising tool for extracting cosmological information from the large-scale structure of the Universe. In this work, we present the first comprehensive analytical study of the marked power spectrum's sensitivity to primordial non-Gaussianity (PNG) of the non-local type. We extend previous effective field theory frameworks to incorporate PNG, developing a complete theoretical model that we validate against the Quijote simulation suite. Through a systematic Fisher analysis, we compare the constraining power of the marked power spectrum against traditional approaches combining the power spectrum and bispectrum (P+B). We explore different choices of mark parameters to evaluate their impact on parameter constraints, particularly focusing on equilateral and orthogonal PNG as well as neutrino masses. Our analysis shows that while marking up underdense regions yields optimal constraints in the low shot-noise regime, the marked power spectrum's performance for discrete tracers with BOSS-like number densities does not surpass that of P+B analysis at mildly non-linear scales ($k \lesssim 0.25 \,h/\text{Mpc}$). However, the marked approach offers several practical advantages, including simpler estimation procedures and potentially more manageable systematic effects. Our theoretical framework reveals how the marked power spectrum incorporates higher-order correlation information through terms resembling tree-level bispectra and power spectrum convolutions. This work establishes a robust foundation for applying marked statistics to future large-volume surveys. △ Less

Submitted 21 November, 2024; originally announced November 2024.

Comments: 18 pages, 7 figures

arXiv:2411.13960 [pdf, other]

Learning the Universe: Cosmological and Astrophysical Parameter Inference with Galaxy Luminosity Functions and Colours

Authors: Christopher C. Lovell, Tjitske Starkenburg, Matthew Ho, Daniel Anglés-Alcázar, Romeel Davé, Austen Gabrielpillai, Kartheik Iyer, Alice E. Matthews, William J. Roper, Rachel Somerville, Laura Sommovigo, Francisco Villaescusa-Navarro

Abstract: We perform the first direct cosmological and astrophysical parameter inference from the combination of galaxy luminosity functions and colours using a simulation based inference approach. Using the Synthesizer code we simulate the dust attenuated ultraviolet--near infrared stellar emission from galaxies in thousands of cosmological hydrodynamic simulations from the CAMELS suite, including the Swif… ▽ More We perform the first direct cosmological and astrophysical parameter inference from the combination of galaxy luminosity functions and colours using a simulation based inference approach. Using the Synthesizer code we simulate the dust attenuated ultraviolet--near infrared stellar emission from galaxies in thousands of cosmological hydrodynamic simulations from the CAMELS suite, including the Swift-EAGLE, Illustris-TNG, Simba & Astrid galaxy formation models. For each galaxy we calculate the rest-frame luminosity in a number of photometric bands, including the SDSS $\textit{ugriz}$ and GALEX FUV & NUV filters; this dataset represents the largest catalogue of synthetic photometry based on hydrodynamic galaxy formation simulations produced to date, totalling >200 million sources. From these we compile luminosity functions and colour distributions, and find clear dependencies on both cosmology and feedback. We then perform simulation based (likelihood-free) inference using these distributions, and obtain constraints on both cosmological and astrophysical parameters. Both colour distributions and luminosity functions provide complementary information on certain parameters when performing inference. Most interestingly we achieve constraints on $σ_8$, describing the clustering of matter. This is attributable to the fact that the photometry encodes the star formation--metal enrichment history of each galaxy; galaxies in a universe with a higher $σ_8$ tend to form earlier and have higher metallicities, which leads to redder colours. We find that a model trained on one galaxy formation simulation generalises poorly when applied to another, and attribute this to differences in the subgrid prescriptions, and lack of flexibility in our emission modelling. The photometric catalogues are publicly available at: https://camels.readthedocs.io/ . △ Less

Submitted 21 November, 2024; originally announced November 2024.

Comments: 28 pages, 20 figures, submitted to MNRAS. Comments and feedback welcome!

arXiv:2410.21225 [pdf, other]

Boosting HI-Galaxy Cross-Clustering Signal through Higher-Order Cross-Correlations

Authors: Eishica Chand, Arka Banerjee, Simon Foreman, Francisco Villaescusa-Navarro

Abstract: After reionization, neutral hydrogen (HI) traces the large-scale structure (LSS) of the Universe, enabling HI intensity mapping (IM) to capture the LSS in 3D and constrain key cosmological parameters. We present a new framework utilizing higher-order cross-correlations to study HI clustering around galaxies, tested using real-space data from the IllustrisTNG300 simulation. This approach computes t… ▽ More After reionization, neutral hydrogen (HI) traces the large-scale structure (LSS) of the Universe, enabling HI intensity mapping (IM) to capture the LSS in 3D and constrain key cosmological parameters. We present a new framework utilizing higher-order cross-correlations to study HI clustering around galaxies, tested using real-space data from the IllustrisTNG300 simulation. This approach computes the joint distributions of $k$-nearest neighbor ($k$NN) optical galaxies and the HI brightness temperature field smoothed at relevant scales (the $k$NN-field framework), providing sensitivity to all higher-order cross-correlations, unlike two-point statistics. To simulate HI data from actual surveys, we add random thermal noise and apply a simple foreground cleaning model, filtering out Fourier modes of the brightness temperature field with $k_\parallel < k_{\rm min,\parallel}$. Under current levels of thermal noise and foreground cleaning, typical of a Canadian Hydrogen Intensity Mapping Experiment (CHIME)-like survey, the HI-galaxy cross-correlation signal in our simulations, using the $k$NN-field framework, is detectable at $>30σ$ across $r = [3,12] \, h^{-1}$Mpc. In contrast, the detectability of the standard two-point correlation function (2PCF) over the same scales depends strongly on the foreground filter: a sharp $k_\parallel$ filter can spuriously boost detection to $8σ$ due to position-space ringing, whereas a less sharp filter yields no detection. Nonetheless, we conclude that $k$NN-field cross-correlations are robustly detectable across a broad range of foreground filtering and thermal noise conditions, suggesting their potential for enhanced constraining power over 2PCFs. △ Less

Submitted 28 October, 2024; originally announced October 2024.

Comments: 14 pages, 9 figures (with 2 figures included in the appendix), 2 tables. Comments are welcome

arXiv:2410.16361 [pdf, other]

doi 10.3847/1538-4357/ada442

Quantifying Baryonic Feedback on Warm-Hot Circumgalactic Medium in CAMELS Simulations

Authors: Isabel Medlock, Chloe Neufeld, Daisuke Nagai, Daniel Anglés Alcázar, Shy Genel, Benjamin Oppenheimer, Xavier Sims, Priyanka Singh, Francisco Villaescusa-Navarro

Abstract: The baryonic physics shaping galaxy formation and evolution are complex, spanning a vast range of scales and making them challenging to model. Cosmological simulations rely on subgrid models that produce significantly different predictions. Understanding how models of stellar and active galactic nuclei (AGN) feedback affect baryon behavior across different halo masses and redshifts is essential. U… ▽ More The baryonic physics shaping galaxy formation and evolution are complex, spanning a vast range of scales and making them challenging to model. Cosmological simulations rely on subgrid models that produce significantly different predictions. Understanding how models of stellar and active galactic nuclei (AGN) feedback affect baryon behavior across different halo masses and redshifts is essential. Using the SIMBA and IllustrisTNG suites from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project, we explore the effect of parameters governing the subgrid implementation of stellar and AGN feedback. We find that while IllustrisTNG shows higher cumulative feedback energy across all halos, SIMBA demonstrates a greater spread of baryons, quantified by the closure radius and circumgalactic medium (CGM) gas fraction. This suggests that feedback in SIMBA couples more effectively to baryons and drives them more efficiently within the host halo. There is evidence that different feedback modes are highly interrelated in these subgrid models. Parameters controlling stellar feedback efficiency significantly impact AGN feedback, as seen in the suppression of black hole mass growth and delayed activation of AGN feedback to higher mass halos with increasing stellar feedback efficiency in both simulations. Additionally, AGN feedback efficiency parameters affect the CGM gas fraction at low halo masses in SIMBA, hinting at complex, non-linear interactions between AGN and SNe feedback modes. Overall, we demonstrate that stellar and AGN feedback are intimately interwoven, especially at low redshift, due to subgrid implementation, resulting in halo property effects that might initially seem counterintuitive. △ Less

Submitted 2 January, 2025; v1 submitted 21 October, 2024; originally announced October 2024.

Comments: 29 pages, 8 figures, accepted to ApJ

arXiv:2410.10942 [pdf, other]

Cosmological and Astrophysical Parameter Inference from Stacked Galaxy Cluster Profiles Using CAMELS-zoomGZ

Authors: Elena Hernández-Martínez, Shy Genel, Francisco Villaescusa-Navarro, Ulrich P. Steinwandel, Max E. Lee, Erwin T. Lau, David N. Spergel

Abstract: We present a study on the inference of cosmological and astrophysical parameters using stacked galaxy cluster profiles. Utilizing the CAMELS-zoomGZ simulations, we explore how various cluster properties--such as X-ray surface brightness, gas density, temperature, metallicity, and Compton-y profiles--can be used to predict parameters within the 28-dimensional parameter space of the IllustrisTNG mod… ▽ More We present a study on the inference of cosmological and astrophysical parameters using stacked galaxy cluster profiles. Utilizing the CAMELS-zoomGZ simulations, we explore how various cluster properties--such as X-ray surface brightness, gas density, temperature, metallicity, and Compton-y profiles--can be used to predict parameters within the 28-dimensional parameter space of the IllustrisTNG model. Through neural networks, we achieve a high correlation coefficient of 0.97 or above for all cosmological parameters, including $Ω_{\rm m}$, $H_0$, and $σ_8$, and over 0.90 for the remaining astrophysical parameters, showcasing the effectiveness of these profiles for parameter inference. We investigate the impact of different radial cuts, with bins ranging from $0.1R_{200c}$ to $0.7R_{200c}$, to simulate current observational constraints. Additionally, we perform a noise sensitivity analysis, adding up to 40\% Gaussian noise (corresponding to signal-to-noise ratios as low as 2.5), revealing that key parameters such as $Ω_{\rm m}$, $H_0$, and the IMF slope remain robust even under extreme noise conditions. We also compare the performance of full radial profiles against integrated quantities, finding that profiles generally lead to more accurate parameter inferences. Our results demonstrate that stacked galaxy cluster profiles contain crucial information on both astrophysical processes within groups and clusters and the underlying cosmology of the universe. This underscores their significance for interpreting the complex data expected from next-generation surveys and reveals, for the first time, their potential as a powerful tool for parameter inference. △ Less

Submitted 14 October, 2024; originally announced October 2024.

Comments: Submitted to ApJ

arXiv:2409.20507 [pdf, other]

Constraining Cosmology with Simulation-based inference and Optical Galaxy Cluster Abundance

Authors: Moonzarin Reza, Yuanyuan Zhang, Camille Avestruz, Louis E. Strigari, Simone Shevchuk, Francisco Villaescusa-Navarro

Abstract: We test the robustness of simulation-based inference (SBI) in the context of cosmological parameter estimation from galaxy cluster counts and masses in simulated optical datasets. We construct ``simulations'' using analytical models for the galaxy cluster halo mass function (HMF) and for the observed richness (number of observed member galaxies) to train and test the SBI method. We compare the SBI… ▽ More We test the robustness of simulation-based inference (SBI) in the context of cosmological parameter estimation from galaxy cluster counts and masses in simulated optical datasets. We construct ``simulations'' using analytical models for the galaxy cluster halo mass function (HMF) and for the observed richness (number of observed member galaxies) to train and test the SBI method. We compare the SBI parameter posterior samples to those from an MCMC analysis that uses the same analytical models to construct predictions of the observed data vector. The two methods exhibit comparable performance, with reliable constraints derived for the primary cosmological parameters, ($Ω_m$ and $σ_8$), and richness-mass relation parameters. We also perform out-of-domain tests with observables constructed from galaxy cluster-sized halos in the Quijote simulations. Again, the SBI and MCMC results have comparable posteriors, with similar uncertainties and biases. Unsurprisingly, upon evaluating the SBI method on thousands of simulated data vectors that span the parameter space, SBI exhibits worsened posterior calibration metrics in the out-of-domain application. We note that such calibration tests with MCMC is less computationally feasible and highlight the potential use of SBI to stress-test limitations of analytical models, such as in the use for constructing models for inference with MCMC. △ Less

Submitted 30 September, 2024; originally announced September 2024.

arXiv:2409.09124 [pdf, other]

CHARM: Creating Halos with Auto-Regressive Multi-stage networks

Authors: Shivam Pandey, Chirag Modi, Benjamin D. Wandelt, Deaglan J. Bartlett, Adrian E. Bayer, Greg L. Bryan, Matthew Ho, Guilhem Lavaux, T. Lucas Makinen, Francisco Villaescusa-Navarro

Abstract: To maximize the amount of information extracted from cosmological datasets, simulations that accurately represent these observations are necessary. However, traditional simulations that evolve particles under gravity by estimating particle-particle interactions (N-body simulations) are computationally expensive and prohibitive to scale to the large volumes and resolutions necessary for the upcomin… ▽ More To maximize the amount of information extracted from cosmological datasets, simulations that accurately represent these observations are necessary. However, traditional simulations that evolve particles under gravity by estimating particle-particle interactions (N-body simulations) are computationally expensive and prohibitive to scale to the large volumes and resolutions necessary for the upcoming datasets. Moreover, modeling the distribution of galaxies typically involves identifying virialized dark matter halos, which is also a time- and memory-consuming process for large N-body simulations, further exacerbating the computational cost. In this study, we introduce CHARM, a novel method for creating mock halo catalogs by matching the spatial, mass, and velocity statistics of halos directly from the large-scale distribution of the dark matter density field. We develop multi-stage neural spline flow-based networks to learn this mapping at redshift z=0.5 directly with computationally cheaper low-resolution particle mesh simulations instead of relying on the high-resolution N-body simulations. We show that the mock halo catalogs and painted galaxy catalogs have the same statistical properties as obtained from $N$-body simulations in both real space and redshift space. Finally, we use these mock catalogs for cosmological inference using redshift-space galaxy power spectrum, bispectrum, and wavelet-based statistics using simulation-based inference, performing the first inference with accelerated forward model simulations and finding unbiased cosmological constraints with well-calibrated posteriors. The code was developed as part of the Simons Collaboration on Learning the Universe and is publicly available at \url{https://github.com/shivampcosmo/CHARM}. △ Less

Submitted 13 September, 2024; originally announced September 2024.

Comments: 12 pages and 8 figures. This is a Learning the Universe Publication

arXiv:2409.02980 [pdf, other]

How DREAMS are made: Emulating Satellite Galaxy and Subhalo Populations with Diffusion Models and Point Clouds

Authors: Tri Nguyen, Francisco Villaescusa-Navarro, Siddharth Mishra-Sharma, Carolina Cuesta-Lazaro, Paul Torrey, Arya Farahi, Alex M. Garcia, Jonah C. Rose, Stephanie O'Neil, Mark Vogelsberger, Xuejian Shen, Cian Roche, Daniel Anglés-Alcázar, Nitya Kallivayalil, Julian B. Muñoz, Francis-Yan Cyr-Racine, Sandip Roy, Lina Necib, Kassidy E. Kollmann

Abstract: The connection between galaxies and their host dark matter (DM) halos is critical to our understanding of cosmology, galaxy formation, and DM physics. To maximize the return of upcoming cosmological surveys, we need an accurate way to model this complex relationship. Many techniques have been developed to model this connection, from Halo Occupation Distribution (HOD) to empirical and semi-analytic… ▽ More The connection between galaxies and their host dark matter (DM) halos is critical to our understanding of cosmology, galaxy formation, and DM physics. To maximize the return of upcoming cosmological surveys, we need an accurate way to model this complex relationship. Many techniques have been developed to model this connection, from Halo Occupation Distribution (HOD) to empirical and semi-analytic models to hydrodynamic. Hydrodynamic simulations can incorporate more detailed astrophysical processes but are computationally expensive; HODs, on the other hand, are computationally cheap but have limited accuracy. In this work, we present NeHOD, a generative framework based on variational diffusion model and Transformer, for painting galaxies/subhalos on top of DM with an accuracy of hydrodynamic simulations but at a computational cost similar to HOD. By modeling galaxies/subhalos as point clouds, instead of binning or voxelization, we can resolve small spatial scales down to the resolution of the simulations. For each halo, NeHOD predicts the positions, velocities, masses, and concentrations of its central and satellite galaxies. We train NeHOD on the TNG-Warm DM suite of the DREAMS project, which consists of 1024 high-resolution zoom-in hydrodynamic simulations of Milky Way-mass halos with varying warm DM mass and astrophysical parameters. We show that our model captures the complex relationships between subhalo properties as a function of the simulation parameters, including the mass functions, stellar-halo mass relations, concentration-mass relations, and spatial clustering. Our method can be used for a large variety of downstream applications, from galaxy clustering to strong lensing studies. △ Less

Submitted 4 September, 2024; originally announced September 2024.

Comments: Submitted to ApJ; 30 + 6 pages; 11 + 4 figures; Comments welcomed

arXiv:2408.07699 [pdf, other]

Field-level Emulation of Cosmic Structure Formation with Cosmology and Redshift Dependence

Authors: Drew Jamieson, Yin Li, Francisco Villaescusa-Navarro, Shirley Ho, David N. Spergel

Abstract: We present a field-level emulator for large-scale structure, capturing the cosmology dependence and the time evolution of cosmic structure formation. The emulator maps linear displacement fields to their corresponding nonlinear displacements from N-body simulations at specific redshifts. Designed as a neural network, the emulator incorporates style parameters that encode dependencies on… ▽ More We present a field-level emulator for large-scale structure, capturing the cosmology dependence and the time evolution of cosmic structure formation. The emulator maps linear displacement fields to their corresponding nonlinear displacements from N-body simulations at specific redshifts. Designed as a neural network, the emulator incorporates style parameters that encode dependencies on $Ω_{\rm m}$ and the linear growth factor $D(z)$ at redshift $z$. We train our model on the six-dimensional N-body phase space, predicting particle velocities as the time derivative of the model's displacement outputs. This innovation results in significant improvements in training efficiency and model accuracy. Tested on diverse cosmologies and redshifts not seen during training, the emulator achieves percent-level accuracy on scales of $k\sim~1~{\rm Mpc}^{-1}~h$ at $z=0$, with improved performance at higher redshifts. We compare predicted structure formation histories with N-body simulations via merger trees, finding consistent merger event sequences and statistical properties. △ Less

Submitted 14 August, 2024; originally announced August 2024.

arXiv:2407.18647 [pdf, other]

doi 10.1088/1475-7516/2024/11/061

Towards unveiling the large-scale nature of gravity with the wavelet scattering transform

Authors: Georgios Valogiannis, Francisco Villaescusa-Navarro, Marco Baldi

Abstract: We present the first application of the Wavelet Scattering Transform (WST) in order to constrain the nature of gravity using the three-dimensional (3D) large-scale structure of the universe. Utilizing the Quijote-MG N-body simulations, we can reliably model the 3D matter overdensity field for the f(R) Hu-Sawicki modified gravity (MG) model down to $k_{\rm max}=0.5$ h/Mpc. Combining these simulatio… ▽ More We present the first application of the Wavelet Scattering Transform (WST) in order to constrain the nature of gravity using the three-dimensional (3D) large-scale structure of the universe. Utilizing the Quijote-MG N-body simulations, we can reliably model the 3D matter overdensity field for the f(R) Hu-Sawicki modified gravity (MG) model down to $k_{\rm max}=0.5$ h/Mpc. Combining these simulations with the Quijote $ν$CDM collection, we then conduct a Fisher forecast of the marginalized constraints obtained on gravity using the WST coefficients and the matter power spectrum at redshift z=0. Our results demonstrate that the WST substantially improves upon the 1$σ$ error obtained on the parameter that captures deviations from standard General Relativity (GR), yielding a tenfold improvement compared to the corresponding matter power spectrum result. At the same time, the WST also enhances the precision on the $Λ$CDM parameters and the sum of neutrino masses, by factors of 1.2-3.4 compared to the matter power spectrum, respectively. Despite the overall reduction in the WST performance when we focus on larger scales, it still provides a relatively $4.5\times$ tighter 1$σ$ error for the MG parameter at $k_{\rm max}=0.2$ h/Mpc, highlighting its great sensitivity to the underlying gravity theory. This first proof-of-concept study reaffirms the constraining properties of the WST technique and paves the way for exciting future applications in order to perform precise large-scale tests of gravity with the new generation of cutting-edge cosmological data. △ Less

Submitted 26 July, 2024; originally announced July 2024.

Comments: 19 pages, 15 figures, 1 table

Journal ref: JCAP11(2024)061

arXiv:2407.06641 [pdf, other]

Cosmological simulations of scale-dependent primordial non-Gaussianity

Authors: Marco Baldi, Emanuele Fondi, Dionysios Karagiannis, Lauro Moscardini, Andrea Ravenni, William R. Coulton, Gabriel Jung, Michele Liguori, Marco Marinucci, Licia Verde, Francisco Villaescusa-Navarro, Banjamin D. Wandelt

Abstract: We present the results of a set of cosmological N-body simulations with standard $Λ$CDM cosmology but characterized by a scale-dependent primordial non-Gaussianity of the local type featuring a power-law dependence of the $f_{\rm NL}^{\rm loc}(k)$ at large scales followed by a saturation to a constant value at smaller scales where non-linear growth leads to the formation of collapsed cosmic struct… ▽ More We present the results of a set of cosmological N-body simulations with standard $Λ$CDM cosmology but characterized by a scale-dependent primordial non-Gaussianity of the local type featuring a power-law dependence of the $f_{\rm NL}^{\rm loc}(k)$ at large scales followed by a saturation to a constant value at smaller scales where non-linear growth leads to the formation of collapsed cosmic structures. Such models are built to ensure consistency with current Cosmic Microwave Background bounds on primordial non-Gaussianity yet allowing for large effects of the non-Gaussian statistics on the properties of non-linear structure formation. We show the impact of such scale-dependent non-Gaussian scenarios on a wide range of properties of the resulting cosmic structures, such as the non-linear matter power spectrum, the halo and sub-halo mass functions, the concentration-mass relation, the halo and void density profiles, and we highlight for the first time that some of these models might mimic the effects of Warm Dark Matter for several of such observables △ Less

Submitted 11 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

Comments: 21 pages, 9 figures, 2 tables; to be submitted to JCAP

arXiv:2406.15546 [pdf, other]

The Impact of Non-Gaussian Primordial Tails on Cosmological Observables

Authors: William R. Coulton, Oliver H. E. Philcox, Francisco Villaescusa-Navarro

Abstract: Whilst current observational evidence favors a close-to-Gaussian spectrum of primordial perturbations, there exist many models of the early Universe that predict this distribution to have exponentially enhanced or suppressed tails. In this work, we generate realizations of the primordial potential with non-Gaussian tails via a phenomenological model; these are then evolved numerically to obtain ma… ▽ More Whilst current observational evidence favors a close-to-Gaussian spectrum of primordial perturbations, there exist many models of the early Universe that predict this distribution to have exponentially enhanced or suppressed tails. In this work, we generate realizations of the primordial potential with non-Gaussian tails via a phenomenological model; these are then evolved numerically to obtain maps of the cosmic microwave background (CMB) and large-scale structure (LSS). In the CMB maps, our added non-Gaussianity manifests as a localized enhancement of hot and cold spots, which would be expected to contribute to $N$-point functions up to large $N$. Such models are indirectly constrained by \textit{Planck} trispectrum bounds, which restrict the changes in the temperature fluctuations to $O(10μ\mathrm{K})$. In the late-time Universe, we find that tailed cosmologies lead to a halo mass function enhanced at high masses, as expected. Furthermore, significant scale-dependent bias in the halo-halo and halo-matter power spectrum is also sourced, which arises from the squeezed limit of large $N$-point functions that are implicitly generated through the enhancement of the tails. These results underscore that a detection of scale-dependent bias alone cannot be used to rule out single field inflation, but can be used together with other statistics to probe a wide range of primordial processes. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: Comments welcome!

arXiv:2405.13491 [pdf, other]

doi 10.1051/0004-6361/202450810

Euclid. I. Overview of the Euclid mission

Authors: Euclid Collaboration, Y. Mellier, Abdurro'uf, J. A. Acevedo Barroso, A. Achúcarro, J. Adamek, R. Adam, G. E. Addison, N. Aghanim, M. Aguena, V. Ajani, Y. Akrami, A. Al-Bahlawan, A. Alavi, I. S. Albuquerque, G. Alestas, G. Alguero, A. Allaoui, S. W. Allen, V. Allevato, A. V. Alonso-Tetilla, B. Altieri, A. Alvarez-Candal, S. Alvi, A. Amara , et al. (1115 additional authors not shown)

Abstract: The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14… ▽ More The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14,000 deg^2 of extragalactic sky. In addition to accurate weak lensing and clustering measurements that probe structure formation over half of the age of the Universe, its primary probes for cosmology, these exquisite data will enable a wide range of science. This paper provides a high-level overview of the mission, summarising the survey characteristics, the various data-processing steps, and data products. We also highlight the main science objectives and expected performance. △ Less

Submitted 24 September, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

Comments: Accepted for publication in the A&A special issue`Euclid on Sky'

Journal ref: A&A 697, A1 (2025)

arXiv:2405.13119 [pdf, other]

doi 10.3847/1538-4357/adc99d

Cosmology from Point Clouds with Dark Matter Halos from the Quijote Simulations

Authors: Atrideb Chatterjee, Francisco Villaescusa-Navarro

Abstract: We train a novel deep learning architecture to perform likelihood-free inference on the value of the cosmological parameters from halo catalogs of the Quijote N-body simulations. Our model takes as input a halo catalog where each halo is characterized by its position, mass, and velocity modulus. By construction, our model is E(3) invariant and is designed to extract information hierarchically. Unl… ▽ More We train a novel deep learning architecture to perform likelihood-free inference on the value of the cosmological parameters from halo catalogs of the Quijote N-body simulations. Our model takes as input a halo catalog where each halo is characterized by its position, mass, and velocity modulus. By construction, our model is E(3) invariant and is designed to extract information hierarchically. Unlike graph neural networks, it does not require the transformation of the input halo (or galaxy) catalog into a graph. Given its simplicity, our model can process point clouds with large numbers of points. We discuss the advantages of this class of methods but also point out their limitations and potential ways to improve them for cosmological data. △ Less

Submitted 22 May, 2025; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: Accepted in ApJ

arXiv:2405.05598 [pdf, other]

Denoising Diffusion Delensing Delight: Reconstructing the Non-Gaussian CMB Lensing Potential with Diffusion Models

Authors: Thomas Flöss, William R. Coulton, Adriaan J. Duivenvoorden, Francisco Villaescusa-Navarro, Benjamin D. Wandelt

Abstract: Optimal extraction of cosmological information from observations of the Cosmic Microwave Background critically relies on our ability to accurately undo the distortions caused by weak gravitational lensing. In this work, we demonstrate the use of denoising diffusion models in performing Bayesian lensing reconstruction. We show that score-based generative models can produce accurate, uncorrelated sa… ▽ More Optimal extraction of cosmological information from observations of the Cosmic Microwave Background critically relies on our ability to accurately undo the distortions caused by weak gravitational lensing. In this work, we demonstrate the use of denoising diffusion models in performing Bayesian lensing reconstruction. We show that score-based generative models can produce accurate, uncorrelated samples from the CMB lensing convergence map posterior, given noisy CMB observations. To validate our approach, we compare the samples of our model to those obtained using established Hamiltonian Monte Carlo methods, which assume a Gaussian lensing potential. We then go beyond this assumption of Gaussianity, and train and validate our model on non-Gaussian lensing data, obtained by ray-tracing N-body simulations. We demonstrate that in this case, samples from our model have accurate non-Gaussian statistics beyond the power spectrum. The method provides an avenue towards more efficient and accurate lensing reconstruction, that does not rely on an approximate analytic description of the posterior probability. The reconstructed lensing maps can be used as an unbiased tracer of the matter distribution, and to improve delensing of the CMB, resulting in more precise cosmological parameter inference. △ Less

Submitted 6 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: 12 pages, 10 figures. v2: typo in one of the equations fixed, references added

arXiv:2405.00766 [pdf, other]

Introducing the DREAMS Project: DaRk mattEr and Astrophysics with Machine learning and Simulations

Authors: Jonah C. Rose, Paul Torrey, Francisco Villaescusa-Navarro, Mariangela Lisanti, Tri Nguyen, Sandip Roy, Kassidy E. Kollmann, Mark Vogelsberger, Francis-Yan Cyr-Racine, Mikhail V. Medvedev, Shy Genel, Daniel Anglés-Alcázar, Nitya Kallivayalil, Bonny Y. Wang, Belén Costanza, Stephanie O'Neil, Cian Roche, Soumyodipta Karmakar, Alex M. Garcia, Ryan Low, Shurui Lin, Olivia Mostow, Akaxia Cruz, Andrea Caputo, Arya Farahi , et al. (5 additional authors not shown)

Abstract: We introduce the DREAMS project, an innovative approach to understanding the astrophysical implications of alternative dark matter models and their effects on galaxy formation and evolution. The DREAMS project will ultimately comprise thousands of cosmological hydrodynamic simulations that simultaneously vary over dark matter physics, astrophysics, and cosmology in modeling a range of systems -- f… ▽ More We introduce the DREAMS project, an innovative approach to understanding the astrophysical implications of alternative dark matter models and their effects on galaxy formation and evolution. The DREAMS project will ultimately comprise thousands of cosmological hydrodynamic simulations that simultaneously vary over dark matter physics, astrophysics, and cosmology in modeling a range of systems -- from galaxy clusters to ultra-faint satellites. Such extensive simulation suites can provide adequate training sets for machine-learning-based analyses. This paper introduces two new cosmological hydrodynamical suites of Warm Dark Matter, each comprised of 1024 simulations generated using the Arepo code. One suite consists of uniform-box simulations covering a $(25~h^{-1}~{\rm M}_\odot)^3$ volume, while the other consists of Milky Way zoom-ins with sufficient resolution to capture the properties of classical satellites. For each simulation, the Warm Dark Matter particle mass is varied along with the initial density field and several parameters controlling the strength of baryonic feedback within the IllustrisTNG model. We provide two examples, separately utilizing emulators and Convolutional Neural Networks, to demonstrate how such simulation suites can be used to disentangle the effects of dark matter and baryonic physics on galactic properties. The DREAMS project can be extended further to include different dark matter models, galaxy formation physics, and astrophysical targets. In this way, it will provide an unparalleled opportunity to characterize uncertainties on predictions for small-scale observables, leading to robust predictions for testing the particle physics nature of dark matter on these scales. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: 28 pages, 8 figures, DREAMS website: https://www.dreams-project.org

arXiv:2403.10648 [pdf, other]

Debiasing with Diffusion: Probabilistic reconstruction of Dark Matter fields from galaxies with CAMELS

Authors: Victoria Ono, Core Francisco Park, Nayantara Mudur, Yueying Ni, Carolina Cuesta-Lazaro, Francisco Villaescusa-Navarro

Abstract: Galaxies are biased tracers of the underlying cosmic web, which is dominated by dark matter components that cannot be directly observed. Galaxy formation simulations can be used to study the relationship between dark matter density fields and galaxy distributions. However, this relationship can be sensitive to assumptions in cosmology and astrophysical processes embedded in the galaxy formation mo… ▽ More Galaxies are biased tracers of the underlying cosmic web, which is dominated by dark matter components that cannot be directly observed. Galaxy formation simulations can be used to study the relationship between dark matter density fields and galaxy distributions. However, this relationship can be sensitive to assumptions in cosmology and astrophysical processes embedded in the galaxy formation models, that remain uncertain in many aspects. In this work, we develop a diffusion generative model to reconstruct dark matter fields from galaxies. The diffusion model is trained on the CAMELS simulation suite that contains thousands of state-of-the-art galaxy formation simulations with varying cosmological parameters and sub-grid astrophysics. We demonstrate that the diffusion model can predict the unbiased posterior distribution of the underlying dark matter fields from the given stellar mass fields, while being able to marginalize over uncertainties in cosmological and astrophysical models. Interestingly, the model generalizes to simulation volumes approximately 500 times larger than those it was trained on, and across different galaxy formation models. Code for reproducing these results can be found at https://github.com/victoriaono/variational-diffusion-cdm △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.10609 [pdf, other]

Zooming by in the CARPoolGP lane: new CAMELS-TNG simulations of zoomed-in massive halos

Authors: Max E. Lee, Shy Genel, Benjamin D. Wandelt, Benjamin Zhang, Ana Maria Delgado, Shivam Pandey, Erwin T. Lau, Christopher Carr, Harrison Cook, Daisuke Nagai, Daniel Angles-Alcazar, Francisco Villaescusa-Navarro, Greg L. Bryan

Abstract: Galaxy formation models within cosmological hydrodynamical simulations contain numerous parameters with non-trivial influences over the resulting properties of simulated cosmic structures and galaxy populations. It is computationally challenging to sample these high dimensional parameter spaces with simulations, particularly for halos in the high-mass end of the mass function. In this work, we dev… ▽ More Galaxy formation models within cosmological hydrodynamical simulations contain numerous parameters with non-trivial influences over the resulting properties of simulated cosmic structures and galaxy populations. It is computationally challenging to sample these high dimensional parameter spaces with simulations, particularly for halos in the high-mass end of the mass function. In this work, we develop a novel sampling and reduced variance regression method, CARPoolGP, which leverages built-in correlations between samples in different locations of high dimensional parameter spaces to provide an efficient way to explore parameter space and generate low variance emulations of summary statistics. We use this method to extend the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) to include a set of 768 zoom-in simulations of halos in the mass range of $10^{13} - 10^{14.5} M_\odot\,h^{-1}$ that span a 28-dimensional parameter space in the IllustrisTNG model. With these simulations and the CARPoolGP emulation method, we explore parameter trends in the Compton $Y-M$, black hole mass-halo mass, and metallicity-mass relations, as well as thermodynamic profiles and quenched fractions of satellite galaxies. We use these emulations to provide a physical picture of the complex interplay between supernova and active galactic nuclei feedback. We then use emulations of the $Y-M$ relation of massive halos to perform Fisher forecasts on astrophysical parameters for future Sunyaev-Zeldovich observations and find a significant improvement in forecasted constraints. We publicly release both the simulation suite and CARPoolGP software package. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: The manuscript was submitted to arxiv after receiving and responding to comments from the first referee report

arXiv:2403.02313 [pdf]

doi 10.3847/1538-4357/ad3070

Probing the Circum-Galactic Medium with Fast Radio Bursts: Insights from the CAMELS Simulations

Authors: Isabel Medlock, Daisuke Nagai, Priyanka Singh, Benjamin Oppenheimer, Daniel Anglés Alcázar, Francisco Villaescusa-Navarro

Abstract: Most diffuse baryons, including the circumgalactic medium (CGM) surrounding galaxies and the intergalactic medium (IGM) in the cosmic web, remain unmeasured and unconstrained. Fast Radio Bursts (FRBs) offer an unparalleled method to measure the electron dispersion measures (DMs) of ionized baryons. Their distribution can resolve the missing baryon problem, and constrain the history of feedback the… ▽ More Most diffuse baryons, including the circumgalactic medium (CGM) surrounding galaxies and the intergalactic medium (IGM) in the cosmic web, remain unmeasured and unconstrained. Fast Radio Bursts (FRBs) offer an unparalleled method to measure the electron dispersion measures (DMs) of ionized baryons. Their distribution can resolve the missing baryon problem, and constrain the history of feedback theorized to impart significant energy to the CGM and IGM. We analyze the Cosmology and Astrophysics in Machine Learning (CAMEL) Simulations, using three suites: IllustrisTNG, SIMBA, and Astrid, each varying 6 parameters (2 cosmological & 4 astrophysical feedback), for a total of 183 distinct simulation models. We find significantly different predictions between the fiducial models of the suites, owing to their different implementations of feedback. SIMBA exhibits the strongest feedback, leading to the smoothest distribution of baryons, reducing the sightline-to-sightline variance in DMs between z=0-1. Astrid has the weakest feedback and the largest variance. We calculate FRB CGM measurements as a function of galaxy impact parameter, with SIMBA showing the weakest DMs due to aggressive AGN feedback and Astrid the strongest. Within each suite, the largest differences are due to varying AGN feedback. IllustrisTNG shows the most sensitivity to supernova feedback, but this is due to the change in the AGN feedback strengths, demonstrating that black holes, not stars, are most capable of redistributing baryons in the IGM and CGM. We compare our statistics directly to recent observations, paving the way for the use of FRBs to constrain the physics of galaxy formation and evolution. △ Less

Submitted 11 July, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: 15 pages, 7 figures, Accepted to ApJ

arXiv:2403.00490 [pdf, other]

Quijote-PNG: Optimizing the summary statistics to measure Primordial non-Gaussianity

Authors: Gabriel Jung, Andrea Ravenni, Michele Liguori, Marco Baldi, William R. Coulton, Francisco Villaescusa-Navarro, Benjamin D. Wandelt

Abstract: We apply a suite of different estimators to the Quijote-PNG halo catalogues to find the best approach to constrain Primordial non-Gaussianity (PNG) at non-linear cosmological scales, up to $k_{\rm max} = 0.5 \, h\,{\rm Mpc}^{-1}$. The set of summary statistics considered in our analysis includes the power spectrum, bispectrum, halo mass function, marked power spectrum, and marked modal bispectrum.… ▽ More We apply a suite of different estimators to the Quijote-PNG halo catalogues to find the best approach to constrain Primordial non-Gaussianity (PNG) at non-linear cosmological scales, up to $k_{\rm max} = 0.5 \, h\,{\rm Mpc}^{-1}$. The set of summary statistics considered in our analysis includes the power spectrum, bispectrum, halo mass function, marked power spectrum, and marked modal bispectrum. Marked statistics are used here for the first time in the context of PNG study. We perform a Fisher analysis to estimate their cosmological information content, showing substantial improvements when marked observables are added to the analysis. Starting from these summaries, we train deep neural networks (NN) to perform likelihood-free inference of cosmological and PNG parameters. We assess the performance of different subsets of summary statistics; in the case of $f_\mathrm{NL}^\mathrm{equil}$, we find that a combination of the power spectrum and a suitable marked power spectrum outperforms the combination of power spectrum and bispectrum, the baseline statistics usually employed in PNG analysis. A minimal pipeline to analyse the statistics we identified can be implemented either with our ML algorithm or via more traditional estimators, if these are deemed more reliable. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: 13 pages, 10 figures

arXiv:2402.10997 [pdf, other]

Cosmological multifield emulator

Authors: Sambatra Andrianomena, Sultan Hassan, Francisco Villaescusa-Navarro

Abstract: We demonstrate the use of deep network to learn the distribution of data from state-of-the-art hydrodynamic simulations of the CAMELS project. To this end, we train a generative adversarial network to generate images composed of three different channels that represent gas density (Mgas), neutral hydrogen density (HI), and magnetic field amplitudes (B). We consider an unconstrained model and anothe… ▽ More We demonstrate the use of deep network to learn the distribution of data from state-of-the-art hydrodynamic simulations of the CAMELS project. To this end, we train a generative adversarial network to generate images composed of three different channels that represent gas density (Mgas), neutral hydrogen density (HI), and magnetic field amplitudes (B). We consider an unconstrained model and another scenario where the model is conditioned on the matter density $Ω_{\rm m}$ and the amplitude of density fluctuations $σ_{8}$. We find that the generated images exhibit great quality which is on a par with that of data, visually. Quantitatively, we find that our model generates maps whose statistical properties, quantified by probability distribution function of pixel values and auto-power spectra, agree reasonably well with those of the real maps. Moreover, the cross-correlations between fields in all maps produced by the emulator are in good agreement with those of the real images, which indicates that our model generates instances whose maps in all three channels describe the same physical region. Furthermore, a CNN regressor, which has been trained to extract $Ω_{\rm m}$ and $σ_{8}$ from CAMELS multifield dataset, recovers the cosmology from the maps generated by our conditional model, achieving $R^{2}$ = 0.96 and 0.83 corresponding to $Ω_{\rm m}$ and $σ_{8}$ respectively. This further demonstrates the great capability of the model to mimic CAMELS data. Our model can be useful for generating data that are required to analyze the information from upcoming multi-wavelength cosmological surveys. △ Less

Submitted 23 October, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: 18 pages, 10 figures, 1 table

arXiv:2401.17940 [pdf, other]

Can we constrain warm dark matter masses with individual galaxies?

Authors: Shurui Lin, Francisco Villaescusa-Navarro, Jonah Rose, Paul Torrey, Arya Farahi, Kassidy E. Kollmann, Alex M. Garcia, Sandip Roy, Nitya Kallivayalil, Mark Vogelsberger, Yi-Fu Cai, Wentao Luo

Abstract: We study the impact of warm dark matter mass on the internal properties of individual galaxies using a large suite of 1,024 state-of-the-art cosmological hydrodynamic simulations from the DREAMS project. We take individual galaxies' properties from the simulations, which have different cosmologies, astrophysics, and warm dark matter masses, and train normalizing flows to learn the posterior of the… ▽ More We study the impact of warm dark matter mass on the internal properties of individual galaxies using a large suite of 1,024 state-of-the-art cosmological hydrodynamic simulations from the DREAMS project. We take individual galaxies' properties from the simulations, which have different cosmologies, astrophysics, and warm dark matter masses, and train normalizing flows to learn the posterior of the parameters. We find that our models cannot infer the value of the warm dark matter mass, even when the values of the cosmological and astrophysical parameters are given explicitly. This result holds for galaxies with stellar mass larger than $2\times10^8 M_\odot/h$ at both low and high redshifts. We calculate the mutual information and find no significant dependence between the WDM mass and galaxy properties. On the other hand, our models can infer the value of $Ω_{\rm m}$ with a $\sim10\%$ accuracy from the properties of individual galaxies while marginalizing astrophysics and warm dark matter masses. △ Less

Submitted 31 January, 2024; originally announced January 2024.

Comments: 13 pages, 8 figures

arXiv:2401.15891 [pdf, other]

doi 10.1093/mnras/staf355

A field-level emulator for modeling baryonic effects across hydrodynamic simulations

Authors: Divij Sharma, Biwei Dai, Francisco Villaescusa-Navarro, Uros Seljak

Abstract: We develop a new and simple method to model baryonic effects at the field level relevant for weak lensing analyses. We analyze thousands of state-of-the-art hydrodynamic simulations from the CAMELS project, each with different cosmology and strength of feedback, and we find that the cross-correlation coefficient between full hydrodynamic and N-body simulations is very close to 1 down to… ▽ More We develop a new and simple method to model baryonic effects at the field level relevant for weak lensing analyses. We analyze thousands of state-of-the-art hydrodynamic simulations from the CAMELS project, each with different cosmology and strength of feedback, and we find that the cross-correlation coefficient between full hydrodynamic and N-body simulations is very close to 1 down to $k\sim10~h{\rm Mpc}^{-1}$. This suggests that modeling baryonic effects at the field level down to these scales only requires N-body simulations plus a correction to the mode's amplitude given by: $\sqrt{P_{\rm hydro}(k)/P_{\rm nbody}(k)}$. In this paper, we build an emulator for this quantity, using Gaussian processes, that is flexible enough to reproduce results from thousands of hydrodynamic simulations that have different cosmologies, astrophysics, subgrid physics, volumes, resolutions, and at different redshifts. Our emulator is accurate at the percent level and exhibits a range of validation superior to previous studies. This method and our emulator enable field-level simulation-based inference analyses and accounting for baryonic effects in weak lensing analyses. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 12 pages, 9 figures. Comments welcome

arXiv:2311.10088 [pdf, other]

doi 10.1088/1475-7516/2024/02/048

Taming assembly bias for primordial non-Gaussianity

Authors: Emanuele Fondi, Licia Verde, Francisco Villaescusa-Navarro, Marco Baldi, William R. Coulton, Gabriel Jung, Dionysios Karagiannis, Michele Liguori, Andrea Ravenni, Benjamin D. Wandelt

Abstract: Primordial non-Gaussianity of the local type induces a strong scale-dependent bias on the clustering of halos in the late-time Universe. This signature is particularly promising to provide constraints on the non-Gaussianity parameter $f_{\rm NL}$ from galaxy surveys, as the bias amplitude grows with scale and becomes important on large, linear scales. However, there is a well-known degeneracy betw… ▽ More Primordial non-Gaussianity of the local type induces a strong scale-dependent bias on the clustering of halos in the late-time Universe. This signature is particularly promising to provide constraints on the non-Gaussianity parameter $f_{\rm NL}$ from galaxy surveys, as the bias amplitude grows with scale and becomes important on large, linear scales. However, there is a well-known degeneracy between the real prize, the $f_{\rm NL}$ parameter, and the (non-Gaussian) assembly bias i.e., the halo formation history-dependent contribution to the amplitude of the signal, which could seriously compromise the ability of large-scale structure surveys to constrain $f_{\rm NL}$. We show how the assembly bias can be modeled and constrained, thus almost completely recovering the power of galaxy surveys to competitively constrain primordial non-Gaussianity. In particular, studying hydrodynamical simulations, we find that a proxy for the halo properties that determine assembly bias can be constructed from photometric properties of galaxies. Using a prior on the assembly bias guided by this proxy degrades the statistical errors on $f_{\rm NL}$ only mildly compared to an ideal case where the assembly bias is perfectly known. The systematic error on $f_{\rm NL}$ that the proxy induces can be safely kept under control. △ Less

Submitted 2 February, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: 30 pages, 13 figures. v2: minor updates to match accepted version

Journal ref: JCAP02(2024)048

arXiv:2311.01588 [pdf, other]

Domain Adaptive Graph Neural Networks for Constraining Cosmological Parameters Across Multiple Data Sets

Authors: Andrea Roncoli, Aleksandra Ćiprijanović, Maggie Voetberg, Francisco Villaescusa-Navarro, Brian Nord

Abstract: Deep learning models have been shown to outperform methods that rely on summary statistics, like the power spectrum, in extracting information from complex cosmological data sets. However, due to differences in the subgrid physics implementation and numerical approximations across different simulation suites, models trained on data from one cosmological simulation show a drop in performance when t… ▽ More Deep learning models have been shown to outperform methods that rely on summary statistics, like the power spectrum, in extracting information from complex cosmological data sets. However, due to differences in the subgrid physics implementation and numerical approximations across different simulation suites, models trained on data from one cosmological simulation show a drop in performance when tested on another. Similarly, models trained on any of the simulations would also likely experience a drop in performance when applied to observational data. Training on data from two different suites of the CAMELS hydrodynamic cosmological simulations, we examine the generalization capabilities of Domain Adaptive Graph Neural Networks (DA-GNNs). By utilizing GNNs, we capitalize on their capacity to capture structured scale-free cosmological information from galaxy distributions. Moreover, by including unsupervised domain adaptation via Maximum Mean Discrepancy (MMD), we enable our models to extract domain-invariant features. We demonstrate that DA-GNN achieves higher accuracy and robustness on cross-dataset tasks (up to $28\%$ better relative error and up to almost an order of magnitude better $χ^2$). Using data visualizations, we show the effects of domain adaptation on proper latent space data alignment. This shows that DA-GNNs are a promising method for extracting domain-independent cosmological information, a vital step toward robust deep learning for real cosmic survey data. △ Less

Submitted 15 April, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

Comments: Accepted in Machine Learning and the Physical Sciences Workshop at NeurIPS 2023; 9 pages, 2 figures, 1 table

Report number: FERMILAB-CONF-23-644-CSAID

arXiv:2310.16884 [pdf, other]

doi 10.1093/mnras/stae1102

Atomic Hydrogen Shows its True Colours: Correlations between HI and Galaxy Colour in Simulations

Authors: Calvin Osinga, Benedikt Diemer, Francisco Villaescusa-Navarro, Elena D'Onghia, Peter Timbie

Abstract: Intensity mapping experiments are beginning to measure the spatial distribution of neutral atomic hydrogen (HI) to constrain cosmological parameters and the large-scale distribution of matter. However, models of the behaviour of HI as a tracer of matter are complicated by galaxy evolution. In this work, we examine the clustering of HI in relation to galaxy colour, stellar mass, and HI mass in Illu… ▽ More Intensity mapping experiments are beginning to measure the spatial distribution of neutral atomic hydrogen (HI) to constrain cosmological parameters and the large-scale distribution of matter. However, models of the behaviour of HI as a tracer of matter are complicated by galaxy evolution. In this work, we examine the clustering of HI in relation to galaxy colour, stellar mass, and HI mass in IllustrisTNG at $z$ = 0, 0.5, and 1. We compare the HI-red and HI-blue galaxy cross-power spectra, finding that HI-red has an amplitude 1.5 times higher than HI-blue at large scales. The cross-power spectra intersect at $\approx 3$ Mpc in real space and $\approx 10$ Mpc in redshift space, consistent with $z \approx 0$ observations. We show that HI clustering increases with galaxy HI mass and depends weakly on detection limits in the range $M_{\mathrm{HI}} \leq 10^8 M_\odot$. In terms of $M_\star$, we find blue galaxies in the greatest stellar mass bin cluster more than blue galaxies in other stellar mass bins. Red galaxies in the greatest stellar mass bin, however, cluster the weakest amongst red galaxies. These trends arise due to central-satellite compositions. Centrals correlate less with HI for increasing stellar mass, whereas satellites correlate more, irrespective of colour. Despite the clustering relationships with stellar mass, we find that the cross-power spectra are largely insensitive to detection limits in HI and galaxy surveys. Counter-intuitively, all auto and cross-power spectra for red and blue galaxies and HI decrease with time at all scales in IllustrisTNG. We demonstrate that processes associated with quenching contribute to this trend. The complex interplay between HI and galaxies underscores the importance of understanding baryonic effects when interpreting the large-scale clustering of HI, blue, and red galaxies at $z \leq 1$. △ Less

Submitted 22 April, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

Comments: 16 pages, 11 figures

arXiv:2310.15234 [pdf, other]

doi 10.1088/1475-7516/2025/01/082

Field-level simulation-based inference with galaxy catalogs: the impact of systematic effects

Authors: Natalí S. M. de Santi, Francisco Villaescusa-Navarro, L. Raul Abramo, Helen Shao, Lucia A. Perez, Tiago Castro, Yueying Ni, Christopher C. Lovell, Elena Hernandez-Martinez, Federico Marinacci, David N. Spergel, Klaus Dolag, Lars Hernquist, Mark Vogelsberger

Abstract: It has been recently shown that a powerful way to constrain cosmological parameters from galaxy redshift surveys is to train graph neural networks to perform field-level likelihood-free inference without imposing cuts on scale. In particular, de Santi et al. (2023) developed models that could accurately infer the value of $Ω_{\rm m}$ from catalogs that only contain the positions and radial velocit… ▽ More It has been recently shown that a powerful way to constrain cosmological parameters from galaxy redshift surveys is to train graph neural networks to perform field-level likelihood-free inference without imposing cuts on scale. In particular, de Santi et al. (2023) developed models that could accurately infer the value of $Ω_{\rm m}$ from catalogs that only contain the positions and radial velocities of galaxies that are robust to uncertainties in astrophysics and subgrid models. However, observations are affected by many effects, including 1) masking, 2) uncertainties in peculiar velocities and radial distances, and 3) different galaxy selections. Moreover, observations only allow us to measure redshift, intertwining galaxies' radial positions and velocities. In this paper we train and test our models on galaxy catalogs, created from thousands of state-of-the-art hydrodynamic simulations run with different codes from the CAMELS project, that incorporate these observational effects. We find that, although the presence of these effects degrades the precision and accuracy of the models, and increases the fraction of catalogs where the model breaks down, the fraction of galaxy catalogs where the model performs well is over 90 %, demonstrating the potential of these models to constrain cosmological parameters even when applied to real data. △ Less

Submitted 26 January, 2025; v1 submitted 23 October, 2023; originally announced October 2023.

Comments: 36 pages, 12 figures. For the reference in the abstract see: de Santi et al. 2023, arXiv:2302.14101

Journal ref: Volume 2025, Number 01, Year 2025, Page 089

arXiv:2310.08634 [pdf, other]

Cosmology with Galaxy Photometry Alone

Authors: ChangHoon Hahn, Francisco Villaescusa-Navarro, Peter Melchior, Romain Teyssier

Abstract: We present the first cosmological constraints using only the observed photometry of galaxies. Villaescusa-Navarro et al. (2022; arXiv:2201.02202) recently demonstrated that the internal physical properties of a single simulated galaxy contain a significant amount of cosmological information. These physical properties, however, cannot be directly measured from observations. In this work, we present… ▽ More We present the first cosmological constraints using only the observed photometry of galaxies. Villaescusa-Navarro et al. (2022; arXiv:2201.02202) recently demonstrated that the internal physical properties of a single simulated galaxy contain a significant amount of cosmological information. These physical properties, however, cannot be directly measured from observations. In this work, we present how we can go beyond theoretical demonstrations to infer cosmological constraints from actual galaxy observables (e.g. optical photometry) using neural density estimation and the CAMELS suite of hydrodynamical simulations. We find that the cosmological information in the photometry of a single galaxy is limited. However, we combine the constraining power of photometry from many galaxies using hierarchical population inference and place significant cosmological constraints. With the observed photometry of $\sim$20,000 NASA-Sloan Atlas galaxies, we constrain $Ω_m = 0.323^{+0.075}_{-0.095}$ and $σ_8 = 0.799^{+0.088}_{-0.085}$. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 15 pages, 7 figures, submitted to ApJL, comments welcome

arXiv:2309.12048 [pdf, other]

Cosmology with multiple galaxies

Authors: Chaitanya Chawak, Francisco Villaescusa-Navarro, Nicolas Echeverri Rojas, Yueying Ni, ChangHoon Hahn, Daniel Angles-Alcazar

Abstract: Recent works have discovered a relatively tight correlation between $Ω_{\rm m}$ and properties of individual simulated galaxies. Because of this, it has been shown that constraints on $Ω_{\rm m}$ can be placed using the properties of individual galaxies while accounting for uncertainties on astrophysical processes such as feedback from supernova and active galactic nuclei. In this work, we quantif… ▽ More Recent works have discovered a relatively tight correlation between $Ω_{\rm m}$ and properties of individual simulated galaxies. Because of this, it has been shown that constraints on $Ω_{\rm m}$ can be placed using the properties of individual galaxies while accounting for uncertainties on astrophysical processes such as feedback from supernova and active galactic nuclei. In this work, we quantify whether using the properties of multiple galaxies simultaneously can tighten those constraints. For this, we train neural networks to perform likelihood-free inference on the value of two cosmological parameters ($Ω_{\rm m}$ and $σ_8$) and four astrophysical parameters using the properties of several galaxies from thousands of hydrodynamic simulations of the CAMELS project. We find that using properties of more than one galaxy increases the precision of the $Ω_{\rm m}$ inference. Furthermore, using multiple galaxies enables the inference of other parameters that were poorly constrained with one single galaxy. We show that the same subset of galaxy properties are responsible for the constraints on $Ω_{\rm m}$ from one and multiple galaxies. Finally, we quantify the robustness of the model and find that without identifying the model range of validity, the model does not perform well when tested on galaxies from other galaxy formation models. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: 13 pages, 7 figures

arXiv:2309.07912 [pdf, other]

doi 10.1093/mnras/stad3784

An Observationally Driven Multifield Approach for Probing the Circum-Galactic Medium with Convolutional Neural Networks

Authors: Naomi Gluck, Benjamin D. Oppenheimer, Daisuke Nagai, Francisco Villaescusa-Navarro, Daniel Anglés-Alcázar

Abstract: The circum-galactic medium (CGM) can feasibly be mapped by multiwavelength surveys covering broad swaths of the sky. With multiple large datasets becoming available in the near future, we develop a likelihood-free Deep Learning technique using convolutional neural networks (CNNs) to infer broad-scale physical properties of a galaxy's CGM and its halo mass for the first time. Using CAMELS (Cosmolog… ▽ More The circum-galactic medium (CGM) can feasibly be mapped by multiwavelength surveys covering broad swaths of the sky. With multiple large datasets becoming available in the near future, we develop a likelihood-free Deep Learning technique using convolutional neural networks (CNNs) to infer broad-scale physical properties of a galaxy's CGM and its halo mass for the first time. Using CAMELS (Cosmology and Astrophysics with MachinE Learning Simulations) data, including IllustrisTNG, SIMBA, and Astrid models, we train CNNs on Soft X-ray and 21-cm (HI) radio 2D maps to trace hot and cool gas, respectively, around galaxies, groups, and clusters. Our CNNs offer the unique ability to train and test on ''multifield'' datasets comprised of both HI and X-ray maps, providing complementary information about physical CGM properties and improved inferences. Applying eRASS:4 survey limits shows that X-ray is not powerful enough to infer individual halos with masses $\log(M_{\rm{halo}}/M_{\odot}) < 12.5$. The multifield improves the inference for all halo masses. Generally, the CNN trained and tested on Astrid (SIMBA) can most (least) accurately infer CGM properties. Cross-simulation analysis -- training on one galaxy formation model and testing on another -- highlights the challenges of developing CNNs trained on a single model to marginalize over astrophysical uncertainties and perform robust inferences on real data. The next crucial step in improving the resulting inferences on physical CGM properties hinges on our ability to interpret these deep-learning models. △ Less

Submitted 16 January, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

Journal ref: Monthly Notices of the Royal Astronomical Society, Volume 527, Issue 4, February 2024, Pages 10038-10058

arXiv:2309.05850 [pdf, other]

Predicting Interloper Fraction with Graph Neural Networks

Authors: Elena Massara, Francisco Villaescusa-Navarro, Will J. Percival

Abstract: Upcoming emission-line spectroscopic surveys, such as Euclid and the Roman Space Telescope, will be affected by systematic effects due to the presence of interlopers: galaxies whose redshift and distance from us are miscalculated due to line confusion in their emission spectra. Particularly pernicious are interlopers involving the confusion between two lines with close emitted wavelengths, like H… ▽ More Upcoming emission-line spectroscopic surveys, such as Euclid and the Roman Space Telescope, will be affected by systematic effects due to the presence of interlopers: galaxies whose redshift and distance from us are miscalculated due to line confusion in their emission spectra. Particularly pernicious are interlopers involving the confusion between two lines with close emitted wavelengths, like H$β$ emitters confused as \oiii, since those are strongly spatially correlated with the target galaxies. They introduce a particular pattern in the 3D distribution of the observed galaxy catalog that can shift the position of the BAO peak in the galaxy correlation function and bias any cosmological analysis performed with that sample. Here we present a novel method to predict the fraction of interlopers in a galaxy catalog, using Graph Neural Networks (GNNs) to learn the posterior distribution of the interloper fraction while marginalizing over cosmology and galaxy bias. The method is developed using simulations with halos acting as a proxy for galaxies. The GNN can infer the mean and standard deviation of the posterior distribution of interloper fraction using small-scale information that is usually not considered in cosmological analyses. The injection of large-scale information into the graph as a global attribute improves the performance of the GNN when marginalizing over cosmology. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Comments: 19 pages, 7 figures

arXiv:2308.13648 [pdf, other]

Emulating Radiative Transfer with Artificial Neural Networks

Authors: Snigdaa S. Sethuram, Rachel K. Cochrane, Christopher C. Hayward, Viviana Acquaviva, Francisco Villaescusa-Navarro, Gergo Popping, John H. Wise

Abstract: Forward-modeling observables from galaxy simulations enables direct comparisons between theory and observations. To generate synthetic spectral energy distributions (SEDs) that include dust absorption, re-emission, and scattering, Monte Carlo radiative transfer is often used in post-processing on a galaxy-by-galaxy basis. However, this is computationally expensive, especially if one wants to make… ▽ More Forward-modeling observables from galaxy simulations enables direct comparisons between theory and observations. To generate synthetic spectral energy distributions (SEDs) that include dust absorption, re-emission, and scattering, Monte Carlo radiative transfer is often used in post-processing on a galaxy-by-galaxy basis. However, this is computationally expensive, especially if one wants to make predictions for suites of many cosmological simulations. To alleviate this computational burden, we have developed a radiative transfer emulator using an artificial neural network (ANN), ANNgelina, that can reliably predict SEDs of simulated galaxies using a small number of integrated properties of the simulated galaxies: star formation rate, stellar and dust masses, and mass-weighted metallicities of all star particles and of only star particles with age <10 Myr. Here, we present the methodology and quantify the accuracy of the predictions. We train the ANN on SEDs computed for galaxies from the IllustrisTNG project's TNG50 cosmological magnetohydrodynamical simulation. ANNgelina is able to predict the SEDs of TNG50 galaxies in the ultraviolet (UV) to millimetre regime with a typical median absolute error of ~7 per cent. The prediction error is the greatest in the UV, possibly due to the viewing-angle dependence being greatest in this wavelength regime. Our results demonstrate that our ANN-based emulator is a promising computationally inexpensive alternative for forward-modeling galaxy SEDs from cosmological simulations. △ Less

Submitted 25 August, 2023; originally announced August 2023.

arXiv:2307.11832 [pdf, other]

Cosmological baryon spread and impact on matter clustering in CAMELS

Authors: Matthew Gebhardt, Daniel Anglés-Alcázar, Josh Borrow, Shy Genel, Francisco Villaescusa-Navarro, Yueying Ni, Christopher Lovell, Daisuke Nagai, Romeel Davé, Federico Marinacci, Mark Vogelsberger, Lars Hernquist

Abstract: We quantify the cosmological spread of baryons relative to their initial neighboring dark matter distribution using thousands of state-of-the-art simulations from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project. We show that dark matter particles spread relative to their initial neighboring distribution owing to chaotic gravitational dynamics on spatial scales com… ▽ More We quantify the cosmological spread of baryons relative to their initial neighboring dark matter distribution using thousands of state-of-the-art simulations from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project. We show that dark matter particles spread relative to their initial neighboring distribution owing to chaotic gravitational dynamics on spatial scales comparable to their host dark matter halo. In contrast, gas in hydrodynamic simulations spreads much further from the initial neighboring dark matter owing to feedback from supernovae (SNe) and Active Galactic Nuclei (AGN). We show that large-scale baryon spread is very sensitive to model implementation details, with the fiducial \textsc{SIMBA} model spreading $\sim$40\% of baryons $>$1\,Mpc away compared to $\sim$10\% for the IllustrisTNG and \textsc{ASTRID} models. Increasing the efficiency of AGN-driven outflows greatly increases baryon spread while increasing the strength of SNe-driven winds can decrease spreading due to non-linear coupling of stellar and AGN feedback. We compare total matter power spectra between hydrodynamic and paired $N$-body simulations and demonstrate that the baryonic spread metric broadly captures the global impact of feedback on matter clustering over variations of cosmological and astrophysical parameters, initial conditions, and galaxy formation models. Using symbolic regression, we find a function that reproduces the suppression of power by feedback as a function of wave number ($k$) and baryonic spread up to $k \sim 10\,h$\,Mpc$^{-1}$ while highlighting the challenge of developing models robust to variations in galaxy formation physics implementation. △ Less

Submitted 21 July, 2023; originally announced July 2023.

Comments: 17 pages, 15 figures

arXiv:2307.06967 [pdf, other]

A Hierarchy of Normalizing Flows for Modelling the Galaxy-Halo Relationship

Authors: Christopher C. Lovell, Sultan Hassan, Daniel Anglés-Alcázar, Greg Bryan, Giulio Fabbian, Shy Genel, ChangHoon Hahn, Kartheik Iyer, James Kwon, Natalí de Santi, Francisco Villaescusa-Navarro

Abstract: Using a large sample of galaxies taken from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project, a suite of hydrodynamic simulations varying both cosmological and astrophysical parameters, we train a normalizing flow (NF) to map the probability of various galaxy and halo properties conditioned on astrophysical and cosmological parameters. By leveraging the learnt cond… ▽ More Using a large sample of galaxies taken from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project, a suite of hydrodynamic simulations varying both cosmological and astrophysical parameters, we train a normalizing flow (NF) to map the probability of various galaxy and halo properties conditioned on astrophysical and cosmological parameters. By leveraging the learnt conditional relationships we can explore a wide range of interesting questions, whilst enabling simple marginalisation over nuisance parameters. We demonstrate how the model can be used as a generative model for arbitrary values of our conditional parameters; we generate halo masses and matched galaxy properties, and produce realisations of the halo mass function as well as a number of galaxy scaling relations and distribution functions. The model represents a unique and flexible approach to modelling the galaxy-halo relationship. △ Less

Submitted 13 July, 2023; originally announced July 2023.

Comments: 8 pages, 2 figures, accepted for ICML 2023 Workshop on Machine Learning for Astrophysics

arXiv:2306.11782 [pdf, other]

Signatures of a Parity-Violating Universe

Authors: William R. Coulton, Oliver H. E. Philcox, Francisco Villaescusa-Navarro

Abstract: What would a parity-violating universe look like? We present a numerical and theoretical study of mirror asymmetries in the late universe, using a new suite of $N$-body simulations: QUIJOTE-Odd. These feature parity-violating initial conditions, injected via a simple ansatz for the imaginary primordial trispectrum and evolved into the non-linear regime. We find that the realization-averaged power… ▽ More What would a parity-violating universe look like? We present a numerical and theoretical study of mirror asymmetries in the late universe, using a new suite of $N$-body simulations: QUIJOTE-Odd. These feature parity-violating initial conditions, injected via a simple ansatz for the imaginary primordial trispectrum and evolved into the non-linear regime. We find that the realization-averaged power spectrum, bispectrum, halo mass function, and matter PDF are not affected by our modifications to the initial conditions, deep into the non-linear regime, which we argue arises from rotational and translational invariance. In contrast, the parity-odd trispectrum of matter (measured using a new estimator), shows distinct signatures proportional to the parity-violating parameter, $p_{\rm NL}$, which sets the amplitude of the primordial trispectrum. We additionally find intriguing signatures in the angular momentum of halos, with the primordial trispectrum inducing a non-zero correlation between angular momentum and smoothed velocity field, proportional to $p_{\rm NL}$. Our simulation suite has been made public to facilitate future analyses. △ Less

Submitted 20 June, 2023; originally announced June 2023.

Comments: 19 pages, 9 figures. Simulations available at https://quijote-simulations.readthedocs.io/en/latest/odd.html

arXiv:2305.10597 [pdf, other]

doi 10.3847/1538-4357/acfe70

Quijote-PNG: The Information Content of the Halo Mass Function

Authors: Gabriel Jung, Andrea Ravenni, Marco Baldi, William R. Coulton, Drew Jamieson, Dionysios Karagiannis, Michele Liguori, Helen Shao, Licia Verde, Francisco Villaescusa-Navarro, Benjamin D. Wandelt

Abstract: We study signatures of primordial non-Gaussianity (PNG) in the redshift-space halo field on non-linear scales, using a combination of three summary statistics, namely the halo mass function (HMF), power spectrum, and bispectrum. The choice of adding the HMF to our previous joint analysis of power spectrum and bispectrum is driven by a preliminary field-level analysis, in which we train graph neura… ▽ More We study signatures of primordial non-Gaussianity (PNG) in the redshift-space halo field on non-linear scales, using a combination of three summary statistics, namely the halo mass function (HMF), power spectrum, and bispectrum. The choice of adding the HMF to our previous joint analysis of power spectrum and bispectrum is driven by a preliminary field-level analysis, in which we train graph neural networks on halo catalogues to infer the PNG $f_\mathrm{NL}$ parameter. The covariance matrix and the responses of our summaries to changes in model parameters are extracted from a suite of halo catalogues constructed from the Quijote-PNG N-body simulations. We consider the three main types of PNG: local, equilateral and orthogonal. Adding the HMF to our previous joint analysis of power spectrum and bispectrum produces two main effects. First, it reduces the equilateral $f_\mathrm{NL}$ predicted errors by roughly a factor $2$, while also producing notable, although smaller, improvements for orthogonal PNG. Second, it helps break the degeneracy between the local PNG amplitude, $f_\mathrm{NL}^\mathrm{local}$, and assembly bias, $b_φ$, without relying on any external prior assumption. Our final forecasts for PNG parameters are $Δf_\mathrm{NL}^\mathrm{local} = 40$, $Δf_\mathrm{NL}^\mathrm{equil} = 210$, $Δf_\mathrm{NL}^\mathrm{ortho} = 91$, on a cubic volume of $1 \left(h^{-1}{\rm Gpc}\right)^3$, with a halo number density of $\bar{n}\sim 5.1 \times 10^{-5}~h^3\mathrm{Mpc}^{-3}$, at $z = 1$, and considering scales up to $k_\mathrm{max} = 0.5~h\,\mathrm{Mpc}^{-1}$. △ Less

Submitted 4 February, 2024; v1 submitted 17 May, 2023; originally announced May 2023.

Comments: 17 pages, 11 figures. v3 (minor caption fix)

Journal ref: Astrophys.J. 957 (2023) 1, 50

arXiv:2304.14432 [pdf, other]

Inferring Warm Dark Matter Masses with Deep Learning

Authors: Jonah C. Rose, Paul Torrey, Francisco Villaescusa-Navarro, Mark Vogelsberger, Stephanie O'Neil, Mikhail V. Medvedev, Ryan Low, Rakshak Adhikari, Daniel Angles-Alcazar

Abstract: We present a new suite of over 1,500 cosmological N-body simulations with varied Warm Dark Matter (WDM) models ranging from 2.5 to 30 keV. We use these simulations to train Convolutional Neural Networks (CNNs) to infer WDM particle masses from images of DM field data. Our fiducial setup can make accurate predictions of the WDM particle mass up to 7.5 keV at a 95% confidence level from small maps t… ▽ More We present a new suite of over 1,500 cosmological N-body simulations with varied Warm Dark Matter (WDM) models ranging from 2.5 to 30 keV. We use these simulations to train Convolutional Neural Networks (CNNs) to infer WDM particle masses from images of DM field data. Our fiducial setup can make accurate predictions of the WDM particle mass up to 7.5 keV at a 95% confidence level from small maps that cover an area of (25 h$^{-1}$ Mpc)$^2$. We vary the image resolution, simulation resolution, redshift, and cosmology of our fiducial setup to better understand how our model is making predictions. Using these variations, we find that our models are most dependent on simulation resolution, minimally dependent on image resolution, not systematically dependent on redshift, and robust to varied cosmologies. We also find that an important feature to distinguish between WDM models is present with a linear size between 100 and 200 h$^{-1}$ kpc. We compare our fiducial model to one trained on the power spectrum alone and find that our field-level model can make 2x more precise predictions and can make accurate predictions to 2x as massive WDM particle masses when used on the same data. Overall, we find that the field-level data can be used to accurately differentiate between WDM models and contain more information than is captured by the power spectrum. This technique can be extended to more complex DM models and opens up new opportunities to explore alternative DM models in a cosmological environment. △ Less

Submitted 27 April, 2023; originally announced April 2023.

Comments: 16 pages, 12 figures

Showing 1–50 of 185 results for author: Villaescusa-Navarro, F