Search | arXiv e-print repository

Bridging Literature and the Universe Via A Multi-Agent Large Language Model System

Authors: Xiaowen Zhang, Zhenyu Bi, Patrick Lachance, Xuan Wang, Tiziana Di Matteo, Rupert A. C. Croft

Abstract: As cosmological simulations and their associated software become increasingly complex, physicists face the challenge of searching through vast amounts of literature and user manuals to extract simulation parameters from dense academic papers, each using different models and formats. Translating these parameters into executable scripts remains a time-consuming and error-prone process. To improve ef… ▽ More As cosmological simulations and their associated software become increasingly complex, physicists face the challenge of searching through vast amounts of literature and user manuals to extract simulation parameters from dense academic papers, each using different models and formats. Translating these parameters into executable scripts remains a time-consuming and error-prone process. To improve efficiency in physics research and accelerate the cosmological simulation process, we introduce SimAgents, a multi-agent system designed to automate both parameter configuration from the literature and preliminary analysis for cosmology research. SimAgents is powered by specialized LLM agents capable of physics reasoning, simulation software validation, and tool execution. These agents collaborate through structured communication, ensuring that extracted parameters are physically meaningful, internally consistent, and software-compliant. We also construct a cosmological parameter extraction evaluation dataset by collecting over 40 simulations in published papers from Arxiv and leading journals that cover diverse simulation types. Experiments on the dataset demonstrate a strong performance of SimAgents, highlighting its effectiveness and potential to accelerate scientific research for physicists. Our demonstration video is available at: https://youtu.be/w1zLpm_CaWA. The complete system and dataset are publicly available at https://github.com/xwzhang98/SimAgents. △ Less

Submitted 15 July, 2025; v1 submitted 11 July, 2025; originally announced July 2025.

Comments: 6 pages, 4 figures

arXiv:2505.20439 [pdf, ps, other]

The Properties of Little Red Dot Galaxies in the ASTRID Simulation

Authors: Patrick LaChance, Rupert A. C. Croft, Tiziana Di Matteo, Yihao Zhou, Fabio Pacucci, Yueying Ni, Nianyi Chen, Simeon Bird

Abstract: We present simulated counterparts of the ``Little Red Dot'' (LRD) galaxies observed with JWST, using the large cosmological hydrodynamic simulation, ASTRID. We create mock observations of the galaxies ($5 \leq z \leq 8$) in ASTRID, and find seventeen which fit the color and size criteria of LRDs. These LRDs are galaxies with high stellar masses ($\rm log(M_*/M_{\odot}) \geq 9.7$), and massive blac… ▽ More We present simulated counterparts of the ``Little Red Dot'' (LRD) galaxies observed with JWST, using the large cosmological hydrodynamic simulation, ASTRID. We create mock observations of the galaxies ($5 \leq z \leq 8$) in ASTRID, and find seventeen which fit the color and size criteria of LRDs. These LRDs are galaxies with high stellar masses ($\rm log(M_*/M_{\odot}) \geq 9.7$), and massive black holes ($\rm log(M_{BH}/M_{\odot}) \geq 6.8$). The host galaxies are dense, with stellar half mass radii ($\rm 325\,pc \leq r_{{\rm half},*} \leq 620\,pc$), and dust attenuation in the F444W band above 1.25. Their star formation has been recently quenched. They host relatively bright AGN that are dust-obscured and contribute significantly to the rest-frame optical red slope and have relatively low luminosity in the rest-frame ultraviolet, where the host galaxy's stars are more dominant. These LRDs are in an evolutionary phase of miniquenching that is the result of AGN feedback from their massive black holes. The LRDs in ASTRID are bright with F444W magnitudes of $23.5-25.5$. The less massive and fainter galaxies in ASTRID lack the dust concentration necessary to produce the red slope of an LRD, though this could be an effect of limited resolution. Most of the highest Eddington black holes are not LRDs due to their host galaxies having typical dust levels and relatively high star formation rates accompanying their highly accreting black holes, resulting in their spectra being too flat. △ Less

Submitted 26 May, 2025; originally announced May 2025.

Comments: 15 pages, 12 figures

arXiv:2504.03848 [pdf, ps, other]

doi 10.33232/001c.143599

Large-scale surveys of the quasar proximity effect

Authors: Rupert A. C. Croft, Patrick Shaw, Ann-Marsha Alexis, Nianyi Chen, Yihao Zhou, Tiziana Di Matteo, Simeon Bird, Patrick Lachance, Yueying Ni

Abstract: The UV radiation from high redshift quasars causes a local deficit in the neutral hydrogen absorption (Lyman-alpha forest) in their spectra, known as the proximity effect. Measurements from small samples of tens to hundreds of quasars have been used to constrain the global intensity of the UV background radiation, but so far the power of large-scale surveys such as the Sloan Digital Sky Survey and… ▽ More The UV radiation from high redshift quasars causes a local deficit in the neutral hydrogen absorption (Lyman-alpha forest) in their spectra, known as the proximity effect. Measurements from small samples of tens to hundreds of quasars have been used to constrain the global intensity of the UV background radiation, but so far the power of large-scale surveys such as the Sloan Digital Sky Survey and the Dark Energy Spectroscopic Instrument (DESI) survey has not been used to investigate the UV background in more detail. We develop a CDM-based halo model of the quasar proximity effect, which accounts by construction for the fact that quasars reside in overdense regions. We test this model on quasar Lyman-alpha spectra from the ASTRID cosmological hydrodynamic simulation, which includes self-consistent formation of quasar black holes and the intergalactic medium surrounding them. Fitting the model to individual quasar spectra, we constrain two parameters, r_eq (the radius at which the local quasar radiation intensity equals the background), and the quasar bias b_q (related to host halo mass). We find that r_eq can be recovered in an unbiased fashion with a statistical uncertainty of 25-50% from a single quasar spectrum. Applying such fitting to samples of millions of spectra from e.g., DESI would allow measurement of the UVBG intensity and its evolution with redshift with high precision. We use another, larger-scale, lower resolution simulation (Uchuu) to test how such a large sample of proximity effect measurements could be used to probe the spatial fluctuations in the intergalactic radiation field. We find that the large-scale structure of the UV radiation intensity could be mapped and its power spectrum measured on 100-1000 Mpc/h scales. This could allow the large-scale radiation field to join the density field as a dataset for constraining cosmology and the sources of radiation. △ Less

Submitted 21 August, 2025; v1 submitted 4 April, 2025; originally announced April 2025.

Comments: 17 pages, 15 figures, Published in the Open Journal of Astrophysics

Journal ref: The Open Journal of Astrophysics, Volume 8, Aug 2025

arXiv:2408.09051 [pdf, other]

doi 10.33232/001c.129471

AI-assisted super-resolution cosmological simulations IV: An emulator for deterministic realizations

Authors: Xiaowen Zhang, Patrick Lachance, Ankita Dasgupta, Rupert A. C. Croft, Tiziana Di Matteo, Yueying Ni, Simeon Bird, Yin Li

Abstract: Super-resolution (SR) models in cosmological simulations use deep learning (DL) to rapidly enhance low-resolution (LR) runs with statistically correct fine details. These models preserves large-scale structures by conditioning on an LR version of the simulation. On smaller scales, the generative process is inherently stochastic, producing multiple possible SR realizations with distinct small-scale… ▽ More Super-resolution (SR) models in cosmological simulations use deep learning (DL) to rapidly enhance low-resolution (LR) runs with statistically correct fine details. These models preserves large-scale structures by conditioning on an LR version of the simulation. On smaller scales, the generative process is inherently stochastic, producing multiple possible SR realizations with distinct small-scale structures. Validation of reconstructed SR runs from LR simulations requires ensuring that specific statistics of interest are accurately reproduced by comparing SR outputs with target high resolution (HR) runs. In this study, we develop an emulator designed to reproduce the small-scale structures of target HR simulation with high fidelity. By processing an SR realization alongside the high-resolution initial condition (HRIC), we transform the SR output to emulate the result of a full simulation with that HRIC. By comparing various metrics, from visualization to individual halo measurements, we demonstrate that the emulated SR runs closely align with the target HR simulation, even at length scales an order of magnitude smaller than the corresponding LR run. These results show the potential of this method for efficiently generating accurate simulations and mock observations for large galaxy surveys. △ Less

Submitted 6 February, 2025; v1 submitted 16 August, 2024; originally announced August 2024.

Comments: 18 pages, 16 figures

arXiv:2405.11026 [pdf]

doi 10.3847/1538-4357/ad7ff0

Astrometric Jitter as a Detection Diagnostic for Recoiling and Slingshot Supermassive Black Hole Candidates

Authors: Anavi Uppal, Charlotte Ward, Suvi Gezari, Priyamvada Natarajan, Nianyi Chen, Patrick LaChance, Tiziana Di Matteo

Abstract: Supermassive black holes (SMBHs) can be ejected from their galactic centers due to gravitational wave recoil or the slingshot mechanism following a galaxy merger. If an ejected SMBH retains its inner accretion disk, it may be visible as an off-nuclear active galactic nucleus (AGN). At present, only a handful of offset AGNs that are recoil or slingshot candidates have been found, and none have been… ▽ More Supermassive black holes (SMBHs) can be ejected from their galactic centers due to gravitational wave recoil or the slingshot mechanism following a galaxy merger. If an ejected SMBH retains its inner accretion disk, it may be visible as an off-nuclear active galactic nucleus (AGN). At present, only a handful of offset AGNs that are recoil or slingshot candidates have been found, and none have been robustly confirmed. Compiling a large sample of runaway SMBHs would enable us to constrain the mass and spin evolution of binary SMBHs and study feedback effects of displaced AGNs. We adapt the method of varstrometry -- which was developed for Gaia observations to identify off-center, dual, and lensed AGNs -- in order to quickly identify off-nuclear AGNs in optical survey data by looking for an excess of blue versus red astrometric jitter. We apply this to the Pan-STARRS1 3$π$ Survey and report on five new runaway AGN candidates. We focus on ZTF18aajyzfv: a luminous quasar offset by 6.7 $\pm$ 0.2 kpc from an adjacent galaxy at $z$=0.224, and conclude after Keck LRIS spectroscopy and comparison to ASTRID simulation analogs that it is likely a dual AGN. This selection method can be easily adapted to work with data from the soon-to-be commissioned Vera C. Rubin Telescope Legacy Survey of Space and Time (LSST). LSST will have a higher cadence and deeper magnitude limit than Pan-STARRS1, and should permit detection of many more runaway SMBH candidates. △ Less

Submitted 8 November, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

Comments: Accepted for publication by ApJ. 13 pages, 12 figures, 3 tables

Journal ref: ApJ 975 286 (2024)

arXiv:2401.16608 [pdf, other]

doi 10.33232/001c.129991

The evolution of galaxy morphology from redshift z=6 to 3: Mock JWST observations of galaxies in the ASTRID simulation

Authors: Patrick LaChance, Rupert Croft, Yueying Ni, Nianyi Chen, Tiziana Di Matteo, Simeon Bird

Abstract: We present mock JWST observations for more than 250,000 different galaxies from the Astrid simulation with $3 \leq z \leq 6$. The mock observations are made using the BPASS stellar SED model, and a simple dust model. They are then viewed through NIRCam filters, convolved with a PSF, have noise added, and are drizzled together to emulate the Cosmic Evolution Early Release Science (CEERS) survey. We… ▽ More We present mock JWST observations for more than 250,000 different galaxies from the Astrid simulation with $3 \leq z \leq 6$. The mock observations are made using the BPASS stellar SED model, and a simple dust model. They are then viewed through NIRCam filters, convolved with a PSF, have noise added, and are drizzled together to emulate the Cosmic Evolution Early Release Science (CEERS) survey. We analyse this dataset by computing a number of morphological measures and find our catalog to have comparable statistics to similar mock catalogs, and the first release of CEERS data. We find that most of the Sersic indices of galaxies in our redshift range are lower than observed, with most having n less than one. Additionally, we observe the sizes of galaxies of all masses to increase from redshift z=6 to redshift z=3 consistent with other results. The number of galaxies in our catalog allows us to examine how relationships like the mass-size relation evolve with redshift, and compare the accuracy of a variety of traditional galaxy classification techniques (Sersic fit, Asymmetry-Concentration, and Gini-$M_{20}$) within our redshift range. We find the mass-size relation to be nearly flat at redshift z=6, and consistently increases as redshift decreases, and find the galaxy classification methods have minimal correlation with each other in our redshift range. We also investigate the impact that different stages of our imaging pipeline have on these morphological measures to determine how robust mock catalogs are to different choices at each step. Finally, we test the addition of incorporating light from AGNs into our pipeline and find that while the population of galaxies that have significant AGN luminosity is low, those galaxies do tend to have higher Sersic indices once the AGN luminosity is added, rectifying some of the systematic bias towards lower Sersic indices present in our dataset. △ Less

Submitted 25 February, 2025; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: 17 pages, 14 figures

arXiv:2312.14263 [pdf, other]

doi 10.33232/001c.124451

z~2 dual AGN host galaxies are disky: stellar kinematics in the ASTRID Simulation

Authors: Ekaterine Dadiani, Tiziana Di Matteo, Nianyi Chen, Patrick Lachance, Yue Shen, Yu-Ching Chen, Rupert Croft, Yueying Ni, Simeon Bird

Abstract: We study dual AGN host galaxy morphologies at $z=2$ using the ASTRID simulation, selecting black hole (BH) pairs with small separation ($Δr<30\rm{kpc}$), high mass ($M_{\text{BH,12}}>10^7M_\odot$), and luminosity ($L_{\text{bol,12}}>10^{43}\rm{erg/s}$). We kinematically decompose (using MORDOR) $\sim1000$ dual AGN hosts into standard components - a `disk' (thin and thick disk, pseudo-bulge) and 'b… ▽ More We study dual AGN host galaxy morphologies at $z=2$ using the ASTRID simulation, selecting black hole (BH) pairs with small separation ($Δr<30\rm{kpc}$), high mass ($M_{\text{BH,12}}>10^7M_\odot$), and luminosity ($L_{\text{bol,12}}>10^{43}\rm{erg/s}$). We kinematically decompose (using MORDOR) $\sim1000$ dual AGN hosts into standard components - a `disk' (thin and thick disk, pseudo-bulge) and 'bulge' (bulge and halo) and define disk-dominated galaxies by the disk-to-total $D/T\geq0.5$. In ASTRID, $60.9\pm2.1\%$ of dual AGN hosts (independent of separation) are disk-dominated, with the $D/T$ distribution peaking at $\sim0.7$. Notably, hosts of BH pairs have similar morphologies (most either both disk or bulge-dominated). In dual-AGN hosts, the $D/T$ increases from $\sim17\% $ at $M_{\rm *}\sim 10^{9} M_{\odot}$ to $ 64\% $ for $M_{\rm *} \sim 10^{11.5} M_{\odot}$, and the pseudo-bulge is the dominant component of the disk fraction at the high mass end. Moreover, dual AGN hosts exhibit a higher fraction of disk/large pseudo-bulge than single-AGN hosts. The Disk-to-Total ratio is approximately constant with BH mass or AGN luminosity. We also create mock images of dual AGN host galaxies, employing morphological fitting software Statmorph to calculate morphological parameters and compare them with our kinematic decomposition results. Around $83.3\pm2.4\%$ of galaxies display disk-like profiles, of which $\sim60.7\pm2.2\%$ are kinematically confirmed as disks. Seŕsic indices and half-mass radii of dual AGN host galaxies align with observational measurements from HST at $z\sim2$. Around $34\%$ are identified as mergers from the $\text{Gini}-M_{20}$ relation. We find two dual AGN hosted by galaxies that exhibit disk-like seŕsic index $n_{12}<1$ and $(D/T)_{12}>0.5$, which are in remarkable agreement with properties of recently discovered dual quasars in disk galaxies at $z\sim 2$. △ Less

Submitted 4 October, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

Comments: 17 pages, 14 figures, submitted to the Open Journal of Astrophysics

Journal ref: Open Journal of Astrophysics, Volume 7, 7th October 2024

arXiv:2307.01276 [pdf, other]

Fly-by galaxy encounters with multiple black holes produce star-forming linear wakes

Authors: Nianyi Chen, Patrick LaChance, Yueying Ni, Tiziana Di Matteo, Rupert Croft, Priyamvada Natarajan, Simeon Bird

Abstract: We look for simulated star-forming linear wakes such as the one recently discovered by van Dokkum et al. (2023) in the cosmological hydrodynamical simulation ASTRID. Amongst the runaway black holes in ASTRID, none are able to produce clear star-forming wakes. Meanwhile, fly-by encounters, typically involving a compact galaxy (with a central black hole) and a star-forming galaxy (with a duo of blac… ▽ More We look for simulated star-forming linear wakes such as the one recently discovered by van Dokkum et al. (2023) in the cosmological hydrodynamical simulation ASTRID. Amongst the runaway black holes in ASTRID, none are able to produce clear star-forming wakes. Meanwhile, fly-by encounters, typically involving a compact galaxy (with a central black hole) and a star-forming galaxy (with a duo of black holes) reproduce remarkably well many of the key properties (its length and linearity; recent star formation, etc.) of the observed star-forming linear feature. We predict the feature to persist for approximately 100 Myr in such a system and hence constitute a rare event. The feature contains a partly stripped galaxy (with $M_{\rm gal}=10^9 \sim 10^{10}M_\odot$) and a dual BH system ($M_{\rm BH}=10^5 \sim 10^7\,M_\odot$) in its brightest knot. X-ray emission from AGN in the knot should be detectable in such systems. After $100\sim 200\,{\rm Myrs}$ from the first fly-by, the galaxies merge leaving behind a triple black hole system in a (still) actively star-forming early-type remnant of mass $\sim 5\times 10^{10}\,M_\odot$. Follow-up JWST observations may be key for revealing the nature of these linear features by potentially detecting the older stellar populations constituting the bright knot. Confirmation of such detections may therefore help discriminate a fly-by encounter from a massive BH wake to reveal the origin of such features. △ Less

Submitted 3 July, 2023; originally announced July 2023.

Comments: 8 pages, 5 figures, comments welcome

arXiv:2305.12222 [pdf, other]

doi 10.1093/mnras/stad3940

AI-assisted super-resolution cosmological simulations III: Time evolution

Authors: Xiaowen Zhang, Patrick Lachance, Yueying Ni, Yin Li, Rupert A. C. Croft, Tiziana Di Matteo, Simeon Bird, Yu Feng

Abstract: In this work, we extend our recently developed super-resolution (SR) model for cosmological simulations to produce fully time consistent evolving representations of the particle phase-space distribution. We employ a style-based constrained generative adversarial network (Style-GAN) where the changing cosmic time is an input style parameter to the network. The matter power spectrum and halo mass fu… ▽ More In this work, we extend our recently developed super-resolution (SR) model for cosmological simulations to produce fully time consistent evolving representations of the particle phase-space distribution. We employ a style-based constrained generative adversarial network (Style-GAN) where the changing cosmic time is an input style parameter to the network. The matter power spectrum and halo mass function agree well with results from high-resolution N-body simulations over the full trained redshift range ($10 \le z \le 0$). Furthermore, we assess the temporal consistency of our SR model by constructing halo merger trees. We examine progenitors, descendants and mass growth along the tree branches. All statistical indicators demonstrate the ability of our SR model to generate satisfactory high-resolution simulations based on low-resolution inputs. △ Less

Submitted 20 May, 2023; originally announced May 2023.

Comments: 12 pages, 11 figures, code and movie available in https://github.com/sagasv5-xw/map2map on styled srsgan branch

arXiv:2210.12907 [pdf, other]

doi 10.1093/mnras/stad2341

Super-resolution simulation of the Fuzzy Dark Matter cosmological model

Authors: Meris Sipp, Patrick LaChance, Rupert Croft, Yueying Ni, Tiziana Di Matteo

Abstract: AI super-resolution, combining deep learning and N-body simulations has been shown to successfully reproduce the large scale structure and halo abundances in the Lambda Cold Dark Matter cosmological model. Here, we extend its use to models with a different dark matter content, in this case Fuzzy Dark Matter (FDM), in the approximation that the difference is encoded in the initial power spectrum. W… ▽ More AI super-resolution, combining deep learning and N-body simulations has been shown to successfully reproduce the large scale structure and halo abundances in the Lambda Cold Dark Matter cosmological model. Here, we extend its use to models with a different dark matter content, in this case Fuzzy Dark Matter (FDM), in the approximation that the difference is encoded in the initial power spectrum. We focus on redshift z = 2, with simulations that model smaller scales and lower masses, the latter by two orders of magnitude, than has been done in previous AI super-resolution work. We find that the super-resolution technique can reproduce the power spectrum and halo mass function to within a few percent of full high resolution calculations. We also find that halo artifacts, caused by spurious numerical fragmentation of filaments, are equally present in the super-resolution outputs. Although we have not trained the super-resolution algorithm using full quantum pressure FDM simulations, the fact that it performs well at the relevant length and mass scales means that it has promise as technique which could avoid the very high computational cost of the latter, in some contexts. We conclude that AI super-resolution can become a useful tool to extend the range of dark matter models covered in mock catalogs. △ Less

Submitted 23 October, 2022; originally announced October 2022.

Comments: 7 pages, 4 figures

arXiv:2105.01016 [pdf, other]

doi 10.1093/mnras/stab2113

AI-assisted super-resolution cosmological simulations II: Halo substructures, velocities and higher order statistics

Authors: Yueying Ni, Yin Li, Patrick Lachance, Rupert A. C. Croft, Tiziana Di Matteo, Simeon Bird, Yu Feng

Abstract: In this work, we expand and test the capabilities of our recently developed super-resolution (SR) model to generate high-resolution (HR) realizations of the full phase-space matter distribution, including both displacement and velocity, from computationally cheap low-resolution (LR) cosmological N-body simulations. The SR model enhances the simulation resolution by generating 512 times more tracer… ▽ More In this work, we expand and test the capabilities of our recently developed super-resolution (SR) model to generate high-resolution (HR) realizations of the full phase-space matter distribution, including both displacement and velocity, from computationally cheap low-resolution (LR) cosmological N-body simulations. The SR model enhances the simulation resolution by generating 512 times more tracer particles, extending into the deeply non-linear regime where complex structure formation processes take place. We validate the SR model by deploying the model in 10 test simulations of box size 100 Mpc/h, and examine the matter power spectra, bispectra and 2D power spectra in redshift space. We find the generated SR field matches the true HR result at percent level down to scales of k ~ 10 h/Mpc. We also identify and inspect dark matter halos and their substructures. Our SR model generate visually authentic small-scale structures, that cannot be resolved by the LR input, and are in good statistical agreement with the real HR results. The SR model performs satisfactorily on the halo occupation distribution, halo correlations in both real and redshift space, and the pairwise velocity distribution, matching the HR results with comparable scatter, thus demonstrating its potential in making mock halo catalogs. The SR technique can be a powerful and promising tool for modelling small-scale galaxy formation physics in large cosmological volumes. △ Less

Submitted 17 September, 2021; v1 submitted 3 May, 2021; originally announced May 2021.

Comments: 13 pages, 11 figures, published version

Showing 1–11 of 11 results for author: LaChance, P