-
Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Authors:
Daniel J. H. Chung,
Zhiqi Gao,
Yurii Kvasiuk,
Tianyi Li,
Moritz Münchmeyer,
Maja Rudolph,
Frederic Sala,
Sai Chaitanya Tadepalli
Abstract:
We introduce a benchmark to evaluate the capability of AI to solve problems in theoretical physics, focusing on high-energy theory and cosmology. The first iteration of our benchmark consists of 57 problems of varying difficulty, from undergraduate to research level. These problems are novel in the sense that they do not come from public problem collections. We evaluate our data set on various ope…
▽ More
We introduce a benchmark to evaluate the capability of AI to solve problems in theoretical physics, focusing on high-energy theory and cosmology. The first iteration of our benchmark consists of 57 problems of varying difficulty, from undergraduate to research level. These problems are novel in the sense that they do not come from public problem collections. We evaluate our data set on various open and closed language models, including o3-mini, o1, DeepSeek-R1, GPT-4o and versions of Llama and Qwen. While we find impressive progress in model performance with the most recent models, our research-level difficulty problems are mostly unsolved. We address challenges of auto-verifiability and grading, and discuss common failure modes. While currently state-of-the art models are still of limited use for researchers, our results show that AI assisted theoretical physics research may become possible in the near future. We discuss the main obstacles towards this goal and possible strategies to overcome them. The public problems and solutions, results for various models, and updates to the data set and score distribution, are available on the website of the dataset tpbench.org.
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
Reconstruction of Continuous Cosmological Fields from Discrete Tracers with Graph Neural Networks
Authors:
Yurii Kvasiuk,
Jordan Krywonos,
Matthew C. Johnson,
Moritz Münchmeyer
Abstract:
We develop a hybrid GNN-CNN architecture for the reconstruction of 3-dimensional continuous cosmological matter fields from discrete point clouds, provided by observed galaxy catalogs. Using the CAMELS hydrodynamical cosmological simulations we demonstrate that the proposed architecture allows for an accurate reconstruction of both the dark matter and electron density given observed galaxies and t…
▽ More
We develop a hybrid GNN-CNN architecture for the reconstruction of 3-dimensional continuous cosmological matter fields from discrete point clouds, provided by observed galaxy catalogs. Using the CAMELS hydrodynamical cosmological simulations we demonstrate that the proposed architecture allows for an accurate reconstruction of both the dark matter and electron density given observed galaxies and their features. Our approach includes a learned grid assignment scheme that improves over the traditional cloud-in-cell method. Our method can improve cosmological analyses in situations where non-luminous (and thus unobservable) continuous fields need to be estimated from luminous (observable) discrete point cloud tracers.
△ Less
Submitted 4 November, 2024;
originally announced November 2024.
-
A Tale of Two Fields: Neural Network-Enhanced non-Gaussianity Search with Halos
Authors:
Yurii Kvasiuk,
Moritz Münchmeyer,
Kendrick Smith
Abstract:
It was recently shown that neural networks can be combined with the analytic method of scale-dependent bias to obtain a measurement of local primordial non-Gaussianity, which is optimal in the squeezed limit that dominates the signal-to-noise. The method is robust to non-linear physics, but also inherits the statistical precision offered by neural networks applied to very non-linear scales. In pri…
▽ More
It was recently shown that neural networks can be combined with the analytic method of scale-dependent bias to obtain a measurement of local primordial non-Gaussianity, which is optimal in the squeezed limit that dominates the signal-to-noise. The method is robust to non-linear physics, but also inherits the statistical precision offered by neural networks applied to very non-linear scales. In prior work, we assumed that the neural network has access to the full matter distribution. In this work, we apply our method to halos. We first describe a novel two-field formalism that is optimal even when the matter distribution is not observed. We show that any N halo fields can be compressed to two fields without losing information, and obtain optimal loss functions to learn these fields. We then apply the method to high-resolution AbacusSummit and AbacusPNG simulations. In the present work, the two neural networks observe the local population statistics, in particular the halo mass and concentration distribution in a patch of the sky. While the traditional mass-binned halo analysis is optimal in practice without further halo properties on AbacusPNG, our novel formalism easily allows to include additional halo properties such as the halo concentration, which can improve $f_{NL}$ constraints by a factor of a few. We also explore whether shot noise can be lowered with machine learning compared to a traditional reconstruction, finding no improvement for our simulation parameters.
△ Less
Submitted 1 October, 2024;
originally announced October 2024.
-
An Auto-Differentiable Likelihood Pipeline for the Cross-Correlation of CMB and Large-Scale Structure due to the Kinetic Sunyaev-Zeldovich Effect
Authors:
Yurii Kvasiuk,
Moritz Münchmeyer
Abstract:
We develop an optimization-based maximum likelihood approach to analyze the cross-correlation of the Cosmic Microwave Background (CMB) and large-scale structure induced by the kinetic Sunyaev-Zeldovich (kSZ) effect. Our main goal is to reconstruct the radial velocity field of the universe. While the existing quadratic estimator (QE) is statistically optimal for current and near-term experiments, t…
▽ More
We develop an optimization-based maximum likelihood approach to analyze the cross-correlation of the Cosmic Microwave Background (CMB) and large-scale structure induced by the kinetic Sunyaev-Zeldovich (kSZ) effect. Our main goal is to reconstruct the radial velocity field of the universe. While the existing quadratic estimator (QE) is statistically optimal for current and near-term experiments, the likelihood can extract more signal-to-noise in the future. Our likelihood formulation has further advantages over the QE, such as the possibility of jointly fitting cosmological and astrophysical parameters and the possibility of unifying several different kSZ analyses. We implement an auto-differentiable likelihood pipeline in JAX, which is computationally tractable for a realistic survey size and resolution, and evaluate it on the Agora simulation. We also implement a machine learning-based estimate of the electron density given an observed galaxy distribution, which can increase the signal-to-noise for both the QE and the likelihood method.
△ Less
Submitted 29 February, 2024; v1 submitted 15 May, 2023;
originally announced May 2023.