Search | arXiv e-print repository

Cosmological feedback from a halo assembly perspective

Authors: Luisa Lucie-Smith, Hiranya V. Peiris, Andrew Pontzen, Anik Halder, Joop Schaye, Matthieu Schaller, John Helly, Robert J. McGibbon, Willem Elbers

Abstract: The impact of feedback from galaxy formation on cosmological probes is typically quantified in terms of the suppression of the matter power spectrum in hydrodynamical compared to gravity-only simulations. In this paper, we instead study how baryonic feedback impacts halo assembly histories and thereby imprints on cosmological observables. We investigate the sensitivity of the thermal Sunyaev-Zel'd… ▽ More The impact of feedback from galaxy formation on cosmological probes is typically quantified in terms of the suppression of the matter power spectrum in hydrodynamical compared to gravity-only simulations. In this paper, we instead study how baryonic feedback impacts halo assembly histories and thereby imprints on cosmological observables. We investigate the sensitivity of the thermal Sunyaev-Zel'dovich effect (tSZ) power spectrum, X-ray number counts, weak lensing and kinetic Sunyaev-Zel'dovich (kSZ) stacked profiles to halo populations as a function of mass and redshift. We then study the imprint of different feedback implementations in the FLAMINGO suite of cosmological simulations on the assembly histories of these halo populations, as a function of radial scale. We find that kSZ profiles target lower-mass halos ($M_{\rm 200m}\sim 10^{13.1}\,\mathrm{M}_\odot$) compared to all other probes considered ($M_{200\mathrm{m}}\sim 10^{15}\,\mathrm{M}_\odot$). Feedback is inefficient in high-mass clusters with $\sim 10^{15} \, \mathrm{M}_\odot$ at $z=0$, but was more efficient at earlier times in the same population, with a $\sim 5$-$10\%$ effect on mass at $2<z<4$ (depending on radial scale). Conversely, for lower-mass halos with $\sim10^{13}\,\mathrm{M}_\odot$ at $z=0$, feedback exhibits a $\sim5$-$20\%$ effect on mass at $z=0$ but had little impact at earlier times ($z>2$). These findings are tied together by noting that, regardless of redshift, feedback most efficiently redistributes baryons when halos reach a mass of $M_{\rm 200m} \simeq {10^{12.8}}\,\mathrm{M}_{\odot}$ and ceases to have any significant effect by the time $M_{\rm 200m} \simeq {10^{15}}\,\mathrm{M}_{\odot}$. We put forward strategies for minimizing sensitivity of lensing analyses to baryonic feedback, and for exploring baryonic resolutions to the unexpectedly low tSZ power in cosmic microwave background observations. △ Less

Submitted 23 May, 2025; originally announced May 2025.

Comments: 20 pages, 9 figures. Comments welcome

arXiv:2502.17135 [pdf, other]

Unbiased estimates of the shapes of haloes using the positions of satellite galaxies

Authors: A. Herle, N. E. Chisari, H. Hoekstra, R. J. McGibbon, J. Schaye, M. Schaller, R. Kugel

Abstract: The shapes of dark matter haloes are sensitive to both cosmology and baryon physics, but are difficult to measure observationally. A promising way to constrain them is to use the positions of satellite galaxies as tracers of the underlying dark matter, but there are typically too few galaxies per halo for reliable shape estimates, resulting in biased shapes. We present a method to model sampling n… ▽ More The shapes of dark matter haloes are sensitive to both cosmology and baryon physics, but are difficult to measure observationally. A promising way to constrain them is to use the positions of satellite galaxies as tracers of the underlying dark matter, but there are typically too few galaxies per halo for reliable shape estimates, resulting in biased shapes. We present a method to model sampling noise to correct for the shape bias. We compare our predicted median shape bias with that obtained from the FLAMINGO suite of simulations and find reasonable agreement. We check that our results are robust to resolution effects and baryonic feedback. We also explore the validity of our bias correction at various redshifts and we discuss how our method might be applied to observations in the future. We show that median projected halo axis ratios are on average biased low by 0.31 when they are traced by only 5 satellites. Using the satellite galaxies, the projected host halo axis ratio can be corrected with a residual bias of ~ 0.1, by accounting for sampling bias. Hence, about two-thirds of the projected axis ratio bias can be explained by sampling noise. This enables the statistical measurement of halo shapes at lower masses than previously possible. Our method will also allow improved estimates of halo shapes in cosmological simulations using fewer particles than currently required. △ Less

Submitted 24 February, 2025; originally announced February 2025.

arXiv:2502.06932 [pdf, other]

Assessing subhalo finders in cosmological hydrodynamical simulations

Authors: Victor J. Forouhar Moreno, John Helly, Rob McGibbon, Joop Schaye, Matthieu Schaller, Jiaxin Han, Roi Kugel

Abstract: Cosmological simulations are essential for inferring cosmological and galaxy population properties based on forward-modelling, but this typically requires finding the population of (sub)haloes and galaxies that they contain. The properties of said populations vary depending on the algorithm used to find them, which is concerning as it may bias key statistics. We compare how the predicted (sub)halo… ▽ More Cosmological simulations are essential for inferring cosmological and galaxy population properties based on forward-modelling, but this typically requires finding the population of (sub)haloes and galaxies that they contain. The properties of said populations vary depending on the algorithm used to find them, which is concerning as it may bias key statistics. We compare how the predicted (sub)halo mass functions, satellite radial distributions and correlation functions vary across algorithms in the dark-matter-only and hydrodynamical versions of the FLAMINGO simulations. We test three representative approaches to finding subhaloes: grouping particles in configuration- (Subfind), phase- (ROCKSTAR and VELOCIraptor) and history-space (HBT-HERONS). We also present HBT-HERONS, a new version of the HBT+ subhalo finder that improves the tracking of subhaloes. We find 10%-level differences in the $M_{\mathrm{200c}}$ mass function, reflecting different field halo definitions and occasional miscentering. The bound mass functions can differ by 75% at the high mass end, even when using the maximum circular velocity as a mass proxy. The number of well-resolved subhaloes differs by up to 20% near $R_{\mathrm{200c}}$, reflecting differences in the assignment of mass to subhaloes and their identification. The predictions of different subhalo finders increasingly diverge towards the centres of the host haloes. The performance of most subhalo finders does not improve with the resolution of the simulation and is worse for hydrodynamical than for dark-matter-only simulations. We conclude that HBT-HERONS is the preferred choice of subhalo finder due to its low computational cost, self-consistently made and robust merger trees, and robust subhalo identification capabilities. △ Less

Submitted 10 February, 2025; originally announced February 2025.

Comments: Submitted to MNRAS. 32 pages total: 23 pages of main text and 9 of appendices

arXiv:2501.07677 [pdf, other]

doi 10.1093/mnras/staf519

On the accuracy of dark matter halo merger trees and the consequences for semi-analytic models of galaxy formation

Authors: Ángel Chandro-Gómez, Claudia del P. Lagos, Chris Power, Victor J. Forouhar Moreno, John C. Helly, Cedric G. Lacey, Robert J. McGibbon, Matthieu Schaller, Joop Schaye

Abstract: Galaxy formation and evolution models, such as semi-analytic models, are powerful theoretical tools for predicting how galaxies evolve across cosmic time. These models follow the evolution of galaxies based on the halo assembly histories inferred from large $N$-body cosmological simulations. This process requires codes to identify halos ("halo finder") and to track their time evolution ("tree buil… ▽ More Galaxy formation and evolution models, such as semi-analytic models, are powerful theoretical tools for predicting how galaxies evolve across cosmic time. These models follow the evolution of galaxies based on the halo assembly histories inferred from large $N$-body cosmological simulations. This process requires codes to identify halos ("halo finder") and to track their time evolution ("tree builder"). While these codes generally perform well, they encounter numerical issues when handling dense environments. In this paper, we present how relevant these issues are in state-of-the-art cosmological simulations. We characterize two major numerical artefacts in halo assembly histories: (i) the non-physical swapping of large amounts of mass between subhalos, and (ii) the sudden formation of already massive subhalos at late cosmic times. We quantify these artefacts for different combinations of halo finder (SUBFIND, VELOCIRAPTOR, HBT-HERONS) and tree builder codes (D-TRESS+DHALO, TREEFROG, HBT-HERONS), finding that in general more than $50\%$ ($80\%$) of the more massive subhalos with $>10^{3}$ ($>10^{4}$) particles at $z=0$ inherit them in most cases. However, HBT-HERONS, which explicitly incorporates temporal information, effectively reduces the occurrence of these artefacts to $5\%$ ($10\%$). We then use the semi-analytic models SHARK and GALFORM to explore how these artefacts impact galaxy formation predictions. We demonstrate that the issues above lead to non-physical predictions in galaxies hosted by affected halos, particularly in SHARK where the modelling of baryons relies on subhalo information. Finally, we propose and implement fixes for the numerical artefacts at the semi-analytic model level, and use SHARK to show the improvements, especially at the high-mass end, after applying them. △ Less

Submitted 15 April, 2025; v1 submitted 13 January, 2025; originally announced January 2025.

Comments: Accepted for publication in MNRAS (a new section 3.3.3 discussing results in the context of other galaxy formation models and other minor changes). 32 pages (26 of main body and 6 of appendices)

Journal ref: Monthly Notices of the Royal Astronomical Society, Volume 539, Issue 2, May 2025, Pages 776-807

arXiv:2412.02736 [pdf, other]

doi 10.1093/mnras/staf357

The FLAMINGO project: cosmology with the redshift dependence of weak gravitational lensing peaks

Authors: Jeger C. Broxterman, Matthieu Schaller, Henk Hoekstra, Joop Schaye, Robert J. McGibbon, Victor J. Forouhar Moreno, Roi Kugel, Willem Elbers

Abstract: Weak gravitational lensing (WL) convergence peaks contain valuable cosmological information in the regime of non-linear collapse. Using the FLAMINGO suite of cosmological hydrodynamical simulations, we study the physical origin and redshift distributions of the objects generating WL peaks selected from a WL convergence map mimicking a $\textit{Euclid}$ signal. We match peaks to individual haloes a… ▽ More Weak gravitational lensing (WL) convergence peaks contain valuable cosmological information in the regime of non-linear collapse. Using the FLAMINGO suite of cosmological hydrodynamical simulations, we study the physical origin and redshift distributions of the objects generating WL peaks selected from a WL convergence map mimicking a $\textit{Euclid}$ signal. We match peaks to individual haloes and show that the high signal-to-noise ratio (SNR$~>~5$) WL peaks measured by Stage IV WL surveys primarily trace $M_{\mathrm{200c}} > 10^{14}~\mathrm{M_\odot}$ haloes. We find that the WL peak sample can compete with the purity and completeness of state-of-the-art X-ray and Sunyaev-Zel'dovich cluster abundance inferences. By comparing the distributions predicted by simulation variations that have been calibrated to the observed gas fractions of local clusters and the present-day galaxy stellar mass function, or shifted versions of these, we illustrate that the shape of the redshift distribution of SNR$~>~5$ peaks is insensitive to baryonic physics while it does change with cosmology. The difference highlights the potential of using WL peaks to constrain cosmology. As the WL convergence and redshift number densities of WL peaks scale differently with cosmology and baryonic feedback, WL peak statistics can simultaneously calibrate baryonic feedback and constrain cosmology. △ Less

Submitted 25 February, 2025; v1 submitted 3 December, 2024; originally announced December 2024.

Comments: 20 pages, 12 figures (including the appendices). Accepted for publication in MNRAS

Journal ref: MNRAS 538, 755-774 (2025)

arXiv:2410.19905 [pdf, other]

FLAMINGO: combining kinetic SZ effect and galaxy-galaxy lensing measurements to gauge the impact of feedback on large-scale structure

Authors: Ian G. McCarthy, Alexandra Amon, Joop Schaye, Emmanuel Schaan, Raul E. Angulo, Jaime Salcido, Matthieu Schaller, Leah Bigwood, Willem Elbers, Roi Kugel, John C. Helly, Victor J. Forouhar Moreno, Carlos S. Frenk, Robert J. McGibbon, Lurdes Ondaro-Mallea, Marcel P. van Daalen

Abstract: Energetic feedback processes associated with accreting supermassive black holes can expel gas from massive haloes and significantly alter various measures of clustering on ~Mpc scales, potentially biasing the values of cosmological parameters inferred from analyses of large-scale structure (LSS) if not modelled accurately. Here we use the state-of-the-art FLAMINGO suite of cosmological hydrodynami… ▽ More Energetic feedback processes associated with accreting supermassive black holes can expel gas from massive haloes and significantly alter various measures of clustering on ~Mpc scales, potentially biasing the values of cosmological parameters inferred from analyses of large-scale structure (LSS) if not modelled accurately. Here we use the state-of-the-art FLAMINGO suite of cosmological hydrodynamical simulations to gauge the impact of feedback on large-scale structure by comparing to Planck + ACT stacking measurements of the kinetic Sunyaev-Zel'dovich (kSZ) effect of SDSS BOSS galaxies. We make careful like-with-like comparisons to the observations, aided by high precision KiDS and DES galaxy-galaxy lensing measurements of the BOSS galaxies to inform the selection of the simulated galaxies. In qualitative agreement with several recent studies using dark matter only simulations corrected for baryonic effects, we find that the kSZ effect measurements prefer stronger feedback than predicted by simulations which have been calibrated to reproduce the gas fractions of low redshift X-ray-selected groups and clusters. We find that the increased feedback can help to reduce the so-called S8 tension between the observed and CMB-predicted clustering on small scales as probed by cosmic shear (although at the expense of agreement with the X-ray group measurements). However, the increased feedback is only marginally effective at reducing the reported offsets between the predicted and observed clustering as probed by the thermal SZ (tSZ) effect power spectrum and tSZ effect--weak lensing cross-spectrum, both of which are sensitive to higher halo masses than cosmic shear. △ Less

Submitted 1 May, 2025; v1 submitted 25 October, 2024; originally announced October 2024.

Comments: 21 pages, 11 figures, MNRAS, accepted

arXiv:2408.17217 [pdf, other]

The FLAMINGO Project: An assessment of the systematic errors in the predictions of models for galaxy cluster counts used to infer cosmological parameters

Authors: Roi Kugel, Joop Schaye, Matthieu Schaller, Victor J. Forouhar Moreno, Robert J. McGibbon

Abstract: Galaxy cluster counts have historically been important for the measurement of cosmological parameters and upcoming surveys will greatly reduce the statistical errors. To exploit the potential of current and future cluster surveys, theoretical uncertainties on the predicted abundance must be smaller than the statistical errors. Models used to predict cluster counts typically combine a model for the… ▽ More Galaxy cluster counts have historically been important for the measurement of cosmological parameters and upcoming surveys will greatly reduce the statistical errors. To exploit the potential of current and future cluster surveys, theoretical uncertainties on the predicted abundance must be smaller than the statistical errors. Models used to predict cluster counts typically combine a model for the dark matter only (DMO) halo mass function (HMF) with an observable - mass relation that is assumed to be a power-law with lognormal scatter. We use the FLAMINGO suite of cosmological hydrodynamical simulations to quantify the biases in the cluster counts and cosmological parameters resulting from the different ingredients of conventional models. For the observable mass proxy we focus on the Compton-Y parameter quantifying the thermal Sunyaev-Zel'dovich effect, which is expected to result in cluster samples that are relatively close to mass-selected samples. We construct three mock samples based on existing (Planck and SPT) and upcoming (Simons Observatory) surveys. We ignore measurement uncertainties and compare the biases in the counts and inferred cosmological parameters to each survey's Poisson errors. We find that widely used models for the DMO HMF differ significantly from each other and from the DMO version of FLAMINGO, leading to significant biases for all three surveys. For upcoming surveys, dramatic improvements are needed for all additional model ingredients, i.e. the functional forms of the fits to the observable-mass scaling relation and the associated scatter, the priors on the scaling relation and the prior on baryonic effects associated with feedback processes on the HMF. △ Less

Submitted 17 January, 2025; v1 submitted 30 August, 2024; originally announced August 2024.

Comments: 19 pages, 8 figures. (Updates w.r.t. version 1) Accepted for publication in MNRAS

arXiv:2406.03180 [pdf, other]

The FLAMINGO Project: A comparison of galaxy cluster samples selected on mass, X-ray luminosity, Compton-Y parameter, or galaxy richness

Authors: Roi Kugel, Joop Schaye, Matthieu Schaller, Ian G. McCarthy, Joey Braspenning, John C. Helly, Victor J. Forouhar Moreno, Robert J. McGibbon

Abstract: Galaxy clusters provide an avenue to expand our knowledge of cosmology and galaxy evolution. Because it is difficult to accurately measure the total mass of a large number of individual clusters, cluster samples are typically selected using an observable proxy for mass. Selection effects are therefore a key problem in understanding galaxy cluster statistics. We make use of the $(2.8~\rm{Gpc})^3$ F… ▽ More Galaxy clusters provide an avenue to expand our knowledge of cosmology and galaxy evolution. Because it is difficult to accurately measure the total mass of a large number of individual clusters, cluster samples are typically selected using an observable proxy for mass. Selection effects are therefore a key problem in understanding galaxy cluster statistics. We make use of the $(2.8~\rm{Gpc})^3$ FLAMINGO hydrodynamical simulation to investigate how selection based on X-ray luminosity, thermal Sunyaev-Zeldovich effect or galaxy richness influences the halo mass distribution. We define our selection cuts based on the median value of the observable at a fixed mass and compare the resulting samples to a mass-selected sample. We find that all samples are skewed towards lower mass haloes. For X-ray luminosity and richness cuts below a critical value, scatter dominates over the trend with mass and the median mass becomes biased increasingly low with respect to a mass-selected sample. At $z\leq0.5$, observable cuts corresponding to median halo masses between $M_\text{500c}=10^{14}$ and $10^{15}~\rm{M_{\odot}}$ give nearly unbiased median masses for all selection methods, but X-ray selection results in biased medians for higher masses. For cuts corresponding to median masses $<10^{14}$ at $z\leq0.5$ and for all masses at $z\geq1$, only Compton-Y selection yields nearly unbiased median masses. Importantly, even when the median mass is unbiased, the scatter is not because for each selection the sample is skewed towards lower masses than a mass-selected sample. Each selection leads to a different bias in secondary quantities like cool-core fraction, temperature and gas fraction. △ Less

Submitted 24 September, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

Comments: 19 pages, 12 figures (Including the appendix). Accepted for publication in MNRAS. Minor changes w.r.t. version 1

arXiv:2306.07728 [pdf, other]

doi 10.1093/mnras/stad1811

Multi-Epoch Machine Learning 2: Identifying physical drivers of galaxy properties in simulations

Authors: Robert McGibbon, Sadegh Khochfar

Abstract: Using a novel machine learning method, we investigate the buildup of galaxy properties in different simulations, and in various environments within a single simulation. The aim of this work is to show the power of this approach at identifying the physical drivers of galaxy properties within simulations. We compare how the stellar mass is dependent on the value of other galaxy and halo properties a… ▽ More Using a novel machine learning method, we investigate the buildup of galaxy properties in different simulations, and in various environments within a single simulation. The aim of this work is to show the power of this approach at identifying the physical drivers of galaxy properties within simulations. We compare how the stellar mass is dependent on the value of other galaxy and halo properties at different points in time by examining the feature importance values of a machine learning model. By training the model on IllustrisTNG we show that stars are produced at earlier times in higher density regions of the universe than they are in low density regions. We also apply the technique to the Illustris, EAGLE, and CAMELS simulations. We find that stellar mass is built up in a similar way in EAGLE and IllustrisTNG, but significantly differently in the original Illustris, suggesting that subgrid model physics is more important than the choice of hydrodynamics method. These differences are driven by the efficiency of supernova feedback. Applying principal component analysis to the CAMELS simulations allows us to identify a component associated with the importance of a halo's gravitational potential and another component representing the time at which galaxies form. We discover that the speed of galactic winds is a more critical subgrid parameter than the total energy per unit star formation. Finally we find that the Simba black hole feedback model has a larger effect on galaxy formation than the IllustrisTNG black hole feedback model. △ Less

Submitted 13 June, 2023; originally announced June 2023.

Comments: 16 pages, 12 figures, accepted to MNRAS

arXiv:2112.08424 [pdf, other]

doi 10.1093/mnras/stac1269

Multi-Epoch Machine Learning 1: Unravelling Nature vs Nurture for Galaxy Formation

Authors: Robert McGibbon, Sadegh Khochfar

Abstract: We present a novel machine learning method for predicting the baryonic properties of dark matter only subhalos from N-body simulations. Our model is built using the extremely randomized tree (ERT) algorithm and takes subhalo properties over a wide range of redshifts as its input features. We train our model using the IllustrisTNG simulations to predict blackhole mass, gas mass, magnitudes, star fo… ▽ More We present a novel machine learning method for predicting the baryonic properties of dark matter only subhalos from N-body simulations. Our model is built using the extremely randomized tree (ERT) algorithm and takes subhalo properties over a wide range of redshifts as its input features. We train our model using the IllustrisTNG simulations to predict blackhole mass, gas mass, magnitudes, star formation rate, stellar mass, and metallicity. We compare the results of our method with a baseline model from previous works, and against a model that only considers the mass history of the subhalo. We find that our new model significantly outperforms both of the other models. We then investigate the predictive power of each input by looking at feature importance scores from the ERT algorithm. We produce feature importance plots for each baryonic property, and find that they differ significantly. We identify low redshifts as being most important for predicting star formation rate and gas mass, with high redshifts being most important for predicting stellar mass and metallicity, and consider what this implies for nature vs nurture. We find that the physical properties of galaxies investigated in this study are all driven by nurture and not nature. The only property showing a somewhat stronger impact of nature is the present-day star formation rate of galaxies. Finally we verify that the feature importance plots are discovering physical patterns, and that the trends shown are not an artefact of the ERT algorithm. △ Less

Submitted 4 May, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

Comments: Accepted to MNRAS, 15 pages, 8 figures, 2 tables, Main Figure is Fig 5

arXiv:2103.13932 [pdf, other]

QUOTAS: A new research platform for the data-driven investigation of black holes

Authors: Priyamvada Natarajan, Kwok Sun Tang, Robert McGibbon, Sadegh Khochfar, Brian Nord, Steinn Sigurdsson, Joe Tricot, Nico Cappelluti, Daniel George, Jack Hidary

Abstract: We present QUOTAS, a novel research platform for the data-driven investigation of super-massive black hole (SMBH) populations. While SMBH data sets -- observations and simulations -- have grown rapidly in complexity and abundance, our computational environments and analysis tools have not matured commensurately to exhaust opportunities for discovery. Motivated to explore BH host galaxy and the par… ▽ More We present QUOTAS, a novel research platform for the data-driven investigation of super-massive black hole (SMBH) populations. While SMBH data sets -- observations and simulations -- have grown rapidly in complexity and abundance, our computational environments and analysis tools have not matured commensurately to exhaust opportunities for discovery. Motivated to explore BH host galaxy and the parent dark matter halo connection, in this pilot version of QUOTAS, we assemble and co-locate the high-redshift, luminous quasar population at $z \geq 3$ alongside simulated data of the same epochs. Leveraging machine learning algorithms (ML) we expand simulation volumes that successfully replicate halo populations beyond the training set. Training ML on the Illustris-TNG300 simulation that includes baryonic physics, we populate the larger LEGACY Expanse dark matter-only box with quasars. Our first science results comparing observational and ML simulated quasars at $z \sim 3$, reveal that while the recovered Black Hole Mass Functions and clustering are in good agreement, simulated SMBHs fail to accrete, shine and grow at high enough rates to match observed quasars. We conclude that sub-grid models of mass accretion and SMBH feedback implemented in Illustris-TNG300 do not reproduce their observed mass growth. QUOTAS, demonstrates the power of ML, both for analyzing large complex datasets, and offering a unique opportunity to interrogate our theoretical model assumptions. We deploy ML again to derive and devise an optimal survey strategy for bringing the undetected lower luminosity quasar population into view. QUOTAS, and all related materials are publicly available at the Google Kaggle platform. △ Less

Submitted 14 April, 2023; v1 submitted 25 March, 2021; originally announced March 2021.

Comments: Revised version: 38 pages, 4 tables and 14 figures, accepted for publication in ApJ

arXiv:2008.06431 [pdf, other]

Efficient hyperparameter optimization by way of PAC-Bayes bound minimization

Authors: John J. Cherian, Andrew G. Taube, Robert T. McGibbon, Panagiotis Angelikopoulos, Guy Blanc, Michael Snarski, Daniel D. Richman, John L. Klepeis, David E. Shaw

Abstract: Identifying optimal values for a high-dimensional set of hyperparameters is a problem that has received growing attention given its importance to large-scale machine learning applications such as neural architecture search. Recently developed optimization methods can be used to select thousands or even millions of hyperparameters. Such methods often yield overfit models, however, leading to poor p… ▽ More Identifying optimal values for a high-dimensional set of hyperparameters is a problem that has received growing attention given its importance to large-scale machine learning applications such as neural architecture search. Recently developed optimization methods can be used to select thousands or even millions of hyperparameters. Such methods often yield overfit models, however, leading to poor performance on unseen data. We argue that this overfitting results from using the standard hyperparameter optimization objective function. Here we present an alternative objective that is equivalent to a Probably Approximately Correct-Bayes (PAC-Bayes) bound on the expected out-of-sample error. We then devise an efficient gradient-based algorithm to minimize this objective; the proposed method has asymptotic space and time complexity equal to or better than other gradient-based hyperparameter optimization methods. We show that this new method significantly reduces out-of-sample error when applied to hyperparameter optimization problems known to be prone to overfitting. △ Less

Submitted 14 August, 2020; originally announced August 2020.

arXiv:1605.02688 [pdf, other]

Theano: A Python framework for fast computation of mathematical expressions

Authors: The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano , et al. (88 additional authors not shown)

Abstract: Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, mu… ▽ More Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it. △ Less

Submitted 9 May, 2016; originally announced May 2016.

Comments: 19 pages, 5 figures

arXiv:1602.08776 [pdf, other]

doi 10.1063/1.4974306

Identification of simple reaction coordinates from complex dynamics

Authors: Robert T. McGibbon, Brooke E. Husic, Vijay S. Pande

Abstract: Reaction coordinates are widely used throughout chemical physics to model and understand complex chemical transformations. We introduce a definition of the natural reaction coordinate, suitable for condensed phase and biomolecular systems, as a maximally predictive one-dimensional projection. We then show this criterion is uniquely satisfied by a dominant eigenfunction of an integral operator asso… ▽ More Reaction coordinates are widely used throughout chemical physics to model and understand complex chemical transformations. We introduce a definition of the natural reaction coordinate, suitable for condensed phase and biomolecular systems, as a maximally predictive one-dimensional projection. We then show this criterion is uniquely satisfied by a dominant eigenfunction of an integral operator associated with the ensemble dynamics. We present a new sparse estimator for these eigenfunctions which can search through a large candidate pool of structural order parameters and build simple, interpretable approximations that employ only a small number of these order parameters. Example applications with a small molecule's rotational dynamics and simulations of protein conformational change and folding show that this approach can filter through statistical noise to identify simple reaction coordinates from complex dynamics. △ Less

Submitted 6 January, 2017; v1 submitted 28 February, 2016; originally announced February 2016.

Comments: 18 pages, 10 figures

arXiv:1504.01804 [pdf, other]

Efficient maximum likelihood parameterization of continuous-time Markov processes

Authors: Robert T. McGibbon, Vijay S. Pande

Abstract: Continuous-time Markov processes over finite state-spaces are widely used to model dynamical processes in many fields of natural and social science. Here, we introduce an maximum likelihood estimator for constructing such models from data observed at a finite time interval. This estimator is dramatically more efficient than prior approaches, enables the calculation of deterministic confidence inte… ▽ More Continuous-time Markov processes over finite state-spaces are widely used to model dynamical processes in many fields of natural and social science. Here, we introduce an maximum likelihood estimator for constructing such models from data observed at a finite time interval. This estimator is dramatically more efficient than prior approaches, enables the calculation of deterministic confidence intervals in all model parameters, and can easily enforce important physical constraints on the models such as detailed balance. We demonstrate and discuss the advantages of these models over existing discrete-time Markov models for the analysis of molecular dynamics simulations. △ Less

Submitted 30 June, 2015; v1 submitted 7 April, 2015; originally announced April 2015.

arXiv:1408.5446 [pdf, ps, other]

doi 10.1063/1.4895044

Perspective: Markov Models for Long-Timescale Biomolecular Dynamics

Authors: Christian R. Schwantes, Robert T. McGibbon, Vijay S. Pande

Abstract: Molecular dynamics simulations have the potential to provide atomic-level detail and insight to important questions in chemical physics that cannot be observed in typical experiments. However, simply generating a long trajectory is insufficient, as researchers must be able to transform the data in a simulation trajectory into specific scientific insights. Although this analysis step has often been… ▽ More Molecular dynamics simulations have the potential to provide atomic-level detail and insight to important questions in chemical physics that cannot be observed in typical experiments. However, simply generating a long trajectory is insufficient, as researchers must be able to transform the data in a simulation trajectory into specific scientific insights. Although this analysis step has often been taken for granted, it deserves further attention as large-scale simulations become increasingly routine. In this perspective, we discuss the application of Markov models to the analysis of large-scale biomolecular simulations. We draw attention to recent improvements in the construction of these models as well as several important open issues. In addition, we highlight recent theoretical advances that pave the way for a new generation of models of molecular kinetics. △ Less

Submitted 22 August, 2014; originally announced August 2014.

Comments: 7 pages

arXiv:1407.8083 [pdf, other]

doi 10.1063/1.4916292

Variational cross-validation of slow dynamical modes in molecular kinetics

Authors: Robert T. McGibbon, Vijay S. Pande

Abstract: Markov state models (MSMs) are a widely used method for approximating the eigenspectrum of the molecular dynamics propagator, yielding insight into the long-timescale statistical kinetics and slow dynamical modes of biomolecular systems. However, the lack of a unified theoretical framework for choosing between alternative models has hampered progress, especially for non-experts applying these meth… ▽ More Markov state models (MSMs) are a widely used method for approximating the eigenspectrum of the molecular dynamics propagator, yielding insight into the long-timescale statistical kinetics and slow dynamical modes of biomolecular systems. However, the lack of a unified theoretical framework for choosing between alternative models has hampered progress, especially for non-experts applying these methods to novel biological systems. Here, we consider cross-validation with a new objective function for estimators of these slow dynamical modes, a generalized matrix Rayleigh quotient (GMRQ), which measures the ability of a rank-$m$ projection operator to capture the slow subspace of the system. It is shown that a variational theorem bounds the GMRQ from above by the sum of the first $m$ eigenvalues of the system's propagator, but that this bound can be violated when the requisite matrix elements are estimated subject to statistical uncertainty. This overfitting can be detected and avoided through cross-validation. These result make it possible to construct Markov state models for protein dynamics in a way that appropriately captures the tradeoff between systematic and statistical errors. △ Less

Submitted 27 March, 2015; v1 submitted 30 July, 2014; originally announced July 2014.

Journal ref: J. Chem. Phys. 142, 124105 (2015)

arXiv:1405.1444 [pdf, other]

Understanding Protein Dynamics with L1-Regularized Reversible Hidden Markov Models

Authors: Robert T. McGibbon, Bharath Ramsundar, Mohammad M. Sultan, Gert Kiss, Vijay S. Pande

Abstract: We present a machine learning framework for modeling protein dynamics. Our approach uses L1-regularized, reversible hidden Markov models to understand large protein datasets generated via molecular dynamics simulations. Our model is motivated by three design principles: (1) the requirement of massive scalability; (2) the need to adhere to relevant physical law; and (3) the necessity of providing a… ▽ More We present a machine learning framework for modeling protein dynamics. Our approach uses L1-regularized, reversible hidden Markov models to understand large protein datasets generated via molecular dynamics simulations. Our model is motivated by three design principles: (1) the requirement of massive scalability; (2) the need to adhere to relevant physical law; and (3) the necessity of providing accessible interpretations, critical for both cellular biology and rational drug design. We present an EM algorithm for learning and introduce a model selection criteria based on the physical notion of convergence in relaxation timescales. We contrast our model with standard methods in biophysics and demonstrate improved robustness. We implement our algorithm on GPUs and apply the method to two large protein simulation datasets generated respectively on the NCSA Bluewaters supercomputer and the Folding@Home distributed computing network. Our analysis identifies the conformational dynamics of the ubiquitin protein critical to cellular signaling, and elucidates the stepwise activation mechanism of the c-Src kinase protein. △ Less

Submitted 6 May, 2014; originally announced May 2014.

Journal ref: Proceedings of the 31st International Conference on Machine Learning, Beijing, China, 2014

Showing 1–18 of 18 results for author: McGibbon, R