-
A Scalable Gaussian Process Approach to Shear Mapping with MuyGPs
Authors:
Gregory Sallaberry,
Benjamin W. Priest,
Robert Armstrong,
Michael D. Schneider,
Amanda Muyskens,
Trevor Steil,
Keita Iwabuchi
Abstract:
Analysis of cosmic shear is an integral part of understanding structure growth across cosmic time, which in-turn provides us with information about the nature of dark energy. Conventional methods generate \emph{shear maps} from which we can infer the matter distribution in the universe. Current methods (e.g., Kaiser-Squires inversion) for generating these maps, however, are tricky to implement and…
▽ More
Analysis of cosmic shear is an integral part of understanding structure growth across cosmic time, which in-turn provides us with information about the nature of dark energy. Conventional methods generate \emph{shear maps} from which we can infer the matter distribution in the universe. Current methods (e.g., Kaiser-Squires inversion) for generating these maps, however, are tricky to implement and can introduce bias. Recent alternatives construct a spatial process prior for the lensing potential, which allows for inference of the convergence and shear parameters given lensing shear measurements. Realizing these spatial processes, however, scales cubically in the number of observations - an unacceptable expense as near-term surveys expect billions of correlated measurements. Therefore, we present a linearly-scaling shear map construction alternative using a scalable Gaussian Process (GP) prior called MuyGPs. MuyGPs avoids cubic scaling by conditioning interpolation on only nearest-neighbors and fits hyperparameters using batched leave-one-out cross validation. We use a suite of ray-tracing results from N-body simulations to demonstrate that our method can accurately interpolate shear maps, as well as recover the two-point and higher order correlations. We also show that we can perform these operations at the scale of billions of galaxies on high performance computing platforms.
△ Less
Submitted 30 September, 2024;
originally announced October 2024.
-
Stellar Blend Image Classification Using Computationally Efficient Gaussian Processes
Authors:
Chinedu Eleh,
Yunli Zhang,
Rafael Bidese,
Benjamin W. Priest,
Amanda L. Muyskens,
Roberto Molinari,
Nedret Billor
Abstract:
Stellar blends, where two or more stars appear blended in an image, pose a significant visualization challenge in astronomy. Traditionally, distinguishing these blends from single stars has been costly and resource-intensive, involving sophisticated equipment and extensive expert analysis. This is especially problematic for analyzing the vast data volumes from surveys, such as Legacy Survey of Spa…
▽ More
Stellar blends, where two or more stars appear blended in an image, pose a significant visualization challenge in astronomy. Traditionally, distinguishing these blends from single stars has been costly and resource-intensive, involving sophisticated equipment and extensive expert analysis. This is especially problematic for analyzing the vast data volumes from surveys, such as Legacy Survey of Space and Time (LSST), Sloan Digital Sky Survey (SDSS), Dark Energy Spectroscopic Instrument (DESI), Legacy Imaging Survey and the Zwicky Transient Facility (ZTF). To address these challenges, we apply different normalizations and data embeddings on low resolution images of single stars and stellar blends, which are passed as inputs into machine learning methods and to a computationally efficient Gaussian process model (MuyGPs). MuyGPs consistently outperforms the benchmarked models, particularly on limited training data. Moreover, MuyGPs with $r^\text{th}$ root local min-max normalization achieves 83.8% accuracy. Furthermore, MuyGPs' ability to produce confidence bands ensures that predictions with low confidence can be redirected to a specialist for efficient human-assisted labeling.
△ Less
Submitted 27 July, 2024;
originally announced July 2024.
-
Light curve completion and forecasting using fast and scalable Gaussian processes (MuyGPs)
Authors:
Imène R. Goumiri,
Alec M. Dunton,
Amanda L. Muyskens,
Benjamin W. Priest,
Robert E. Armstrong
Abstract:
Temporal variations of apparent magnitude, called light curves, are observational statistics of interest captured by telescopes over long periods of time. Light curves afford the exploration of Space Domain Awareness (SDA) objectives such as object identification or pose estimation as latent variable inference problems. Ground-based observations from commercial off the shelf (COTS) cameras remain…
▽ More
Temporal variations of apparent magnitude, called light curves, are observational statistics of interest captured by telescopes over long periods of time. Light curves afford the exploration of Space Domain Awareness (SDA) objectives such as object identification or pose estimation as latent variable inference problems. Ground-based observations from commercial off the shelf (COTS) cameras remain inexpensive compared to higher precision instruments, however, limited sensor availability combined with noisier observations can produce gappy time-series data that can be difficult to model. These external factors confound the automated exploitation of light curves, which makes light curve prediction and extrapolation a crucial problem for applications. Traditionally, image or time-series completion problems have been approached with diffusion-based or exemplar-based methods. More recently, Deep Neural Networks (DNNs) have become the tool of choice due to their empirical success at learning complex nonlinear embeddings. However, DNNs often require large training data that are not necessarily available when looking at unique features of a light curve of a single satellite.
In this paper, we present a novel approach to predicting missing and future data points of light curves using Gaussian Processes (GPs). GPs are non-linear probabilistic models that infer posterior distributions over functions and naturally quantify uncertainty. However, the cubic scaling of GP inference and training is a major barrier to their adoption in applications. In particular, a single light curve can feature hundreds of thousands of observations, which is well beyond the practical realization limits of a conventional GP on a single machine. Consequently, we employ MuyGPs, a scalable framework for hyperparameter estimation of GP models that uses nearest neighbors sparsification and local cross-validation. MuyGPs...
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Gaussian Process Classification for Galaxy Blend Identification in LSST
Authors:
James J. Buchanan,
Michael D. Schneider,
Robert E. Armstrong,
Amanda L. Muyskens,
Benjamin W. Priest,
Ryan J. Dana
Abstract:
A significant fraction of observed galaxies in the Rubin Observatory Legacy Survey of Space and Time (LSST) will overlap at least one other galaxy along the same line of sight, in a so-called "blend." The current standard method of assessing blend likelihood in LSST images relies on counting up the number of intensity peaks in the smoothed image of a blend candidate, but the reliability of this pr…
▽ More
A significant fraction of observed galaxies in the Rubin Observatory Legacy Survey of Space and Time (LSST) will overlap at least one other galaxy along the same line of sight, in a so-called "blend." The current standard method of assessing blend likelihood in LSST images relies on counting up the number of intensity peaks in the smoothed image of a blend candidate, but the reliability of this procedure has not yet been comprehensively studied. Here we construct a realistic distribution of blended and unblended galaxies through high-fidelity simulations of LSST-like images, and from this we examine the blend classification accuracy of the standard peak-finding method. Furthermore, we develop a novel Gaussian process blend classifier model, and show that this classifier is competitive with both the peak-finding method as well as with a convolutional neural network model. Finally, whereas the peak-finding method does not naturally assign probabilities to its classification estimates, the Gaussian process model does, and we show that the Gaussian process classification probabilities are generally reliable.
△ Less
Submitted 10 December, 2021; v1 submitted 19 July, 2021;
originally announced July 2021.
-
Star-Galaxy Image Separation with Computationally Efficient Gaussian Process Classification
Authors:
Amanda L. Muyskens,
Imène R. Goumiri,
Benjamin W. Priest,
Michael D. Schneider,
Robert E. Armstrong,
Jason M. Bernstein,
Ryan Dana
Abstract:
We introduce a novel method for discerning optical telescope images of stars from those of galaxies using Gaussian processes (GPs). Although applications of GPs often struggle in high-dimensional data modalities such as optical image classification, we show that a low-dimensional embedding of images into a metric space defined by the principal components of the data suffices to produce high-qualit…
▽ More
We introduce a novel method for discerning optical telescope images of stars from those of galaxies using Gaussian processes (GPs). Although applications of GPs often struggle in high-dimensional data modalities such as optical image classification, we show that a low-dimensional embedding of images into a metric space defined by the principal components of the data suffices to produce high-quality predictions from real large-scale survey data. We develop a novel method of GP classification hyperparameter training that scales approximately linearly in the number of image observations, which allows for application of GP models to large-size Hyper Suprime-Cam (HSC) Subaru Strategic Program data. In our experiments we evaluate the performance of a principal component analysis (PCA) embedded GP predictive model against other machine learning algorithms including a convolutional neural network and an image photometric morphology discriminator. Our analysis shows that our methods compare favorably with current methods in optical image classification while producing posterior distributions from the GP regression that can be used to quantify object classification uncertainty. We further describe how classification uncertainty can be used to efficiently parse large-scale survey imaging data to produce high-confidence object catalogs.
△ Less
Submitted 3 May, 2021;
originally announced May 2021.
-
Star-Galaxy Separation via Gaussian Processes with Model Reduction
Authors:
Imène R. Goumiri,
Amanda L. Muyskens,
Michael D. Schneider,
Benjamin W. Priest,
Robert E. Armstrong
Abstract:
Modern cosmological surveys such as the Hyper Suprime-Cam (HSC) survey produce a huge volume of low-resolution images of both distant galaxies and dim stars in our own galaxy. Being able to automatically classify these images is a long-standing problem in astronomy and critical to a number of different scientific analyses. Recently, the challenge of "star-galaxy" classification has been approached…
▽ More
Modern cosmological surveys such as the Hyper Suprime-Cam (HSC) survey produce a huge volume of low-resolution images of both distant galaxies and dim stars in our own galaxy. Being able to automatically classify these images is a long-standing problem in astronomy and critical to a number of different scientific analyses. Recently, the challenge of "star-galaxy" classification has been approached with Deep Neural Networks (DNNs), which are good at learning complex nonlinear embeddings. However, DNNs are known to overconfidently extrapolate on unseen data and require a large volume of training images that accurately capture the data distribution to be considered reliable. Gaussian Processes (GPs), which infer posterior distributions over functions and naturally quantify uncertainty, haven't been a tool of choice for this task mainly because popular kernels exhibit limited expressivity on complex and high-dimensional data.
In this paper, we present a novel approach to the star-galaxy separation problem that uses GPs and reap their benefits while solving many of the issues traditionally affecting them for classification of high-dimensional celestial image data. After an initial filtering of the raw data of star and galaxy image cutouts, we first reduce the dimensionality of the input images by using a Principal Components Analysis (PCA) before applying GPs using a simple Radial Basis Function (RBF) kernel on the reduced data. Using this method, we greatly improve the accuracy of the classification over a basic application of GPs while improving the computational efficiency and scalability of the method.
△ Less
Submitted 12 October, 2020;
originally announced October 2020.