-
Improved source classification and performance analysis using Gaia DR3
Authors:
Sara Jamal,
Coryn A. L. Bailer-Jones
Abstract:
The Discrete Source Classifier (DSC) provides probabilistic classification of sources in Gaia Data Release 3 using a Bayesian framework and a global prior. The DSC Combmod classifier in GDR3 achieved for the extragalactic classes (quasars and galaxies) a high completeness of 92%, but a low purity of 22% due to contamination from the far larger star class. However, these single metrics mask signifi…
▽ More
The Discrete Source Classifier (DSC) provides probabilistic classification of sources in Gaia Data Release 3 using a Bayesian framework and a global prior. The DSC Combmod classifier in GDR3 achieved for the extragalactic classes (quasars and galaxies) a high completeness of 92%, but a low purity of 22% due to contamination from the far larger star class. However, these single metrics mask significant variation in performance with magnitude and sky position. Furthermore, a better combination of the individual classifiers is possible. Here we compute two-dimensional representations of the completeness and the purity as function of Galactic latitude and source brightness, and also exclude the Magellanic Clouds where stellar contamination significantly reduces the purity. Reevaluated on a cleaner validation set and without introducing changes to the published GDR3 DSC probabilities themselves, we achieve for Combmod average 2D completenesses of 92% and 95% and average 2D purities of 55% and 89% for the quasar and galaxy classes, respectively. Since the relative proportions of extragalactic objects to stars in Gaia is expected to vary significantly with brightness and latitude, we introduce a new prior as a continuous function of brightness and latitude, and compute new class probabilities. This variable prior only improves the performance by a few percentage points, mostly at the faint end. Significant improvement, however, is obtained by a new additive combination of Specmod and Allosmod. This classifier, Combmod-$α$, achieves average 2D completenesses of 82% and 93% and average 2D purities of 79% and 93% for the quasar and galaxy classes, respectively, when using the global prior. Thus, we achieve a significant improvement in purity for a small loss of completeness. The improvement is most significant for faint quasars where the purity rises from 20% to 62%.
△ Less
Submitted 22 May, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
Discovery of a dormant 33 solar-mass black hole in pre-release Gaia astrometry
Authors:
Gaia Collaboration,
P. Panuzzo,
T. Mazeh,
F. Arenou,
B. Holl,
E. Caffau,
A. Jorissen,
C. Babusiaux,
P. Gavras,
J. Sahlmann,
U. Bastian,
Ł. Wyrzykowski,
L. Eyer,
N. Leclerc,
N. Bauchet,
A. Bombrun,
N. Mowlavi,
G. M. Seabroke,
D. Teyssier,
E. Balbinot,
A. Helmi,
A. G. A. Brown,
A. Vallenari,
T. Prusti,
J. H. J. de Bruijne
, et al. (390 additional authors not shown)
Abstract:
Gravitational waves from black-hole merging events have revealed a population of extra-galactic BHs residing in short-period binaries with masses that are higher than expected based on most stellar evolution models - and also higher than known stellar-origin black holes in our Galaxy. It has been proposed that those high-mass BHs are the remnants of massive metal-poor stars. Gaia astrometry is exp…
▽ More
Gravitational waves from black-hole merging events have revealed a population of extra-galactic BHs residing in short-period binaries with masses that are higher than expected based on most stellar evolution models - and also higher than known stellar-origin black holes in our Galaxy. It has been proposed that those high-mass BHs are the remnants of massive metal-poor stars. Gaia astrometry is expected to uncover many Galactic wide-binary systems containing dormant BHs, which may not have been detected before. The study of this population will provide new information on the BH-mass distribution in binaries and shed light on their formation mechanisms and progenitors. As part of the validation efforts in preparation for the fourth Gaia data release (DR4), we analysed the preliminary astrometric binary solutions, obtained by the Gaia Non-Single Star pipeline, to verify their significance and to minimise false-detection rates in high-mass-function orbital solutions. The astrometric binary solution of one source, Gaia BH3, implies the presence of a 32.70 \pm 0.82 M\odot BH in a binary system with a period of 11.6 yr. Gaia radial velocities independently validate the astrometric orbit. Broad-band photometric and spectroscopic data show that the visible component is an old, very metal-poor giant of the Galactic halo, at a distance of 590 pc. The BH in the Gaia BH3 system is more massive than any other Galactic stellar-origin BH known thus far. The low metallicity of the star companion supports the scenario that metal-poor massive stars are progenitors of the high-mass BHs detected by gravitational-wave telescopes. The Galactic orbit of the system and its metallicity indicate that it might belong to the Sequoia halo substructure. Alternatively, and more plausibly, it could belong to the ED-2 stream, which likely originated from a globular cluster that had been disrupted by the Milky Way.
△ Less
Submitted 19 April, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Gaia Focused Product Release: Sources from Service Interface Function image analysis -- Half a million new sources in omega Centauri
Authors:
Gaia Collaboration,
K. Weingrill,
A. Mints,
J. Castañeda,
Z. Kostrzewa-Rutkowska,
M. Davidson,
F. De Angeli,
J. Hernández,
F. Torra,
M. Ramos-Lerate,
C. Babusiaux,
M. Biermann,
C. Crowley,
D. W. Evans,
L. Lindegren,
J. M. Martín-Fleitas,
L. Palaversa,
D. Ruz Mieres,
K. Tisanić,
A. G. A. Brown,
A. Vallenari,
T. Prusti,
J. H. J. de Bruijne,
F. Arenou,
A. Barbier
, et al. (378 additional authors not shown)
Abstract:
Gaia's readout window strategy is challenged by very dense fields in the sky. Therefore, in addition to standard Gaia observations, full Sky Mapper (SM) images were recorded for nine selected regions in the sky. A new software pipeline exploits these Service Interface Function (SIF) images of crowded fields (CFs), making use of the availability of the full two-dimensional (2D) information. This ne…
▽ More
Gaia's readout window strategy is challenged by very dense fields in the sky. Therefore, in addition to standard Gaia observations, full Sky Mapper (SM) images were recorded for nine selected regions in the sky. A new software pipeline exploits these Service Interface Function (SIF) images of crowded fields (CFs), making use of the availability of the full two-dimensional (2D) information. This new pipeline produced half a million additional Gaia sources in the region of the omega Centauri ($ω$ Cen) cluster, which are published with this Focused Product Release. We discuss the dedicated SIF CF data reduction pipeline, validate its data products, and introduce their Gaia archive table. Our aim is to improve the completeness of the {\it Gaia} source inventory in a very dense region in the sky, $ω$ Cen. An adapted version of {\it Gaia}'s Source Detection and Image Parameter Determination software located sources in the 2D SIF CF images. We validated the results by comparing them to the public {\it Gaia} DR3 catalogue and external Hubble Space Telescope data. With this Focused Product Release, 526\,587 new sources have been added to the {\it Gaia} catalogue in $ω$ Cen. Apart from positions and brightnesses, the additional catalogue contains parallaxes and proper motions, but no meaningful colour information. While SIF CF source parameters generally have a lower precision than nominal {\it Gaia} sources, in the cluster centre they increase the depth of the combined catalogue by three magnitudes and improve the source density by a factor of ten. This first SIF CF data publication already adds great value to the {\it Gaia} catalogue. It demonstrates what to expect for the fourth {\it Gaia} catalogue, which will contain additional sources for all nine SIF CF regions.
△ Less
Submitted 8 November, 2023; v1 submitted 10 October, 2023;
originally announced October 2023.
-
Gaia Focused Product Release: A catalogue of sources around quasars to search for strongly lensed quasars
Authors:
Gaia Collaboration,
A. Krone-Martins,
C. Ducourant,
L. Galluccio,
L. Delchambre,
I. Oreshina-Slezak,
R. Teixeira,
J. Braine,
J. -F. Le Campion,
F. Mignard,
W. Roux,
A. Blazere,
L. Pegoraro,
A. G. A. Brown,
A. Vallenari,
T. Prusti,
J. H. J. de Bruijne,
F. Arenou,
C. Babusiaux,
A. Barbier,
M. Biermann,
O. L. Creevey,
D. W. Evans,
L. Eyer,
R. Guerra
, et al. (376 additional authors not shown)
Abstract:
Context. Strongly lensed quasars are fundamental sources for cosmology. The Gaia space mission covers the entire sky with the unprecedented resolution of $0.18$" in the optical, making it an ideal instrument to search for gravitational lenses down to the limiting magnitude of 21. Nevertheless, the previous Gaia Data Releases are known to be incomplete for small angular separations such as those ex…
▽ More
Context. Strongly lensed quasars are fundamental sources for cosmology. The Gaia space mission covers the entire sky with the unprecedented resolution of $0.18$" in the optical, making it an ideal instrument to search for gravitational lenses down to the limiting magnitude of 21. Nevertheless, the previous Gaia Data Releases are known to be incomplete for small angular separations such as those expected for most lenses. Aims. We present the Data Processing and Analysis Consortium GravLens pipeline, which was built to analyse all Gaia detections around quasars and to cluster them into sources, thus producing a catalogue of secondary sources around each quasar. We analysed the resulting catalogue to produce scores that indicate source configurations that are compatible with strongly lensed quasars. Methods. GravLens uses the DBSCAN unsupervised clustering algorithm to detect sources around quasars. The resulting catalogue of multiplets is then analysed with several methods to identify potential gravitational lenses. We developed and applied an outlier scoring method, a comparison between the average BP and RP spectra of the components, and we also used an extremely randomised tree algorithm. These methods produce scores to identify the most probable configurations and to establish a list of lens candidates. Results. We analysed the environment of 3 760 032 quasars. A total of 4 760 920 sources, including the quasars, were found within 6" of the quasar positions. This list is given in the Gaia archive. In 87\% of cases, the quasar remains a single source, and in 501 385 cases neighbouring sources were detected. We propose a list of 381 lensed candidates, of which we identified 49 as the most promising. Beyond these candidates, the associate tables in this Focused Product Release allow the entire community to explore the unique Gaia data for strong lensing studies further.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Gaia Focused Product Release: Radial velocity time series of long-period variables
Authors:
Gaia Collaboration,
Gaia Collaboration,
M. Trabucchi,
N. Mowlavi,
T. Lebzelter,
I. Lecoeur-Taibi,
M. Audard,
L. Eyer,
P. García-Lario,
P. Gavras,
B. Holl,
G. Jevardat de Fombelle,
K. Nienartowicz,
L. Rimoldini,
P. Sartoretti,
R. Blomme,
Y. Frémat,
O. Marchal,
Y. Damerdji,
A. G. A. Brown,
A. Guerrier,
P. Panuzzo,
D. Katz,
G. M. Seabroke,
K. Benson
, et al. (382 additional authors not shown)
Abstract:
The third Gaia Data Release (DR3) provided photometric time series of more than 2 million long-period variable (LPV) candidates. Anticipating the publication of full radial-velocity (RV) in DR4, this Focused Product Release (FPR) provides RV time series for a selection of LPVs with high-quality observations. We describe the production and content of the Gaia catalog of LPV RV time series, and the…
▽ More
The third Gaia Data Release (DR3) provided photometric time series of more than 2 million long-period variable (LPV) candidates. Anticipating the publication of full radial-velocity (RV) in DR4, this Focused Product Release (FPR) provides RV time series for a selection of LPVs with high-quality observations. We describe the production and content of the Gaia catalog of LPV RV time series, and the methods used to compute variability parameters published in the Gaia FPR. Starting from the DR3 LPVs catalog, we applied filters to construct a sample of sources with high-quality RV measurements. We modeled their RV and photometric time series to derive their periods and amplitudes, and further refined the sample by requiring compatibility between the RV period and at least one of the $G$, $G_{\rm BP}$, or $G_{\rm RP}$ photometric periods. The catalog includes RV time series and variability parameters for 9\,614 sources in the magnitude range $6\lesssim G/{\rm mag}\lesssim 14$, including a flagged top-quality subsample of 6\,093 stars whose RV periods are fully compatible with the values derived from the $G$, $G_{\rm BP}$, and $G_{\rm RP}$ photometric time series. The RV time series contain a mean of 24 measurements per source taken unevenly over a duration of about three years. We identify the great most sources (88%) as genuine LPVs, with about half of them showing a pulsation period and the other half displaying a long secondary period. The remaining 12% consists of candidate ellipsoidal binaries. Quality checks against RVs available in the literature show excellent agreement. We provide illustrative examples and cautionary remarks. The publication of RV time series for almost 10\,000 LPVs constitutes, by far, the largest such database available to date in the literature. The availability of simultaneous photometric measurements gives a unique added value to the Gaia catalog (abridged)
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
`pgmuvi`: Quick and easy Gaussian Process Regression for multi-wavelength astronomical timeseries
Authors:
P. Scicluna,
S. Waterval,
D. A. Vasquez-Torres,
S. Srinivasan,
S. Jamal
Abstract:
Time-domain observations are increasingly important in astronomy, and are often the only way to study certain objects. The volume of time-series data is increasing dramatically as new surveys come online - for example, the Vera Rubin Observatory will produce 15 terabytes of data per night, and its Legacy Survey of Space and Time (LSST) is expected to produce five-year lightcurves for $>10^7$ sourc…
▽ More
Time-domain observations are increasingly important in astronomy, and are often the only way to study certain objects. The volume of time-series data is increasing dramatically as new surveys come online - for example, the Vera Rubin Observatory will produce 15 terabytes of data per night, and its Legacy Survey of Space and Time (LSST) is expected to produce five-year lightcurves for $>10^7$ sources, each consisting of 5 photometric bands. Historically, astronomers have worked with Fourier-based techniques such as the Lomb-Scargle periodogram or information-theoretic approaches; however, in recent years Bayesian and data-driven approaches such as Gaussian Process Regression (GPR) have gained traction. However, the computational complexity and steep learning curve of GPR has limited its adoption. `pgmuvi` makes GPR of multi-band timeseries accessible to astronomers by building on cutting-edge open-source machine-learning libraries, and hence `pgmuvi` retains the speed and flexibility of GPR while being easy to use. It provides easy access to GPU acceleration and Bayesian inference of the hyperparameters (e.g. the periods), and is able to scale to large datasets.
△ Less
Submitted 31 July, 2023;
originally announced August 2023.
-
Quasar and galaxy classification using Gaia EDR3 and CatWise2020
Authors:
Arvind C. N. Hughes,
Coryn A. L. Bailer-Jones,
Sara Jamal
Abstract:
In this work, we assess the combined use of Gaia photometry and astrometry with infrared data from CatWISE in improving the identification of extragalactic sources compared to the classification obtained using Gaia data. We evaluate different input feature configurations and prior functions, with the aim of presenting a classification methodology integrating prior knowledge stemming from realistic…
▽ More
In this work, we assess the combined use of Gaia photometry and astrometry with infrared data from CatWISE in improving the identification of extragalactic sources compared to the classification obtained using Gaia data. We evaluate different input feature configurations and prior functions, with the aim of presenting a classification methodology integrating prior knowledge stemming from realistic class distributions in the universe. In our work, we compare different classifiers, namely Gaussian Mixture Models (GMMs), XGBoost and CatBoost, and classify sources into three classes - star, quasar, and galaxy, with the target quasar and galaxy class labels obtained from SDSS16 and the star label from Gaia EDR3. In our approach, we adjust the posterior probabilities to reflect the intrinsic distribution of extragalactic sources in the universe via a prior function. We introduce two priors, a global prior reflecting the overall rarity of quasars and galaxies, and a mixed prior that incorporates in addition the distribution of the these sources as a function of Galactic latitude and magnitude. Our best classification performances, in terms of completeness and purity of the galaxy and quasar classes, are achieved using the mixed prior for sources at high latitudes and in the magnitude range G = 18.5 to 19.5. We apply our identified best-performing classifier to three application datasets from Gaia DR3, and find that the global prior is more conservative in what it considers to be a quasar or a galaxy compared to the mixed prior. In particular, when applied to the pure quasar and galaxy candidates samples, we attain a purity of 97% for quasars and 99.9% for galaxies using the global prior, and purities of 96% and 99% respectively using the mixed prior. We conclude our work by discussing the importance of applying adjusted priors portraying realistic class distributions in the universe.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
On Neural Architectures for Astronomical Time-series Classification with Application to Variable Stars
Authors:
Sara Jamal,
Joshua S. Bloom
Abstract:
Despite the utility of neural networks (NNs) for astronomical time-series classification, the proliferation of learning architectures applied to diverse datasets has thus far hampered a direct intercomparison of different approaches. Here we perform the first comprehensive study of variants of NN-based learning and inference for astronomical time-series, aiming to provide the community with an ove…
▽ More
Despite the utility of neural networks (NNs) for astronomical time-series classification, the proliferation of learning architectures applied to diverse datasets has thus far hampered a direct intercomparison of different approaches. Here we perform the first comprehensive study of variants of NN-based learning and inference for astronomical time-series, aiming to provide the community with an overview on relative performance and, hopefully, a set of best-in-class choices for practical implementations. In both supervised and self-supervised contexts, we study the effects of different time-series-compatible layer choices, namely the dilated temporal convolutional neural network (dTCNs), Long-Short Term Memory (LSTM) NNs, Gated Recurrent Units (GRUs) and temporal convolutional NNs (tCNNs). We also study the efficacy and performance of encoder-decoder (i.e., autoencoder) networks compared to direct classification networks, different pathways to include auxiliary (non-time-series) metadata, and different approaches to incorporate multi-passband data (i.e., multiple time-series per source). Performance---applied to a sample of 17,604 variable stars from the MACHO survey across 10 imbalanced classes---is measured in training convergence time, classification accuracy, reconstruction error, and generated latent variables. We find that networks with Recurrent NN (RNNs) generally outperform dTCNs and, in many scenarios, yield to similar accuracy as tCNNs. In learning time and memory requirements, convolution-based layers are more performant. We conclude by discussing the advantages and limitations of deep architectures for variable star classification, with a particular eye towards next-generation surveys such as LSST, WFIRST and ZTF2.
△ Less
Submitted 11 June, 2020; v1 submitted 19 March, 2020;
originally announced March 2020.
-
Automated reliability assessment for spectroscopic redshift measurements
Authors:
S. Jamal,
V. Le Brun,
O. Le Fèvre,
D. Vibert,
A. Schmitt,
C. Surace,
Y. Copin,
B. Garilli,
M. Moresco,
L. Pozzetti
Abstract:
We present a new approach to automate the spectroscopic redshift reliability assessment based on machine learning (ML) and characteristics of the redshift probability density function (PDF).
We propose to rephrase the spectroscopic redshift estimation into a Bayesian framework, in order to incorporate all sources of information and uncertainties related to the redshift estimation process, and pr…
▽ More
We present a new approach to automate the spectroscopic redshift reliability assessment based on machine learning (ML) and characteristics of the redshift probability density function (PDF).
We propose to rephrase the spectroscopic redshift estimation into a Bayesian framework, in order to incorporate all sources of information and uncertainties related to the redshift estimation process, and produce a redshift posterior PDF that will be the starting-point for ML algorithms to provide an automated assessment of a redshift reliability.
As a use case, public data from the VIMOS VLT Deep Survey is exploited to present and test this new methodology. We first tried to reproduce the existing reliability flags using supervised classification to describe different types of redshift PDFs, but due to the subjective definition of these flags, soon opted for a new homogeneous partitioning of the data into distinct clusters via unsupervised classification. After assessing the accuracy of the new clusters via resubstitution and test predictions, unlabelled data from preliminary mock simulations for the Euclid space mission are projected into this mapping to predict their redshift reliability labels.
△ Less
Submitted 22 January, 2018; v1 submitted 4 June, 2017;
originally announced June 2017.
-
Noether symmetries and stability of ideal gas solution in Galileon Cosmology
Authors:
N. Dimakis,
Alex Giacomini,
Sameerah Jamal,
Genly Leon,
Andronikos Paliathanasis
Abstract:
A class of generalized Galileon cosmological models, which can be described by a point-like Lagrangian, is considered in order to utilize Noether's Theorem to determine conservation laws for the field equations. In the Friedmann-Lemaître-Robertson-Walker universe, the existence of a nontrivial conservation law indicates the integrability of the field equations. Due to the complexity of the latter,…
▽ More
A class of generalized Galileon cosmological models, which can be described by a point-like Lagrangian, is considered in order to utilize Noether's Theorem to determine conservation laws for the field equations. In the Friedmann-Lemaître-Robertson-Walker universe, the existence of a nontrivial conservation law indicates the integrability of the field equations. Due to the complexity of the latter, we apply the differential invariants approach in order to construct special power-law solutions and study their stability.
△ Less
Submitted 3 March, 2017; v1 submitted 6 February, 2017;
originally announced February 2017.