-
StratLearn-z: Improved photo-$z$ estimation from spectroscopic data subject to selection effects
Authors:
Chiara Moretti,
Maximilian Autenrieth,
Riccardo Serra,
Roberto Trotta,
David A. van Dyk,
Andrei Mesinger
Abstract:
A precise measurement of photometric redshifts (photo-z) is key for the success of modern photometric galaxy surveys. Machine learning (ML) methods show great promise in this context, but suffer from covariate shift (CS) in training sets due to selection bias where interesting sources are underrepresented, and the corresponding ML models show poor generalisation properties. We present an applicati…
▽ More
A precise measurement of photometric redshifts (photo-z) is key for the success of modern photometric galaxy surveys. Machine learning (ML) methods show great promise in this context, but suffer from covariate shift (CS) in training sets due to selection bias where interesting sources are underrepresented, and the corresponding ML models show poor generalisation properties. We present an application of the StratLearn method to the estimation of photo-z, validating against simulations where we enforce the presence of CS to different degrees. StratLearn is a statistically principled approach that relies on splitting the source and target datasets into strata based on estimated propensity scores (i.e. the probability for an object to be in the source set given its observed covariates). After stratification, two conditional density estimators are fit separately to each stratum, then combined via a weighted average. We benchmark our results against the GPz algorithm, quantifying the performance of the two codes with a set of metrics. Our results show that the StratLearn-z metrics are only marginally affected by the presence of CS, while GPz shows a significant degradation of performance in the photo-z prediction for fainter objects. For the strongest CS scenario, StratLearn-z yields a reduced fraction of catastrophic errors, a factor of 2 improvement for the RMSE and one order of magnitude improvement on the bias. We also assess the quality of the conditional redshift estimates with the probability integral transform (PIT). The PIT distribution obtained from StratLearn-z features fat fewer outliers and is symmetric, i.e. the predictions appear to be centered around the true redshift value, despite showing a conservative estimation of the spread of the conditional redshift distributions. Our julia implementation of the method is available at https://github.com/chiaramoretti/StratLearn-z.
△ Less
Submitted 30 April, 2025; v1 submitted 30 September, 2024;
originally announced September 2024.
-
The atmospheric fragmentation of the 1908 Tunguska Cosmic Body: reconsidering the possibility of a ground impact
Authors:
L. Foschini,
L. Gasperini,
C. Stanghellini,
R. Serra,
A. Polonia,
G. Stanghellini
Abstract:
The 1908 June 30 Tunguska Event (TE) is one of the best studied cases of cosmic body impacting the Earth with global effects. However, still today, significant doubts are casted on the different proposed event reconstructions, because of shortage of reliable information and uncertainties of available data. In the present work, we would like to revisit the atmospheric fragmentation of the Tunguska…
▽ More
The 1908 June 30 Tunguska Event (TE) is one of the best studied cases of cosmic body impacting the Earth with global effects. However, still today, significant doubts are casted on the different proposed event reconstructions, because of shortage of reliable information and uncertainties of available data. In the present work, we would like to revisit the atmospheric fragmentation of the Tunguska Cosmic Body (TCB) by taking into account the possibility that a metre-sized fragment could cause the formation of the Lake Cheko, located at about $9$~km North-West from the epicentre. We performed order-of-magnitude calculations by using the classical single-body theory for the atmospheric dynamics of comets/asteroids, with the addition of the fragmentation conditions by Foschini (2001). We calibrated the numerical model by using the data of the Chelyabinsk Event (CE) of 2013 February 15. Our work favours the hypothesis that the TCB could have been a rubble-pile asteroid composed by boulders with very different materials with different mechanical strengths, density, and porosity. Before the impact, a close encounter with the Earth stripped at least one boulder, which fell aside the main body and excavated the Lake Cheko. We exclude the hypothesis of a single compact asteroid ejecting a metre-sized fragment during, or shortly before, the airburst, because there is no suitable combination of boulder mass and lateral velocity.
△ Less
Submitted 11 February, 2019; v1 submitted 17 October, 2018;
originally announced October 2018.
-
Measuring the Mass Distribution in Galaxy Clusters
Authors:
Margaret J. Geller,
Antonaldo Diaferio,
Kenneth J. Rines. Ana Laura Serra
Abstract:
Cluster mass profiles are tests of models of structure formation. Only two current observational methods of determining the mass profile, gravitational lensing and the caustic technique, are independent of the assumption of dynamical equilibrium. Both techniques enable determination of the extended mass profile at radii beyond the virial radius. For 19 clusters, we compare the mass profile based o…
▽ More
Cluster mass profiles are tests of models of structure formation. Only two current observational methods of determining the mass profile, gravitational lensing and the caustic technique, are independent of the assumption of dynamical equilibrium. Both techniques enable determination of the extended mass profile at radii beyond the virial radius. For 19 clusters, we compare the mass profile based on the caustic technique with weak lensing measurements taken from the literature. This comparison offers a test of systematic issues in both techniques. Around the virial radius, the two methods of mass estimation agree to within about 30%, consistent with the expected errors in the individual techniques. At small radii, the caustic technique overestimates the mass as expected from numerical simulations. The ratio between the lensing profile and the caustic mass profile at these radii suggests that the weak lensing profiles are a good representation of the true mass profile. At radii larger than the virial radius, the lensing mass profile exceeds the caustic mass profile possibly as a result of contamination of the lensing profile by large-scale structures within the lensing kernel. We highlight the case of the closely neighboring clusters MS0906+11 and A750 to illustrate the potential seriousness of contamination of the the weak lensing signal by unrelated structures.
△ Less
Submitted 25 September, 2012;
originally announced September 2012.