-
Multimodality for improved CNN photometric redshifts
Authors:
R. Ait-Ouahmed,
S. Arnouts,
J. Pasquet,
M. Treyer,
E. Bertin
Abstract:
Photometric redshift estimation plays a crucial role in modern cosmological surveys for studying the universe's large-scale structures and the evolution of galaxies. Deep learning has emerged as a powerful method to produce accurate photometric redshift estimates from multi-band images of galaxies. Here, we introduce a multimodal approach consisting of the parallel processing of several subsets of…
▽ More
Photometric redshift estimation plays a crucial role in modern cosmological surveys for studying the universe's large-scale structures and the evolution of galaxies. Deep learning has emerged as a powerful method to produce accurate photometric redshift estimates from multi-band images of galaxies. Here, we introduce a multimodal approach consisting of the parallel processing of several subsets of image bands prior, the outputs of which are then merged for further processing through a convolutional neural network (CNN). We evaluate the performance of our method using three surveys: the Sloan Digital Sky Survey (SDSS), The Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) and Hyper Suprime-Cam (HSC). By improving the model's ability to capture information embedded in the correlation between different bands, our technique surpasses the state-of-the-art photometric redshift precision. We find that the positive gain does not depend on the specific architecture of the CNN and that it increases with the number of photometric filters available.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
CNN photometric redshifts in the SDSS at $r\leq 20$
Authors:
M. Treyer,
R. Ait-Ouahmed,
J. Pasquet,
S. Arnouts,
E. Bertin,
D. Fouchez
Abstract:
We release photometric redshifts, reaching $\sim$0.7, for $\sim$14M galaxies at $r\leq 20$ in the 11,500 deg$^2$ of the SDSS north and south galactic caps. These estimates were inferred from a convolution neural network (CNN) trained on $ugriz$ stamp images of galaxies labelled with a spectroscopic redshift from the SDSS, GAMA and BOSS surveys. Representative training sets of $\sim$370k galaxies w…
▽ More
We release photometric redshifts, reaching $\sim$0.7, for $\sim$14M galaxies at $r\leq 20$ in the 11,500 deg$^2$ of the SDSS north and south galactic caps. These estimates were inferred from a convolution neural network (CNN) trained on $ugriz$ stamp images of galaxies labelled with a spectroscopic redshift from the SDSS, GAMA and BOSS surveys. Representative training sets of $\sim$370k galaxies were constructed from the much larger combined spectroscopic data to limit biases, particularly those arising from the over-representation of Luminous Red Galaxies. The CNN outputs a redshift classification that offers all the benefits of a well-behaved PDF, with a width efficiently signaling unreliable estimates due to poor photometry or stellar sources. The dispersion, mean bias and rate of catastrophic failures of the median point estimate are of order $σ_{\rm MAD}=0.014$, <$Δz_{\rm norm}$>$=0.0015$, $η(|Δz_{\rm norm}|>0.05)=4\%$ on a representative test sample at $r<19.8$, out-performing currently published estimates. The distributions in narrow intervals of magnitudes of the redshifts inferred for the photometric sample are in good agreement with the results of tomographic analyses. The inferred redshifts also match the photometric redshifts of the redMaPPer galaxy clusters for the probable cluster members. The CNN input and output are available at: https://deepdip.iap.fr/treyer+2023.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Photometric Redshift Estimation with Convolutional Neural Networks and Galaxy Images: A Case Study of Resolving Biases in Data-Driven Methods
Authors:
Q. Lin,
D. Fouchez,
J. Pasquet,
M. Treyer,
R. Ait Ouahmed,
S. Arnouts,
O. Ilbert
Abstract:
Deep Learning models have been increasingly exploited in astrophysical studies, yet such data-driven algorithms are prone to producing biased outputs detrimental for subsequent analyses. In this work, we investigate two major forms of biases, i.e., class-dependent residuals and mode collapse, in a case study of estimating photometric redshifts as a classification problem using Convolutional Neural…
▽ More
Deep Learning models have been increasingly exploited in astrophysical studies, yet such data-driven algorithms are prone to producing biased outputs detrimental for subsequent analyses. In this work, we investigate two major forms of biases, i.e., class-dependent residuals and mode collapse, in a case study of estimating photometric redshifts as a classification problem using Convolutional Neural Networks (CNNs) and galaxy images with spectroscopic redshifts. We focus on point estimates and propose a set of consecutive steps for resolving the two biases based on CNN models, involving representation learning with multi-channel outputs, balancing the training data and leveraging soft labels. The residuals can be viewed as a function of spectroscopic redshifts or photometric redshifts, and the biases with respect to these two definitions are incompatible and should be treated in a split way. We suggest that resolving biases in the spectroscopic space is a prerequisite for resolving biases in the photometric space. Experiments show that our methods possess a better capability in controlling biases compared to benchmark methods, and exhibit robustness under varying implementing and training conditions provided with high-quality data. Our methods have promises for future cosmological surveys that require a good constraint of biases, and may be applied to regression problems and other studies that make use of data-driven models. Nonetheless, the bias-variance trade-off and the demand on sufficient statistics suggest the need for developing better methodologies and optimizing data usage strategies.
△ Less
Submitted 20 February, 2022;
originally announced February 2022.
-
Galaxy Image Translation with Semi-supervised Noise-reconstructed Generative Adversarial Networks
Authors:
Qiufan Lin,
Dominique Fouchez,
Jérôme Pasquet
Abstract:
Image-to-image translation with Deep Learning neural networks, particularly with Generative Adversarial Networks (GANs), is one of the most powerful methods for simulating astronomical images. However, current work is limited to utilizing paired images with supervised translation, and there has been rare discussion on reconstructing noise background that encodes instrumental and observational effe…
▽ More
Image-to-image translation with Deep Learning neural networks, particularly with Generative Adversarial Networks (GANs), is one of the most powerful methods for simulating astronomical images. However, current work is limited to utilizing paired images with supervised translation, and there has been rare discussion on reconstructing noise background that encodes instrumental and observational effects. These limitations might be harmful for subsequent scientific applications in astrophysics. Therefore, we aim to develop methods for using unpaired images and preserving noise characteristics in image translation. In this work, we propose a two-way image translation model using GANs that exploits both paired and unpaired images in a semi-supervised manner, and introduce a noise emulating module that is able to learn and reconstruct noise characterized by high-frequency features. By experimenting on multi-band galaxy images from the Sloan Digital Sky Survey (SDSS) and the Canada France Hawaii Telescope Legacy Survey (CFHT), we show that our method recovers global and local properties effectively and outperforms benchmark image translation models. To our best knowledge, this work is the first attempt to apply semi-supervised methods and noise reconstruction techniques in astrophysical studies.
△ Less
Submitted 18 January, 2021;
originally announced January 2021.
-
PhotoWeb redshift: boosting photometric redshift accuracy with large spectroscopic surveys
Authors:
Marko Shuntov,
J. Pasquet,
S. Arnouts,
O. Ilbert,
M. Treyer,
E. Bertin,
S. de la Torre,
Y. Dubois,
D. Fouchez,
K. Kraljic,
C. Laigle,
C. Pichon,
D. Vibert
Abstract:
Improving distance measurements in large imaging surveys is a major challenge to better reveal the distribution of galaxies on a large scale and to link galaxy properties with their environments. Photometric redshifts can be efficiently combined with the cosmic web (CW) extracted from overlapping spectroscopic surveys to improve their accuracy. We apply a similar method using a new generation of p…
▽ More
Improving distance measurements in large imaging surveys is a major challenge to better reveal the distribution of galaxies on a large scale and to link galaxy properties with their environments. Photometric redshifts can be efficiently combined with the cosmic web (CW) extracted from overlapping spectroscopic surveys to improve their accuracy. We apply a similar method using a new generation of photometric redshifts based on a convolution neural network (CNN). The CNN is trained on the SDSS images with the main galaxy sample (SDSS-MGS, $r \leq 17.8$) and the GAMA spectroscopic redshifts up tor $\sim 19.8$. The mapping of the CW is obtained with 680,000 spectroscopic redshifts from the MGS and BOSS surveys. The redshift probability distribution functions (PDF), which are well calibrated (unbiased and narrow, $\leq 120$ Mpc), intercept a few CW structure along the line of sight. Combining these PDFs with the density field distribution provides new photometric redshifts, $z_{web}$, whose accuracy is improved by a factor of two (i.e.,$σ \sim 0.004(1+z)$) for galaxies with $r \leq 17.8$. For half of them, the distance accuracy is better than 10 cMpc. The narrower the original PDF, the larger the boost in accuracy. No gain is observed for original PDFs wider than 0.03. The final $z_{web}$ PDFs also appear well calibrated. The method performs slightly better for passive galaxies than star-forming ones, and for galaxies in massive groups since these populations better trace the underlying large-scale structure. Reducing the spectroscopic sampling by a factor of 8 still improves the photometric redshift accuracy by 25%. Extending the method to galaxies fainter than the MGS limit still improves the redshift estimates for 70% of the galaxies, with a gain in accuracy of 20% at low $z$ where the resolution of the CW is the highest.
△ Less
Submitted 24 March, 2020;
originally announced March 2020.
-
PELICAN: deeP architecturE for the LIght Curve ANalysis
Authors:
Johanna Pasquet,
Jérôme Pasquet,
Marc Chaumont,
Dominique Fouchez
Abstract:
We developed a deeP architecturE for the LIght Curve ANalysis (PELICAN) for the characterization and the classification of light curves. It takes light curves as input, without any additional features. PELICAN can deal with the sparsity and the irregular sampling of light curves. It is designed to remove the problem of non-representativeness between the training and test databases coming from the…
▽ More
We developed a deeP architecturE for the LIght Curve ANalysis (PELICAN) for the characterization and the classification of light curves. It takes light curves as input, without any additional features. PELICAN can deal with the sparsity and the irregular sampling of light curves. It is designed to remove the problem of non-representativeness between the training and test databases coming from the limitations of the spectroscopic follow-up. We applied our methodology on different supernovae light curve databases. First, we evaluated PELICAN on the Supernova Photometric Classification Challenge for which we obtained the best performance ever achieved with a non-representative training database, by reaching an accuracy of 0.811. Then we tested PELICAN on simulated light curves of the LSST Deep Fields for which PELICAN is able to detect 87.4% of supernovae Ia with a precision higher than 98%, by considering a non-representative training database of 2k light curves. PELICAN can be trained on light curves of LSST Deep Fields to classify light curves of LSST main survey, that have a lower sampling rate and are more noisy. In this scenario, it reaches an accuracy of 96.5% with a training database of 2k light curves of the Deep Fields. It constitutes a pivotal result as type Ia supernovae candidates from the main survey might then be used to increase the statistics without additional spectroscopic follow-up. Finally we evaluated PELICAN on real data from the Sloan Digital Sky Survey. PELICAN reaches an accuracy of 86.8% with a training database composed of simulated data and a fraction of 10% of real data. The ability of PELICAN to deal with the different causes of non-representativeness between the training and test databases, and its robustness against survey properties and observational conditions, put it on the forefront of the light curves classification tools for the LSST era.
△ Less
Submitted 4 January, 2019;
originally announced January 2019.
-
Photometric redshifts from SDSS images using a Convolutional Neural Network
Authors:
Johanna Pasquet,
Emmanuel Bertin,
Marie Treyer,
Stéphane Arnouts,
Dominique Fouchez
Abstract:
We developed a Deep Convolutional Neural Network (CNN), used as a classifier, to estimate photometric redshifts and associated probability distribution functions (PDF) for galaxies in the Main Galaxy Sample of the Sloan Digital Sky Survey at z < 0.4. Our method exploits all the information present in the images without any feature extraction. The input data consist of 64x64 pixel ugriz images cent…
▽ More
We developed a Deep Convolutional Neural Network (CNN), used as a classifier, to estimate photometric redshifts and associated probability distribution functions (PDF) for galaxies in the Main Galaxy Sample of the Sloan Digital Sky Survey at z < 0.4. Our method exploits all the information present in the images without any feature extraction. The input data consist of 64x64 pixel ugriz images centered on the spectroscopic targets, plus the galactic reddening value on the line-of-sight. For training sets of 100k objects or more ($\geq$ 20% of the database), we reach a dispersion $σ_{MAD}$<0.01, significantly lower than the current best one obtained from another machine learning technique on the same sample. The bias is lower than 0.0001, independent of photometric redshift. The PDFs are shown to have very good predictive power. We also find that the CNN redshifts are unbiased with respect to galaxy inclination, and that $σ_{MAD}$ decreases with the signal-to-noise ratio (SNR), achieving values below 0.007 for SNR >100, as in the deep stacked region of Stripe 82. We argue that for most galaxies the precision is limited by the SNR of SDSS images rather than by the method. The success of this experiment at low redshift opens promising perspectives for upcoming surveys.
△ Less
Submitted 19 June, 2018; v1 submitted 18 June, 2018;
originally announced June 2018.
-
Gravitational birefringence and an exotic formula for redshift
Authors:
Christian Duval,
Johanna Pasquet,
Thomas Schucker,
Andre Tilquin
Abstract:
We compute the birefringence of light in curved Robertson-Walker spacetimes and propose an exotic formula for redshift based on the internal structure of the spinning photon. We then use the Hubble diagram of supernovae to test this formula.
We compute the birefringence of light in curved Robertson-Walker spacetimes and propose an exotic formula for redshift based on the internal structure of the spinning photon. We then use the Hubble diagram of supernovae to test this formula.
△ Less
Submitted 27 June, 2018; v1 submitted 26 February, 2018;
originally announced February 2018.
-
Deep learning Approach for Classifying, Detecting and Predicting Photometric Redshifts of Quasars in the Sloan Digital Sky Survey Stripe 82
Authors:
Johanna Pasquet-Itam,
Jérôme Pasquet
Abstract:
We apply a convolutional neural network (CNN) to classify and detect quasars in the Sloan Digital Sky Survey Stripe 82 and also to predict the photometric redshifts of quasars. The network takes the variability of objects into account by converting light curves into images. The width of the images, noted w, corresponds to the five magnitudes ugriz and the height of the images, noted h, represents…
▽ More
We apply a convolutional neural network (CNN) to classify and detect quasars in the Sloan Digital Sky Survey Stripe 82 and also to predict the photometric redshifts of quasars. The network takes the variability of objects into account by converting light curves into images. The width of the images, noted w, corresponds to the five magnitudes ugriz and the height of the images, noted h, represents the date of the observation. The CNN provides good results since its precision is 0.988 for a recall of 0.90, compared to a precision of 0.985 for the same recall with a random forest classifier. Moreover 175 new quasar candidates are found with the CNN considering a fixed recall of 0.97. The combination of probabilities given by the CNN and the random forest makes good performance even better with a precision of 0.99 for a recall of 0.90.
For the redshift predictions, the CNN presents excellent results which are higher than those obtained with a feature extraction step and different classifiers (a K-nearest-neighbors, a support vector machine, a random forest and a gaussian process classifier). Indeed, the accuracy of the CNN within |Δz|<0.1 can reach 78.09%, within |Δz|<0.2 reaches 86.15%, within |Δz|<0.3 reaches 91.2% and the value of rms is 0.359. The performance of the KNN decreases for the three |Δz| regions, since within the accuracy of |Δz|<0.1, |Δz|<0.2 and |Δz|<0.3 is 73.72%, 82.46% and 90.09% respectively, and the value of rms amounts to 0.395. So the CNN successfully reduces the dispersion and the catastrophic redshifts of quasars. This new method is very promising for the future of big databases like the Large Synoptic Survey Telescope.
△ Less
Submitted 7 December, 2017;
originally announced December 2017.