Conditionally Calibrated Predictive Distributions by Probability-Probability Map: Application to Galaxy Redshift Estimation and Probabilistic Forecasting

Dey, Biprateep; Zhao, David; Newman, Jeffrey A.; Andrews, Brett H.; Izbicki, Rafael; Lee, Ann B.

Statistics > Machine Learning

arXiv:2205.14568v3 (stat)

[Submitted on 29 May 2022 (v1), revised 7 Jul 2023 (this version, v3), latest version 18 May 2025 (v7)]

Title:Conditionally Calibrated Predictive Distributions by Probability-Probability Map: Application to Galaxy Redshift Estimation and Probabilistic Forecasting

Authors:Biprateep Dey, David Zhao, Jeffrey A. Newman, Brett H. Andrews, Rafael Izbicki, Ann B. Lee

View PDF

Abstract:Uncertainty quantification is crucial for assessing the predictive ability of AI algorithms. Much research has been devoted to describing the predictive distribution (PD) $F(y|\mathbf{x})$ of a target variable $y \in \mathbb{R}$ given complex input features $\mathbf{x} \in \mathcal{X}$. However, off-the-shelf PDs (from, e.g., normalizing flows and Bayesian neural networks) often lack conditional calibration with the probability of occurrence of an event given input $\mathbf{x}$ being significantly different from the predicted probability. Current calibration methods do not fully assess and enforce conditionally calibrated PDs. Here we propose \texttt{Cal-PIT}, a method that addresses both PD diagnostics and recalibration by learning a single probability-probability map from calibration data. The key idea is to regress probability integral transform scores against $\mathbf{x}$. The estimated regression provides interpretable diagnostics of conditional coverage across the feature space. The same regression function morphs the misspecified PD to a re-calibrated PD for all $\mathbf{x}$. We benchmark our corrected prediction bands (a by-product of corrected PDs) against oracle bands and state-of-the-art predictive inference algorithms for synthetic data. We also provide results for two applications: (i) probabilistic nowcasting given sequences of satellite images, and (ii) conditional density estimation of galaxy distances given imaging data (so-called photometric redshift estimation). Our code is available as a Python package this https URL .

Comments:	21 pages, 11 figures. Under review. Code available as a Python package this https URL
Subjects:	Machine Learning (stat.ML); Instrumentation and Methods for Astrophysics (astro-ph.IM); Machine Learning (cs.LG); Methodology (stat.ME)
Cite as:	arXiv:2205.14568 [stat.ML]
	(or arXiv:2205.14568v3 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2205.14568

Submission history

From: Biprateep Dey [view email]
[v1] Sun, 29 May 2022 03:52:44 UTC (3,897 KB)
[v2] Wed, 26 Oct 2022 15:17:22 UTC (4,858 KB)
[v3] Fri, 7 Jul 2023 18:34:02 UTC (7,585 KB)
[v4] Mon, 17 Jul 2023 16:58:54 UTC (7,584 KB)
[v5] Sun, 22 Dec 2024 15:18:26 UTC (9,607 KB)
[v6] Mon, 30 Dec 2024 14:00:24 UTC (8,324 KB)
[v7] Sun, 18 May 2025 13:09:18 UTC (7,119 KB)

Statistics > Machine Learning

Title:Conditionally Calibrated Predictive Distributions by Probability-Probability Map: Application to Galaxy Redshift Estimation and Probabilistic Forecasting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Conditionally Calibrated Predictive Distributions by Probability-Probability Map: Application to Galaxy Redshift Estimation and Probabilistic Forecasting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators