-
arXiv:2505.12578 [pdf, ps, other]
Stacked conformal prediction
Abstract: We consider a method for conformalizing a stacked ensemble of predictive models, showing that the potentially simple form of the meta-learner at the top of the stack enables a procedure with manageable computational cost that achieves approximate marginal validity without requiring the use of a separate calibration sample. Empirical results indicate that the method compares favorably to a standard… ▽ More
Submitted 7 July, 2025; v1 submitted 18 May, 2025; originally announced May 2025.
Comments: 12 pages, 2 figures
-
Projected random forests and conformal prediction of circular data
Abstract: We apply split conformal prediction techniques to regression problems with circular responses by introducing a suitable conformity score, leading to prediction sets with adaptive arc length and finite-sample coverage guarantees for any circular predictive model under exchangeable data. Leveraging the high performance of existing predictive models designed for linear responses, we analyze a general… ▽ More
Submitted 25 December, 2024; v1 submitted 31 October, 2024; originally announced October 2024.
Comments: 7 pages; 4 figures
-
arXiv:2307.13124 [pdf, ps, other]
Conformal prediction for frequency-severity modeling
Abstract: We present a model-agnostic framework for the construction of prediction intervals of insurance claims, with finite sample statistical guarantees, extending the technique of split conformal prediction to the domain of two-stage frequency-severity modeling. The framework effectiveness is showcased with simulated and real datasets using classical parametric models and contemporary machine learning m… ▽ More
Submitted 19 June, 2025; v1 submitted 24 July, 2023; originally announced July 2023.
-
arXiv:2303.02770 [pdf, ps, other]
Universal distribution of the empirical coverage in split conformal prediction
Abstract: When split conformal prediction operates in batch mode with exchangeable data, we determine the exact distribution of the empirical coverage of prediction sets produced for a finite batch of future observables, as well as the exact distribution of its almost sure limit when the batch size goes to infinity. Both distributions are universal, being determined solely by the nominal miscoverage level a… ▽ More
Submitted 21 September, 2024; v1 submitted 5 March, 2023; originally announced March 2023.
Comments: 6 pages, 1 table
Journal ref: Statistics & Probability Letters, Volume 219, 2025, 110350.
-
arXiv:2112.06101 [pdf, ps, other]
Confidence intervals for the random forest generalization error
Abstract: We show that the byproducts of the standard training process of a random forest yield not only the well known and almost computationally free out-of-bag point estimate of the model generalization error, but also give a direct path to compute confidence intervals for the generalization error which avoids processes of data splitting and model retraining. Besides the low computational cost involved i… ▽ More
Submitted 11 March, 2022; v1 submitted 11 December, 2021; originally announced December 2021.
Comments: 10 pages
-
Learning a latent pattern of heterogeneity in the innovation rates of a time series of counts
Abstract: We develop a Bayesian hierarchical semiparametric model for phenomena related to time series of counts. The main feature of the model is its capability to learn a latent pattern of heterogeneity in the distribution of the process innovation rates, which are softly clustered through time with the help of a Dirichlet process placed at the top of the model hierarchy. The probabilistic forecasting cap… ▽ More
Submitted 6 July, 2019; originally announced July 2019.
-
arXiv:1312.2291 [pdf, ps, other]
Predictive analysis of microarray data
Abstract: Microarray gene expression data are analyzed by means of a Bayesian nonparametric model, with emphasis on prediction of future observables, yielding a method for selection of differentially expressed genes and a classifier.
Submitted 10 June, 2014; v1 submitted 8 December, 2013; originally announced December 2013.
-
arXiv:1306.1170 [pdf, ps, other]
On the computation of the marginal likelihood
Abstract: We describe briefly in this note a procedure for consistently estimating the marginal likelihood of a statistical model through a sample from the posterior distribution of the model parameters.
Submitted 10 June, 2014; v1 submitted 3 June, 2013; originally announced June 2013.
-
arXiv:1209.4947 [pdf, ps, other]
Bayesian Analysis of Simple Random Densities
Abstract: A tractable nonparametric prior over densities is introduced which is closed under sampling and exhibits proper posterior asymptotics.
Submitted 10 June, 2014; v1 submitted 21 September, 2012; originally announced September 2012.
Comments: 19 pages; 6 figures