Search | arXiv e-print repository

Parametric Scaling Law of Tuning Bias in Conformal Prediction

Authors: Hao Zeng, Kangdao Liu, Bingyi Jing, Hongxin Wei

Abstract: Conformal prediction is a popular framework of uncertainty quantification that constructs prediction sets with coverage guarantees. To uphold the exchangeability assumption, many conformal prediction methods necessitate an additional holdout set for parameter tuning. Yet, the impact of violating this principle on coverage remains underexplored, making it ambiguous in practical applications. In thi… ▽ More Conformal prediction is a popular framework of uncertainty quantification that constructs prediction sets with coverage guarantees. To uphold the exchangeability assumption, many conformal prediction methods necessitate an additional holdout set for parameter tuning. Yet, the impact of violating this principle on coverage remains underexplored, making it ambiguous in practical applications. In this work, we empirically find that the tuning bias - the coverage gap introduced by leveraging the same dataset for tuning and calibration, is negligible for simple parameter tuning in many conformal prediction methods. In particular, we observe the scaling law of the tuning bias: this bias increases with parameter space complexity and decreases with calibration set size. Formally, we establish a theoretical framework to quantify the tuning bias and provide rigorous proof for the scaling law of the tuning bias by deriving its upper bound. In the end, we discuss how to reduce the tuning bias, guided by the theories we developed. △ Less

Submitted 5 February, 2025; originally announced February 2025.

arXiv:2412.08051 [pdf, other]

Two-way Node Popularity Model for Directed and Bipartite Networks

Authors: Bing-Yi Jing, Ting Li, Jiangzhou Wang, Ya Wang

Abstract: There has been extensive research on community detection in directed and bipartite networks. However, these studies often fail to consider the popularity of nodes in different communities, which is a common phenomenon in real-world networks. To address this issue, we propose a new probabilistic framework called the Two-Way Node Popularity Model (TNPM). The TNPM also accommodates edges from differe… ▽ More There has been extensive research on community detection in directed and bipartite networks. However, these studies often fail to consider the popularity of nodes in different communities, which is a common phenomenon in real-world networks. To address this issue, we propose a new probabilistic framework called the Two-Way Node Popularity Model (TNPM). The TNPM also accommodates edges from different distributions within a general sub-Gaussian family. We introduce the Delete-One-Method (DOM) for model fitting and community structure identification, and provide a comprehensive theoretical analysis with novel technical skills dealing with sub-Gaussian generalization. Additionally, we propose the Two-Stage Divided Cosine Algorithm (TSDC) to handle large-scale networks more efficiently. Our proposed methods offer multi-folded advantages in terms of estimation accuracy and computational efficiency, as demonstrated through extensive numerical studies. We apply our methods to two real-world applications, uncovering interesting findings. △ Less

Submitted 10 December, 2024; originally announced December 2024.

arXiv:2402.12683 [pdf, other]

TorchCP: A Python Library for Conformal Prediction

Authors: Jianguo Huang, Jianqing Song, Xuanning Zhou, Bingyi Jing, Hongxin Wei

Abstract: Conformal Prediction (CP) has attracted great attention from the research community due to its strict theoretical guarantees. However, researchers and developers still face challenges of applicability and efficiency when applying CP algorithms to deep learning models. In this paper, we introduce \torchcp, a comprehensive PyTorch-based toolkit to strengthen the usability of CP for deep learning mod… ▽ More Conformal Prediction (CP) has attracted great attention from the research community due to its strict theoretical guarantees. However, researchers and developers still face challenges of applicability and efficiency when applying CP algorithms to deep learning models. In this paper, we introduce \torchcp, a comprehensive PyTorch-based toolkit to strengthen the usability of CP for deep learning models. \torchcp implements a wide range of post-hoc and training methods of conformal prediction for various machine learning tasks, including classification, regression, GNN, and LLM. Moreover, we provide user-friendly interfaces and extensive evaluations to easily integrate CP algorithms into specific tasks. Our \torchcp toolkit, built entirely with PyTorch, enables high-performance GPU acceleration for deep learning models and mini-batch computation on large-scale datasets. With the LGPL license, the code is open-sourced at \url{https://github.com/ml-stat-Sustech/TorchCP} and will be continuously updated. △ Less

Submitted 12 December, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

arXiv:2104.09778 [pdf, other]

Convergence of Gaussian process regression: Optimality, robustness, and relationship with kernel ridge regression

Authors: Wenjia Wang, Bing-Yi Jing

Abstract: In this work, we investigate Gaussian process regression used to recover a function based on noisy observations. We derive upper and lower error bounds for Gaussian process regression with possibly misspecified correlation functions. The optimal convergence rate can be attained even if the smoothness of the imposed correlation function exceeds that of the true correlation function and the sampling… ▽ More In this work, we investigate Gaussian process regression used to recover a function based on noisy observations. We derive upper and lower error bounds for Gaussian process regression with possibly misspecified correlation functions. The optimal convergence rate can be attained even if the smoothness of the imposed correlation function exceeds that of the true correlation function and the sampling scheme is quasi-uniform. As byproducts, we also obtain convergence rates of kernel ridge regression with misspecified kernel function, where the underlying truth is a deterministic function. The convergence rates of Gaussian process regression and kernel ridge regression are closely connected, which is aligned with the relationship between sample paths of Gaussian process and the corresponding reproducing kernel Hilbert space. △ Less

Submitted 18 July, 2022; v1 submitted 20 April, 2021; originally announced April 2021.

arXiv:2002.04457 [pdf, other]

Community Detection on Mixture Multi-layer Networks via Regularized Tensor Decomposition

Authors: Bing-Yi Jing, Ting Li, Zhongyuan Lyu, Dong Xia

Abstract: We study the problem of community detection in multi-layer networks, where pairs of nodes can be related in multiple modalities. We introduce a general framework, i.e., mixture multi-layer stochastic block model (MMSBM), which includes many earlier models as special cases. We propose a tensor-based algorithm (TWIST) to reveal both global/local memberships of nodes, and memberships of layers. We sh… ▽ More We study the problem of community detection in multi-layer networks, where pairs of nodes can be related in multiple modalities. We introduce a general framework, i.e., mixture multi-layer stochastic block model (MMSBM), which includes many earlier models as special cases. We propose a tensor-based algorithm (TWIST) to reveal both global/local memberships of nodes, and memberships of layers. We show that the TWIST procedure can accurately detect the communities with small misclassification error as the number of nodes and/or the number of layers increases. Numerical studies confirm our theoretical findings. To our best knowledge, this is the first systematic study on the mixture multi-layer networks using tensor decomposition. The method is applied to two real datasets: worldwide trading networks and malaria parasite genes networks, yielding new and interesting findings. △ Less

Submitted 10 February, 2020; originally announced February 2020.

arXiv:1504.00461 [pdf, ps, other]

doi 10.1214/14-AOS1298

Testing for pure-jump processes for high-frequency data

Authors: Xin-Bing Kong, Zhi Liu, Bing-Yi Jing

Abstract: Pure-jump processes have been increasingly popular in modeling high-frequency financial data, partially due to their versatility and flexibility. In the meantime, several statistical tests have been proposed in the literature to check the validity of using pure-jump models. However, these tests suffer from several drawbacks, such as requiring rather stringent conditions and having slow rates of co… ▽ More Pure-jump processes have been increasingly popular in modeling high-frequency financial data, partially due to their versatility and flexibility. In the meantime, several statistical tests have been proposed in the literature to check the validity of using pure-jump models. However, these tests suffer from several drawbacks, such as requiring rather stringent conditions and having slow rates of convergence. In this paper, we propose a different test to check whether the underlying process of high-frequency data can be modeled by a pure-jump process. The new test is based on the realized characteristic function, and enjoys a much faster convergence rate of order $O(n^{1/2})$ (where $n$ is the sample size) versus the usual $o(n^{1/4})$ available for existing tests; it is applicable much more generally than previous tests; for example, it is robust to jumps of infinite variation and flexible modeling of the diffusion component. Simulation studies justify our findings and the test is also applied to some real high-frequency financial data. △ Less

Submitted 2 April, 2015; originally announced April 2015.

Comments: Published at http://dx.doi.org/10.1214/14-AOS1298 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS1298

Journal ref: Annals of Statistics 2015, Vol. 43, No. 2, 847-877

arXiv:1211.3230 [pdf, ps, other]

doi 10.1214/10-AOS833

Nonparametric estimate of spectral density functions of sample covariance matrices: A first step

Authors: Bing-Yi Jing, Guangming Pan, Qi-Man Shao, Wang Zhou

Abstract: The density function of the limiting spectral distribution of general sample covariance matrices is usually unknown. We propose to use kernel estimators which are proved to be consistent. A simulation study is also conducted to show the performance of the estimators. The density function of the limiting spectral distribution of general sample covariance matrices is usually unknown. We propose to use kernel estimators which are proved to be consistent. A simulation study is also conducted to show the performance of the estimators. △ Less

Submitted 14 November, 2012; originally announced November 2012.

Comments: Published in at http://dx.doi.org/10.1214/10-AOS833 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS833

Journal ref: Annals of Statistics 2010, Vol. 38, No. 6, 3724-3750

arXiv:1206.0827 [pdf, ps, other]

doi 10.1214/12-AOS977

Modeling high-frequency financial data by pure jump processes

Authors: Bing-Yi Jing, Xin-Bing Kong, Zhi Liu

Abstract: It is generally accepted that the asset price processes contain jumps. In fact, pure jump models have been widely used to model asset prices and/or stochastic volatilities. The question is: is there any statistical evidence from the high-frequency financial data to support using pure jump models alone? The purpose of this paper is to develop such a statistical test against the necessity of a diffu… ▽ More It is generally accepted that the asset price processes contain jumps. In fact, pure jump models have been widely used to model asset prices and/or stochastic volatilities. The question is: is there any statistical evidence from the high-frequency financial data to support using pure jump models alone? The purpose of this paper is to develop such a statistical test against the necessity of a diffusion component. The test is very simple to use and yet effective. Asymptotic properties of the proposed test statistic will be studied. Simulation studies and some real-life examples are included to illustrate our results. △ Less

Submitted 5 June, 2012; originally announced June 2012.

Comments: Published in at http://dx.doi.org/10.1214/12-AOS977 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS977

Journal ref: Annals of Statistics 2012, Vol. 40, No. 2, 759-784

arXiv:0903.3081 [pdf, ps, other]

doi 10.1214/09-AOP474

On normal approximations to $U$-statistics

Authors: Vidmantas Bentkus, Bing-Yi Jing, Wang Zhou

Abstract: Let ${X_1,...,X_n}$ be i.i.d. random observations. Let $\mathbb{S}=\mathbb{L}+\mathbb{T}$ be a $U$-statistic of order $k\ge2$ where $\mathbb{L}$ is a linear statistic having asymptotic normal distribution, and $\mathbb{T}$ is a stochastically smaller statistic. We show that the rate of convergence to normality for $\mathbb{S}$ can be simply expressed as the rate of convergence to normality for t… ▽ More Let ${X_1,...,X_n}$ be i.i.d. random observations. Let $\mathbb{S}=\mathbb{L}+\mathbb{T}$ be a $U$-statistic of order $k\ge2$ where $\mathbb{L}$ is a linear statistic having asymptotic normal distribution, and $\mathbb{T}$ is a stochastically smaller statistic. We show that the rate of convergence to normality for $\mathbb{S}$ can be simply expressed as the rate of convergence to normality for the linear part $\mathbb{L}$ plus a correction term, $(\operatorname {var}\mathbb{T})\ln^2(\operatorname {var}\mathbb{T})$, under the condition ${\mathbb{E}\mathbb{T}^2<\infty}$. An optimal bound without this $\log$ factor is obtained under a lower moment assumption ${\mathbb {E}|\mathbb{T}|^α<\infty}$ for ${α<2}$. Some other related results are also obtained in the paper. Our results extend, refine and yield a number of related-known results in the literature. △ Less

Submitted 14 December, 2009; v1 submitted 17 March, 2009; originally announced March 2009.

Comments: Published in at http://dx.doi.org/10.1214/09-AOP474 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOP-AOP474 MSC Class: 62E20 (Primary)

Journal ref: Annals of Probability 2009, Vol. 37, No. 6, 2174-2199

arXiv:math/0508604 [pdf, ps, other]

doi 10.1214/009053604000000742

Saddlepoint approximation for Student's t-statistic with no moment conditions

Authors: Bing-Yi Jing, Qi-Man Shao, Wang Zhou

Abstract: A saddlepoint approximation of the Student's t-statistic was derived by Daniels and Young [Biometrika 78 (1991) 169-179] under the very stringent exponential moment condition that requires that the underlying density function go down at least as fast as a Normal density in the tails. This is a severe restriction on the approximation's applicability. In this paper we show that this strong exponen… ▽ More A saddlepoint approximation of the Student's t-statistic was derived by Daniels and Young [Biometrika 78 (1991) 169-179] under the very stringent exponential moment condition that requires that the underlying density function go down at least as fast as a Normal density in the tails. This is a severe restriction on the approximation's applicability. In this paper we show that this strong exponential moment restriction can be completely dispensed with, that is, saddlepoint approximation of the Student's t-statistic remains valid without any moment condition. This confirms the folklore that the Student's t-statistic is robust against outliers. The saddlepoint approximation not only provides a very accurate approximation for the Student's t-statistic, but it also can be applied much more widely in statistical inference. As a result, saddlepoint approximations should always be used whenever possible. Some numerical work will be given to illustrate these points. △ Less

Submitted 30 August, 2005; originally announced August 2005.

Comments: Published at http://dx.doi.org/10.1214/009053604000000742 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS273 MSC Class: 62E20 (Primary) 60G50. (Secondary)

Journal ref: Annals of Statistics 2004, Vol. 32, No. 6, 2679-2711

Showing 1–10 of 10 results for author: Jing, B