-
Parametric Scaling Law of Tuning Bias in Conformal Prediction
Authors:
Hao Zeng,
Kangdao Liu,
Bingyi Jing,
Hongxin Wei
Abstract:
Conformal prediction is a popular framework of uncertainty quantification that constructs prediction sets with coverage guarantees. To uphold the exchangeability assumption, many conformal prediction methods necessitate an additional holdout set for parameter tuning. Yet, the impact of violating this principle on coverage remains underexplored, making it ambiguous in practical applications. In thi…
▽ More
Conformal prediction is a popular framework of uncertainty quantification that constructs prediction sets with coverage guarantees. To uphold the exchangeability assumption, many conformal prediction methods necessitate an additional holdout set for parameter tuning. Yet, the impact of violating this principle on coverage remains underexplored, making it ambiguous in practical applications. In this work, we empirically find that the tuning bias - the coverage gap introduced by leveraging the same dataset for tuning and calibration, is negligible for simple parameter tuning in many conformal prediction methods. In particular, we observe the scaling law of the tuning bias: this bias increases with parameter space complexity and decreases with calibration set size. Formally, we establish a theoretical framework to quantify the tuning bias and provide rigorous proof for the scaling law of the tuning bias by deriving its upper bound. In the end, we discuss how to reduce the tuning bias, guided by the theories we developed.
△ Less
Submitted 5 February, 2025;
originally announced February 2025.
-
Two-way Node Popularity Model for Directed and Bipartite Networks
Authors:
Bing-Yi Jing,
Ting Li,
Jiangzhou Wang,
Ya Wang
Abstract:
There has been extensive research on community detection in directed and bipartite networks. However, these studies often fail to consider the popularity of nodes in different communities, which is a common phenomenon in real-world networks. To address this issue, we propose a new probabilistic framework called the Two-Way Node Popularity Model (TNPM). The TNPM also accommodates edges from differe…
▽ More
There has been extensive research on community detection in directed and bipartite networks. However, these studies often fail to consider the popularity of nodes in different communities, which is a common phenomenon in real-world networks. To address this issue, we propose a new probabilistic framework called the Two-Way Node Popularity Model (TNPM). The TNPM also accommodates edges from different distributions within a general sub-Gaussian family. We introduce the Delete-One-Method (DOM) for model fitting and community structure identification, and provide a comprehensive theoretical analysis with novel technical skills dealing with sub-Gaussian generalization. Additionally, we propose the Two-Stage Divided Cosine Algorithm (TSDC) to handle large-scale networks more efficiently. Our proposed methods offer multi-folded advantages in terms of estimation accuracy and computational efficiency, as demonstrated through extensive numerical studies. We apply our methods to two real-world applications, uncovering interesting findings.
△ Less
Submitted 10 December, 2024;
originally announced December 2024.
-
TorchCP: A Python Library for Conformal Prediction
Authors:
Jianguo Huang,
Jianqing Song,
Xuanning Zhou,
Bingyi Jing,
Hongxin Wei
Abstract:
Conformal Prediction (CP) has attracted great attention from the research community due to its strict theoretical guarantees. However, researchers and developers still face challenges of applicability and efficiency when applying CP algorithms to deep learning models. In this paper, we introduce \torchcp, a comprehensive PyTorch-based toolkit to strengthen the usability of CP for deep learning mod…
▽ More
Conformal Prediction (CP) has attracted great attention from the research community due to its strict theoretical guarantees. However, researchers and developers still face challenges of applicability and efficiency when applying CP algorithms to deep learning models. In this paper, we introduce \torchcp, a comprehensive PyTorch-based toolkit to strengthen the usability of CP for deep learning models. \torchcp implements a wide range of post-hoc and training methods of conformal prediction for various machine learning tasks, including classification, regression, GNN, and LLM. Moreover, we provide user-friendly interfaces and extensive evaluations to easily integrate CP algorithms into specific tasks. Our \torchcp toolkit, built entirely with PyTorch, enables high-performance GPU acceleration for deep learning models and mini-batch computation on large-scale datasets. With the LGPL license, the code is open-sourced at \url{https://github.com/ml-stat-Sustech/TorchCP} and will be continuously updated.
△ Less
Submitted 12 December, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
Convergence of Gaussian process regression: Optimality, robustness, and relationship with kernel ridge regression
Authors:
Wenjia Wang,
Bing-Yi Jing
Abstract:
In this work, we investigate Gaussian process regression used to recover a function based on noisy observations. We derive upper and lower error bounds for Gaussian process regression with possibly misspecified correlation functions. The optimal convergence rate can be attained even if the smoothness of the imposed correlation function exceeds that of the true correlation function and the sampling…
▽ More
In this work, we investigate Gaussian process regression used to recover a function based on noisy observations. We derive upper and lower error bounds for Gaussian process regression with possibly misspecified correlation functions. The optimal convergence rate can be attained even if the smoothness of the imposed correlation function exceeds that of the true correlation function and the sampling scheme is quasi-uniform. As byproducts, we also obtain convergence rates of kernel ridge regression with misspecified kernel function, where the underlying truth is a deterministic function. The convergence rates of Gaussian process regression and kernel ridge regression are closely connected, which is aligned with the relationship between sample paths of Gaussian process and the corresponding reproducing kernel Hilbert space.
△ Less
Submitted 18 July, 2022; v1 submitted 20 April, 2021;
originally announced April 2021.
-
Community Detection on Mixture Multi-layer Networks via Regularized Tensor Decomposition
Authors:
Bing-Yi Jing,
Ting Li,
Zhongyuan Lyu,
Dong Xia
Abstract:
We study the problem of community detection in multi-layer networks, where pairs of nodes can be related in multiple modalities. We introduce a general framework, i.e., mixture multi-layer stochastic block model (MMSBM), which includes many earlier models as special cases. We propose a tensor-based algorithm (TWIST) to reveal both global/local memberships of nodes, and memberships of layers. We sh…
▽ More
We study the problem of community detection in multi-layer networks, where pairs of nodes can be related in multiple modalities. We introduce a general framework, i.e., mixture multi-layer stochastic block model (MMSBM), which includes many earlier models as special cases. We propose a tensor-based algorithm (TWIST) to reveal both global/local memberships of nodes, and memberships of layers. We show that the TWIST procedure can accurately detect the communities with small misclassification error as the number of nodes and/or the number of layers increases. Numerical studies confirm our theoretical findings. To our best knowledge, this is the first systematic study on the mixture multi-layer networks using tensor decomposition. The method is applied to two real datasets: worldwide trading networks and malaria parasite genes networks, yielding new and interesting findings.
△ Less
Submitted 10 February, 2020;
originally announced February 2020.
-
Testing for pure-jump processes for high-frequency data
Authors:
Xin-Bing Kong,
Zhi Liu,
Bing-Yi Jing
Abstract:
Pure-jump processes have been increasingly popular in modeling high-frequency financial data, partially due to their versatility and flexibility. In the meantime, several statistical tests have been proposed in the literature to check the validity of using pure-jump models. However, these tests suffer from several drawbacks, such as requiring rather stringent conditions and having slow rates of co…
▽ More
Pure-jump processes have been increasingly popular in modeling high-frequency financial data, partially due to their versatility and flexibility. In the meantime, several statistical tests have been proposed in the literature to check the validity of using pure-jump models. However, these tests suffer from several drawbacks, such as requiring rather stringent conditions and having slow rates of convergence. In this paper, we propose a different test to check whether the underlying process of high-frequency data can be modeled by a pure-jump process. The new test is based on the realized characteristic function, and enjoys a much faster convergence rate of order $O(n^{1/2})$ (where $n$ is the sample size) versus the usual $o(n^{1/4})$ available for existing tests; it is applicable much more generally than previous tests; for example, it is robust to jumps of infinite variation and flexible modeling of the diffusion component. Simulation studies justify our findings and the test is also applied to some real high-frequency financial data.
△ Less
Submitted 2 April, 2015;
originally announced April 2015.
-
Nonparametric estimate of spectral density functions of sample covariance matrices: A first step
Authors:
Bing-Yi Jing,
Guangming Pan,
Qi-Man Shao,
Wang Zhou
Abstract:
The density function of the limiting spectral distribution of general sample covariance matrices is usually unknown. We propose to use kernel estimators which are proved to be consistent. A simulation study is also conducted to show the performance of the estimators.
The density function of the limiting spectral distribution of general sample covariance matrices is usually unknown. We propose to use kernel estimators which are proved to be consistent. A simulation study is also conducted to show the performance of the estimators.
△ Less
Submitted 14 November, 2012;
originally announced November 2012.
-
Modeling high-frequency financial data by pure jump processes
Authors:
Bing-Yi Jing,
Xin-Bing Kong,
Zhi Liu
Abstract:
It is generally accepted that the asset price processes contain jumps. In fact, pure jump models have been widely used to model asset prices and/or stochastic volatilities. The question is: is there any statistical evidence from the high-frequency financial data to support using pure jump models alone? The purpose of this paper is to develop such a statistical test against the necessity of a diffu…
▽ More
It is generally accepted that the asset price processes contain jumps. In fact, pure jump models have been widely used to model asset prices and/or stochastic volatilities. The question is: is there any statistical evidence from the high-frequency financial data to support using pure jump models alone? The purpose of this paper is to develop such a statistical test against the necessity of a diffusion component. The test is very simple to use and yet effective. Asymptotic properties of the proposed test statistic will be studied. Simulation studies and some real-life examples are included to illustrate our results.
△ Less
Submitted 5 June, 2012;
originally announced June 2012.
-
On normal approximations to $U$-statistics
Authors:
Vidmantas Bentkus,
Bing-Yi Jing,
Wang Zhou
Abstract:
Let ${X_1,...,X_n}$ be i.i.d. random observations. Let $\mathbb{S}=\mathbb{L}+\mathbb{T}$ be a $U$-statistic of order $k\ge2$ where $\mathbb{L}$ is a linear statistic having asymptotic normal distribution, and $\mathbb{T}$ is a stochastically smaller statistic. We show that the rate of convergence to normality for $\mathbb{S}$ can be simply expressed as the rate of convergence to normality for t…
▽ More
Let ${X_1,...,X_n}$ be i.i.d. random observations. Let $\mathbb{S}=\mathbb{L}+\mathbb{T}$ be a $U$-statistic of order $k\ge2$ where $\mathbb{L}$ is a linear statistic having asymptotic normal distribution, and $\mathbb{T}$ is a stochastically smaller statistic. We show that the rate of convergence to normality for $\mathbb{S}$ can be simply expressed as the rate of convergence to normality for the linear part $\mathbb{L}$ plus a correction term, $(\operatorname {var}\mathbb{T})\ln^2(\operatorname {var}\mathbb{T})$, under the condition ${\mathbb{E}\mathbb{T}^2<\infty}$. An optimal bound without this $\log$ factor is obtained under a lower moment assumption ${\mathbb {E}|\mathbb{T}|^α<\infty}$ for ${α<2}$. Some other related results are also obtained in the paper. Our results extend, refine and yield a number of related-known results in the literature.
△ Less
Submitted 14 December, 2009; v1 submitted 17 March, 2009;
originally announced March 2009.
-
Saddlepoint approximation for Student's t-statistic with no moment conditions
Authors:
Bing-Yi Jing,
Qi-Man Shao,
Wang Zhou
Abstract:
A saddlepoint approximation of the Student's t-statistic was derived by Daniels and Young [Biometrika 78 (1991) 169-179] under the very stringent exponential moment condition that requires that the underlying density function go down at least as fast as a Normal density in the tails. This is a severe restriction on the approximation's applicability. In this paper we show that this strong exponen…
▽ More
A saddlepoint approximation of the Student's t-statistic was derived by Daniels and Young [Biometrika 78 (1991) 169-179] under the very stringent exponential moment condition that requires that the underlying density function go down at least as fast as a Normal density in the tails. This is a severe restriction on the approximation's applicability. In this paper we show that this strong exponential moment restriction can be completely dispensed with, that is, saddlepoint approximation of the Student's t-statistic remains valid without any moment condition. This confirms the folklore that the Student's t-statistic is robust against outliers. The saddlepoint approximation not only provides a very accurate approximation for the Student's t-statistic, but it also can be applied much more widely in statistical inference. As a result, saddlepoint approximations should always be used whenever possible. Some numerical work will be given to illustrate these points.
△ Less
Submitted 30 August, 2005;
originally announced August 2005.