Skip to main content

Showing 1–28 of 28 results for author: Yin, G

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.18280  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Feature Preserving Shrinkage on Bayesian Neural Networks via the R2D2 Prior

    Authors: Tsai Hor Chan, Dora Yan Zhang, Guosheng Yin, Lequan Yu

    Abstract: Bayesian neural networks (BNNs) treat neural network weights as random variables, which aim to provide posterior uncertainty estimates and avoid overfitting by performing inference on the posterior weights. However, the selection of appropriate prior distributions remains a challenging task, and BNNs may suffer from catastrophic inflated variance or poor predictive performance when poor choices ar… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: To appear in TPAMI

  2. arXiv:2503.05225  [pdf, other

    stat.AP stat.ME

    Bayesian analysis of restricted mean survival time adjusted for covariates using pseudo-observations

    Authors: Léa Orsini, Emmanuel Lesaffre, Guosheng Yin, Caroline Brard, David Dejardin, Gwénaël Le Teuff

    Abstract: The difference in restricted mean survival time (RMST) is a clinically meaningful measure to quantify treatment effect in randomized controlled trials, especially when the proportional hazards assumption does not hold. Several frequentist methods exist to estimate RMST adjusted for covariates based on modeling and integrating the survival function. A more natural approach may be a regression model… ▽ More

    Submitted 27 March, 2025; v1 submitted 7 March, 2025; originally announced March 2025.

  3. arXiv:2501.14197  [pdf, other

    cs.LG cs.SI stat.ML

    Bi-directional Curriculum Learning for Graph Anomaly Detection: Dual Focus on Homogeneity and Heterogeneity

    Authors: Yitong Hao, Enbo He, Yue Zhang, Guisheng Yin

    Abstract: Graph anomaly detection (GAD) aims to identify nodes from a graph that are significantly different from normal patterns. Most previous studies are model-driven, focusing on enhancing the detection effect by improving the model structure. However, these approaches often treat all nodes equally, neglecting the different contributions of various nodes to the training. Therefore, we introduce graph cu… ▽ More

    Submitted 23 January, 2025; originally announced January 2025.

    Comments: 8pages, 5 figures

  4. arXiv:2410.18021  [pdf, other

    stat.ME math.ST

    Deep Nonparametric Inference for Conditional Hazard Function

    Authors: Wen Su, Kin-Yat Liu, Guosheng Yin, Jian Huang, Xingqiu Zhao

    Abstract: We propose a novel deep learning approach to nonparametric statistical inference for the conditional hazard function of survival time with right-censored data. We use a deep neural network (DNN) to approximate the logarithm of a conditional hazard function given covariates and obtain a DNN likelihood-based estimator of the conditional hazard function. Such an estimation approach renders model flex… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

  5. arXiv:2410.07138  [pdf, other

    q-bio.NC cs.LG stat.AP

    Diagnosis and Pathogenic Analysis of Autism Spectrum Disorder Using Fused Brain Connection Graph

    Authors: Lu Wei, Yi Huang, Guosheng Yin, Fode Zhang, Manxue Zhang, Bin Liu

    Abstract: We propose a model for diagnosing Autism spectrum disorder (ASD) using multimodal magnetic resonance imaging (MRI) data. Our approach integrates brain connectivity data from diffusion tensor imaging (DTI) and functional MRI (fMRI), employing graph neural networks (GNNs) for fused graph classification. To improve diagnostic accuracy, we introduce a loss function that maximizes inter-class and minim… ▽ More

    Submitted 21 September, 2024; originally announced October 2024.

  6. arXiv:2406.03821  [pdf, other

    stat.AP stat.ME

    Bayesian generalized method of moments applied to pseudo-observations in survival analysis

    Authors: Léa Orsini, Caroline Brard, Emmanuel Lesaffre, Guosheng Yin, David Dejardin, Gwénaël Le Teuff

    Abstract: Bayesian inference for survival regression modeling offers numerous advantages, especially for decision-making and external data borrowing, but demands the specification of the baseline hazard function, which may be a challenging task. We propose an alternative approach that does not need the specification of this function. Our approach combines pseudo-observations to convert censored data into lo… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  7. arXiv:2404.03198  [pdf, other

    stat.ME

    Delaunay Weighted Two-sample Test for High-dimensional Data by Incorporating Geometric Information

    Authors: Jiaqi Gu, Ruoxu Tan, Guosheng Yin

    Abstract: Two-sample hypothesis testing is a fundamental problem with various applications, which faces new challenges in the high-dimensional context. To mitigate the issue of the curse of dimensionality, high-dimensional data are typically assumed to lie on a low-dimensional manifold. To incorporate geometric informtion in the data, we propose to apply the Delaunay triangulation and develop the Delaunay w… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    MSC Class: 62G10; 62G20

  8. arXiv:2402.05395  [pdf, other

    stat.ME

    Efficient Estimation for Functional Accelerated Failure Time Model

    Authors: Changyu Liu, Wen Su, Kin-Yat Liu, Guosheng Yin, Xingqiu Zhao

    Abstract: We propose a functional accelerated failure time model to characterize effects of both functional and scalar covariates on the time to event of interest, and provide regularity conditions to guarantee model identifiability. For efficient estimation of model parameters, we develop a sieve maximum likelihood approach where parametric and nonparametric coefficients are bundled with an unknown baselin… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  9. arXiv:2303.16532  [pdf, other

    cs.LG q-fin.ST stat.AP

    Futures Quantitative Investment with Heterogeneous Continual Graph Neural Network

    Authors: Min Hu, Zhizhong Tan, Bin Liu, Guosheng Yin

    Abstract: This study aims to address the challenges of futures price prediction in high-frequency trading (HFT) by proposing a continuous learning factor predictor based on graph neural networks. The model integrates multi-factor pricing theories with real-time market dynamics, effectively bypassing the limitations of existing methods that lack financial theory guidance and ignore various trend signals and… ▽ More

    Submitted 19 December, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

  10. arXiv:2212.05814  [pdf, other

    cs.LG stat.ML

    GWRBoost:A geographically weighted gradient boosting method for explainable quantification of spatially-varying relationships

    Authors: Han Wang, Zhou Huang, Ganmin Yin, Yi Bao, Xiao Zhou, Yong Gao

    Abstract: The geographically weighted regression (GWR) is an essential tool for estimating the spatial variation of relationships between dependent and independent variables in geographical contexts. However, GWR suffers from the problem that classical linear regressions, which compose the GWR model, are more prone to be underfitting, especially for significant volume and complex nonlinear data, causing inf… ▽ More

    Submitted 15 December, 2022; v1 submitted 12 December, 2022; originally announced December 2022.

    Comments: 13 pages, 8 figures, 4 tables

  11. arXiv:2210.00242  [pdf, other

    stat.ME math.ST

    Causal Effect of Functional Treatment

    Authors: Ruoxu Tan, Wei Huang, Zheng Zhang, Guosheng Yin

    Abstract: We study the causal effect with a functional treatment variable, where practical applications often arise in neuroscience, biomedical sciences, etc. Previous research concerning the effect of a functional variable on an outcome is typically restricted to exploring correlation rather than causality. The generalized propensity score, which is often used to calibrate the selection bias, is not direct… ▽ More

    Submitted 17 May, 2025; v1 submitted 1 October, 2022; originally announced October 2022.

  12. arXiv:2203.00173  [pdf, other

    stat.AP stat.CO

    Oncology Dose Finding Using Approximate Bayesian Computation Design

    Authors: Huaqing Jin, Wenbin Du, Guosheng Yin

    Abstract: In the development of new cancer treatment, an essential step is to determine the maximum tolerated dose (MTD) via phase I clinical trials. Generally speaking, phase I trial designs can be classified as either model-based or algorithm-based approaches. Model-based phase I designs are typically more efficient by using all observed data, while there is a potential risk of model misspecification that… ▽ More

    Submitted 28 February, 2022; originally announced March 2022.

    Comments: 4 figures and 3 tables

  13. arXiv:2102.12262  [pdf, other

    stat.ME

    PCA Rerandomization

    Authors: Hengtao Zhang, Guosheng Yin, Donald B. Rubin

    Abstract: Mahalanobis distance between treatment group and control group covariate means is often adopted as a balance criterion when implementing a rerandomization strategy. However, this criterion may not work well for high-dimensional cases because it balances all orthogonalized covariates equally. Here, we propose leveraging principal component analysis (PCA) to identify proper subspaces in which Mahala… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

  14. arXiv:2102.05223  [pdf, other

    stat.ME

    Bayesian Knockoff Filter

    Authors: Jiaqi Gu, Guosheng Yin

    Abstract: In many scientific fields, researchers are interested in discovering features with substantial effect on the response from a large number of features while controlling the proportion of false discoveries. By incorporating the knockoff procedure in the Bayesian framework, we develop the Bayesian knockoff filter (BKF) for selecting features that have important effect on the response. In contrast to… ▽ More

    Submitted 24 February, 2023; v1 submitted 9 February, 2021; originally announced February 2021.

    Comments: 39 pages, 10 figures

  15. arXiv:2102.00796  [pdf, other

    stat.ME

    Unit Information Prior for Adaptive Information Borrowing from Multiple Historical Datasets

    Authors: Huaqing Jin, Guosheng Yin

    Abstract: In clinical trials, there often exist multiple historical studies for the same or related treatment investigated in the current trial. Incorporating historical data in the analysis of the current study is of great importance, as it can help to gain more information, improve efficiency, and provide a more comprehensive evaluation of treatment. Enlightened by the unit information prior (UIP) concept… ▽ More

    Submitted 1 February, 2021; originally announced February 2021.

    Comments: 4 figures; 2 tables in manuscript. 2 figures and one table in supplementary

  16. arXiv:2011.10240  [pdf, other

    stat.ME

    Reconstruct Kaplan--Meier Estimator as M-estimator and Its Confidence Band

    Authors: Jiaqi Gu, Yiwei Fan, Guosheng Yin

    Abstract: The Kaplan--Meier (KM) estimator, which provides a nonparametric estimate of a survival function for time-to-event data, has wide application in clinical studies, engineering, economics and other fields. The theoretical properties of the KM estimator including its consistency and asymptotic distribution have been extensively studied. We reconstruct the KM estimator as an M-estimator by maximizing… ▽ More

    Submitted 20 November, 2020; originally announced November 2020.

    Comments: 15 pages, 2 figures

  17. arXiv:2009.12690  [pdf, ps, other

    cs.LG eess.SY stat.ML

    Adaptive Non-reversible Stochastic Gradient Langevin Dynamics

    Authors: Vikram Krishnamurthy, George Yin

    Abstract: It is well known that adding any skew symmetric matrix to the gradient of Langevin dynamics algorithm results in a non-reversible diffusion with improved convergence rate. This paper presents a gradient algorithm to adaptively optimize the choice of the skew symmetric matrix. The resulting algorithm involves a non-reversible diffusion algorithm cross coupled with a stochastic gradient algorithm th… ▽ More

    Submitted 26 September, 2020; originally announced September 2020.

  18. arXiv:2008.10020  [pdf, ps, other

    cs.LG eess.SP eess.SY stat.ML

    Multi-kernel Passive Stochastic Gradient Algorithms and Transfer Learning

    Authors: Vikram Krishnamurthy, George Yin

    Abstract: This paper develops a novel passive stochastic gradient algorithm. In passive stochastic approximation, the stochastic gradient algorithm does not have control over the location where noisy gradients of the cost function are evaluated. Classical passive stochastic gradient algorithms use a kernel that approximates a Dirac delta to weigh the gradients based on how far they are evaluated from the de… ▽ More

    Submitted 7 February, 2021; v1 submitted 23 August, 2020; originally announced August 2020.

  19. arXiv:2006.11674  [pdf, ps, other

    cs.LG eess.SY stat.ML

    Langevin Dynamics for Adaptive Inverse Reinforcement Learning of Stochastic Gradient Algorithms

    Authors: Vikram Krishnamurthy, George Yin

    Abstract: Inverse reinforcement learning (IRL) aims to estimate the reward function of optimizing agents by observing their response (estimates or actions). This paper considers IRL when noisy estimates of the gradient of a reward function generated by multiple stochastic gradient agents are observed. We present a generalized Langevin dynamics algorithm to estimate the reward function $R(θ)$; specifically,… ▽ More

    Submitted 18 January, 2021; v1 submitted 20 June, 2020; originally announced June 2020.

  20. arXiv:2002.10883  [pdf, ps, other

    stat.ME stat.CO

    Demystify Lindley's Paradox by Interpreting P-value as Posterior Probability

    Authors: Guosheng Yin, Haolun Shi

    Abstract: In the hypothesis testing framework, p-value is often computed to determine rejection of the null hypothesis or not. On the other hand, Bayesian approaches typically compute the posterior probability of the null hypothesis to evaluate its plausibility. We revisit Lindley's paradox (Lindley, 1957) and demystify the conflicting results between Bayesian and frequentist hypothesis testing procedures b… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

    Comments: arXiv admin note: text overlap with arXiv:1809.08503

  21. arXiv:1906.00350  [pdf, other

    stat.ML cs.LG

    Nonparametric Functional Approximation with Delaunay Triangulation

    Authors: Yehong Liu, Guosheng Yin

    Abstract: We propose a differentiable nonparametric algorithm, the Delaunay triangulation learner (DTL), to solve the functional approximation problem on the basis of a $p$-dimensional feature space. By conducting the Delaunay triangulation algorithm on the data points, the DTL partitions the feature space into a series of $p$-dimensional simplices in a geometrically optimal way, and fits a linear model wit… ▽ More

    Submitted 2 June, 2019; originally announced June 2019.

    Comments: 28 pages, 8 figures

  22. The statistical finite element method (statFEM) for coherent synthesis of observation data and model predictions

    Authors: Mark Girolami, Eky Febrianto, Ge Yin, Fehmi Cirak

    Abstract: The increased availability of observation data from engineering systems in operation poses the question of how to incorporate this data into finite element models. To this end, we propose a novel statistical construction of the finite element method that provides the means of synthesising measurement data and finite element models. The Bayesian statistical framework is adopted to treat all the unc… ▽ More

    Submitted 22 January, 2021; v1 submitted 15 May, 2019; originally announced May 2019.

  23. arXiv:1902.07627  [pdf, other

    stat.ML cs.LG

    Adaptive Iterative Hessian Sketch via A-Optimal Subsampling

    Authors: Aijun Zhang, Hengtao Zhang, Guosheng Yin

    Abstract: Iterative Hessian sketch (IHS) is an effective sketching method for modeling large-scale data. It was originally proposed by Pilanci and Wainwright (2016; JMLR) based on randomized sketching matrices. However, it is computationally intensive due to the iterative sketch process. In this paper, we analyze the IHS algorithm under the unconstrained least squares problem setting, then propose a determi… ▽ More

    Submitted 8 March, 2020; v1 submitted 20 February, 2019; originally announced February 2019.

    Comments: To appear in Statistics and Computing

  24. P-value: A Bless or A Curse for Evidence-Based Studies?

    Authors: Haolun Shi, Guosheng Yin

    Abstract: As a convention, p-value is often computed in frequentist hypothesis testing and compared with the nominal significance level of 0.05 to determine whether or not to reject the null hypothesis. The smaller the p-value, the more significant the statistical test. We consider both one-sided and two-sided hypotheses in the composite hypothesis setting. For one-sided hypothesis tests, we establish the e… ▽ More

    Submitted 22 September, 2018; originally announced September 2018.

    Journal ref: Reconnecting p-value and posterior probability under one- and two-sided tests. The American Statistician 2020

  25. arXiv:1809.01000  [pdf, other

    cs.CV cs.LG stat.ML

    Bayesian Outdoor Defect Detection

    Authors: Fei Jiang, Guosheng Yin

    Abstract: We introduce a Bayesian defect detector to facilitate the defect detection on the motion blurred images on rough texture surfaces. To enhance the accuracy of Bayesian detection on removing non-defect pixels, we develop a class of reflected non-local prior distributions, which is constructed by using the mode of a distribution to subtract its density. The reflected non-local priors forces the Bayes… ▽ More

    Submitted 30 August, 2018; originally announced September 2018.

  26. arXiv:1806.09039  [pdf, other

    cs.LG stat.ML

    Parallel Transport Unfolding: A Connection-based Manifold Learning Approach

    Authors: Max Budninskiy, Glorian Yin, Leman Feng, Yiying Tong, Mathieu Desbrun

    Abstract: Manifold learning offers nonlinear dimensionality reduction of high-dimensional datasets. In this paper, we bring geometry processing to bear on manifold learning by introducing a new approach based on metric connection for generating a quasi-isometric, low-dimensional mapping from a sparse and irregular sampling of an arbitrary manifold embedded in a high-dimensional space. Geodesic distances of… ▽ More

    Submitted 2 November, 2018; v1 submitted 23 June, 2018; originally announced June 2018.

    Comments: Submitted

    MSC Class: 53B05; 53B20; 68T05

  27. Bayesian data augmentation dose finding with continual reassessment method and delayed toxicity

    Authors: Suyu Liu, Guosheng Yin, Ying Yuan

    Abstract: A major practical impediment when implementing adaptive dose-finding designs is that the toxicity outcome used by the decision rules may not be observed shortly after the initiation of the treatment. To address this issue, we propose the data augmentation continual reassessment method (DA-CRM) for dose finding. By naturally treating the unobserved toxicities as missing data, we show that such miss… ▽ More

    Submitted 8 January, 2014; originally announced January 2014.

    Comments: Published in at http://dx.doi.org/10.1214/13-AOAS661 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS661

    Journal ref: Annals of Applied Statistics 2013, Vol. 7, No. 4, 2138-2156

  28. Bayesian phase I/II adaptively randomized oncology trials with combined drugs

    Authors: Ying Yuan, Guosheng Yin

    Abstract: We propose a new integrated phase I/II trial design to identify the most efficacious dose combination that also satisfies certain safety requirements for drug-combination trials. We first take a Bayesian copula-type model for dose finding in phase I. After identifying a set of admissible doses, we immediately move the entire set forward to phase II. We propose a novel adaptive randomization scheme… ▽ More

    Submitted 8 August, 2011; originally announced August 2011.

    Comments: Published in at http://dx.doi.org/10.1214/10-AOAS433 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS433

    Journal ref: Annals of Applied Statistics 2011, Vol. 5, No. 2A, 924-942