Skip to main content

Showing 1–19 of 19 results for author: Fan, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2507.10975  [pdf, ps, other

    stat.ME

    Robust Bayesian high-dimensional variable selection and inference with the horseshoe family of priors

    Authors: Kun Fan, Srijana Subedi, Vishmi Ridmika Dissanayake Pathiranage, Cen Wu

    Abstract: Frequentist robust variable selection has been extensively investigated in high-dimensional regression. Despite success, developing the corresponding statistical inference procedures remains a challenging task. Recently, tackling this challenge from a Bayesian perspective has received much attention. In literature, the two-group spike-and-slab priors that can induce exact sparsity have been demons… ▽ More

    Submitted 22 July, 2025; v1 submitted 15 July, 2025; originally announced July 2025.

  2. arXiv:2503.16321  [pdf

    stat.ME

    Balancing the effective sample size in prior across different doses in the curve-free Bayesian decision-theoretic design for dose-finding trials

    Authors: Jiapeng Xu, Dehua Bi, Shenghua Kelly Fan, Bee Leng Lee, Ying Lu

    Abstract: The primary goal of dose allocation in phase I trials is to minimize patient exposure to subtherapeutic or excessively toxic doses, while accurately recommending a phase II dose that is as close as possible to the maximum tolerated dose (MTD). Fan et al. (2012) introduced a curve-free Bayesian decision-theoretic design (CFBD), which leverages the assumption of a monotonic dose-toxicity relationshi… ▽ More

    Submitted 20 March, 2025; originally announced March 2025.

    Comments: 24 pages

  3. arXiv:2311.07736  [pdf, other

    stat.ME physics.med-ph

    Use of Expected Utility (EU) to Evaluate Artificial Intelligence-Enabled Rule-Out Devices for Mammography Screening

    Authors: Kwok Lung Fan, Yee Lam Elim Thompson, Weijie Chen, Craig K. Abbey, Frank W Samuelson

    Abstract: Background: An artificial intelligence (AI)-enabled rule-out device may autonomously remove patient images unlikely to have cancer from radiologist review. Many published studies evaluate this type of device by retrospectively applying the AI to large datasets and use sensitivity and specificity as the performance metrics. However, these metrics have fundamental shortcomings because they are bound… ▽ More

    Submitted 1 October, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

  4. arXiv:2208.08265  [pdf, other

    stat.ML cs.LG

    Semi-Supervised Anomaly Detection Based on Quadratic Multiform Separation

    Authors: Ko-Hui Michael Fan, Chih-Chung Chang, Kuang-Hsiao-Yin Kongguoluo

    Abstract: In this paper we propose a novel method for semi-supervised anomaly detection (SSAD). Our classifier is named QMS22 as its inception was dated 2022 upon the framework of quadratic multiform separation (QMS), a recently introduced classification model. QMS22 tackles SSAD by solving a multi-class classification problem involving both the training set and the test set of the original problem. The cla… ▽ More

    Submitted 17 August, 2022; originally announced August 2022.

  5. arXiv:2208.04683  [pdf, other

    cs.CY cs.AI cs.LG stat.AP

    Applying data technologies to combat AMR: current status, challenges, and opportunities on the way forward

    Authors: Leonid Chindelevitch, Elita Jauneikaite, Nicole E. Wheeler, Kasim Allel, Bede Yaw Ansiri-Asafoakaa, Wireko A. Awuah, Denis C. Bauer, Stephan Beisken, Kara Fan, Gary Grant, Michael Graz, Yara Khalaf, Veranja Liyanapathirana, Carlos Montefusco-Pereira, Lawrence Mugisha, Atharv Naik, Sylvia Nanono, Anthony Nguyen, Timothy Rawson, Kessendri Reddy, Juliana M. Ruzante, Anneke Schmider, Roman Stocker, Leonhardt Unruh, Daniel Waruingi , et al. (2 additional authors not shown)

    Abstract: Antimicrobial resistance (AMR) is a growing public health threat, estimated to cause over 10 million deaths per year and cost the global economy 100 trillion USD by 2050 under status quo projections. These losses would mainly result from an increase in the morbidity and mortality from treatment failure, AMR infections during medical procedures, and a loss of quality of life attributed to AMR. Nume… ▽ More

    Submitted 11 August, 2022; v1 submitted 5 July, 2022; originally announced August 2022.

    Comments: 65 pages, 3 figures

    ACM Class: I.2.1; J.3

  6. arXiv:2202.09784  [pdf, other

    cs.LG cs.AI cs.CV stat.ME

    Clustering by the Probability Distributions from Extreme Value Theory

    Authors: Sixiao Zheng, Ke Fan, Yanxi Hou, Jianfeng Feng, Yanwei Fu

    Abstract: Clustering is an essential task to unsupervised learning. It tries to automatically separate instances into coherent subsets. As one of the most well-known clustering algorithms, k-means assigns sample points at the boundary to a unique cluster, while it does not utilize the information of sample distribution or density. Comparably, it would potentially be more beneficial to consider the probabili… ▽ More

    Submitted 20 February, 2022; originally announced February 2022.

    Comments: IEEE Transactions on Artificial Intelligence

  7. arXiv:2112.03912  [pdf, other

    cs.LG cs.AI stat.ML

    RID-Noise: Towards Robust Inverse Design under Noisy Environments

    Authors: Jia-Qi Yang, Ke-Bin Fan, Hao Ma, De-Chuan Zhan

    Abstract: From an engineering perspective, a design should not only perform well in an ideal condition, but should also resist noises. Such a design methodology, namely robust design, has been widely implemented in the industry for product quality control. However, classic robust design requires a lot of evaluations for a single design target, while the results of these evaluations could not be reused for a… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

    Comments: AAAI'22

  8. arXiv:2110.04925  [pdf, ps, other

    stat.ML cs.LG

    Quadratic Multiform Separation: A New Classification Model in Machine Learning

    Authors: Ko-Hui Michael Fan, Chih-Chung Chang, Kuang-Hsiao-Yin Kongguoluo

    Abstract: In this paper we present a new classification model in machine learning. Our result is threefold: 1) The model produces comparable predictive accuracy to that of most common classification models. 2) It runs significantly faster than most common classification models. 3) It has the ability to identify a portion of unseen samples for which class labels can be found with much higher predictive accur… ▽ More

    Submitted 17 August, 2022; v1 submitted 10 October, 2021; originally announced October 2021.

  9. arXiv:2107.08533  [pdf, other

    stat.ME

    Sparse group variable selection for gene-environment interactions in the longitudinal study

    Authors: Fei Zhou, Xi Lu, Jie Ren, Kun Fan, Shuangge Ma, Cen Wu

    Abstract: Penalized variable selection for high dimensional longitudinal data has received much attention as accounting for the correlation among repeated measurements and providing additional and essential information for improved identification and prediction performance. Despite the success, in longitudinal studies the potential of penalization methods is far from fully understood for accommodating struc… ▽ More

    Submitted 18 July, 2021; originally announced July 2021.

  10. arXiv:2102.11772  [pdf, other

    stat.ME

    Identifying Gene-environment interactions with robust marginal Bayesian variable selection

    Authors: Xi Lu, Kun Fan, Jie Ren, Cen Wu

    Abstract: In high-throughput genetics studies, an important aim is to identify gene-environment interactions associated with the clinical outcomes. Recently, multiple marginal penalization methods have been developed and shown to be effective in G$\times$E studies. However, within the Bayesian framework, marginal variable selection has not received much attention. In this study, we propose a novel marginal… ▽ More

    Submitted 23 February, 2021; originally announced February 2021.

  11. arXiv:1706.03850  [pdf, other

    stat.ML cs.CL cs.LG

    Adversarial Feature Matching for Text Generation

    Authors: Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Dinghan Shen, Lawrence Carin

    Abstract: The Generative Adversarial Network (GAN) has achieved great success in generating realistic (real-valued) synthetic data. However, convergence issues and difficulties dealing with discrete data hinder the applicability of GAN to text. We propose a framework for generating realistic text via adversarial training. We employ a long short-term memory network as generator, and a convolutional network a… ▽ More

    Submitted 18 November, 2017; v1 submitted 12 June, 2017; originally announced June 2017.

    Comments: Accepted by ICML 2017

  12. arXiv:1703.09766  [pdf, other

    stat.ML cs.LG

    Unifying the Stochastic Spectral Descent for Restricted Boltzmann Machines with Bernoulli or Gaussian Inputs

    Authors: Kai Fan

    Abstract: Stochastic gradient descent based algorithms are typically used as the general optimization tools for most deep learning models. A Restricted Boltzmann Machine (RBM) is a probabilistic generative model that can be stacked to construct deep architectures. For RBM with Bernoulli inputs, non-Euclidean algorithm such as stochastic spectral descent (SSD) has been specifically designed to speed up the c… ▽ More

    Submitted 28 March, 2017; originally announced March 2017.

  13. arXiv:1611.05559  [pdf, other

    stat.ML cs.LG

    Boosting Variational Inference

    Authors: Fangjian Guo, Xiangyu Wang, Kai Fan, Tamara Broderick, David B. Dunson

    Abstract: Variational inference (VI) provides fast approximations of a Bayesian posterior in part because it formulates posterior approximation as an optimization problem: to find the closest distribution to the exact posterior over some family of distributions. For practical reasons, the family of distributions in VI is usually constrained so that it does not include the exact posterior, even as a limit po… ▽ More

    Submitted 1 March, 2017; v1 submitted 16 November, 2016; originally announced November 2016.

    Comments: 17 pages, 7 figures

  14. arXiv:1602.07800  [pdf, other

    stat.ML stat.ME

    Towards Unifying Hamiltonian Monte Carlo and Slice Sampling

    Authors: Yizhe Zhang, Xiangyu Wang, Changyou Chen, Ricardo Henao, Kai Fan, Lawrence Carin

    Abstract: We unify slice sampling and Hamiltonian Monte Carlo (HMC) sampling, demonstrating their connection via the Hamiltonian-Jacobi equation from Hamiltonian mechanics. This insight enables extension of HMC and slice sampling to a broader family of samplers, called Monomial Gamma Samplers (MGS). We provide a theoretical analysis of the mixing performance of such samplers, proving that in the limit of a… ▽ More

    Submitted 10 January, 2018; v1 submitted 25 February, 2016; originally announced February 2016.

    Comments: updated version

    Journal ref: Advances in Neural Information Processing Systems, pages 1741--1749, year 2016

  15. arXiv:1512.07662  [pdf, other

    stat.ML

    High-Order Stochastic Gradient Thermostats for Bayesian Learning of Deep Models

    Authors: Chunyuan Li, Changyou Chen, Kai Fan, Lawrence Carin

    Abstract: Learning in deep models using Bayesian methods has generated significant attention recently. This is largely because of the feasibility of modern Bayesian methods to yield scalable learning and inference, while maintaining a measure of uncertainty in the model parameters. Stochastic gradient MCMC algorithms (SG-MCMC) are a family of diffusion-based sampling methods for large-scale Bayesian learnin… ▽ More

    Submitted 23 December, 2015; originally announced December 2015.

    Comments: AAAI 2016

  16. arXiv:1511.04157  [pdf, other

    stat.ML

    $k$-means: Fighting against Degeneracy in Sequential Monte Carlo with an Application to Tracking

    Authors: Kai Fan, Katherine Heller

    Abstract: For regular particle filter algorithm or Sequential Monte Carlo (SMC) methods, the initial weights are traditionally dependent on the proposed distribution, the posterior distribution at the current timestamp in the sampled sequence, and the target is the posterior distribution of the previous timestamp. This is technically correct, but leads to algorithms which usually have practical issues with… ▽ More

    Submitted 12 November, 2015; originally announced November 2015.

  17. arXiv:1509.02866  [pdf, other

    stat.ML

    Fast Second-Order Stochastic Backpropagation for Variational Inference

    Authors: Kai Fan, Ziteng Wang, Jeff Beck, James Kwok, Katherine Heller

    Abstract: We propose a second-order (Hessian or Hessian-free) based optimization method for variational inference inspired by Gaussian backpropagation, and argue that quasi-Newton optimization can be developed as well. This is accomplished by generalizing the gradient computation in stochastic backpropagation via a reparametrization trick with lower complexity. As an illustrative example, we apply this appr… ▽ More

    Submitted 28 March, 2017; v1 submitted 9 September, 2015; originally announced September 2015.

    Comments: Accepted by NIPS 2015

  18. arXiv:1509.00110  [pdf, other

    stat.AP

    Bayesian Models for Heterogeneous Personalized Health Data

    Authors: Kai Fan, Allison E. Aiello, Katherine A. Heller

    Abstract: The purpose of this study is to leverage modern technology (such as mobile or web apps in Beckman et al. (2014)) to enrich epidemiology data and infer the transmission of disease. Homogeneity related research on population level has been intensively studied in previous work. In contrast, we develop hierarchical Graph-Coupled Hidden Markov Models (hGCHMMs) to simultaneously track the spread of infe… ▽ More

    Submitted 31 August, 2015; originally announced September 2015.

    Comments: 35 pages; Heterogeneous Flu Diffusion, Social Networks, Dynamic Bayesian Modeling

  19. arXiv:1410.7837  [pdf, ps, other

    stat.AP

    A Novel Non-Parametric Approach to Compare Paired General Statistical Distributions between Two Interventions

    Authors: Kang Li, Kai Fan

    Abstract: Despite of many measures applied for determine the difference between two groups of observations, such as mean value, median value, sample stan- dard deviation and so on, we propose a novel non parametric transformation method based on Mallows distance to investigate the location and variance differences between the two groups. The convexity theory of this method is constructed and thus it is a vi… ▽ More

    Submitted 28 October, 2014; originally announced October 2014.

    Comments: 12 pages, 5 figures