Skip to main content

Showing 1–25 of 25 results for author: Yin, F

Searching in archive stat. Search in all archives.
.
  1. arXiv:2504.04829  [pdf, other

    cs.LG eess.SP stat.ML

    Attentional Graph Meta-Learning for Indoor Localization Using Extremely Sparse Fingerprints

    Authors: Wenzhong Yan, Feng Yin, Jun Gao, Ao Wang, Yang Tian, Ruizhi Chen

    Abstract: Fingerprint-based indoor localization is often labor-intensive due to the need for dense grids and repeated measurements across time and space. Maintaining high localization accuracy with extremely sparse fingerprints remains a persistent challenge. Existing benchmark methods primarily rely on the measured fingerprints, while neglecting valuable spatial and environmental characteristics. In this p… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  2. arXiv:2503.18309  [pdf, other

    stat.ML cs.LG eess.SP

    Efficient Transformed Gaussian Process State-Space Models for Non-Stationary High-Dimensional Dynamical Systems

    Authors: Zhidi Lin, Ying Li, Feng Yin, Juan Maroñas, Alexandre H. Thiéry

    Abstract: Gaussian process state-space models (GPSSMs) offer a principled framework for learning and inference in nonlinear dynamical systems with uncertainty quantification. However, existing GPSSMs are limited by the use of multiple independent stationary Gaussian processes (GPs), leading to prohibitive computational and parametric complexity in high-dimensional settings and restricted modeling capacity f… ▽ More

    Submitted 14 May, 2025; v1 submitted 23 March, 2025; originally announced March 2025.

    Comments: 15 pages, 6 figures

  3. arXiv:2404.01697  [pdf, other

    stat.ML cs.LG

    Preventing Model Collapse in Gaussian Process Latent Variable Models

    Authors: Ying Li, Zhidi Lin, Feng Yin, Michael Minyi Zhang

    Abstract: Gaussian process latent variable models (GPLVMs) are a versatile family of unsupervised learning models commonly used for dimensionality reduction. However, common challenges in modeling data with GPLVMs include inadequate kernel flexibility and improper selection of the projection noise, leading to a type of model collapse characterized by vague latent representations that do not reflect the unde… ▽ More

    Submitted 18 June, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: International Conference on Machine Learning (ICML), 2024

  4. arXiv:2312.05910  [pdf, other

    cs.LG eess.SP stat.ML

    Ensemble Kalman Filtering Meets Gaussian Process SSM for Non-Mean-Field and Online Inference

    Authors: Zhidi Lin, Yiyong Sun, Feng Yin, Alexandre Hoang Thiéry

    Abstract: The Gaussian process state-space models (GPSSMs) represent a versatile class of data-driven nonlinear dynamical system models. However, the presence of numerous latent variables in GPSSM incurs unresolved issues for existing variational inference approaches, particularly under the more realistic non-mean-field (NMF) assumption, including extensive training effort, compromised inference accuracy, a… ▽ More

    Submitted 22 July, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

    Comments: Gaussian process, state-space model, ensemble Kalman filter, online learning, variational inference

  5. arXiv:2311.16856  [pdf, other

    cs.LG eess.SP stat.ML

    Attentional Graph Neural Network Is All You Need for Robust Massive Network Localization

    Authors: Wenzhong Yan, Feng Yin, Juntao Wang, Geert Leus, Abdelhak M. Zoubir, Yang Tian

    Abstract: In this paper, we design Graph Neural Networks (GNNs) with attention mechanisms to tackle an important yet challenging nonlinear regression problem: massive network localization. We first review our previous network localization method based on Graph Convolutional Network (GCN), which can exhibit state-of-the-art localization accuracy, even under severe Non-Line-of-Sight (NLOS) conditions, by care… ▽ More

    Submitted 7 April, 2025; v1 submitted 28 November, 2023; originally announced November 2023.

  6. arXiv:2306.11839  [pdf, other

    stat.ME cs.LG stat.AP stat.ML

    Should I Stop or Should I Go: Early Stopping with Heterogeneous Populations

    Authors: Hammaad Adam, Fan Yin, Huibin, Hu, Neil Tenenholtz, Lorin Crawford, Lester Mackey, Allison Koenecke

    Abstract: Randomized experiments often need to be stopped prematurely due to the treatment having an unintended harmful effect. Existing methods that determine when to stop an experiment early are typically applied to the data in aggregate and do not account for treatment effect heterogeneity. In this paper, we study the early stopping of experiments for harm on heterogeneous populations. We first establish… ▽ More

    Submitted 27 October, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 (spotlight)

  7. arXiv:2205.14283  [pdf, other

    stat.ML cs.LG eess.IV eess.SP

    Rethinking Bayesian Learning for Data Analysis: The Art of Prior and Inference in Sparsity-Aware Modeling

    Authors: Lei Cheng, Feng Yin, Sergios Theodoridis, Sotirios Chatzis, Tsung-Hui Chang

    Abstract: Sparse modeling for signal processing and machine learning has been at the focus of scientific research for over two decades. Among others, supervised sparsity-aware learning comprises two major paths paved by: a) discriminative methods and b) generative methods. The latter, more widely known as Bayesian methods, enable uncertainty evaluation w.r.t. the performed predictions. Furthermore, they can… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: 64 pages, 16 figures, 6 tables, 98 references, submitted to IEEE Signal Processing Magazine

  8. arXiv:2110.13527  [pdf, other

    stat.ME cs.SI q-bio.QM

    Highly Scalable Maximum Likelihood and Conjugate Bayesian Inference for ERGMs on Graph Sets with Equivalent Vertices

    Authors: Fan Yin, Carter T. Butts

    Abstract: The exponential family random graph modeling (ERGM) framework provides a flexible approach for the statistical analysis of networks. As ERGMs typically involve normalizing factors that are costly to compute, practical inference relies on a variety of approximations or other workarounds. Markov Chain Monte Carlo maximum likelihood (MCMC MLE) provides a powerful tool to approximate the MLE of ERGM p… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

  9. arXiv:2011.11178  [pdf, other

    stat.ME stat.AP

    Bayesian Nonparametric Estimation for Point Processes with Spatial Homogeneity: A Spatial Analysis of NBA Shot Locations

    Authors: Fan Yin, Jieying Jiao, Guanyu Hu, Jun Yan

    Abstract: Basketball shot location data provide valuable summary information regarding players to coaches, sports analysts, fans, statisticians, as well as players themselves. Represented by spatial points, such data are naturally analyzed with spatial point process models. We present a novel nonparametric Bayesian method for learning the underlying intensity surface built upon a combination of Dirichlet pr… ▽ More

    Submitted 22 November, 2020; originally announced November 2020.

  10. Bayesian Analysis of Static Light Scattering Data for Globular Proteins

    Authors: Fan Yin, Domarin Khago, Rachel W. Martin, Carter T. Butts

    Abstract: Static light scattering is a popular physical chemistry technique that enables calculation of physical attributes such as the radius of gyration and the second virial coefficient for a macromolecule (e.g., a polymer or a protein) in solution. The second virial coefficient is a physical quantity that characterizes the magnitude and sign of pairwise interactions between particles, and hence is relat… ▽ More

    Submitted 31 October, 2020; originally announced November 2020.

  11. arXiv:2010.11653  [pdf, other

    cs.LG eess.SP stat.ML

    Graph Neural Network for Large-Scale Network Localization

    Authors: Wenzhong Yan, Di Jin, Zhidi Lin, Feng Yin

    Abstract: Graph neural networks (GNNs) are popular to use for classifying structured data in the context of machine learning. But surprisingly, they are rarely applied to regression problems. In this work, we adopt GNN for a classic but challenging nonlinear regression problem, namely the network localization. Our main findings are in order. First, GNN is potentially the best solution to large-scale network… ▽ More

    Submitted 15 February, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: Accepted by ICASSP 2021, Code available at https://github.com/Yanzongzi/GNN-For-localization

  12. arXiv:2010.08495  [pdf, other

    stat.ME stat.AP stat.CO

    Analysis of professional basketball field goal attempts via a Bayesian matrix clustering approach

    Authors: Fan Yin, Guanyu Hu, Weining Shen

    Abstract: We propose a Bayesian nonparametric matrix clustering approach to analyze the latent heterogeneity structure in the shot selection data collected from professional basketball players in the National Basketball Association (NBA). The proposed method adopts a mixture of finite mixtures framework and fully utilizes the spatial information via a mixture of matrix normal distribution representation. We… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

  13. arXiv:2006.04097  [pdf, ps, other

    cs.LG stat.ML

    Optimally Combining Classifiers for Semi-Supervised Learning

    Authors: Zhiguo Wang, Liusha Yang, Feng Yin, Ke Lin, Qingjiang Shi, Zhi-Quan Luo

    Abstract: This paper considers semi-supervised learning for tabular data. It is widely known that Xgboost based on tree model works well on the heterogeneous features while transductive support vector machine can exploit the low density separation assumption. However, little work has been done to combine them together for the end-to-end semi-supervised learning. In this paper, we find these two methods have… ▽ More

    Submitted 7 June, 2020; originally announced June 2020.

  14. arXiv:2004.08064  [pdf, other

    stat.CO

    Kernel-based Approximate Bayesian Inference for Exponential Family Random Graph Models

    Authors: Fan Yin, Carter T. Butts

    Abstract: Bayesian inference for exponential family random graph models (ERGMs) is a doubly-intractable problem because of the intractability of both the likelihood and posterior normalizing factor. Auxiliary variable based Markov Chain Monte Carlo (MCMC) methods for this problem are asymptotically exact but computationally demanding, and are difficult to extend to modified ERGM families. In this work, we p… ▽ More

    Submitted 13 July, 2020; v1 submitted 17 April, 2020; originally announced April 2020.

  15. arXiv:2003.03697  [pdf, other

    cs.DC cs.LG eess.SP eess.SY stat.AP

    FedLoc: Federated Learning Framework for Data-Driven Cooperative Localization and Location Data Processing

    Authors: Feng Yin, Zhidi Lin, Yue Xu, Qinglei Kong, Deshi Li, Sergios Theodoridis, Shuguang, Cui

    Abstract: In this overview paper, data-driven learning model-based cooperative localization and location data processing are considered, in line with the emerging machine learning and big data methods. We first review (1) state-of-the-art algorithms in the context of federated learning, (2) two widely used learning models, namely the deep neural network model and the Gaussian process model, and (3) various… ▽ More

    Submitted 25 May, 2020; v1 submitted 7 March, 2020; originally announced March 2020.

  16. arXiv:2003.00474  [pdf, other

    cs.LG cs.NI stat.ML

    Scalable Learning Paradigms for Data-Driven Wireless Communication

    Authors: Yue Xu, Feng Yin, Wenjun Xu, Chia-Han Lee, Jiaru Lin, Shuguang Cui

    Abstract: The marriage of wireless big data and machine learning techniques revolutionizes the wireless system by the data-driven philosophy. However, the ever exploding data volume and model complexity will limit centralized solutions to learn and respond within a reasonable time. Therefore, scalability becomes a critical issue to be solved. In this article, we aim to provide a systematic discussion on the… ▽ More

    Submitted 1 March, 2020; originally announced March 2020.

  17. arXiv:1910.11445  [pdf, other

    stat.ME stat.CO

    Finite Mixtures of ERGMs for Modeling Ensembles of Networks

    Authors: Fan Yin, Weining Shen, Carter T. Butts

    Abstract: Ensembles of networks arise in many scientific fields, but there are few statistical tools for inferring their generative processes, particularly in the presence of both dyadic dependence and cross-graph heterogeneity. To fill in this gap, we propose characterizing network ensembles via finite mixtures of exponential family random graph models, a framework for parametric statistical modeling of gr… ▽ More

    Submitted 22 April, 2020; v1 submitted 24 October, 2019; originally announced October 2019.

  18. arXiv:1908.05873  [pdf, other

    stat.ME stat.AP

    Selection of Exponential-Family Random Graph Models via Held-Out Predictive Evaluation (HOPE)

    Authors: Fan Yin, Nolan Edward Phillips, Carter T. Butts

    Abstract: Statistical models for networks with complex dependencies pose particular challenges for model selection and evaluation. In particular, many well-established statistical tools for selecting between models assume conditional independence of observations and/or conventional asymptotics, and their theoretical foundations are not always applicable in a network modeling context. While simulation-based… ▽ More

    Submitted 19 August, 2019; v1 submitted 16 August, 2019; originally announced August 2019.

  19. arXiv:1907.03043  [pdf, ps, other

    cs.LG eess.SP stat.ML

    Gaussian Processes for Analyzing Positioned Trajectories in Sports

    Authors: Yuxin Zhao, Feng Yin, Fredrik Gunnarsson, Fredrik Hultkrantz

    Abstract: Kernel-based machine learning approaches are gaining increasing interest for exploring and modeling large dataset in recent years. Gaussian process (GP) is one example of such kernel-based approaches, which can provide very good performance for nonlinear modeling problems. In this work, we first propose a grey-box modeling approach to analyze the forces in cross country skiing races. To be more pr… ▽ More

    Submitted 5 July, 2019; originally announced July 2019.

    Comments: 31pages, 28 figures

  20. arXiv:1906.02387  [pdf, other

    stat.ML cs.LG

    A General $\mathcal{O}(n^2)$ Hyper-Parameter Optimization for Gaussian Process Regression with Cross-Validation and Non-linearly Constrained ADMM

    Authors: Linning Xu, Feng Yin, Jiawei Zhang, Zhi-Quan Luo, Shuguang Cui

    Abstract: Hyper-parameter optimization remains as the core issue of Gaussian process (GP) for machine learning nowadays. The benchmark method using maximum likelihood (ML) estimation and gradient descent (GD) is impractical for processing big data due to its $O(n^3)$ complexity. Many sophisticated global or local approximation models, for instance, sparse GP, distributed GP, have been proposed to address su… ▽ More

    Submitted 6 June, 2019; v1 submitted 5 June, 2019; originally announced June 2019.

  21. arXiv:1904.09559  [pdf, ps, other

    cs.LG eess.SP stat.ML

    Linear Multiple Low-Rank Kernel Based Stationary Gaussian Processes Regression for Time Series

    Authors: Feng Yin, Lishuo Pan, Xinwei He, Tianshi Chen, Sergios Theodoridis, Zhi-Quan, Luo

    Abstract: Gaussian processes (GP) for machine learning have been studied systematically over the past two decades and they are by now widely used in a number of diverse applications. However, GP kernel design and the associated hyper-parameter optimization are still hard and to a large extend open problems. In this paper, we consider the task of GP regression for time series modeling and analysis. The under… ▽ More

    Submitted 21 April, 2019; originally announced April 2019.

    Comments: 15 pages, 5 figures, submitted

  22. arXiv:1902.04763  [pdf, ps, other

    cs.LG cs.NI eess.SP stat.ML

    Wireless Traffic Prediction with Scalable Gaussian Process: Framework, Algorithms, and Verification

    Authors: Yue Xu, Feng Yin, Wenjun Xu, Jiaru Lin, Shuguang Cui

    Abstract: The cloud radio access network (C-RAN) is a promising paradigm to meet the stringent requirements of the fifth generation (5G) wireless systems. Meanwhile, wireless traffic prediction is a key enabler for C-RANs to improve both the spectrum efficiency and energy efficiency through load-aware network managements. This paper proposes a scalable Gaussian process (GP) framework as a promising solution… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

    Journal ref: IEEE Journal on Selected Areas in Communications ( Volume: 37 , Issue: 6 , June 2019 )

  23. arXiv:1808.01132  [pdf, other

    cs.LG cs.DS stat.ML

    Multitask Gaussian Process with Hierarchical Latent Interactions

    Authors: Kai Chen, Twan van Laarhoven, Elena Marchiori, Feng Yin, Shuguang Cui

    Abstract: Multitask Gaussian process (MTGP) is powerful for joint learning of multiple tasks with complicated correlation patterns. However, due to the assembling of additive independent latent functions, all current MTGPs including the salient linear model of coregionalization (LMC) and convolution frameworks cannot effectively represent and learn the hierarchical latent interactions between its latent fun… ▽ More

    Submitted 2 October, 2021; v1 submitted 3 August, 2018; originally announced August 2018.

    Comments: 7 pages, 17 figures. arXiv admin note: text overlap with arXiv:1808.02266

  24. arXiv:1808.00560  [pdf, other

    cs.LG stat.ML

    Compressible Spectral Mixture Kernels with Sparse Dependency Structures for Gaussian Processes

    Authors: Kai Chen, Yijue Dai, Feng Yin, Elena Marchiori, Sergios Theodoridis

    Abstract: Spectral mixture (SM) kernels comprise a powerful class of generalized kernels for Gaussian processes (GPs) to describe complex patterns. This paper introduces model compression and time- and phase (TP) modulated dependency structures to the original (SM) kernel for improved generalization of GPs. Specifically, by adopting Bienaymés identity, we generalize the dependency structure through cross-co… ▽ More

    Submitted 26 July, 2023; v1 submitted 1 August, 2018; originally announced August 2018.

    Comments: 13 pages

  25. arXiv:1405.1796  [pdf, ps, other

    stat.CO stat.ME

    Comparisons of penalized least squares methods by simulations

    Authors: Ke Zhang, Fan Yin, Shifeng Xiong

    Abstract: Penalized least squares methods are commonly used for simultaneous estimation and variable selection in high-dimensional linear models. In this paper we compare several prevailing methods including the lasso, nonnegative garrote, and SCAD in this area through Monte Carlo simulations. Criterion for evaluating these methods in terms of variable selection and estimation are presented. This paper focu… ▽ More

    Submitted 7 May, 2014; originally announced May 2014.