Skip to main content

Showing 1–50 of 144 results for author: Yu, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.14731  [pdf

    stat.AP cs.LG

    Effective climate policies for major emission reductions of ozone precursors: Global evidence from two decades

    Authors: Ningning Yao, Huan Xi, Lang Chen, Zhe Song, Jian Li, Yulei Chen, Baocai Guo, Yuanhang Zhang, Tong Zhu, Pengfei Li, Daniel Rosenfeld, John H. Seinfeld, Shaocai Yu

    Abstract: Despite policymakers deploying various tools to mitigate emissions of ozone (O\textsubscript{3}) precursors, such as nitrogen oxides (NO\textsubscript{x}), carbon monoxide (CO), and volatile organic compounds (VOCs), the effectiveness of policy combinations remains uncertain. We employ an integrated framework that couples structural break detection with machine learning to pinpoint effective inter… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    Comments: There are 30 pages of 12 figures

  2. arXiv:2505.12444  [pdf, other

    stat.ML cs.LG

    High-Dimensional Dynamic Covariance Models with Random Forests

    Authors: Shuguang Yu, Fan Zhou, Yingjie Zhang, Ziqi Chen, Hongtu Zhu

    Abstract: This paper introduces a novel nonparametric method for estimating high-dimensional dynamic covariance matrices with multiple conditioning covariates, leveraging random forests and supported by robust theoretical guarantees. Unlike traditional static methods, our dynamic nonparametric covariance models effectively capture distributional heterogeneity. Furthermore, unlike kernel-smoothing methods, w… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

  3. arXiv:2505.12225  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Reward Inside the Model: A Lightweight Hidden-State Reward Model for LLM's Best-of-N sampling

    Authors: Jizhou Guo, Zhaomin Wu, Philip S. Yu

    Abstract: High-quality reward models are crucial for unlocking the reasoning potential of large language models (LLMs), with best-of-N voting demonstrating significant performance gains. However, current reward models, which typically operate on the textual output of LLMs, are computationally expensive and parameter-heavy, limiting their real-world applications. We introduce the Efficient Linear Hidden Stat… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

  4. arXiv:2503.04981  [pdf, other

    stat.ML cs.LG

    Topology-Aware Conformal Prediction for Stream Networks

    Authors: Jifan Zhang, Fangxin Wang, Philip S. Yu, Kaize Ding, Shixiang Zhu

    Abstract: Stream networks, a unique class of spatiotemporal graphs, exhibit complex directional flow constraints and evolving dependencies, making uncertainty quantification a critical yet challenging task. Traditional conformal prediction methods struggle in this setting due to the need for joint predictions across multiple interdependent locations and the intricate spatio-temporal dependencies inherent in… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

    Comments: 16 pages, 6 figures

  5. arXiv:2502.05155  [pdf, other

    cs.LG stat.ML

    Deep Dynamic Probabilistic Canonical Correlation Analysis

    Authors: Shiqin Tang, Shujian Yu, Yining Dong, S. Joe Qin

    Abstract: This paper presents Deep Dynamic Probabilistic Canonical Correlation Analysis (D2PCCA), a model that integrates deep learning with probabilistic modeling to analyze nonlinear dynamical systems. Building on the probabilistic extensions of Canonical Correlation Analysis (CCA), D2PCCA captures nonlinear latent dynamics and supports enhancements such as KL annealing for improved convergence and normal… ▽ More

    Submitted 7 February, 2025; originally announced February 2025.

    Comments: accepted by ICASSP-25, code is available at \url{https://github.com/marcusstang/D2PCCA}

  6. arXiv:2501.16768  [pdf, other

    stat.ML cs.LG

    Towards the Generalization of Multi-view Learning: An Information-theoretical Analysis

    Authors: Wen Wen, Tieliang Gong, Yuxin Dong, Shujian Yu, Weizhan Zhang

    Abstract: Multiview learning has drawn widespread attention for its efficacy in leveraging cross-view consensus and complementarity information to achieve a comprehensive representation of data. While multi-view learning has undergone vigorous development and achieved remarkable success, the theoretical understanding of its generalization behavior remains elusive. This paper aims to bridge this gap by devel… ▽ More

    Submitted 28 January, 2025; originally announced January 2025.

  7. arXiv:2412.07446  [pdf, ps, other

    cs.AI cs.CL cs.LG stat.ML

    A Causal World Model Underlying Next Token Prediction: Exploring GPT in a Controlled Environment

    Authors: Raanan Y. Rohekar, Yaniv Gurwicz, Sungduk Yu, Estelle Aflalo, Vasudev Lal

    Abstract: Are generative pre-trained transformer (GPT) models, trained only to predict the next token, implicitly learning a world model from which sequences are generated one token at a time? We address this question by deriving a causal interpretation of the attention mechanism in GPT and presenting a causal world model that arises from this interpretation. Furthermore, we propose that GPT models, at infe… ▽ More

    Submitted 6 July, 2025; v1 submitted 10 December, 2024; originally announced December 2024.

    Comments: International Conference on Machine Learning (ICML), 2025

  8. arXiv:2412.05783  [pdf, other

    cs.LG stat.ML

    Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning

    Authors: Shuguang Yu, Shuxing Fang, Ruixin Peng, Zhengling Qi, Fan Zhou, Chengchun Shi

    Abstract: This paper studies off-policy evaluation (OPE) in the presence of unmeasured confounders. Inspired by the two-way fixed effects regression model widely used in the panel data literature, we propose a two-way unmeasured confounding assumption to model the system dynamics in causal reinforcement learning and develop a two-way deconfounder algorithm that devises a neural tensor network to simultaneou… ▽ More

    Submitted 7 December, 2024; originally announced December 2024.

  9. arXiv:2411.00109  [pdf, other

    stat.ML cs.AI cs.LG

    Prospective Learning: Learning for a Dynamic Future

    Authors: Ashwin De Silva, Rahul Ramesh, Rubing Yang, Siyu Yu, Joshua T Vogelstein, Pratik Chaudhari

    Abstract: In real-world applications, the distribution of the data, and our goals, evolve over time. The prevailing theoretical framework for studying machine learning, namely probably approximately correct (PAC) learning, largely ignores time. As a consequence, existing strategies to address the dynamic nature of data and goals exhibit poor real-world performance. This paper develops a theoretical framewor… ▽ More

    Submitted 30 January, 2025; v1 submitted 31 October, 2024; originally announced November 2024.

    Comments: Accepted to NeurIPS 2024

  10. arXiv:2408.16963  [pdf, other

    stat.ME

    Nonparametric Density Estimation for Data Scattered on Irregular Spatial Domains: A Likelihood-Based Approach Using Bivariate Penalized Spline Smoothing

    Authors: Kunal Das, Shan Yu, Guannan Wang, Li Wang

    Abstract: Accurately estimating data density is crucial for making informed decisions and modeling in various fields. This paper presents a novel nonparametric density estimation procedure that utilizes bivariate penalized spline smoothing over triangulation for data scattered over irregular spatial domains. The approach is likelihood-based with a regularization term that addresses the roughness of the loga… ▽ More

    Submitted 26 October, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

  11. arXiv:2408.01861  [pdf, other

    cs.LG stat.ML

    Batch Active Learning in Gaussian Process Regression using Derivatives

    Authors: Hon Sum Alec Yu, Christoph Zimmer, Duy Nguyen-Tuong

    Abstract: We investigate the use of derivative information for Batch Active Learning in Gaussian Process regression models. The proposed approach employs the predictive covariance matrix for selection of data batches to exploit full correlation of samples. We theoretically analyse our proposed algorithm taking different optimality criteria into consideration and provide empirical comparisons highlighting th… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

    Comments: 29 pages, 10 figures

  12. arXiv:2406.04690  [pdf, other

    cs.LG stat.ML

    Higher-order Structure Based Anomaly Detection on Attributed Networks

    Authors: Xu Yuan, Na Zhou, Shuo Yu, Huafei Huang, Zhikui Chen, Feng Xia

    Abstract: Anomaly detection (such as telecom fraud detection and medical image detection) has attracted the increasing attention of people. The complex interaction between multiple entities widely exists in the network, which can reflect specific human behavior patterns. Such patterns can be modeled by higher-order network structures, thus benefiting anomaly detection on attributed networks. However, due to… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  13. arXiv:2405.19978  [pdf, other

    cs.LG stat.ML

    Domain Adaptation with Cauchy-Schwarz Divergence

    Authors: Wenzhe Yin, Shujian Yu, Yicong Lin, Jie Liu, Jan-Jakob Sonke, Efstratios Gavves

    Abstract: Domain adaptation aims to use training data from one or multiple source domains to learn a hypothesis that can be generalized to a different, but related, target domain. As such, having a reliable measure for evaluating the discrepancy of both marginal and conditional distributions is crucial. We introduce Cauchy-Schwarz (CS) divergence to the problem of unsupervised domain adaptation (UDA). The C… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted by UAI-24

  14. arXiv:2405.13535  [pdf, ps, other

    cs.LG stat.ML

    Addressing the Inconsistency in Bayesian Deep Learning via Generalized Laplace Approximation

    Authors: Yinsong Chen, Samson S. Yu, Zhong Li, Chee Peng Lim

    Abstract: In recent years, inconsistency in Bayesian deep learning has attracted significant attention. Tempered or generalized posterior distributions are frequently employed as direct and effective solutions. Nonetheless, the underlying mechanisms and the effectiveness of generalized posteriors remain active research topics. In this work, we interpret posterior tempering as a correction for model misspeci… ▽ More

    Submitted 30 June, 2025; v1 submitted 22 May, 2024; originally announced May 2024.

  15. arXiv:2404.17951  [pdf, other

    cs.LG cs.IT stat.ML

    Cauchy-Schwarz Divergence Information Bottleneck for Regression

    Authors: Shujian Yu, Xi Yu, Sigurd Løkse, Robert Jenssen, Jose C. Principe

    Abstract: The information bottleneck (IB) approach is popular to improve the generalization, robustness and explainability of deep neural networks. Essentially, it aims to find a minimum sufficient representation $\mathbf{t}$ by striking a trade-off between a compression term $I(\mathbf{x};\mathbf{t})$ and a prediction term $I(y;\mathbf{t})$, where $I(\cdot;\cdot)$ refers to the mutual information (MI). MI… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: accepted by ICLR-24, project page: \url{https://github.com/SJYuCNEL/Cauchy-Schwarz-Information-Bottleneck}

  16. arXiv:2404.11579  [pdf, other

    stat.ME

    Spatial Heterogeneous Additive Partial Linear Model: A Joint Approach of Bivariate Spline and Forest Lasso

    Authors: Xin Zhang, Shan Yu, Zhengyuan Zhu, Xin Wang

    Abstract: Identifying spatial heterogeneous patterns has attracted a surge of research interest in recent years, due to its important applications in various scientific and engineering fields. In practice the spatially heterogeneous components are often mixed with components which are spatially smooth, making the task of identifying the heterogeneous regions more challenging. In this paper, we develop an ef… ▽ More

    Submitted 3 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  17. arXiv:2403.07185  [pdf, other

    cs.LG stat.ML

    Uncertainty in Graph Neural Networks: A Survey

    Authors: Fangxin Wang, Yuqing Liu, Kay Liu, Yibo Wang, Sourav Medya, Philip S. Yu

    Abstract: Graph Neural Networks (GNNs) have been extensively used in various real-world applications. However, the predictive uncertainty of GNNs stemming from diverse sources such as inherent randomness in data and model training errors can lead to unstable and erroneous predictions. Therefore, identifying, quantifying, and utilizing uncertainty are essential to enhance the performance of the model for the… ▽ More

    Submitted 8 March, 2025; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: 14 main pages, 4 figures, 1 table

    Journal ref: Transactions on Machine Learning Research (11/2024)

  18. arXiv:2403.06302  [pdf, other

    stat.ML cs.LG

    Nonparametric Automatic Differentiation Variational Inference with Spline Approximation

    Authors: Yuda Shao, Shan Yu, Tianshu Feng

    Abstract: Automatic Differentiation Variational Inference (ADVI) is efficient in learning probabilistic models. Classic ADVI relies on the parametric approach to approximate the posterior. In this paper, we develop a spline-based nonparametric approximation approach that enables flexible posterior approximation for distributions with complicated structures, such as skewness, multimodality, and bounded suppo… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  19. arXiv:2403.05644  [pdf, other

    stat.ME stat.AP

    TSSS: A Novel Triangulated Spherical Spline Smoothing for Surface-based Data

    Authors: Zhiling Gu, Shan Yu, Guannan Wang, Ming-Jun Lai, Li Wang

    Abstract: Surface-based data is commonly observed in diverse practical applications spanning various fields. In this paper, we introduce a novel nonparametric method to discover the underlying signals from data distributed on complex surface-based domains. Our approach involves a penalized spline estimator defined on a triangulation of surface patches, which enables effective signal extraction and recovery.… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 56 pages, 16 figures

    MSC Class: 62G05; 62G08

  20. arXiv:2401.09028  [pdf, other

    stat.AP

    A Novel Interpretable Fusion Analytic Framework for Investigating Functional Brain Connectivity Differences in Cognitive Impairments

    Authors: Yeseul Jeon, Jeong-Jae Kim, SuMin Yu, Junggu Choi, Sanghoon Han

    Abstract: Functional magnetic resonance imaging (fMRI) data is characterized by its complexity and high--dimensionality, encompassing signals from various regions of interests (ROIs) that exhibit intricate correlations. Analyzing fMRI data directly proves challenging due to its intricate structure. Nevertheless, ROIs convey crucial information about brain activities through their connections, offering insig… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: arXiv admin note: text overlap with arXiv:2207.01581

  21. arXiv:2311.05339  [pdf, other

    stat.ME

    An iterative algorithm for high-dimensional linear models with both sparse and non-sparse structures

    Authors: Shun Yu, Yuehan Yang

    Abstract: Numerous practical medical problems often involve data that possess a combination of both sparse and non-sparse structures. Traditional penalized regularizations techniques, primarily designed for promoting sparsity, are inadequate to capture the optimal solutions in such scenarios. To address these challenges, this paper introduces a novel algorithm named Non-sparse Iteration (NSI). The NSI algor… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  22. arXiv:2306.15626  [pdf, other

    cs.LG cs.AI cs.LO stat.ML

    LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

    Authors: Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar

    Abstract: Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute requirements. This has created substantial barriers to research on machine learning methods for theorem proving. This paper removes these barriers by introducing LeanDojo: an op… ▽ More

    Submitted 27 October, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: Accepted to NeurIPS 2023 (Datasets and Benchmarks Track) as an oral presentation. Data, code, and models available at https://leandojo.org/

  23. arXiv:2305.03555  [pdf, other

    cs.LG stat.ML

    Contrastive Graph Clustering in Curvature Spaces

    Authors: Li Sun, Feiyang Wang, Junda Ye, Hao Peng, Philip S. Yu

    Abstract: Graph clustering is a longstanding research topic, and has achieved remarkable success with the deep learning methods in recent years. Nevertheless, we observe that several important issues largely remain open. On the one hand, graph clustering from the geometric perspective is appealing but has rarely been touched before, as it lacks a promising space for geometric clustering. On the other hand,… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: Accepted by IJCAI'23

  24. arXiv:2301.08970  [pdf, other

    cs.LG cs.IT stat.ML

    The Conditional Cauchy-Schwarz Divergence with Applications to Time-Series Data and Sequential Decision Making

    Authors: Shujian Yu, Hongming Li, Sigurd Løkse, Robert Jenssen, José C. Príncipe

    Abstract: The Cauchy-Schwarz (CS) divergence was developed by Príncipe et al. in 2000. In this paper, we extend the classic CS divergence to quantify the closeness between two conditional distributions and show that the developed conditional CS divergence can be simply estimated by a kernel density estimator from given samples. We illustrate the advantages (e.g., rigorous faithfulness guarantee, lower compu… ▽ More

    Submitted 16 March, 2025; v1 submitted 21 January, 2023; originally announced January 2023.

    Comments: manuscript is accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence. The code is available at \url{https://github.com/SJYuCNEL/conditional_CS_divergence}

  25. arXiv:2301.01642  [pdf, other

    stat.ML cs.LG q-bio.NC

    CI-GNN: A Granger Causality-Inspired Graph Neural Network for Interpretable Brain Network-Based Psychiatric Diagnosis

    Authors: Kaizhong Zheng, Shujian Yu, Badong Chen

    Abstract: There is a recent trend to leverage the power of graph neural networks (GNNs) for brain-network based psychiatric diagnosis, which,in turn, also motivates an urgent need for psychiatrists to fully understand the decision behavior of the used GNNs. However, most of the existing GNN explainers are either post-hoc in which another interpretive model needs to be created to explain a well-trained GNN,… ▽ More

    Submitted 28 January, 2024; v1 submitted 4 January, 2023; originally announced January 2023.

    Comments: Manuscript ia accepted by Neural Networks, The source code and implementation details are freely available at GitHub repository (https://github.com/ZKZ-Brain/CI-GNN/). 45 pages, 14 figures

  26. arXiv:2209.12313  [pdf, other

    cs.DS math.ST stat.ML

    Random graph matching at Otter's threshold via counting chandeliers

    Authors: Cheng Mao, Yihong Wu, Jiaming Xu, Sophie H. Yu

    Abstract: We propose an efficient algorithm for graph matching based on similarity scores constructed from counting a certain family of weighted trees rooted at each vertex. For two Erdős-Rényi graphs $\mathcal{G}(n,q)$ whose edges are correlated through a latent vertex correspondence, we show that this algorithm correctly matches all but a vanishing fraction of the vertices with high probability, provided… ▽ More

    Submitted 13 February, 2023; v1 submitted 25 September, 2022; originally announced September 2022.

  27. A State Transition Model for Mobile Notifications via Survival Analysis

    Authors: Yiping Yuan, Jing Zhang, Shaunak Chatterjee, Shipeng Yu, Romer Rosales

    Abstract: Mobile notifications have become a major communication channel for social networking services to keep users informed and engaged. As more mobile applications push notifications to users, they constantly face decisions on what to send, when and how. A lack of research and methodology commonly leads to heuristic decision making. Many notifications arrive at an inappropriate moment or introduce too m… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Comments: 9 pages, 7 figures. Published in WSDM 19'

    ACM Class: I.2.6

    Journal ref: WSDM 2019 Pages 123-131

  28. arXiv:2205.07426  [pdf, other

    stat.ML cs.LG

    Optimal Randomized Approximations for Matrix based Renyi's Entropy

    Authors: Yuxin Dong, Tieliang Gong, Shujian Yu, Chen Li

    Abstract: The Matrix-based Renyi's entropy enables us to directly measure information quantities from given data without the costly probability density estimation of underlying distributions, thus has been widely adopted in numerous statistical learning and inference tasks. However, exactly calculating this new information quantity requires access to the eigenspectrum of a semi-positive definite (SPD) matri… ▽ More

    Submitted 15 May, 2022; originally announced May 2022.

  29. arXiv:2203.05667  [pdf, other

    cond-mat.mtrl-sci physics.atm-clus stat.CO

    3D Nanoscale Mapping of Short-Range Order in GeSn Alloys

    Authors: Shang Liu, Alejandra Cuervo Covian, Xiaoxin Wang, Cory T. Cline, Austin Akey, Weiling Dong, Shui-Qing Yu, Jifeng Liu

    Abstract: GeSn on Si has attracted much research interest due to its tunable direct bandgap for mid-infrared applications. Recently, short-range order (SRO) in GeSn alloys has been theoretically predicted, which profoundly impacts the band structure. However, characterizing SRO in GeSn is challenging. Guided by physics-informed Poisson statistical analyses of Kth-nearest neighbors (KNN) in atom probe tomogr… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

  30. Computationally Efficient Approximations for Matrix-based Renyi's Entropy

    Authors: Tieliang Gong, Yuxin Dong, Shujian Yu, Bo Dong

    Abstract: The recently developed matrix based Renyi's entropy enables measurement of information in data simply using the eigenspectrum of symmetric positive semi definite (PSD) matrices in reproducing kernel Hilbert space, without estimation of the underlying data distribution. This intriguing property makes the new information measurement widely adopted in multiple statistical inference and learning tasks… ▽ More

    Submitted 9 January, 2023; v1 submitted 27 December, 2021; originally announced December 2021.

  31. arXiv:2110.11816  [pdf, other

    math.ST stat.ML

    Testing network correlation efficiently via counting trees

    Authors: Cheng Mao, Yihong Wu, Jiaming Xu, Sophie H. Yu

    Abstract: We propose a new procedure for testing whether two networks are edge-correlated through some latent vertex correspondence. The test statistic is based on counting the co-occurrences of signed trees for a family of non-isomorphic trees. When the two networks are Erdős-Rényi random graphs $\mathcal{G}(n,q)$ that are either independent or correlated with correlation coefficient $ρ$, our test runs in… ▽ More

    Submitted 1 April, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

  32. arXiv:2110.06057  [pdf, other

    cs.LG stat.ML

    Gated Information Bottleneck for Generalization in Sequential Environments

    Authors: Francesco Alesiani, Shujian Yu, Xi Yu

    Abstract: Deep neural networks suffer from poor generalization to unseen environments when the underlying data distribution is different from that in the training set. By learning minimum sufficient representations from training data, the information bottleneck (IB) approach has demonstrated its effectiveness to improve generalization in different AI applications. In this work, we propose a new neural netwo… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

    Comments: manuscript accepted by IEEE ICDM-21 (regular papers), code is available at https://github.com/falesiani/GIB

  33. arXiv:2110.05794  [pdf, other

    cs.LG cs.IT stat.ML

    Information Theoretic Structured Generative Modeling

    Authors: Bo Hu, Shujian Yu, Jose C. Principe

    Abstract: Rényi's information provides a theoretical foundation for tractable and data-efficient non-parametric density estimation, based on pair-wise evaluations in a reproducing kernel Hilbert space (RKHS). This paper extends this framework to parametric probabilistic modeling, motivated by the fact that Rényi's information can be estimated in closed-form for Gaussian mixtures. Based on this special conne… ▽ More

    Submitted 7 March, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

  34. arXiv:2109.04671  [pdf, other

    stat.ME stat.AP stat.ML

    Interaction Models and Generalized Score Matching for Compositional Data

    Authors: Shiqing Yu, Mathias Drton, Ali Shojaie

    Abstract: Applications such as the analysis of microbiome data have led to renewed interest in statistical methods for compositional data, i.e., multivariate data in the form of probability vectors that contain relative proportions. In particular, there is considerable interest in modeling interactions among such relative proportions. To this end we propose a class of exponential family models that accommod… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

    Comments: 41 pages, 19 figures

  35. arXiv:2108.03531  [pdf, other

    cs.LG stat.ML

    Learning to Transfer with von Neumann Conditional Divergence

    Authors: Ammar Shaker, Shujian Yu, Daniel Oñoro-Rubio

    Abstract: The similarity of feature representations plays a pivotal role in the success of problems related to domain adaptation. Feature similarity includes both the invariance of marginal distributions and the closeness of conditional distributions given the desired response $y$ (e.g., class labels). Unfortunately, traditional methods always learn such features without fully taking into consideration the… ▽ More

    Submitted 6 January, 2022; v1 submitted 7 August, 2021; originally announced August 2021.

    Comments: Accepted at AAAI2022

  36. arXiv:2108.03336  [pdf, other

    stat.ME cs.LG cs.SI math.ST

    Estimating Graph Dimension with Cross-validated Eigenvalues

    Authors: Fan Chen, Sebastien Roch, Karl Rohe, Shuqi Yu

    Abstract: In applied multivariate statistics, estimating the number of latent dimensions or the number of clusters is a fundamental and recurring problem. One common diagnostic is the scree plot, which shows the largest eigenvalues of the data matrix; the user searches for a "gap" or "elbow" in the decreasing eigenvalues; unfortunately, these patterns can hide beneath the bias of the sample eigenvalues. Thi… ▽ More

    Submitted 6 August, 2021; originally announced August 2021.

    Comments: 53 pages, 12 figures

  37. arXiv:2108.00819  [pdf, other

    cs.LG cs.AI stat.ML

    Active Learning in Gaussian Process State Space Model

    Authors: Hon Sum Alec Yu, Dingling Yao, Christoph Zimmer, Marc Toussaint, Duy Nguyen-Tuong

    Abstract: We investigate active learning in Gaussian Process state-space models (GPSSM). Our problem is to actively steer the system through latent states by determining its inputs such that the underlying dynamics can be optimally learned by a GPSSM. In order that the most informative inputs are selected, we employ mutual information as our active learning criterion. In particular, we present two approache… ▽ More

    Submitted 30 July, 2021; originally announced August 2021.

    Comments: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) 2021

  38. arXiv:2106.04255  [pdf, other

    stat.CO

    Nonparametric Regression for 3D Point Cloud Learning

    Authors: Xinyi Li, Shan Yu, Yueying Wang, Guannan Wang, Ming-Jun Lai, Li Wang

    Abstract: Over the past two decades, we have seen an exponentially increased amount of point clouds collected with irregular shapes in various areas. Motivated by the importance of solid modeling for point clouds, we develop a novel and efficient smoothing tool based on multivariate splines over the tetrahedral partitions to extract the underlying signal and build up a 3D solid model from the point cloud. T… ▽ More

    Submitted 18 February, 2023; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: 64 pages, 16 figures

    MSC Class: 62G05; 62G08

  39. Multivariate Spline Estimation and Inference for Image-On-Scalar Regression

    Authors: Shan Yu, Guannan Wang, Li Wang, Lijian Yang

    Abstract: Motivated by recent data analyses in biomedical imaging studies, we consider a class of image-on-scalar regression models for imaging responses and scalar predictors. We propose using flexible multivariate splines over triangulations to handle the irregular domain of the objects of interest on the images, as well as other characteristics of images. The proposed estimators of the coefficient functi… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

  40. arXiv:2105.03058  [pdf, other

    cs.LG stat.ML

    Error-Robust Multi-View Clustering: Progress, Challenges and Opportunities

    Authors: Mehrnaz Najafi, Lifang He, Philip S. Yu

    Abstract: With recent advances in data collection from multiple sources, multi-view data has received significant attention. In multi-view data, each view represents a different perspective of data. Since label information is often expensive to acquire, multi-view clustering has gained growing interest, which aims to obtain better clustering solution by exploiting complementary and consistent information ac… ▽ More

    Submitted 7 May, 2021; originally announced May 2021.

  41. arXiv:2104.07295  [pdf, other

    cs.LG cs.AI stat.ML

    Variational Co-embedding Learning for Attributed Network Clustering

    Authors: Shuiqiao Yang, Sunny Verma, Borui Cai, Jiaojiao Jiang, Kun Yu, Fang Chen, Shui Yu

    Abstract: Recent works for attributed network clustering utilize graph convolution to obtain node embeddings and simultaneously perform clustering assignments on the embedding space. It is effective since graph convolution combines the structural and attributive information for node embedding learning. However, a major limitation of such works is that the graph convolution only incorporates the attribute in… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

    Comments: This manuscript is under review

  42. arXiv:2102.00082  [pdf, ps, other

    math.ST cs.IT stat.ML

    Settling the Sharp Reconstruction Thresholds of Random Graph Matching

    Authors: Yihong Wu, Jiaming Xu, Sophie H. Yu

    Abstract: This paper studies the problem of recovering the hidden vertex correspondence between two edge-correlated random graphs. We focus on the Gaussian model where the two graphs are complete graphs with correlated Gaussian weights and the Erdős-Rényi model where the two graphs are subsampled from a common parent Erdős-Rényi graph $\mathcal{G}(n,p)$. For dense graphs with $p=n^{-o(1)}$, we prove that th… ▽ More

    Submitted 16 February, 2022; v1 submitted 29 January, 2021; originally announced February 2021.

    MSC Class: 94A15; 62B10; 68Q87; 05C80; 05C60

  43. arXiv:2101.10160  [pdf, other

    cs.LG cs.IT stat.ML

    Measuring Dependence with Matrix-based Entropy Functional

    Authors: Shujian Yu, Francesco Alesiani, Xi Yu, Robert Jenssen, Jose C. Principe

    Abstract: Measuring the dependence of data plays a central role in statistics and machine learning. In this work, we summarize and generalize the main idea of existing information-theoretic dependence measures into a higher-level perspective by the Shearer's inequality. Based on our generalization, we then propose two measures, namely the matrix-based normalized total correlation ($T_α^*$) and the matrix-ba… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

    Comments: Accepted at AAAI-21. An interpretable and differentiable dependence (or independence) measure that can be used to 1) train deep network under covariate shift and non-Gaussian noise; 2) implement a deep deterministic information bottleneck; and 3) understand the dynamics of learning of CNN. Code available at https://bit.ly/AAAI-dependence

  44. arXiv:2011.01272   

    cs.LG stat.ML

    Modular-Relatedness for Continual Learning

    Authors: Ammar Shaker, Shujian Yu, Francesco Alesiani

    Abstract: In this paper, we propose a continual learning (CL) technique that is beneficial to sequential task learners by improving their retained accuracy and reducing catastrophic forgetting. The principal target of our approach is the automatic extraction of modular parts of the neural network and then estimating the relatedness between the tasks given these modular components. This technique is applicab… ▽ More

    Submitted 17 January, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: We realized one conclusion in the submission is erroneous and disconnected from the results shown in one theorem is. We decide to withdraw the current version to avoid misleading conclusion

  45. Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce Discrimination

    Authors: Tao Zhang, Tianqing Zhu, Jing Li, Mengde Han, Wanlei Zhou, Philip S. Yu

    Abstract: A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair. While research is already underway to formalize a machine-learning concept of fairness and to design frameworks for building fair models with sacrifice in accuracy, most are geared toward either supervised or unsupervised learning. Yet two observations inspired us to wonder whether… ▽ More

    Submitted 25 September, 2020; originally announced September 2020.

    Comments: This paper has been published in IEEE Transactions on Knowledge and Data Engineering

  46. arXiv:2009.11428  [pdf, other

    stat.ME stat.ML

    Generalized Score Matching for General Domains

    Authors: Shiqing Yu, Mathias Drton, Ali Shojaie

    Abstract: Estimation of density functions supported on general domains arises when the data is naturally restricted to a proper subset of the real space. This problem is complicated by typically intractable normalizing constants. Score matching provides a powerful tool for estimating densities with such intractable normalizing constants, but as originally proposed is limited to densities on $\mathbb{R}^m$ a… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    Comments: 50 pages, 14 figures

  47. arXiv:2009.06190  [pdf, other

    cs.LG stat.ML

    Fairness Constraints in Semi-supervised Learning

    Authors: Tao Zhang, Tianqing Zhu, Mengde Han, Jing Li, Wanlei Zhou, Philip S. Yu

    Abstract: Fairness in machine learning has received considerable attention. However, most studies on fair learning focus on either supervised learning or unsupervised learning. Very few consider semi-supervised settings. Yet, in reality, most machine learning tasks rely on large datasets that contain both labeled and unlabeled data. One of key issues with fair learning is the balance between fairness and ac… ▽ More

    Submitted 14 September, 2020; originally announced September 2020.

  48. arXiv:2009.05618  [pdf, other

    cs.LG stat.ML

    Learning an Interpretable Graph Structure in Multi-Task Learning

    Authors: Shujian Yu, Francesco Alesiani, Ammar Shaker, Wenzhe Yin

    Abstract: We present a novel methodology to jointly perform multi-task learning and infer intrinsic relationship among tasks by an interpretable and sparse graph. Unlike existing multi-task learning methodologies, the graph structure is not assumed to be known a priori or estimated separately in a preprocessing step. Instead, our graph is learned simultaneously with model parameters of each task, thus it re… ▽ More

    Submitted 11 September, 2020; originally announced September 2020.

    Comments: 11 pages, 7 figures

  49. arXiv:2009.05483  [pdf, other

    cs.LG stat.ML

    Towards Interpretable Multi-Task Learning Using Bilevel Programming

    Authors: Francesco Alesiani, Shujian Yu, Ammar Shaker, Wenzhe Yin

    Abstract: Interpretable Multi-Task Learning can be expressed as learning a sparse graph of the task relationship based on the prediction performance of the learned models. Since many natural phenomenon exhibit sparse structures, enforcing sparsity on learned models reveals the underlying task relationship. Moreover, different sparsification degrees from a fully connected graph uncover various types of struc… ▽ More

    Submitted 11 September, 2020; originally announced September 2020.

    Comments: Manuscript accepted at ECML PKDD 2020

  50. arXiv:2009.03506  [pdf

    cs.LG stat.ML

    High-throughput relation extraction algorithm development associating knowledge articles and electronic health records

    Authors: Yucong Lin, Keming Lu, Yulin Chen, Chuan Hong, Sheng Yu

    Abstract: Objective: Medical relations are the core components of medical knowledge graphs that are needed for healthcare artificial intelligence. However, the requirement of expert annotation by conventional algorithm development processes creates a major bottleneck for mining new relations. In this paper, we present Hi-RES, a framework for high-throughput relation extraction algorithm development. We also… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.