Skip to main content

Showing 1–30 of 30 results for author: Cheng, Q

Searching in archive stat. Search in all archives.
.
  1. arXiv:2501.11323  [pdf

    cs.LG eess.SP physics.app-ph stat.ML

    Physics-Informed Machine Learning for Efficient Reconfigurable Intelligent Surface Design

    Authors: Zhen Zhang, Jun Hui Qiu, Jun Wei Zhang, Hui Dong Li, Dong Tang, Qiang Cheng, Wei Lin

    Abstract: Reconfigurable intelligent surface (RIS) is a two-dimensional periodic structure integrated with a large number of reflective elements, which can manipulate electromagnetic waves in a digital way, offering great potentials for wireless communication and radar detection applications. However, conventional RIS designs highly rely on extensive full-wave EM simulations that are extremely time-consumin… ▽ More

    Submitted 20 January, 2025; originally announced January 2025.

  2. arXiv:2310.13178  [pdf, other

    stat.ME

    Exact Inference for Common Odds Ratio in Meta-Analysis with Zero-Total-Event Studies

    Authors: Xiaolin Chen, Jerry Q Cheng, Lu Tian, Minge Xie

    Abstract: Stemming from the high profile publication of Nissen and Wolski (2007) and subsequent discussions with divergent views on how to handle observed zero-total-event studies, defined to be studies which observe zero events in both treatment and control arms, the research topic concerning the common odds ratio model with zero-total-event studies remains to be an unresolved problem in meta-analysis. In… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

  3. arXiv:2301.01620  [pdf, ps, other

    stat.AP

    Anonymous Pattern Molecular Fingerprint and its Applications on Property Identification

    Authors: Xue Liu, Qian Cheng, Dan Sun, Xing Li, Wei Wei, Zhiming Zheng

    Abstract: Molecular fingerprints are significant cheminformatics tools to map molecules into vectorial space according to their characteristics in diverse functional groups, atom sequences, and other topological structures. In this paper, we set out to investigate a novel molecular fingerprint \emph{Anonymous-FP} that possesses abundant perception about the underlying interactions shaped in small, medium, a… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

    Comments: 11 pages

  4. arXiv:2209.05438  [pdf, other

    cs.CY cs.LG stat.AP stat.CO

    Alcohol Intake Differentiates AD and LATE: A Telltale Lifestyle from Two Large-Scale Datasets

    Authors: Xinxing Wu, Chong Peng, Peter T. Nelson, Qiang Cheng

    Abstract: Alzheimer's disease (AD), as a progressive brain disease, affects cognition, memory, and behavior. Similarly, limbic-predominant age-related TDP-43 encephalopathy (LATE) is a recently defined common neurodegenerative disease that mimics the clinical symptoms of AD. At present, the risk factors implicated in LATE and those distinguishing LATE from AD are largely unknown. We leveraged an integrated… ▽ More

    Submitted 25 August, 2022; originally announced September 2022.

    Comments: 10 pages

    Journal ref: AMIA 2022 Annual Symposium (AMIA 2022)

  5. arXiv:2106.11880  [pdf, other

    cs.LG stat.ML

    Dynamic Customer Embeddings for Financial Service Applications

    Authors: Nima Chitsazan, Samuel Sharpe, Dwipam Katariya, Qianyu Cheng, Karthik Rajasethupathy

    Abstract: As financial services (FS) companies have experienced drastic technology driven changes, the availability of new data streams provides the opportunity for more comprehensive customer understanding. We propose Dynamic Customer Embeddings (DCE), a framework that leverages customers' digital activity and a wide range of financial context to learn dense representations of customers in the FS industry.… ▽ More

    Submitted 22 June, 2021; originally announced June 2021.

    Comments: ICML Workshop on Representation Learning for Finance and E-Commerce Applications

  6. arXiv:2106.02197  [pdf, other

    cs.LG stat.ML

    Top-$k$ Regularization for Supervised Feature Selection

    Authors: Xinxing Wu, Qiang Cheng

    Abstract: Feature selection identifies subsets of informative features and reduces dimensions in the original feature space, helping provide insights into data generation or a variety of domain problems. Existing methods mainly depend on feature scoring functions or sparse regularizations; nonetheless, they have limited ability to reconcile the representativeness and inter-correlations of features. In this… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: 12 pages

  7. arXiv:2105.05373  [pdf, other

    math.ST stat.ME stat.ML

    Estimation of population size based on capture recapture designs and evaluation of the estimation reliability

    Authors: Yue You, Mark van der Laan, Philip Collender, Qu Cheng, Alan Hubbard, Nicholas P Jewell, Zhiyue Tom Hu, Robin Mejia, Justin Remais

    Abstract: We propose a modern method to estimate population size based on capture-recapture designs of K samples. The observed data is formulated as a sample of n i.i.d. K-dimensional vectors of binary indicators, where the k-th component of each vector indicates the subject being caught by the k-th sample, such that only subjects with nonzero capture vectors are observed. The target quantity is the uncondi… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

  8. arXiv:2102.10771  [pdf, ps, other

    stat.ML cs.LG

    Divide-and-conquer methods for big data analysis

    Authors: Xueying Chen, Jerry Q. Cheng, Min-ge Xie

    Abstract: In the context of big data analysis, the divide-and-conquer methodology refers to a multiple-step process: first splitting a data set into several smaller ones; then analyzing each set separately; finally combining results from each analysis together. This approach is effective in handling large data sets that are unsuitable to be analyzed entirely by a single computer due to limits either from me… ▽ More

    Submitted 21 February, 2021; originally announced February 2021.

  9. arXiv:2010.09416  [pdf, other

    cs.LG stat.ML

    Algorithmic Stability and Generalization of an Unsupervised Feature Selection Algorithm

    Authors: Xinxing Wu, Qiang Cheng

    Abstract: Feature selection, as a vital dimension reduction technique, reduces data dimension by identifying an essential subset of input features, which can facilitate interpretable insights into learning and inference processes. Algorithmic stability is a key characteristic of an algorithm regarding its sensitivity to perturbations of input samples. In this paper, we propose an innovative unsupervised fea… ▽ More

    Submitted 5 January, 2022; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: Published as a conference paper at NeurIPS 2021

  10. arXiv:2009.00399  [pdf, other

    stat.ME

    Accounting for correlated horizontal pleiotropy in two-sample Mendelian randomization using correlated instrumental variants

    Authors: Qing Cheng, Baoluo Sun, Yingcun Xia, Jin Liu

    Abstract: Mendelian randomization (MR) is a powerful approach to examine the causal relationships between health risk factors and outcomes from observational studies. Due to the proliferation of genome-wide association studies (GWASs) and abundant fully accessible GWASs summary statistics, a variety of two-sample MR methods for summary data have been developed to either detect or account for horizontal plei… ▽ More

    Submitted 1 September, 2020; originally announced September 2020.

  11. arXiv:2008.13429  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Structured Graph Learning for Clustering and Semi-supervised Classification

    Authors: Zhao Kang, Chong Peng, Qiang Cheng, Xinwang Liu, Xi Peng, Zenglin Xu, Ling Tian

    Abstract: Graphs have become increasingly popular in modeling structures and interactions in a wide variety of problems during the last decade. Graph-based clustering and semi-supervised classification techniques have shown impressive performance. This paper proposes a graph learning framework to preserve both the local and global structure of data. Specifically, our method uses the self-expressiveness of s… ▽ More

    Submitted 31 August, 2020; originally announced August 2020.

    Comments: Appear in Pattern Recognition

  12. arXiv:2005.09229  [pdf, other

    cs.LG stat.ML

    Two-Dimensional Semi-Nonnegative Matrix Factorization for Clustering

    Authors: Chong Peng, Zhilu Zhang, Zhao Kang, Chenglizhao Chen, Qiang Cheng

    Abstract: In this paper, we propose a new Semi-Nonnegative Matrix Factorization method for 2-dimensional (2D) data, named TS-NMF. It overcomes the drawback of existing methods that seriously damage the spatial information of the data by converting 2D data to vectors in a preprocessing step. In particular, projection matrices are sought under the guidance of building new data representations, such that the s… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

  13. arXiv:2005.03228  [pdf, other

    cs.LG cs.CV stat.ML

    Collective Loss Function for Positive and Unlabeled Learning

    Authors: Chenhao Xie, Qiao Cheng, Jiaqing Liang, Lihan Chen, Yanghua Xiao

    Abstract: People learn to discriminate between classes without explicit exposure to negative examples. On the contrary, traditional machine learning algorithms often rely on negative examples, otherwise the model would be prone to collapse and always-true predictions. Therefore, it is crucial to design the learning objective which leads the model to converge and to perform predictions unbiasedly without exp… ▽ More

    Submitted 5 May, 2020; originally announced May 2020.

  14. arXiv:2001.01895  [pdf

    cs.IR cs.CL stat.AP

    Machine-learning classifiers for logographic name matching in public health applications: approaches for incorporating phonetic, visual, and keystroke similarity in large-scale probabilistic record linkage

    Authors: Philip A. Collender, Zhiyue Tom Hu, Charles Li, Qu Cheng, Xintong Li, Yue You, Song Liang, Changhong Yang, Justin V. Remais

    Abstract: Approximate string-matching methods to account for complex variation in highly discriminatory text fields, such as personal names, can enhance probabilistic record linkage. However, discriminating between matching and non-matching strings is challenging for logographic scripts, where similarities in pronunciation, appearance, or keystroke sequence are not directly encoded in the string data. We le… ▽ More

    Submitted 7 January, 2020; originally announced January 2020.

    Comments: 28 pages, 4 figures

  15. arXiv:1907.04150  [pdf, other

    cs.LG cs.CV stat.ML

    Nonnegative Matrix Factorization with Local Similarity Learning

    Authors: Chong Peng, Zhao Kang, Chenglizhao Chen, Qiang Cheng

    Abstract: Existing nonnegative matrix factorization methods focus on learning global structure of the data to construct basis and coefficient matrices, which ignores the local structure that commonly exists among data. In this paper, we propose a new type of nonnegative matrix factorization method, which learns local similarity and clustering in a mutually enhancing way. The learned new representation is mo… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

  16. arXiv:1906.02745  [pdf, other

    eess.SP cs.LG stat.ML

    Automated Classification of Seizures against Nonseizures: A Deep Learning Approach

    Authors: Xinghua Yao, Qiang Cheng, Guo-Qiang Zhang

    Abstract: In current clinical practice, electroencephalograms (EEG) are reviewed and analyzed by well-trained neurologists to provide supports for therapeutic decisions. The way of manual reviewing is labor-intensive and error prone. Automatic and accurate seizure/nonseizure classification methods are needed. One major problem is that the EEG signals for seizure state and nonseizure state exhibit considerab… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: 12 pages, 8 figures. arXiv admin note: text overlap with arXiv:1903.09326

  17. arXiv:1904.07497  [pdf, other

    cs.LG cs.CV stat.ML

    RES-PCA: A Scalable Approach to Recovering Low-rank Matrices

    Authors: Chong Peng, Chenglizhao Chen, Zhao Kang, Jianbo Li, Qiang Cheng

    Abstract: Robust principal component analysis (RPCA) has drawn significant attentions due to its powerful capability in recovering low-rank matrices as well as successful appplications in various real world problems. The current state-of-the-art algorithms usually need to solve singular value decomposition of large matrices, which generally has at least a quadratic or even cubic complexity. This drawback ha… ▽ More

    Submitted 16 April, 2019; originally announced April 2019.

  18. arXiv:1904.07496  [pdf, other

    cs.LG cs.CV stat.ML

    Discriminative Ridge Machine: A Classifier for High-Dimensional Data or Imbalanced Data

    Authors: Chong Peng, Qiang Cheng

    Abstract: We introduce a discriminative regression approach to supervised classification in this paper. It estimates a representation model while accounting for discriminativeness between classes, thereby enabling accurate derivation of categorical information. This new type of regression models extends existing models such as ridge, lasso, and group lasso through explicitly incorporating discriminative inf… ▽ More

    Submitted 30 December, 2019; v1 submitted 16 April, 2019; originally announced April 2019.

  19. arXiv:1903.09326  [pdf, other

    cs.LG stat.ML

    A Novel Independent RNN Approach to Classification of Seizures against Non-seizures

    Authors: Xinghua Yao, Qiang Cheng, Guo-Qiang Zhang

    Abstract: In current clinical practices, electroencephalograms (EEG) are reviewed and analyzed by trained neurologists to provide supports for therapeutic decisions. Manual reviews can be laborious and error prone. Automatic and accurate seizure/non-seizure classification methods are desirable. A critical challenge is that seizure morphologies exhibit considerable variabilities. In order to capture essentia… ▽ More

    Submitted 21 March, 2019; originally announced March 2019.

    Comments: 10 pages, 2 figures, submitted to AMIA symposium 2019

  20. arXiv:1901.09970  [pdf, other

    cs.LG math.GR stat.ML

    Lie Group Auto-Encoder

    Authors: Liyu Gong, Qiang Cheng

    Abstract: In this paper, we propose an auto-encoder based generative neural network model whose encoder compresses the inputs into vectors in the tangent space of a special Lie group manifold: upper triangular positive definite affine transform matrices (UTDATs). UTDATs are representations of Gaussian distributions and can straightforwardly generate Gaussian distributed samples. Therefore, the encoder is tr… ▽ More

    Submitted 28 January, 2019; originally announced January 2019.

  21. arXiv:1812.06562  [pdf, other

    cs.LG q-bio.NC stat.ML

    A Robust Deep Learning Approach for Automatic Classification of Seizures Against Non-seizures

    Authors: X. Yao, X. Li, Q. Ye, Y. Huang, Q. Cheng, G. -Q. Zhang

    Abstract: Identifying epileptic seizures through analysis of the electroencephalography (EEG) signal becomes a standard method for the diagnosis of epilepsy. Manual seizure identification on EEG by trained neurologists is time-consuming, labor-intensive and error-prone, and a reliable automatic seizure/non-seizure classification method is needed. One of the challenges in automatic seizure/non-seizure classi… ▽ More

    Submitted 5 June, 2019; v1 submitted 16 December, 2018; originally announced December 2018.

    Comments: 13 pages, 10 figures, submitted to Biomedical Signal Processing and Control

  22. arXiv:1809.02709  [pdf, other

    cs.LG cs.SI stat.ML

    Exploiting Edge Features in Graph Neural Networks

    Authors: Liyu Gong, Qiang Cheng

    Abstract: Edge features contain important information about graphs. However, current state-of-the-art neural network models designed for graph learning, e.g. graph convolutional networks (GCN) and graph attention networks (GAT), adequately utilize edge features, especially multi-dimensional edge features. In this paper, we build a new framework for a family of new graph neural network models that can more s… ▽ More

    Submitted 28 January, 2019; v1 submitted 7 September, 2018; originally announced September 2018.

  23. arXiv:1711.04258  [pdf, other

    cs.LG cs.AI cs.CV cs.MM stat.ML

    Unified Spectral Clustering with Optimal Graph

    Authors: Zhao Kang, Chong Peng, Qiang Cheng, Zenglin Xu

    Abstract: Spectral clustering has found extensive use in many areas. Most traditional spectral clustering algorithms work in three separate steps: similarity graph construction; continuous labels learning; discretizing the learned labels by k-means clustering. Such common practice has two potential flaws, which may lead to severe information loss and performance degradation. First, predefined similarity gra… ▽ More

    Submitted 12 November, 2017; originally announced November 2017.

    Comments: Accepted by AAAI 2018

  24. arXiv:1705.00678  [pdf, other

    cs.LG cs.CV stat.ML

    Twin Learning for Similarity and Clustering: A Unified Kernel Approach

    Authors: Zhao Kang, Chong Peng, Qiang Cheng

    Abstract: Many similarity-based clustering methods work in two separate steps including similarity matrix computation and subsequent spectral clustering. However, similarity measurement is challenging because it is usually impacted by many factors, e.g., the choice of similarity metric, neighborhood size, scale of data, noise and outliers. Thus the learned similarity matrix is often not suitable, let alone… ▽ More

    Submitted 2 May, 2017; v1 submitted 1 May, 2017; originally announced May 2017.

    Comments: Published in AAAI 2017

  25. arXiv:1602.07783  [pdf, other

    cs.IR cs.AI stat.ML

    Top-N Recommendation with Novel Rank Approximation

    Authors: Zhao Kang, Qiang Cheng

    Abstract: The importance of accurate recommender systems has been widely recognized by academia and industry. However, the recommendation quality is still rather low. Recently, a linear sparse and low-rank representation of the user-item matrix has been applied to produce Top-N recommendations. This approach uses the nuclear norm as a convex relaxation for the rank function and has achieved better recommend… ▽ More

    Submitted 26 February, 2016; v1 submitted 24 February, 2016; originally announced February 2016.

    Comments: SDM 2016. arXiv admin note: text overlap with arXiv:1601.04800

  26. arXiv:1601.04800  [pdf, other

    cs.IR cs.AI cs.LG stat.ML

    Top-N Recommender System via Matrix Completion

    Authors: Zhao Kang, Chong Peng, Qiang Cheng

    Abstract: Top-N recommender systems have been investigated widely both in industry and academia. However, the recommendation quality is far from satisfactory. In this paper, we propose a simple yet promising algorithm. We fill the user-item matrix based on a low-rank assumption and simultaneously keep the original information. To do that, a nonconvex rank relaxation rather than the nuclear norm is adopted t… ▽ More

    Submitted 18 January, 2016; originally announced January 2016.

    Comments: AAAI 2016

  27. arXiv:1511.05261  [pdf, other

    cs.CV cs.LG math.NA stat.ML

    Robust PCA via Nonconvex Rank Approximation

    Authors: Zhao Kang, Chong Peng, Qiang Cheng

    Abstract: Numerous applications in data mining and machine learning require recovering a matrix of minimal rank. Robust principal component analysis (RPCA) is a general framework for handling this kind of problems. Nuclear norm based convex surrogate of the rank function in RPCA is widely investigated. Under certain assumptions, it can recover the underlying true low rank matrix with high probability. Howev… ▽ More

    Submitted 16 November, 2015; originally announced November 2015.

    Comments: IEEE International Conference on Data Mining

  28. arXiv:1510.08971  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Robust Subspace Clustering via Tighter Rank Approximation

    Authors: Zhao Kang, Chong Peng, Qiang Cheng

    Abstract: Matrix rank minimization problem is in general NP-hard. The nuclear norm is used to substitute the rank function in many recent studies. Nevertheless, the nuclear norm approximation adds all singular values together and the approximation error may depend heavily on the magnitudes of singular values. This might restrict its capability in dealing with many practical problems. In this paper, an arcta… ▽ More

    Submitted 30 October, 2015; originally announced October 2015.

    Comments: ACM CIKM 2015

  29. arXiv:1508.04467  [pdf, other

    cs.CV cs.IT cs.LG math.NA stat.ML

    Robust Subspace Clustering via Smoothed Rank Approximation

    Authors: Zhao Kang, Chong Peng, Qiang Cheng

    Abstract: Matrix rank minimizing subject to affine constraints arises in many application areas, ranging from signal processing to machine learning. Nuclear norm is a convex relaxation for this problem which can recover the rank exactly under some restricted and theoretically interesting conditions. However, for many real-world applications, nuclear norm approximation to the rank function can only produce a… ▽ More

    Submitted 18 August, 2015; originally announced August 2015.

    Comments: Journal, code is available

    Journal ref: IEEE Signal Processing Letters, 22(2015)2088-2092

  30. arXiv:1406.3521  [pdf, other

    stat.ME math.ST

    Exact prior-free probabilistic inference on the heritability coefficient in a linear mixed model

    Authors: Qianshun Cheng, Xu Gao, Ryan Martin

    Abstract: Linear mixed-effect models with two variance components are often used when variability comes from two sources. In genetics applications, variation in observed traits can be attributed to biological and environmental effects, and the heritability coefficient is a fundamental quantity that measures the proportion of total variability due to the biological effect. We propose a new inferential model… ▽ More

    Submitted 30 July, 2014; v1 submitted 13 June, 2014; originally announced June 2014.

    Comments: 15 pages, 1 table, 2 figures

    Journal ref: Electronic Journal of Statistics, volume 8, pages 3062-3076, 2014