Skip to main content

Showing 1–50 of 58 results for author: Ng, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2502.10958  [pdf, other

    stat.ME

    Estimation of Treatment Effects based on Kernel Matching

    Authors: Chong Ding, Zheng Li, Hon Keung Tony Ng, Wei Gao

    Abstract: The treatment effect represents the average causal impact or outcome difference between treatment and control groups. Treatment effects can be estimated through social experiments, regression models, matching estimators, and instrumental variables. In this paper, we introduce a novel kernel-matching estimator for treatment effect estimation. This method is particularly beneficial in observat… ▽ More

    Submitted 15 February, 2025; originally announced February 2025.

  2. arXiv:2410.17604  [pdf, other

    stat.ME stat.ML

    Ranking of Multi-Response Experiment Treatments

    Authors: Miguel R. Pebes-Trujillo, Itamar Shenhar, Aravind Harikumar, Ittai Herrmann, Menachem Moshelion, Kee Woei Ng, Matan Gavish

    Abstract: We present a probabilistic ranking model to identify the optimal treatment in multiple-response experiments. In contemporary practice, treatments are applied over individuals with the goal of achieving multiple ideal properties on them simultaneously. However, often there are competing properties, and the optimality of one cannot be achieved without compromising the optimality of another. Typicall… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

    MSC Class: 68T05; 62H10; 62H12; 62H30 ACM Class: I.2; I.5; G.3

  3. arXiv:2410.05757  [pdf, ps, other

    stat.ML cs.LG stat.CO stat.ME

    Temperature Optimization for Bayesian Deep Learning

    Authors: Kenyon Ng, Chris van der Heide, Liam Hodgkinson, Susan Wei

    Abstract: The Cold Posterior Effect (CPE) is a phenomenon in Bayesian Deep Learning (BDL), where tempering the posterior to a cold temperature often improves the predictive performance of the posterior predictive distribution (PPD). Although the term `CPE' suggests colder temperatures are inherently better, the BDL community increasingly recognizes that this is not always the case. Despite this, there remai… ▽ More

    Submitted 11 June, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

    Comments: 11 pages (+5 reference, +17 appendix). Accepted at UAI 2025

  4. arXiv:2410.05753  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Pathwise Gradient Variance Reduction with Control Variates in Variational Inference

    Authors: Kenyon Ng, Susan Wei

    Abstract: Variational inference in Bayesian deep learning often involves computing the gradient of an expectation that lacks a closed-form solution. In these cases, pathwise and score-function gradient estimators are the most common approaches. The pathwise estimator is often favoured for its substantially lower variance compared to the score-function estimator, which typically requires variance reduction t… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

    Comments: 9 (+16 appendix) pages

  5. arXiv:2409.02372  [pdf, other

    stat.ME

    A Principal Square Response Forward Regression Method for Dimension Reduction

    Authors: Zheng Li, Yunhao Wang, Wei Gao, Hon Keung Tony Ng

    Abstract: Dimension reduction techniques, such as Sufficient Dimension Reduction (SDR), are indispensable for analyzing high-dimensional datasets. This paper introduces a novel SDR method named Principal Square Response Forward Regression (PSRFR) for estimating the central subspace of the response variable Y, given the vector of predictor variables $\bm{X}$. We provide a computational algorithm for implemen… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  6. arXiv:2408.09054  [pdf, other

    physics.soc-ph physics.app-ph stat.AP

    From Urban Clusters to Megaregions: Mapping Australia's Evolving Urban Regions

    Authors: M. K. M Ng, Z. Shabrina, S. Sarkar, H. Han, C. Pettit

    Abstract: This study employs percolation theory to investigate the hierarchical organisation of Australian urban centres through the connectivity of their road networks. The analysis demonstrates how discrete urban clusters have developed into integrated regional entities, delineating the pivotal distance thresholds that regulate these urban transitions. The study reveals the interconnections between dispar… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  7. arXiv:2405.17464  [pdf, other

    cs.LG cs.AI stat.ML

    Data Valuation by Leveraging Global and Local Statistical Information

    Authors: Xiaoling Zhou, Ou Wu, Michael K. Ng, Hao Jiang

    Abstract: Data valuation has garnered increasing attention in recent years, given the critical role of high-quality data in various applications, particularly in machine learning tasks. There are diverse technical avenues to quantify the value of data within a corpus. While Shapley value-based methods are among the most widely used techniques in the literature due to their solid theoretical foundation, the… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 12 pages, 8 figures. arXiv admin note: text overlap with arXiv:2306.10577 by other authors

    ACM Class: I.2

  8. arXiv:2401.11263  [pdf, other

    stat.ME stat.ML

    Estimating Heterogeneous Treatment Effects on Survival Outcomes Using Counterfactual Censoring Unbiased Transformations

    Authors: Shenbo Xu, Raluca Cobzaru, Stan N. Finkelstein, Roy E. Welsch, Kenney Ng, Zach Shahn

    Abstract: Methods for estimating heterogeneous treatment effects (HTE) from observational data have largely focused on continuous or binary outcomes, with less attention paid to survival outcomes and almost none to settings with competing risks. In this work, we develop censoring unbiased transformations (CUTs) for survival outcomes both with and without competing risks. After converting time-to-event outco… ▽ More

    Submitted 27 September, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

  9. arXiv:2310.03758  [pdf, other

    eess.SP cs.IT cs.LG stat.ML

    A Unified Framework for Uniform Signal Recovery in Nonlinear Generative Compressed Sensing

    Authors: Junren Chen, Jonathan Scarlett, Michael K. Ng, Zhaoqiang Liu

    Abstract: In generative compressed sensing (GCS), we want to recover a signal $\mathbf{x}^* \in \mathbb{R}^n$ from $m$ measurements ($m\ll n$) using a generative prior $\mathbf{x}^*\in G(\mathbb{B}_2^k(r))$, where $G$ is typically an $L$-Lipschitz continuous generative model and $\mathbb{B}_2^k(r)$ represents the radius-$r$ $\ell_2$-ball in $\mathbb{R}^k$. Under nonlinear measurements, most prior results ar… ▽ More

    Submitted 9 October, 2023; v1 submitted 25 September, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023

  10. arXiv:2309.09032  [pdf, other

    cs.IT cs.LG eess.SP stat.ML

    Solving Quadratic Systems with Full-Rank Matrices Using Sparse or Generative Priors

    Authors: Junren Chen, Michael K. Ng, Zhaoqiang Liu

    Abstract: The problem of recovering a signal $\boldsymbol x\in \mathbb{R}^n$ from a quadratic system $\{y_i=\boldsymbol x^\top\boldsymbol A_i\boldsymbol x,\ i=1,\ldots,m\}$ with full-rank matrices $\boldsymbol A_i$ frequently arises in applications such as unassigned distance geometry and sub-wavelength imaging. With i.i.d. standard Gaussian matrices $\boldsymbol A_i$, this paper addresses the high-dimensio… ▽ More

    Submitted 29 October, 2024; v1 submitted 16 September, 2023; originally announced September 2023.

  11. arXiv:2308.16059  [pdf, other

    stat.ML cs.IT cs.LG

    A Parameter-Free Two-Bit Covariance Estimator with Improved Operator Norm Error Rate

    Authors: Junren Chen, Michael K. Ng

    Abstract: A covariance matrix estimator using two bits per entry was recently developed by Dirksen, Maly and Rauhut [Annals of Statistics, 50(6), pp. 3538-3562]. The estimator achieves near minimax rate for general sub-Gaussian distributions, but also suffers from two downsides: theoretically, there is an essential gap on operator norm error between their estimator and sample covariance when the diagonal of… ▽ More

    Submitted 10 November, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

    Comments: Major changes. In particular, we further adapt our method to online settings (see Section 5)

  12. arXiv:2305.02373  [pdf, other

    stat.ME stat.ML

    Efficient estimation of weighted cumulative treatment effects by double/debiased machine learning

    Authors: Shenbo Xu, Bang Zheng, Bowen Su, Stan Finkelstein, Roy Welsch, Kenney Ng, Ioanna Tzoulaki, Zach Shahn

    Abstract: In empirical studies with time-to-event outcomes, investigators often leverage observational data to conduct causal inference on the effect of exposure when randomized controlled trial data is unavailable. Model misspecification and lack of overlap are common issues in observational studies, and they often lead to inconsistent and inefficient estimators of the average treatment effect. Estimators… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

  13. arXiv:2302.11197  [pdf, ps, other

    stat.ML cs.LG eess.SP

    Quantized Low-Rank Multivariate Regression with Random Dithering

    Authors: Junren Chen, Yueqi Wang, Michael K. Ng

    Abstract: Low-rank multivariate regression (LRMR) is an important statistical learning model that combines highly correlated tasks as a multiresponse regression problem with low-rank priori on the coefficient matrix. In this paper, we study quantized LRMR, a practical setting where the responses and/or the covariates are discretized to finite precision. We focus on the estimation of the underlying coefficie… ▽ More

    Submitted 6 October, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: IEEE Transactions on Signal Processing (publication ready version)

  14. arXiv:2301.11272  [pdf, other

    cs.CY stat.AP

    Location-based Activity Behavior Deviation Detection for Nursing Home using IoT Devices

    Authors: Billy Pik Lik Lau, Zann Koh, Yuren Zhou, Benny Kai Kiat Ng, Chau Yuen, Mui Lang Low

    Abstract: With the advancement of the Internet of Things(IoT) and pervasive computing applications, it provides a better opportunity to understand the behavior of the aging population. However, in a nursing home scenario, common sensors and techniques used to track an elderly living alone are not suitable. In this paper, we design a location-based tracking system for a four-story nursing home - The Salvatio… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

    Comments: 12 pages

  15. arXiv:2212.14562  [pdf, ps, other

    math.ST cs.IT stat.ML

    Quantizing Heavy-tailed Data in Statistical Estimation: (Near) Minimax Rates, Covariate Quantization, and Uniform Recovery

    Authors: Junren Chen, Michael K. Ng, Di Wang

    Abstract: This paper studies the quantization of heavy-tailed data in some fundamental statistical estimation problems, where the underlying distributions have bounded moments of some order. We propose to truncate and properly dither the data prior to a uniform quantization. Our major standpoint is that (near) minimax rates of estimation error are achievable merely from the quantized data produced by the pr… ▽ More

    Submitted 26 July, 2023; v1 submitted 30 December, 2022; originally announced December 2022.

    Comments: Major changes

  16. arXiv:2208.08287  [pdf, ps, other

    cs.LG stat.ML

    Noisy Nonnegative Tucker Decomposition with Sparse Factors and Missing Data

    Authors: Xiongjun Zhang, Michael K. Ng

    Abstract: Tensor decomposition is a powerful tool for extracting physically meaningful latent factors from multi-dimensional nonnegative data, and has been an increasing interest in a variety of fields such as image processing, machine learning, and computer vision. In this paper, we propose a sparse nonnegative Tucker decomposition and completion method for the recovery of underlying nonnegative data under… ▽ More

    Submitted 1 December, 2024; v1 submitted 17 August, 2022; originally announced August 2022.

  17. arXiv:2205.13827  [pdf, ps, other

    stat.ML cs.LG

    Error Bound of Empirical $\ell_2$ Risk Minimization for Noisy Standard and Generalized Phase Retrieval Problems

    Authors: Junren Chen, Michael K. Ng

    Abstract: In this paper, we study the estimation performance of empirical $\ell_2$ risk minimization (ERM) in noisy (standard) phase retrieval (NPR) given by $y_k = |α_k^*x_0|^2+η_k$, or noisy generalized phase retrieval (NGPR) formulated as $y_k = x_0^*A_kx_0 + η_k$, where $x_0\in\mathbb{K}^d$ is the desired signal, $n$ is the sample size, $η= (η_1,...,η_n)^\top$ is the noise vector. We establish new error… ▽ More

    Submitted 28 June, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: 44 pages, 6 figures

  18. arXiv:2202.13157  [pdf, ps, other

    stat.ML cs.LG eess.SP

    High Dimensional Statistical Estimation under Uniformly Dithered One-bit Quantization

    Authors: Junren Chen, Cheng-Long Wang, Michael K. Ng, Di Wang

    Abstract: In this paper, we propose a uniformly dithered 1-bit quantization scheme for high-dimensional statistical estimation. The scheme contains truncation, dithering, and quantization as typical steps. As canonical examples, the quantization scheme is applied to the estimation problems of sparse covariance matrix estimation, sparse linear regression (i.e., compressed sensing), and matrix completion. We… ▽ More

    Submitted 20 January, 2023; v1 submitted 26 February, 2022; originally announced February 2022.

    Comments: We add lower bounds for 1-bit quantization of heavy-tailed data (Theorems 11, 14)

  19. arXiv:2109.00749  [pdf, other

    cs.LG stat.ML

    Co-Separable Nonnegative Matrix Factorization

    Authors: Junjun Pan, Michael K. Ng

    Abstract: Nonnegative matrix factorization (NMF) is a popular model in the field of pattern recognition. It aims to find a low rank approximation for nonnegative data M by a product of two nonnegative matrices W and H. In general, NMF is NP-hard to solve while it can be solved efficiently under separability assumption, which requires the columns of factor matrix are equal to columns of the input matrix. In… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.

  20. arXiv:2106.06997  [pdf, other

    cs.LG stat.ML

    Post-hoc loss-calibration for Bayesian neural networks

    Authors: Meet P. Vadera, Soumya Ghosh, Kenney Ng, Benjamin M. Marlin

    Abstract: Bayesian decision theory provides an elegant framework for acting optimally under uncertainty when tractable posterior distributions are available. Modern Bayesian models, however, typically involve intractable posteriors that are approximated with, potentially crude, surrogates. This difficulty has engendered loss-calibrated techniques that aim to learn posterior approximations that favor high-ut… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

    Comments: Accepted to Conference on Uncertainty in AI (UAI) '21

  21. arXiv:2104.05449   

    cond-mat.mtrl-sci physics.data-an stat.AP

    Current Overview of Statistical Fiber Bundles Model and Its Application to Physics-based Reliability Analysis of Thin-film Dielectrics

    Authors: James U. Gleaton, David Han, James D. Lynch, Hon Keung Tony Ng, Fabrizio Ruggeri

    Abstract: In this paper, we present a critical overview of statistical fiber bundles models. We discuss relevant aspects, like assumptions and consequences stemming from models in the literature and propose new ones. This is accomplished by concentrating on both the physical and statistical aspects of a specific load-sharing example, the breakdown (BD) for circuits of capacitors and related dielectrics. For… ▽ More

    Submitted 25 January, 2023; v1 submitted 9 April, 2021; originally announced April 2021.

    Comments: The majority of the materials in the paper has been published as a book

  22. arXiv:2103.10060  [pdf, other

    cs.LG stat.ML

    Approximating Probability Distributions by using Wasserstein Generative Adversarial Networks

    Authors: Yihang Gao, Michael K. Ng, Mingjie Zhou

    Abstract: Studied here are Wasserstein generative adversarial networks (WGANs) with GroupSort neural networks as their discriminators. It is shown that the error bound of the approximation for the target distribution depends on the width and depth (capacity) of the generators and discriminators and the number of samples in training. A quantified generalization bound is established for the Wasserstein distan… ▽ More

    Submitted 29 June, 2023; v1 submitted 18 March, 2021; originally announced March 2021.

    Comments: Accepted by SIAM Journal on Mathematics of Data Science (SIMODS)

    MSC Class: 68Q32; 68T15; 68W40

  23. The Study of Urban Residential's Public Space Activeness using Space-centric Approach

    Authors: Billy Pik Lik Lau, Benny Kai Kiat Ng, Chau Yuen, Bige Tuncer, Keng Hua Chong

    Abstract: With the advancement of the Internet of Things (IoT) and communication platform, large scale sensor deployment can be easily implemented in an urban city to collect various information. To date, there are only a handful of research studies about understanding the usage of urban public spaces. Leveraging IoT, various sensors have been deployed in an urban residential area to monitor and study publi… ▽ More

    Submitted 11 January, 2021; v1 submitted 11 January, 2021; originally announced January 2021.

    Comments: Accepted at IEEE Internet of Things Journal 2021

  24. arXiv:2012.08784  [pdf, other

    stat.ML cs.LG

    Tensor Completion by Multi-Rank via Unitary Transformation

    Authors: Guang-Jing Song, Michael K. Ng, Xiongjun Zhang

    Abstract: One of the key problems in tensor completion is the number of uniformly random sample entries required for recovery guarantee. The main aim of this paper is to study $n_1 \times n_2 \times n_3$ third-order tensor completion based on transformed tensor singular value decomposition, and provide a bound on the number of required sample entries. Our approach is to make use of the multi-rank of the und… ▽ More

    Submitted 24 January, 2022; v1 submitted 16 December, 2020; originally announced December 2020.

  25. Urban Space Insights Extraction using Acoustic Histogram Information

    Authors: Nipun Wijerathne, Billy Pik Lik Lau, Benny Kai Kiat Ng, Chau Yuen

    Abstract: Urban data mining can be identified as a highly potential area that can enhance the smart city services towards better sustainable development especially in the urban residential activity tracking. While existing human activity tracking systems have demonstrated the capability to unveil the hidden aspects of citizens' behavior, they often come with a high implementation cost and require a large co… ▽ More

    Submitted 14 December, 2020; v1 submitted 10 December, 2020; originally announced December 2020.

    Comments: Accepted at IEEE Systems Journal

  26. arXiv:2009.03998  [pdf, other

    cs.LG cs.CV stat.ML

    Tangent Space Based Alternating Projections for Nonnegative Low Rank Matrix Approximation

    Authors: Guangjing Song, Michael K. Ng, Tai-Xiang Jiang

    Abstract: In this paper, we develop a new alternating projection method to compute nonnegative low rank matrix approximation for nonnegative matrices. In the nonnegative low rank matrix approximation method, the projection onto the manifold of fixed rank matrices can be expensive as the singular value decomposition is required. We propose to use the tangent space of the point in the manifold to approximate… ▽ More

    Submitted 2 September, 2020; originally announced September 2020.

  27. arXiv:2008.00748  [pdf, other

    cs.LG eess.IV stat.ML

    Tensorizing GAN with High-Order Pooling for Alzheimer's Disease Assessment

    Authors: Wen Yu, Baiying Lei, Michael K. Ng, Albert C. Cheung, Yanyan Shen, Shuqiang Wang

    Abstract: It is of great significance to apply deep learning for the early diagnosis of Alzheimer's Disease (AD). In this work, a novel tensorizing GAN with high-order pooling is proposed to assess Mild Cognitive Impairment (MCI) and AD. By tensorizing a three-player cooperative game based framework, the proposed model can benefit from the structural information of the brain. By incorporating the high-order… ▽ More

    Submitted 3 August, 2020; originally announced August 2020.

    Comments: 15 pages, 20 figures

  28. arXiv:2007.12355  [pdf, other

    cs.LG stat.ML

    Dynamic Knowledge Distillation for Black-box Hypothesis Transfer Learning

    Authors: Yiqin Yu, Xu Min, Shiwan Zhao, Jing Mei, Fei Wang, Dongsheng Li, Kenney Ng, Shaochun Li

    Abstract: In real world applications like healthcare, it is usually difficult to build a machine learning prediction model that works universally well across different institutions. At the same time, the available model is often proprietary, i.e., neither the model parameter nor the data set used for model training is accessible. In consequence, leveraging the knowledge hidden in the available model (aka. t… ▽ More

    Submitted 6 August, 2020; v1 submitted 24 July, 2020; originally announced July 2020.

    Comments: 7 pages, 2 figures

  29. arXiv:2007.10626  [pdf, other

    stat.ML cs.CV cs.LG math.NA

    Sparse Nonnegative Tensor Factorization and Completion with Noisy Observations

    Authors: Xiongjun Zhang, Michael K. Ng

    Abstract: In this paper, we study the sparse nonnegative tensor factorization and completion problem from partial and noisy observations for third-order tensors. Because of sparsity and nonnegativity, the underlying tensor is decomposed into the tensor-tensor product of one sparse nonnegative tensor and one nonnegative tensor. We propose to minimize the sum of the maximum likelihood estimation for the obser… ▽ More

    Submitted 20 October, 2021; v1 submitted 21 July, 2020; originally announced July 2020.

  30. arXiv:2006.11601  [pdf, other

    cs.LG cs.CR cs.DC stat.ML

    Rethinking Privacy Preserving Deep Learning: How to Evaluate and Thwart Privacy Attacks

    Authors: Lixin Fan, Kam Woh Ng, Ce Ju, Tianyu Zhang, Chang Liu, Chee Seng Chan, Qiang Yang

    Abstract: This paper investigates capabilities of Privacy-Preserving Deep Learning (PPDL) mechanisms against various forms of privacy attacks. First, we propose to quantitatively measure the trade-off between model accuracy and privacy losses incurred by reconstruction, tracing and membership attacks. Second, we formulate reconstruction attacks as solving a noisy system of linear equations, and prove that a… ▽ More

    Submitted 23 June, 2020; v1 submitted 20 June, 2020; originally announced June 2020.

    Comments: under review, 36 pages (updated Eq. 3 and Fig. 8)

  31. arXiv:2002.04401  [pdf, other

    cs.SI cs.LG stat.AP

    Understanding Crowd Behaviors in a Social Event by Passive WiFi Sensing and Data Mining

    Authors: Yuren Zhou, Billy Pik Lik Lau, Zann Koh, Chau Yuen, Benny Kai Kiat Ng

    Abstract: Understanding crowd behaviors in a large social event is crucial for event management. Passive WiFi sensing, by collecting WiFi probe requests sent from mobile devices, provides a better way to monitor crowds compared with people counters and cameras in terms of free interference, larger coverage, lower cost, and more information on people's movement. In existing studies, however, not enough atten… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

    Comments: This manuscript has been accepted by IEEE Internet of Things journal. Copyright (c) 2020 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected]

  32. arXiv:1911.04655  [pdf, other

    cs.LG cs.IR stat.ML

    Hyper-Sphere Quantization: Communication-Efficient SGD for Federated Learning

    Authors: Xinyan Dai, Xiao Yan, Kaiwen Zhou, Han Yang, Kelvin K. W. Ng, James Cheng, Yu Fan

    Abstract: The high cost of communicating gradients is a major bottleneck for federated learning, as the bandwidth of the participating user devices is limited. Existing gradient compression algorithms are mainly designed for data centers with high-speed network and achieve $O(\sqrt{d} \log d)$ per-iteration communication cost at best, where $d$ is the size of the model. We propose hyper-sphere quantization… ▽ More

    Submitted 25 November, 2019; v1 submitted 11 November, 2019; originally announced November 2019.

  33. arXiv:1910.09979  [pdf, other

    stat.ML cs.LG

    Orthogonal Nonnegative Tucker Decomposition

    Authors: Junjun Pan, Michael K. Ng, Ye Liu, Xiongjun Zhang, Hong Yan

    Abstract: In this paper, we study the nonnegative tensor data and propose an orthogonal nonnegative Tucker decomposition (ONTD). We discuss some properties of ONTD and develop a convex relaxation algorithm of the augmented Lagrangian function to solve the optimization problem. The convergence of the algorithm is given. We employ ONTD on the image data sets from the real world applications including face rec… ▽ More

    Submitted 27 October, 2019; v1 submitted 21 October, 2019; originally announced October 2019.

  34. arXiv:1909.10679  [pdf

    q-fin.ST econ.EM stat.AP

    Structural Change Analysis of Active Cryptocurrency Market

    Authors: C. Y. Tan, Y. B. Koh, K. H. Ng, K. H. Ng

    Abstract: Structural Change Analysis of Active Cryptocurrency Market

    Submitted 23 September, 2019; originally announced September 2019.

    Comments: 18 pages, 6 figures and 3 tables

  35. arXiv:1907.01113  [pdf, other

    cs.LG cs.CV stat.ML

    Robust Tensor Completion Using Transformed Tensor SVD

    Authors: Guangjing Song, Michael K. Ng, Xiongjun Zhang

    Abstract: In this paper, we study robust tensor completion by using transformed tensor singular value decomposition (SVD), which employs unitary transform matrices instead of discrete Fourier transform matrix that is used in the traditional tensor SVD. The main motivation is that a lower tubal rank tensor can be obtained by using other unitary transform matrices than that by using discrete Fourier transform… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

  36. arXiv:1906.01167  [pdf, other

    cs.CR cs.AI cs.LG stat.ML

    Towards Fair and Privacy-Preserving Federated Deep Models

    Authors: Lingjuan Lyu, Jiangshan Yu, Karthik Nandakumar, Yitong Li, Xingjun Ma, Jiong Jin, Han Yu, Kee Siong Ng

    Abstract: The current standalone deep learning framework tends to result in overfitting and low utility. This problem can be addressed by either a centralized framework that deploys a central server to train a global model on the joint data from all parties, or a distributed framework that leverages a parameter server to aggregate local model updates. Server-based solutions are prone to the problem of a sin… ▽ More

    Submitted 19 May, 2020; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: Accepted for publication in TPDS

  37. arXiv:1904.11652  [pdf, other

    cs.LG cs.HC stat.ML

    DPVis: Visual Analytics with Hidden Markov Models for Disease Progression Pathways

    Authors: Bum Chul Kwon, Vibha Anand, Kristen A Severson, Soumya Ghosh, Zhaonan Sun, Brigitte I Frohnert, Markus Lundgren, Kenney Ng

    Abstract: Clinical researchers use disease progression models to understand patient status and characterize progression patterns from longitudinal health records. One approach for disease progression modeling is to describe patient status using a small number of states that represent distinctive distributions over a set of observed measures. Hidden Markov models (HMMs) and its variants are a class of models… ▽ More

    Submitted 9 April, 2020; v1 submitted 25 April, 2019; originally announced April 2019.

    Comments: to appear at IEEE Transactions on Visualization and Computer Graphics

  38. arXiv:1904.10126  [pdf

    cs.CV cs.AI stat.ML

    Lung Nodule Classification using Deep Local-Global Networks

    Authors: Mundher Al-Shabi, Boon Leong Lan, Wai Yee Chan, Kwan-Hoong Ng, Maxine Tan

    Abstract: Purpose: Lung nodules have very diverse shapes and sizes, which makes classifying them as benign/malignant a challenging problem. In this paper, we propose a novel method to predict the malignancy of nodules that have the capability to analyze the shape and size of a nodule using a global feature extractor, as well as the density and structure of the nodule using a local feature extractor. Methods… ▽ More

    Submitted 22 April, 2019; originally announced April 2019.

    Comments: Code and dataset available here https://github.com/mundher/local-global

  39. arXiv:1902.10393  [pdf, other

    stat.ME quant-ph

    Using prior expansions for prior-data conflict checking

    Authors: David J. Nott, Max Seah, Luai Al-Labadi, Michael Evans, Hui Khoon Ng, Berthold-Georg Englert

    Abstract: Any Bayesian analysis involves combining information represented through different model components, and when different sources of information are in conflict it is important to detect this. Here we consider checking for prior-data conflict in Bayesian models by expanding the prior used for the analysis into a larger family of priors, and considering a marginal likelihood score statistic for the e… ▽ More

    Submitted 12 March, 2020; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: Accepted version, to appear in Bayesian Analysis

  40. arXiv:1901.08551  [pdf

    cs.LG stat.ML

    A Universal Logic Operator for Interpretable Deep Convolution Networks

    Authors: KamWoh Ng, Lixin Fan, Chee Seng Chan

    Abstract: Explaining neural network computation in terms of probabilistic/fuzzy logical operations has attracted much attention due to its simplicity and high interpretability. Different choices of logical operators such as AND, OR and XOR give rise to another dimension for network optimization, and in this paper, we study the open problem of learning a universal logical operator without prescribing to any… ▽ More

    Submitted 20 January, 2019; originally announced January 2019.

    Comments: In AAAI-19 Workshop on Network Interpretability for Deep Learning

  41. arXiv:1811.06094  [pdf, other

    stat.ML cs.LG

    Unsupervised learning with contrastive latent variable models

    Authors: Kristen Severson, Soumya Ghosh, Kenney Ng

    Abstract: In unsupervised learning, dimensionality reduction is an important tool for data exploration and visualization. Because these aims are typically open-ended, it can be useful to frame the problem as looking for patterns that are enriched in one dataset relative to another. These pairs of datasets occur commonly, for instance a population of interest vs. control or signal vs. signal free recordings.… ▽ More

    Submitted 14 November, 2018; originally announced November 2018.

  42. arXiv:1811.01506  [pdf, other

    cs.LG stat.ML

    Theoretical and Experimental Analysis on the Generalizability of Distribution Regression Network

    Authors: Connie Kou, Hwee Kuan Lee, Jorge Sanz, Teck Khim Ng

    Abstract: There is emerging interest in performing regression between distributions. In contrast to prediction on single instances, these machine learning methods can be useful for population-based studies or on problems that are inherently statistical in nature. The recently proposed distribution regression network (DRN) has shown superior performance for the distribution-to-distribution regression task co… ▽ More

    Submitted 31 May, 2019; v1 submitted 4 November, 2018; originally announced November 2018.

  43. A flexible sequential Monte Carlo algorithm for parametric constrained regression

    Authors: Kenyon Ng, Berwin A. Turlach, Kevin Murray

    Abstract: An algorithm is proposed that enables the imposition of shape constraints on regression curves, without requiring the constraints to be written as closed-form expressions, nor assuming the functional form of the loss function. This algorithm is based on Sequential Monte Carlo-Simulated Annealing and only relies on an indicator function that assesses whether or not the constraints are fulfilled, th… ▽ More

    Submitted 1 April, 2019; v1 submitted 2 October, 2018; originally announced October 2018.

    Comments: Typo corrections. Code available on https://github.com/weiyaw/blackbox

    Journal ref: Computational Statistics & Data Analysis 138 (2019) 13-26

  44. arXiv:1804.04775  [pdf, other

    cs.LG stat.ML

    A Compact Network Learning Model for Distribution Regression

    Authors: Connie Kou, Hwee Kuan Lee, Teck Khim Ng

    Abstract: Despite the superior performance of deep learning in many applications, challenges remain in the area of regression on function spaces. In particular, neural networks are unable to encode function inputs compactly as each node encodes just a real value. We propose a novel idea to address this shortcoming: to encode an entire function in a single network node. To that end, we design a compact netwo… ▽ More

    Submitted 10 July, 2018; v1 submitted 12 April, 2018; originally announced April 2018.

  45. arXiv:1804.02655  [pdf, ps, other

    stat.CO

    Efficient Computational Algorithm for Optimal Continuous Experimental Designs

    Authors: Jiangtao Duan, Wei Gao, Hon Keung Tony Ng

    Abstract: A simple yet efficient computational algorithm for computing the continuous optimal experimental design for linear models is proposed. An alternative proof the monotonic convergence for $D$-optimal criterion on continuous design spaces are provided. We further show that the proposed algorithm converges to the $D$-optimal design. We also provide an algorithm for the $A$-optimality and conjecture th… ▽ More

    Submitted 8 April, 2018; originally announced April 2018.

  46. arXiv:1804.01957  [pdf, ps, other

    stat.AP

    A Class of Skewed Distributions with Applications in Environmental Data

    Authors: Indranil Ghosh, Hon Keung Tony Ng

    Abstract: In environmental studies, many data are typically skewed and it is desired to have a flexible statistical model for this kind of data. In this paper, we study a class of skewed distributions by invoking arguments as described by Ferreira and Steel (2006, Journal of the American Statistical Association, 101: 823--829). In particular, we consider using the logistic kernel to derive a class of univar… ▽ More

    Submitted 5 April, 2018; originally announced April 2018.

    MSC Class: 60E; 62F

  47. arXiv:1802.09933  [pdf, ps, other

    stat.ML cs.DS cs.LG math.OC

    Guaranteed Sufficient Decrease for Stochastic Variance Reduced Gradient Optimization

    Authors: Fanhua Shang, Yuanyuan Liu, Kaiwen Zhou, James Cheng, Kelvin K. W. Ng, Yuichi Yoshida

    Abstract: In this paper, we propose a novel sufficient decrease technique for stochastic variance reduced gradient descent methods such as SVRG and SAGA. In order to make sufficient decrease for stochastic optimization, we design a new sufficient decrease criterion, which yields sufficient decrease versions of stochastic variance reduction algorithms such as SVRG-SD and SAGA-SD as a byproduct. We introduce… ▽ More

    Submitted 25 February, 2018; originally announced February 2018.

    Comments: 24 pages, 10 figures, AISTATS 2018. arXiv admin note: text overlap with arXiv:1703.06807

  48. arXiv:1802.06476  [pdf, other

    cs.LG cs.AI stat.ML

    Simultaneous Modeling of Multiple Complications for Risk Profiling in Diabetes Care

    Authors: Bin Liu, Ying Li, Soumya Ghosh, Zhaonan Sun, Kenney Ng, Jianying Hu

    Abstract: Type 2 diabetes mellitus (T2DM) is a chronic disease that often results in multiple complications. Risk prediction and profiling of T2DM complications is critical for healthcare professionals to design personalized treatment plans for patients in diabetes care for improved outcomes. In this paper, we study the risk of developing complications after the initial T2DM diagnosis from longitudinal pati… ▽ More

    Submitted 18 February, 2018; originally announced February 2018.

    Journal ref: IEEE Transactions on Knowledge and Data Engineering, 2019

  49. Sensor Fusion for Public Space Utilization Monitoring in a Smart City

    Authors: Billy Pik Lik Lau, Nipun Wijerathne, Benny Kai Kiat Ng, and Chau Yuen

    Abstract: Public space utilization is crucial for urban developers to understand how efficient a place is being occupied in order to improve existing or future infrastructures. In a smart cities approach, implementing public space monitoring with Internet-of-Things (IoT) sensors appear to be a viable solution. However, choice of sensors often is a challenging problem and often linked with scalability, cover… ▽ More

    Submitted 5 October, 2017; v1 submitted 14 September, 2017; originally announced October 2017.

  50. arXiv:1708.00601  [pdf, other

    cs.LG cs.CV cs.IT stat.ML

    Exact Tensor Completion from Sparsely Corrupted Observations via Convex Optimization

    Authors: Jonathan Q. Jiang, Michael K. Ng

    Abstract: This paper conducts a rigorous analysis for provable estimation of multidimensional arrays, in particular third-order tensors, from a random subset of its corrupted entries. Our study rests heavily on a recently proposed tensor algebraic framework in which we can obtain tensor singular value decomposition (t-SVD) that is similar to the SVD for matrices, and define a new notion of tensor rank refer… ▽ More

    Submitted 2 August, 2017; originally announced August 2017.

    Comments: 36 pages, 9 figures