Skip to main content

Showing 1–46 of 46 results for author: Wang, N

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.05583  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Conformal Prediction Adaptive to Unknown Subpopulation Shifts

    Authors: Nien-Shao Wang, Duygu Nur Yaldiz, Yavuz Faruk Bakman, Sai Praneeth Karimireddy

    Abstract: Conformal prediction is widely used to equip black-box machine learning models with uncertainty quantification enjoying formal coverage guarantees. However, these guarantees typically break down in the presence of distribution shifts, where the data distribution at test time differs from the training (or calibration-time) distribution. In this work, we address subpopulation shifts, where the test… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: 20 pages, 6 figures, 5 tables, submitted to NeurIPS 2025

  2. arXiv:2504.13908  [pdf

    cs.HC cs.AI stat.AP

    AI-Assisted Conversational Interviewing: Effects on Data Quality and User Experience

    Authors: Soubhik Barari, Jarret Angbazo, Natalie Wang, Leah M. Christian, Elizabeth Dean, Zoe Slowinski, Brandon Sepulvado

    Abstract: Standardized surveys scale efficiently but sacrifice depth, while conversational interviews improve response quality at the cost of scalability and consistency. This study bridges the gap between these methods by introducing a framework for AI-assisted conversational interviewing. To evaluate this framework, we conducted a web survey experiment where 1,800 participants were randomly assigned to te… ▽ More

    Submitted 9 April, 2025; originally announced April 2025.

  3. arXiv:2411.17688  [pdf, other

    stat.AP

    Cornering in the Water: An Investigation of Dolphin Swimming Performance

    Authors: Mingkai Xia, Junhan Zhang, Ningshan Wang, Gabriel Antoniak, Nicole West, Ding Zhang, Kenneth Alex Shorter

    Abstract: This article provides new insights into dolphin maneuver strategies in lap swimming tasks. However, most existing research focuses on straight-line swimming leaving the study of dolphins' corning strategies an open area. Challenges for directly analyzing dolphins' turning behavior include difficulties in motion tracking underwater and the inability to directly measure the propulsive forces. This p… ▽ More

    Submitted 26 November, 2024; originally announced November 2024.

    Comments: 11 pages, 13 figures

  4. arXiv:2408.01575  [pdf, other

    cs.LG physics.geo-ph stat.ML

    Deep Learning Framework for History Matching CO2 Storage with 4D Seismic and Monitoring Well Data

    Authors: Nanzhe Wang, Louis J. Durlofsky

    Abstract: Geological carbon storage entails the injection of megatonnes of supercritical CO2 into subsurface formations. The properties of these formations are usually highly uncertain, which makes design and optimization of large-scale storage operations challenging. In this paper we introduce a history matching strategy that enables the calibration of formation properties based on early-time observations.… ▽ More

    Submitted 21 January, 2025; v1 submitted 2 August, 2024; originally announced August 2024.

    Comments: 52 pages, 21 figures

  5. arXiv:2407.13118  [pdf, other

    q-bio.NC stat.CO

    Evaluating the evolution and inter-individual variability of infant functional module development from 0 to 5 years old

    Authors: Lingbin Bian, Nizhuan Wang, Yuanning Li, Adeel Razi, Qian Wang, Han Zhang, Dinggang Shen, the UNC/UMN Baby Connectome Project Consortium

    Abstract: The segregation and integration of infant brain networks undergo tremendous changes due to the rapid development of brain function and organization. Traditional methods for estimating brain modularity usually rely on group-averaged functional connectivity (FC), often overlooking individual variability. To address this, we introduce a novel approach utilizing Bayesian modeling to analyze the dynami… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  6. arXiv:2404.04800  [pdf, other

    cs.LG cs.CV stat.ML

    Coordinated Sparse Recovery of Label Noise

    Authors: Yukun Yang, Naihao Wang, Haixin Yang, Ruirui Li

    Abstract: Label noise is a common issue in real-world datasets that inevitably impacts the generalization of models. This study focuses on robust classification tasks where the label noise is instance-dependent. Estimating the transition matrix accurately in this task is challenging, and methods based on sample selection often exhibit confirmation bias to varying degrees. Sparse over-parameterized training… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: Pre-print prior to submission to journal

  7. arXiv:2306.12125  [pdf, other

    stat.ME

    High-dimensional Tensor Response Regression using the t-Distribution

    Authors: Ning Wang, Xin Zhang, Qing Mai

    Abstract: In recent years, promising statistical modeling approaches to tensor data analysis have been rapidly developed. Traditional multivariate analysis tools, such as multivariate regression and discriminant analysis, are generalized from modeling random vectors and matrices to higher-order random tensors. One of the biggest challenges to statistical tensor models is the non-Gaussian nature of many real… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  8. arXiv:2301.07386  [pdf, other

    q-bio.NC stat.AP

    Hierarchical Bayesian inference for community detection and connectivity of functional brain networks

    Authors: Lingbin Bian, Nizhuan Wang, Leonardo Novelli, Jonathan Keith, Adeel Razi

    Abstract: Many functional magnetic resonance imaging (fMRI) studies rely on estimates of hierarchically organised brain networks whose segregation and integration reflect the dynamic transitions of latent cognitive states. However, most existing methods for estimating the community structure of networks from both individual and group-level analysis neglect the variability between subjects and lack validatio… ▽ More

    Submitted 26 May, 2024; v1 submitted 18 January, 2023; originally announced January 2023.

  9. arXiv:2210.01019  [pdf, other

    stat.ML cs.LG

    Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss Landscape for Deep Networks

    Authors: Xiang Wang, Annie N. Wang, Mo Zhou, Rong Ge

    Abstract: Monotonic linear interpolation (MLI) - on the line connecting a random initialization with the minimizer it converges to, the loss and accuracy are monotonic - is a phenomenon that is commonly observed in the training of neural networks. Such a phenomenon may seem to suggest that optimization of neural networks is easy. In this paper, we show that the MLI property is not necessarily related to the… ▽ More

    Submitted 14 February, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: ICLR 2023

  10. arXiv:2201.00698  [pdf

    cs.LG cs.AI stat.ML

    Deep-learning-based upscaling method for geologic models via theory-guided convolutional neural network

    Authors: Nanzhe Wang, Qinzhuo Liao, Haibin Chang, Dongxiao Zhang

    Abstract: Large-scale or high-resolution geologic models usually comprise a huge number of grid blocks, which can be computationally demanding and time-consuming to solve with numerical simulators. Therefore, it is advantageous to upscale geologic models (e.g., hydraulic conductivity) from fine-scale (high-resolution grids) to coarse-scale systems. Numerical upscaling methods have been proven to be effectiv… ▽ More

    Submitted 31 December, 2021; originally announced January 2022.

    Comments: 37 pages, 21 pages

  11. arXiv:2107.04855  [pdf, ps, other

    cs.LG stat.ML

    Kernel Mean Estimation by Marginalized Corrupted Distributions

    Authors: Xiaobo Xia, Shuo Shan, Mingming Gong, Nannan Wang, Fei Gao, Haikun Wei, Tongliang Liu

    Abstract: Estimating the kernel mean in a reproducing kernel Hilbert space is a critical component in many kernel learning algorithms. Given a finite sample, the standard estimate of the target kernel mean is the empirical average. Previous works have shown that better estimators can be constructed by shrinkage methods. In this work, we propose to corrupt data examples with noise from known distributions an… ▽ More

    Submitted 10 July, 2021; originally announced July 2021.

  12. arXiv:2103.04387  [pdf, other

    cs.LG stat.ML

    CORe: Capitalizing On Rewards in Bandit Exploration

    Authors: Nan Wang, Branislav Kveton, Maryam Karimzadehgan

    Abstract: We propose a bandit algorithm that explores purely by randomizing its past observations. In particular, the sufficient optimism in the mean reward estimates is achieved by exploiting the variance in the past observed rewards. We name the algorithm Capitalizing On Rewards (CORe). The algorithm is general and can be easily applied to different bandit settings. The main benefit of CORe is that its ex… ▽ More

    Submitted 7 March, 2021; originally announced March 2021.

  13. arXiv:2011.14990  [pdf, other

    q-bio.NC stat.ME

    Multiscale Comparative Connectomics

    Authors: Vivek Gopalakrishnan, Jaewon Chung, Eric Bridgeford, Benjamin D. Pedigo, Jesús Arroyo, Lucy Upchurch, G. Allan Johnson, Nian Wang, Youngser Park, Carey E. Priebe, Joshua T. Vogelstein

    Abstract: The connectome, a map of the structural and/or functional connections in the brain, provides a complex representation of the neurobiological phenotypes on which it supervenes. This information-rich data modality has the potential to transform our understanding of the relationship between patterns in brain connectivity and neurological processes, disorders, and diseases. However, existing computati… ▽ More

    Submitted 2 December, 2024; v1 submitted 30 November, 2020; originally announced November 2020.

  14. arXiv:2011.11981  [pdf

    cs.LG math.NA physics.comp-ph stat.ML

    Deep-learning based discovery of partial differential equations in integral form from sparse and noisy data

    Authors: Hao Xu, Dongxiao Zhang, Nanzhe Wang

    Abstract: Data-driven discovery of partial differential equations (PDEs) has attracted increasing attention in recent years. Although significant progress has been made, certain unresolved issues remain. For example, for PDEs with high-order derivatives, the performance of existing methods is unsatisfactory, especially when the data are sparse and noisy. It is also difficult to discover heterogeneous parame… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

    Journal ref: Journal of Computational Physics, 445 (2021), 110592

  15. Theory-guided Auto-Encoder for Surrogate Construction and Inverse Modeling

    Authors: Nanzhe Wang, Haibin Chang, Dongxiao Zhang

    Abstract: A Theory-guided Auto-Encoder (TgAE) framework is proposed for surrogate construction and is further used for uncertainty quantification and inverse modeling tasks. The framework is built based on the Auto-Encoder (or Encoder-Decoder) architecture of convolutional neural network (CNN) via a theory-guided training process. In order to achieve the theory-guided training, the governing equations of th… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

    Journal ref: Comput. Methods Appl. Mech. Engrg., 385 (2021), 114037

  16. arXiv:2008.10159  [pdf, ps, other

    cs.LG math.OC stat.ML

    A Lagrangian Dual-based Theory-guided Deep Neural Network

    Authors: Miao Rong, Dongxiao Zhang, Nanzhe Wang

    Abstract: The theory-guided neural network (TgNN) is a kind of method which improves the effectiveness and efficiency of neural network architectures by incorporating scientific knowledge or physical information. Despite its great success, the theory-guided (deep) neural network possesses certain limits when maintaining a tradeoff between training data and domain knowledge during the training process. In th… ▽ More

    Submitted 23 August, 2020; originally announced August 2020.

    Comments: 12 pages, 10 figures

    Journal ref: Complex & Intelligent Systems, 2022

  17. arXiv:2007.15580  [pdf

    eess.SP cs.LG math.OC physics.comp-ph stat.ML

    Deep-Learning based Inverse Modeling Approaches: A Subsurface Flow Example

    Authors: Nanzhe Wang, Haibin Chang, Dongxiao Zhang

    Abstract: Deep-learning has achieved good performance and shown great potential for solving forward and inverse problems. In this work, two categories of innovative deep-learning based inverse modeling methods are proposed and compared. The first category is deep-learning surrogate-based inversion methods, in which the Theory-guided Neural Network (TgNN) is constructed as a deep-learning surrogate for probl… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

    Comments: 53 pages, 22 figures, 7 tables

    Journal ref: Journal of Geophysical Research: Solid Earth, e2020JB020549, 2020

  18. Space-time clustering of flash floods in a changing climate (China, 1950-2015)

    Authors: Nan Wang, Luigi Lombardo, Marj Tonini, Weiming Cheng, Liang Guo, Junnan Xiong

    Abstract: The persistence over space and time of flash flood disasters -- flash floods that have caused either economical or life losses, or both -- is a diagnostic measure of areas subjected to hydrological risk. The concept of persistence can be assessed via clustering analyses, performed here to analyse the national inventory of flash floods disasters in China occurred in the period 1950-2015. Specifical… ▽ More

    Submitted 23 June, 2020; originally announced June 2020.

  19. arXiv:2006.09978  [pdf, other

    cs.IR cs.LG stat.ML

    Directional Multivariate Ranking

    Authors: Nan Wang, Hongning Wang

    Abstract: User-provided multi-aspect evaluations manifest users' detailed feedback on the recommended items and enable fine-grained understanding of their preferences. Extensive studies have shown that modeling such data greatly improves the effectiveness and explainability of the recommendations. However, as ranking is essential in recommendation, there is no principled solution yet for collectively genera… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.

    Comments: Accepted as a full research paper in KDD'20

  20. arXiv:2006.07836  [pdf, other

    cs.LG stat.ML

    Part-dependent Label Noise: Towards Instance-dependent Label Noise

    Authors: Xiaobo Xia, Tongliang Liu, Bo Han, Nannan Wang, Mingming Gong, Haifeng Liu, Gang Niu, Dacheng Tao, Masashi Sugiyama

    Abstract: Learning with the \textit{instance-dependent} label noise is challenging, because it is hard to model such real-world noise. Note that there are psychological and physiological evidences showing that we humans perceive instances by decomposing them into parts. Annotators are therefore more likely to annotate instances based on the parts rather than the whole instances, where a wrong mapping from p… ▽ More

    Submitted 2 December, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

  21. arXiv:2006.07831  [pdf, other

    cs.LG stat.ML

    Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels

    Authors: Songhua Wu, Xiaobo Xia, Tongliang Liu, Bo Han, Mingming Gong, Nannan Wang, Haifeng Liu, Gang Niu

    Abstract: Learning with noisy labels has attracted a lot of attention in recent years, where the mainstream approaches are in pointwise manners. Meanwhile, pairwise manners have shown great potential in supervised metric learning and unsupervised contrastive learning. Thus, a natural question is raised: does learning in a pairwise manner mitigate label noise? To give an affirmative answer, in this paper, we… ▽ More

    Submitted 17 June, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

  22. arXiv:2005.04139  [pdf, other

    stat.ML cs.LG stat.AP

    The scalable Birth-Death MCMC Algorithm for Mixed Graphical Model Learning with Application to Genomic Data Integration

    Authors: Nanwei Wang, Laurent Briollais, Helene Massam

    Abstract: Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological mechanisms at an unprecedented depth and scale. A large amount of genomic data is now distributed through consortia like The Cancer Genome Atlas (TCGA), where specific types of biological information on specific type of tissue or cell are avai… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

  23. arXiv:2004.13560  [pdf

    eess.SP cs.LG physics.comp-ph stat.ML

    Efficient Uncertainty Quantification for Dynamic Subsurface Flow with Surrogate by Theory-guided Neural Network

    Authors: Nanzhe Wang, Haibin Chang, Dongxiao Zhang

    Abstract: Subsurface flow problems usually involve some degree of uncertainty. Consequently, uncertainty quantification is commonly necessary for subsurface flow prediction. In this work, we propose a methodology for efficient uncertainty quantification for dynamic subsurface flow with a surrogate constructed by the Theory-guided Neural Network (TgNN). The TgNN here is specially designed for problems with s… ▽ More

    Submitted 25 April, 2020; originally announced April 2020.

  24. arXiv:2004.01328  [pdf, other

    stat.ME

    Penalized composite likelihood for colored graphical Gaussian models

    Authors: Qiong Li, Xiaoying Sun, Nanwei Wang

    Abstract: This paper proposes a penalized composite likelihood method for model selection in colored graphical Gaussian models. The method provides a sparse and symmetry-constrained estimator of the precision matrix, and thus conducts model selection and precision matrix estimation simultaneously. In particular, the method uses penalty terms to constrain the elements of the precision matrix, which enables u… ▽ More

    Submitted 2 April, 2020; originally announced April 2020.

    MSC Class: 62H12 (Primary); 62F15 (Secondary)

  25. arXiv:2002.06508  [pdf, other

    cs.LG stat.ML

    Multi-Class Classification from Noisy-Similarity-Labeled Data

    Authors: Songhua Wu, Xiaobo Xia, Tongliang Liu, Bo Han, Mingming Gong, Nannan Wang, Haifeng Liu, Gang Niu

    Abstract: A similarity label indicates whether two instances belong to the same class while a class label shows the class of the instance. Without class labels, a multi-class classifier could be learned from similarity-labeled pairwise data by meta classification learning. However, since the similarity label is less informative than the class label, it is more likely to be noisy. Deep neural networks can ea… ▽ More

    Submitted 16 February, 2020; originally announced February 2020.

  26. Deep Learning of Subsurface Flow via Theory-guided Neural Network

    Authors: Nanzhe Wang, Dongxiao Zhang, Haibin Chang, Heng Li

    Abstract: Active researches are currently being performed to incorporate the wealth of scientific knowledge into data-driven approaches (e.g., neural networks) in order to improve the latter's effectiveness. In this study, the Theory-guided Neural Network (TgNN) is proposed for deep learning of subsurface flow. In the TgNN, as supervised learning, the neural network is trained with available observations or… ▽ More

    Submitted 24 October, 2019; originally announced November 2019.

    Journal ref: Journal of Hydrology, 2020, 584, 124700

  27. arXiv:1907.09693  [pdf, other

    cs.LG cs.CR cs.DB stat.ML

    A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection

    Authors: Qinbin Li, Zeyi Wen, Zhaomin Wu, Sixu Hu, Naibo Wang, Yuan Li, Xu Liu, Bingsheng He

    Abstract: Federated learning has been a hot research topic in enabling the collaborative training of machine learning models among different organizations under the privacy restrictions. As researchers try to support more machine learning models with different privacy-preserving approaches, there is a requirement in developing systems and infrastructures to ease the development of various federated learning… ▽ More

    Submitted 4 December, 2021; v1 submitted 23 July, 2019; originally announced July 2019.

    Comments: Accepted to IEEE Transactions on Knowledge and Data Engineering (TKDE)

  28. arXiv:1906.02037  [pdf, other

    cs.IR cs.LG stat.ML

    The FacT: Taming Latent Factor Models for Explainability with Factorization Trees

    Authors: Yiyi Tao, Yiling Jia, Nan Wang, Hongning Wang

    Abstract: Latent factor models have achieved great success in personalized recommendations, but they are also notoriously difficult to explain. In this work, we integrate regression trees to guide the learning of latent factor models for recommendation, and use the learnt tree structure to explain the resulting latent factors. Specifically, we build regression trees on users and items respectively with user… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

    Comments: In proceedings of SIGIR'19

  29. arXiv:1906.00189  [pdf, other

    cs.LG stat.ML

    Are Anchor Points Really Indispensable in Label-Noise Learning?

    Authors: Xiaobo Xia, Tongliang Liu, Nannan Wang, Bo Han, Chen Gong, Gang Niu, Masashi Sugiyama

    Abstract: In label-noise learning, \textit{noise transition matrix}, denoting the probabilities that clean labels flip into noisy labels, plays a central role in building \textit{statistically consistent classifiers}. Existing theories have shown that the transition matrix can be learned by exploiting \textit{anchor points} (i.e., data points that belong to a specific class almost surely). However, when the… ▽ More

    Submitted 16 December, 2019; v1 submitted 1 June, 2019; originally announced June 2019.

    Comments: Accepted by NeurIPS 2019

  30. arXiv:1905.05987  [pdf, ps, other

    cs.LG stat.ML

    EasiCS: the objective and fine-grained classification method of cervical spondylosis dysfunction

    Authors: Nana Wang, Li Cui, Xi Huang, Yingcong Xiang, Jing Xiao, Yi Rao

    Abstract: The precise diagnosis is of great significance in developing precise treatment plans to restore neck function and reduce the burden posed by the cervical spondylosis (CS). However, the current available neck function assessment method are subjective and coarse-grained. In this paper, based on the relationship among CS, cervical structure, cervical vertebra function, and surface electromyography (s… ▽ More

    Submitted 15 May, 2019; originally announced May 2019.

  31. arXiv:1901.06588  [pdf, other

    cs.LG stat.ML

    Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks

    Authors: Charbel Sakr, Naigang Wang, Chia-Yu Chen, Jungwook Choi, Ankur Agrawal, Naresh Shanbhag, Kailash Gopalakrishnan

    Abstract: Efforts to reduce the numerical precision of computations in deep learning training have yielded systems that aggressively quantize weights and activations, yet employ wide high-precision accumulators for partial sums in inner-product operations to preserve the quality of convergence. The absence of any framework to analyze the precision requirements of partial sum accumulations results in conserv… ▽ More

    Submitted 19 January, 2019; originally announced January 2019.

    Comments: Published as a conference paper in ICLR 2019

  32. arXiv:1901.06247  [pdf, other

    cs.LG stat.ML

    Micro- and Macro-Level Churn Analysis of Large-Scale Mobile Games

    Authors: Xi Liu, Muhe Xie, Xidao Wen, Rui Chen, Yong Ge, Nick Duffield, Na Wang

    Abstract: As mobile devices become more and more popular, mobile gaming has emerged as a promising market with billion-dollar revenues. A variety of mobile game platforms and services have been developed around the world. A critical challenge for these platforms and services is to understand the churn behavior in mobile games, which usually involves churn at micro level (between an app and a specific user)… ▽ More

    Submitted 14 January, 2019; originally announced January 2019.

    Comments: arXiv admin note: substantial text overlap with arXiv:1808.06573

  33. arXiv:1812.08011  [pdf, other

    cs.LG stat.ML

    Training Deep Neural Networks with 8-bit Floating Point Numbers

    Authors: Naigang Wang, Jungwook Choi, Daniel Brand, Chia-Yu Chen, Kailash Gopalakrishnan

    Abstract: The state-of-the-art hardware platforms for training Deep Neural Networks (DNNs) are moving from traditional single precision (32-bit) computations towards 16 bits of precision -- in large part due to the high energy efficiency and smaller bit storage associated with using reduced-precision representations. However, unlike inference, training with numbers represented with less than 16 bits has bee… ▽ More

    Submitted 19 December, 2018; originally announced December 2018.

    Comments: NeurIPS 2018 (12 pages)

  34. arXiv:1812.04912  [pdf, ps, other

    cs.LG stat.ML

    EasiCSDeep: A deep learning model for Cervical Spondylosis Identification using surface electromyography signal

    Authors: Nana Wang, Li Cui, Xi Huang, Yingcong Xiang, Jing Xiao

    Abstract: Cervical spondylosis (CS) is a common chronic disease that affects up to two-thirds of the population and poses a serious burden on individuals and society. The early identification has significant value in improving cure rate and reducing costs. However, the pathology is complex, and the mild symptoms increase the difficulty of the diagnosis, especially in the early stage. Besides, the time-consu… ▽ More

    Submitted 12 December, 2018; originally announced December 2018.

  35. arXiv:1812.04287  [pdf, other

    cs.LG cs.CV stat.ML

    Deep Density-based Image Clustering

    Authors: Yazhou Ren, Ni Wang, Mingxia Li, Zenglin Xu

    Abstract: Recently, deep clustering, which is able to perform feature learning that favors clustering tasks via deep neural networks, has achieved remarkable performance in image clustering applications. However, the existing deep clustering algorithms generally need the number of clusters in advance, which is usually unknown in real-world tasks. In addition, the initial cluster centers in the learned featu… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.

  36. arXiv:1808.06573  [pdf, other

    cs.LG stat.ML

    A Semi-Supervised and Inductive Embedding Model for Churn Prediction of Large-Scale Mobile Games

    Authors: Xi Liu, Muhe Xie, Xidao Wen, Rui Chen, Yong Ge, Nick Duffield, Na Wang

    Abstract: Mobile gaming has emerged as a promising market with billion-dollar revenues. A variety of mobile game platforms and services have been developed around the world. One critical challenge for these platforms and services is to understand user churn behavior in mobile games. Accurate churn prediction will benefit many stakeholders such as game developers, advertisers, and platform operators. In this… ▽ More

    Submitted 10 October, 2018; v1 submitted 20 August, 2018; originally announced August 2018.

    Comments: to appear in ICDM 2018

  37. arXiv:1808.03388  [pdf

    cs.ET cs.LG stat.ML

    Code-division multiplexed resistive pulse sensor networks for spatio-temporal detection of particles in microfluidic devices

    Authors: Ningquan Wang, Ruxiu Liu, Roozbeh Khodambashi, Norh Asmare, A. Fatih Sarioglu

    Abstract: Spatial separation of suspended particles based on contrast in their physical or chemical properties forms the basis of various biological assays performed on lab-on-achip devices. To electronically acquire this information, we have recently introduced a microfluidic sensing platform, called Microfluidic CODES, which combines the resistive pulse sensing with the code division multiple access in mu… ▽ More

    Submitted 9 August, 2018; originally announced August 2018.

    Comments: 2017 IEEE 30th International Conference on Micro Electro Mechanical Systems (MEMS)

    MSC Class: I.2.9

  38. arXiv:1505.00853  [pdf, other

    cs.LG cs.CV stat.ML

    Empirical Evaluation of Rectified Activations in Convolutional Network

    Authors: Bing Xu, Naiyan Wang, Tianqi Chen, Mu Li

    Abstract: In this paper we investigate the performance of different types of rectified activation functions in convolutional neural network: standard rectified linear unit (ReLU), leaky rectified linear unit (Leaky ReLU), parametric rectified linear unit (PReLU) and a new randomized leaky rectified linear units (RReLU). We evaluate these activation function on standard image classification task. Our experim… ▽ More

    Submitted 27 November, 2015; v1 submitted 4 May, 2015; originally announced May 2015.

  39. arXiv:1504.05434  [pdf, other

    stat.ML

    A local approach to estimation in discrete loglinear models

    Authors: Helene Massam, Nanwei Wang

    Abstract: We consider two connected aspects of maximum likelihood estimation of the parameter for high-dimensional discrete graphical models: the existence of the maximum likelihood estimate (mle) and its computation. When the data is sparse, there are many zeros in the contingency table and the maximum likelihood estimate of the parameter may not exist. Fienberg and Rinaldo (2012) have shown that the mle… ▽ More

    Submitted 21 April, 2015; originally announced April 2015.

    Comments: 36 pages, 1 figure and 3 tables

    MSC Class: 62H17; 62M40

  40. arXiv:1409.2944  [pdf, other

    cs.LG cs.CL cs.IR cs.NE stat.ML

    Collaborative Deep Learning for Recommender Systems

    Authors: Hao Wang, Naiyan Wang, Dit-Yan Yeung

    Abstract: Collaborative filtering (CF) is a successful approach commonly used by many recommender systems. Conventional CF-based methods use the ratings given to items by users as the sole source of information for learning to make recommendation. However, the ratings are often very sparse in many applications, causing CF-based methods to degrade significantly in their recommendation performance. To address… ▽ More

    Submitted 18 June, 2015; v1 submitted 9 September, 2014; originally announced September 2014.

  41. arXiv:1406.7349  [pdf

    stat.ML q-bio.QM

    Convex Analysis of Mixtures for Separating Non-negative Well-grounded Sources

    Authors: Yitan Zhu, Niya Wang, David J. Miller, Yue Wang

    Abstract: Blind Source Separation (BSS) has proven to be a powerful tool for the analysis of composite patterns in engineering and science. We introduce Convex Analysis of Mixtures (CAM) for separating non-negative well-grounded sources, which learns the mixing matrix by identifying the lateral edges of the convex data scatter plot. We prove a sufficient and necessary condition for identifying the mixing ma… ▽ More

    Submitted 10 December, 2015; v1 submitted 27 June, 2014; originally announced June 2014.

    Comments: 15 pages, 9 figures, 2 tables

  42. arXiv:1401.5900  [pdf, ps, other

    cs.NE cs.LG stat.ML

    Gaussian-binary Restricted Boltzmann Machines on Modeling Natural Image Statistics

    Authors: Nan Wang, Jan Melchior, Laurenz Wiskott

    Abstract: We present a theoretical analysis of Gaussian-binary restricted Boltzmann machines (GRBMs) from the perspective of density models. The key aspect of this analysis is to show that GRBMs can be formulated as a constrained mixture of Gaussians, which gives a much better insight into the model's capabilities and limitations. We show that GRBMs are capable of learning meaningful features both in a two-… ▽ More

    Submitted 23 January, 2014; originally announced January 2014.

    Comments: Current version is only an early manuscript and is subject to further change

    Journal ref: PLoS ONE 12(2): e0171015 (2017)

  43. arXiv:1310.7033  [pdf

    stat.ML q-bio.GN q-bio.QM stat.AP

    A feasible roadmap for unsupervised deconvolution of two-source mixed gene expressions

    Authors: Niya Wang, Eric P. Hoffman, Robert Clarke, Zhen Zhang, David M. Herrington, Ie-Ming Shih, Douglas A. Levine, Guoqiang Yu, Jianhua Xuan, Yue Wang

    Abstract: Tissue heterogeneity is a major confounding factor in studying individual populations that cannot be resolved directly by global profiling. Experimental solutions to mitigate tissue heterogeneity are expensive, time consuming, inapplicable to existing data, and may alter the original gene expression patterns. Here we ask whether it is possible to deconvolute two-source mixed expressions (estimatin… ▽ More

    Submitted 25 October, 2013; originally announced October 2013.

    Comments: 5 pages, 5 figures, 3 tables

  44. arXiv:1310.5666  [pdf, ps, other

    stat.ML

    Distributed parameter estimation of discrete hierarchical models via marginal likelihoods

    Authors: Helene Massam, Nanwei Wang

    Abstract: We consider discrete graphical models Markov with respect to a graph $G$ and propose two distributed marginal methods to estimate the maximum likelihood estimate of the canonical parameter of the model. Both methods are based on a relaxation of the marginal likelihood obtained by considering the density of the variables represented by a vertex $v$ of $G$ and a neighborhood. The two methods differ… ▽ More

    Submitted 21 October, 2013; originally announced October 2013.

    Comments: 21 pages, 7 figures

    MSC Class: 62H17 (Primary); 62M40

  45. Unsupervised deconvolution of dynamic imaging reveals intratumor vascular heterogeneity

    Authors: Li Chen, Peter L. Choyke, Niya Wang, Robert Clarke, Zaver M. Bhujwalla, Elizabeth M. C. Hillman, Yue Wang

    Abstract: Intratumor heterogeneity is often manifested by vascular compartments with distinct pharmacokinetics that cannot be resolved directly by in vivo dynamic imaging. We developed tissue-specific compartment modeling (TSCM), an unsupervised computational method of deconvolving dynamic imaging series from heterogeneous tumors that can improve vascular phenotyping in many biological contexts. Applying TS… ▽ More

    Submitted 4 September, 2014; v1 submitted 14 June, 2013; originally announced June 2013.

    Comments: Content: main manuscript, 31 pages

  46. arXiv:1210.2464  [pdf, other

    stat.ME

    On The Degrees of Freedom of Reduced-rank Estimators in Multivariate Regression

    Authors: Ashin Mukherjee, Kun Chen, Naisyin Wang, Ji Zhu

    Abstract: In this paper we study the effective degrees of freedom of a general class of reduced rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation (SURE). We derive a finite-sample exact unbiased estimator that admits a closed-form expression in terms of the singular values or thresholded singular values of the least squares solution and hence readily computable… ▽ More

    Submitted 19 April, 2013; v1 submitted 8 October, 2012; originally announced October 2012.

    Comments: 29 pages, 3 figures