Skip to main content

Showing 1–50 of 70 results for author: Xie, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2507.07041  [pdf, ps, other

    stat.ME cs.LG stat.ML

    Non-Asymptotic Analysis of Online Local Private Learning with SGD

    Authors: Enze Shi, Jinhan Xie, Bei Jiang, Linglong Kong, Xuming He

    Abstract: Differentially Private Stochastic Gradient Descent (DP-SGD) has been widely used for solving optimization problems with privacy guarantees in machine learning and statistics. Despite this, a systematic non-asymptotic convergence analysis for DP-SGD, particularly in the context of online problems and local differential privacy (LDP) models, remains largely elusive. Existing non-asymptotic analyses… ▽ More

    Submitted 9 July, 2025; originally announced July 2025.

  2. arXiv:2506.17657  [pdf

    stat.AP cs.CY

    Research on the recommendation framework of foreign enterprises from the perspective of multidimensional proximity

    Authors: Guoqiang Liang, Jiarui Xie, Mengxuan Li, Shuo Zhang

    Abstract: As global economic integration progresses, foreign-funded enterprises play an increasingly crucial role in fostering local economic growth and enhancing industrial development. However, there are not many researches to deal with this aspect in recent years. This study utilizes the multidimensional proximity theory to thoroughly examine the criteria for selecting high-quality foreign-funded compani… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

  3. arXiv:2506.12334  [pdf, ps, other

    stat.ME math.ST

    A Generalized Framework for Approximate Co-Sufficient Sampling

    Authors: Jie Xie, Dongming Huang

    Abstract: Approximate co-sufficient sampling (aCSS) offers a principled route to hypothesis testing when null distributions are unknown, yet current implementations are confined to maximum likelihood estimators with smooth or linear regularization and provide little theoretical insight into power. We present a generalized framework that widens the scope of the aCSS method to embrace nonlinear regularization… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  4. arXiv:2506.05116  [pdf, ps, other

    stat.ME econ.EM math.ST

    The Spurious Factor Dilemma: Robust Inference in Heavy-Tailed Elliptical Factor Models

    Authors: Jiang Hu, Jiahui Xie, Yangchun Zhang, Wang Zhou

    Abstract: Factor models are essential tools for analyzing high-dimensional data, particularly in economics and finance. However, standard methods for determining the number of factors often overestimate the true number when data exhibit heavy-tailed randomness, misinterpreting noise-induced outliers as genuine factors. This paper addresses this challenge within the framework of Elliptical Factor Models (EFM… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  5. arXiv:2505.14725  [pdf, ps, other

    q-bio.GN cs.LG stat.AP

    HR-VILAGE-3K3M: A Human Respiratory Viral Immunization Longitudinal Gene Expression Dataset for Systems Immunity

    Authors: Xuejun Sun, Yiran Song, Xiaochen Zhou, Ruilie Cai, Yu Zhang, Xinyi Li, Rui Peng, Jialiu Xie, Yuanyuan Yan, Muyao Tang, Prem Lakshmanane, Baiming Zou, James S. Hagood, Raymond J. Pickles, Didong Li, Fei Zou, Xiaojing Zheng

    Abstract: Respiratory viral infections pose a global health burden, yet the cellular immune responses driving protection or pathology remain unclear. Natural infection cohorts often lack pre-exposure baseline data and structured temporal sampling. In contrast, inoculation and vaccination trials generate insightful longitudinal transcriptomic data. However, the scattering of these datasets across platforms,… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  6. arXiv:2505.08227  [pdf, ps, other

    stat.ME

    Online differentially private inference in stochastic gradient descent

    Authors: Jinhan Xie, Enze Shi, Bei Jiang, Linglong Kong, Xuming He

    Abstract: We propose a general privacy-preserving optimization-based framework for real-time environments without requiring trusted data curators. In particular, we introduce a noisy stochastic gradient descent algorithm for online statistical inference with streaming data under local differential privacy constraints. Unlike existing methods that either disregard privacy protection or require full access to… ▽ More

    Submitted 9 June, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

  7. arXiv:2504.06685  [pdf, other

    stat.ME

    Testing Multivariate Conditional Independence Using Exchangeable Sampling and Sufficient Statistics

    Authors: Xiaotong Lin, Jie Xie, Fangqiao Tian, Dongming Huang

    Abstract: We consider testing multivariate conditional independence between a response Y and a covariate vector X given additional variables Z. We introduce the Multivariate Sufficient Statistic Conditional Randomization Test (MS-CRT), which generates exchangeable copies of X by conditioning on sufficient statistics of P(X|Z). MS-CRT requires no modelling assumption on Y and accommodates any test statistics… ▽ More

    Submitted 9 April, 2025; originally announced April 2025.

    Comments: arXiv admin note: text overlap with arXiv:2312.01815

  8. arXiv:2503.15210  [pdf, other

    stat.ML cs.LG

    Online federated learning framework for classification

    Authors: Wenxing Guo, Jinhan Xie, Jianya Lu, Bei jiang, Hongsheng Dai, Linglong Kong

    Abstract: In this paper, we develop a novel online federated learning framework for classification, designed to handle streaming data from multiple clients while ensuring data privacy and computational efficiency. Our method leverages the generalized distance-weighted discriminant technique, making it robust to both homogeneous and heterogeneous data distributions across clients. In particular, we develop a… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

  9. arXiv:2503.03634  [pdf, other

    stat.ML cs.LG stat.ME

    Feature Matching Intervention: Leveraging Observational Data for Causal Representation Learning

    Authors: Haoze Li, Jun Xie

    Abstract: A major challenge in causal discovery from observational data is the absence of perfect interventions, making it difficult to distinguish causal features from spurious ones. We propose an innovative approach, Feature Matching Intervention (FMI), which uses a matching procedure to mimic perfect interventions. We define causal latent graphs, extending structural causal models to latent feature space… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

  10. arXiv:2502.01567  [pdf, ps, other

    cs.CL cs.LG stat.ML

    Latent Thought Models with Variational Bayes Inference-Time Computation

    Authors: Deqian Kong, Minglu Zhao, Dehong Xu, Bo Pang, Shu Wang, Edouardo Honig, Zhangzhang Si, Chuan Li, Jianwen Xie, Sirui Xie, Ying Nian Wu

    Abstract: We propose a novel class of language models, Latent Thought Models (LTMs), which incorporate explicit latent thought vectors that follow an explicit prior model in latent space. These latent thought vectors guide the autoregressive generation of ground tokens through a Transformer decoder. Training employs a dual-rate optimization process within the classical variational Bayes framework: fast lear… ▽ More

    Submitted 6 June, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

  11. arXiv:2412.06233  [pdf, other

    stat.ML cs.LG

    Representational Transfer Learning for Matrix Completion

    Authors: Yong He, Zeyu Li, Dong Liu, Kangxiang Qin, Jiahui Xie

    Abstract: We propose to transfer representational knowledge from multiple sources to a target noisy matrix completion task by aggregating singular subspaces information. Under our representational similarity framework, we first integrate linear representation information by solving a two-way principal component analysis problem based on a properly debiased matrix-valued dataset. After acquiring better colum… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

  12. arXiv:2412.05174  [pdf, other

    eess.SP stat.ME

    Compound Gaussian Radar Clutter Model With Positive Tempered Alpha-Stable Texture

    Authors: Xingxing Liao, Junhao Xie, Jie Zhou

    Abstract: The compound Gaussian (CG) family of distributions has achieved great success in modeling sea clutter. This work develops a flexible-tailed CG model to improve generality in clutter modeling, by introducing the positive tempered $α$-stable (PT$α$S) distribution to model clutter texture. The PT$α$S distribution exhibits widely tunable tails by tempering the heavy tails of the positive $α$-stable (P… ▽ More

    Submitted 6 December, 2024; originally announced December 2024.

    Comments: 7 pages, 4 figures, 2 tables

  13. arXiv:2409.03845  [pdf, other

    cs.LG stat.ML

    Latent Space Energy-based Neural ODEs

    Authors: Sheng Cheng, Deqian Kong, Jianwen Xie, Kookjin Lee, Ying Nian Wu, Yezhou Yang

    Abstract: This paper introduces novel deep dynamical models designed to represent continuous-time sequences. Our approach employs a neural emission model to generate each data point in the time series through a non-linear transformation of a latent state vector. The evolution of these latent states is implicitly defined by a neural ordinary differential equation (ODE), with the initial state drawn from an i… ▽ More

    Submitted 5 February, 2025; v1 submitted 5 September, 2024; originally announced September 2024.

  14. arXiv:2409.03618  [pdf, other

    stat.ML cs.LG

    DART2: a robust multiple testing method to smartly leverage helpful or misleading ancillary information

    Authors: Xuechan Li, Jichun Xie

    Abstract: In many applications of multiple testing, ancillary information is available, reflecting the hypothesis null or alternative status. Several methods have been developed to leverage this ancillary information to enhance testing power, typically requiring the ancillary information is helpful enough to ensure favorable performance. In this paper, we develop a robust and effective distance-assisted mul… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

    Comments: 26 pages, 6 figures

  15. arXiv:2405.16730  [pdf, other

    cs.LG cs.AI stat.AP

    Latent Energy-Based Odyssey: Black-Box Optimization via Expanded Exploration in the Energy-Based Latent Space

    Authors: Peiyu Yu, Dinghuai Zhang, Hengzhi He, Xiaojian Ma, Ruiyao Miao, Yifan Lu, Yasi Zhang, Deqian Kong, Ruiqi Gao, Jianwen Xie, Guang Cheng, Ying Nian Wu

    Abstract: Offline Black-Box Optimization (BBO) aims at optimizing a black-box function using the knowledge from a pre-collected offline dataset of function values and corresponding input designs. However, the high-dimensional and highly-multimodal input design space of black-box function pose inherent challenges for most existing methods that model and operate directly upon input designs. These issues inclu… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  16. arXiv:2310.12667  [pdf, other

    stat.ML cs.LG

    STANLEY: Stochastic Gradient Anisotropic Langevin Dynamics for Learning Energy-Based Models

    Authors: Belhal Karimi, Jianwen Xie, Ping Li

    Abstract: We propose in this paper, STANLEY, a STochastic gradient ANisotropic LangEvin dYnamics, for sampling high dimensional data. With the growing efficacy and potential of Energy-Based modeling, also known as non-normalized probabilistic modeling, for modeling a generative process of different natures of high dimensional data observations, we present an end-to-end learning algorithm for Energy-Based mo… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: text overlap with arXiv:1207.5938 by other authors

  17. arXiv:2310.03253  [pdf, other

    cs.LG q-bio.BM stat.ML

    Molecule Design by Latent Prompt Transformer

    Authors: Deqian Kong, Yuhao Huang, Jianwen Xie, Ying Nian Wu

    Abstract: This paper proposes a latent prompt Transformer model for solving challenging optimization problems such as molecule design, where the goal is to find molecules with optimal values of a target chemical or biological property that can be computed by an existing software. Our proposed model consists of three components. (1) A latent vector whose prior distribution is modeled by a Unet transformation… ▽ More

    Submitted 5 February, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

  18. arXiv:2309.05153  [pdf, other

    stat.ML cs.LG

    Learning Energy-Based Models by Cooperative Diffusion Recovery Likelihood

    Authors: Yaxuan Zhu, Jianwen Xie, Yingnian Wu, Ruiqi Gao

    Abstract: Training energy-based models (EBMs) on high-dimensional data can be both challenging and time-consuming, and there exists a noticeable gap in sample quality between EBMs and other generative frameworks like GANs and diffusion models. To close this gap, inspired by the recent efforts of learning EBMs by maximizing diffusion recovery likelihood (DRL), we propose cooperative diffusion recovery likeli… ▽ More

    Submitted 10 November, 2024; v1 submitted 10 September, 2023; originally announced September 2023.

  19. arXiv:2304.07918  [pdf, other

    cs.CV stat.ML

    Likelihood-Based Generative Radiance Field with Latent Space Energy-Based Model for 3D-Aware Disentangled Image Representation

    Authors: Yaxuan Zhu, Jianwen Xie, Ping Li

    Abstract: We propose the NeRF-LEBM, a likelihood-based top-down 3D-aware 2D image generative model that incorporates 3D representation via Neural Radiance Fields (NeRF) and 2D imaging process via differentiable volume rendering. The model represents an image as a rendering process from 3D object to 2D image and is conditioned on some latent variables that account for object characteristics and are assumed t… ▽ More

    Submitted 16 April, 2023; originally announced April 2023.

  20. arXiv:2303.05389  [pdf

    cs.CL cs.AI cs.SI stat.AP

    Depression Detection Using Digital Traces on Social Media: A Knowledge-aware Deep Learning Approach

    Authors: Wenli Zhang, Jiaheng Xie, Zhu Zhang, Xiang Liu

    Abstract: Depression is a common disease worldwide. It is difficult to diagnose and continues to be underdiagnosed. Because depressed patients constantly share their symptoms, major life events, and treatments on social media, researchers are turning to user-generated digital traces on social media for depression detection. Such methods have distinct advantages in combating depression because they can facil… ▽ More

    Submitted 1 August, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: Presented at INFORMS 2022 Data Science Workshop

    MSC Class: H.4.m ACM Class: K.5

  21. arXiv:2303.03532  [pdf, ps, other

    stat.ME

    Extreme eigenvalues of sample covariance matrices under generalized elliptical models with applications

    Authors: Xiucai Ding, Jiahui Xie, Long Yu, Wang Zhou

    Abstract: We consider the extreme eigenvalues of the sample covariance matrix $Q=YY^*$ under the generalized elliptical model that $Y=Σ^{1/2}XD.$ Here $Σ$ is a bounded $p \times p$ positive definite deterministic matrix representing the population covariance structure, $X$ is a $p \times n$ random matrix containing either independent columns sampled from the unit sphere in $\mathbb{R}^p$ or i.i.d. centered… ▽ More

    Submitted 19 April, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: 90 pages, 6 figures, some typos are corrected

  22. arXiv:2302.02457   

    stat.ME

    Scalable inference in functional linear regression with streaming data

    Authors: Jinhan Xie, Enze Shi, Peijun Sang, Zuofeng Shang, Bei Jiang, Linglong Kong

    Abstract: Traditional static functional data analysis is facing new challenges due to streaming data, where data constantly flow in. A major challenge is that storing such an ever-increasing amount of data in memory is nearly impossible. In addition, existing inferential tools in online learning are mainly developed for finite-dimensional problems, while inference methods for functional data are focused on… ▽ More

    Submitted 10 October, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: Due to the request of one of the co-authors, we tentatively withdrew the manuscript

  23. arXiv:2301.09300  [pdf, other

    stat.ML cs.LG

    A Tale of Two Latent Flows: Learning Latent Space Normalizing Flow with Short-run Langevin Flow for Approximate Inference

    Authors: Jianwen Xie, Yaxuan Zhu, Yifei Xu, Dingcheng Li, Ping Li

    Abstract: We study a normalizing flow in the latent space of a top-down generator model, in which the normalizing flow model plays the role of the informative prior model of the generator. We propose to jointly learn the latent space normalizing flow prior model and the top-down generator model by a Markov chain Monte Carlo (MCMC)-based maximum likelihood algorithm, where a short-run Langevin sampling from… ▽ More

    Submitted 23 January, 2023; originally announced January 2023.

    Comments: The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI) 2023

  24. arXiv:2211.13748  [pdf, other

    cs.CY cs.SI stat.AP

    How We Express Ourselves Freely: Censorship, Self-censorship, and Anti-censorship on a Chinese Social Media

    Authors: Xiang Chen, Jiamu Xie, Zixin Wang, Bohui Shen, Zhixuan Zhou

    Abstract: Censorship, anti-censorship, and self-censorship in an authoritarian regime have been extensively studies, yet the relationship between these intertwined factors is not well understood. In this paper, we report results of a large-scale survey study (N = 526) with Sina Weibo users toward bridging this research gap. Through descriptive statistics, correlation analysis, and regression analysis, we un… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

    Comments: iConference 2023 has accepted

  25. arXiv:2210.01342  [pdf, other

    stat.ME math.ST

    Estimating heterogeneous treatment effects versus building individualized treatment rules: Connection and disconnection

    Authors: Zhongyuan Chen, Jun Xie

    Abstract: Estimating heterogeneous treatment effects is a well-studied topic in the statistics literature. More recently, it has regained attention due to an increasing need for precision medicine as well as the increased use of state-of-art machine learning methods in the estimation. Furthermore, estimating heterogeneous treatment effects is directly related to building an individualized treatment rule, wh… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

  26. arXiv:2210.00937  [pdf, ps, other

    stat.ME

    Inference on High-dimensional Single-index Models with Streaming Data

    Authors: Dongxiao Han, Jinhan Xie, Jin Liu, Liuquan Sun, Jian Huang, Bei Jian, Linglong Kong

    Abstract: Traditional statistical methods are faced with new challenges due to streaming data. The major challenge is the rapidly growing volume and velocity of data, which makes storing such huge datasets in memory impossible. The paper presents an online inference framework for regression parameters in high-dimensional semiparametric single-index models with unknown link functions. The proposed online pro… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

    Comments: 38 pages, 2 figures

  27. arXiv:2205.06924  [pdf, other

    stat.ML cs.LG

    A Tale of Two Flows: Cooperative Learning of Langevin Flow and Normalizing Flow Toward Energy-Based Model

    Authors: Jianwen Xie, Yaxuan Zhu, Jun Li, Ping Li

    Abstract: This paper studies the cooperative learning of two generative flow models, in which the two models are iteratively updated based on the jointly synthesized examples. The first flow model is a normalizing flow that transforms an initial simple density to a target density by applying a sequence of invertible transformations. The second flow model is a Langevin flow that runs finite steps of gradient… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: 23 pages

    Journal ref: ICLR 2022

  28. arXiv:2204.04567  [pdf, other

    cs.CV cs.LG stat.ML

    Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification

    Authors: Jiangtao Xie, Fei Long, Jiaming Lv, Qilong Wang, Peihua Li

    Abstract: Few-shot classification is a challenging problem as only very few training examples are given for each new task. One of the effective research lines to address this challenge focuses on learning deep representations driven by a similarity measure between a query image and few support images of some class. Statistically, this amounts to measure the dependency of image features, viewed as random vec… ▽ More

    Submitted 9 April, 2022; originally announced April 2022.

    Comments: Accepted to CVPR 2022 as an oral presentation. Equal contribution from first two authors

  29. arXiv:2110.01873  [pdf, other

    econ.EM stat.AP

    A New Multivariate Predictive Model for Stock Returns

    Authors: Jianying Xie

    Abstract: One of the most important studies in finance is to find out whether stock returns could be predicted. This research aims to create a new multivariate model, which includes dividend yield, earnings-to-price ratio, book-to-market ratio as well as consumption-wealth ratio as explanatory variables, for future stock returns predictions. The new multivariate model will be assessed for its forecasting pe… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

  30. arXiv:2108.04364  [pdf, other

    stat.ME

    Data-guided Treatment Recommendation with Feature Scores

    Authors: Zhongyuan Chen, Ziyi Wang, Qifan Song, Jun Xie

    Abstract: Despite the availability of large amounts of genomics data, medical treatment recommendations have not successfully used them. In this paper, we consider the utility of high dimensional genomic-clinical data and nonparametric methods for making cancer treatment recommendations. This builds upon the framework of the individualized treatment rule [Qian and Murphy 2011] but we aim to overcome their m… ▽ More

    Submitted 19 February, 2022; v1 submitted 9 August, 2021; originally announced August 2021.

  31. arXiv:2103.11085  [pdf, other

    stat.ME

    Distance Assisted Recursive Testing

    Authors: Xuechan Li, Anthony Sung, Jichun Xie

    Abstract: In many applications, a large number of features are collected with the goal to identify a few important ones. Sometimes, these features lie in a metric space with a known distance matrix, which partially reflects their co-importance pattern. Proper use of the distance matrix will boost the power of identifying important features. Hence, we develop a new multiple testing framework named the Distan… ▽ More

    Submitted 24 September, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

  32. arXiv:2012.14936  [pdf, other

    stat.ML cs.LG

    Learning Energy-Based Model with Variational Auto-Encoder as Amortized Sampler

    Authors: Jianwen Xie, Zilong Zheng, Ping Li

    Abstract: Due to the intractable partition function, training energy-based models (EBMs) by maximum likelihood requires Markov chain Monte Carlo (MCMC) sampling to approximate the gradient of the Kullback-Leibler divergence between data and model distributions. However, it is non-trivial to sample from an EBM because of the difficulty of mixing between modes. In this paper, we propose to learn a variational… ▽ More

    Submitted 23 December, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

    Journal ref: Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), pp10441--10451, 2021

  33. DS-UI: Dual-Supervised Mixture of Gaussian Mixture Models for Uncertainty Inference

    Authors: Jiyang Xie, Zhanyu Ma, Jing-Hao Xue, Guoqiang Zhang, Jun Guo

    Abstract: This paper proposes a dual-supervised uncertainty inference (DS-UI) framework for improving Bayesian estimation-based uncertainty inference (UI) in deep neural network (DNN)-based image recognition. In the DS-UI, we combine the classifier of a DNN, i.e., the last fully-connected (FC) layer, with a mixture of Gaussian mixture models (MoGMM) to obtain an MoGMM-FC layer. Unlike existing UI methods fo… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

  34. Advanced Dropout: A Model-free Methodology for Bayesian Dropout Optimization

    Authors: Jiyang Xie, Zhanyu Ma, and Jianjun Lei, Guoqiang Zhang, Jing-Hao Xue, Zheng-Hua Tan, Jun Guo

    Abstract: Due to lack of data, overfitting ubiquitously exists in real-world applications of deep neural networks (DNNs). We propose advanced dropout, a model-free methodology, to mitigate overfitting and improve the performance of DNNs. The advanced dropout technique applies a model-free and easily implemented distribution with parametric prior, and adaptively adjusts dropout rate. Specifically, the distri… ▽ More

    Submitted 10 August, 2021; v1 submitted 11 October, 2020; originally announced October 2020.

    Comments: Accepted by IEEE TPAMI, 2021

  35. arXiv:2008.09763  [pdf, other

    cs.LG cs.CE stat.ML

    Variational Autoencoder for Anti-Cancer Drug Response Prediction

    Authors: Hongyuan Dong, Jiaqing Xie, Zhi Jing, Dexin Ren

    Abstract: Cancer is a primary cause of human death, but discovering drugs and tailoring cancer therapies are expensive and time-consuming. We seek to facilitate the discovery of new drugs and treatment strategies for cancer using variational autoencoders (VAEs) and multi-layer perceptrons (MLPs) to predict anti-cancer drug responses. Our model takes as input gene expression data of cancer cell lines and ant… ▽ More

    Submitted 15 April, 2021; v1 submitted 22 August, 2020; originally announced August 2020.

  36. arXiv:2007.00426  [pdf

    cs.GT cs.LG stat.ML

    Dynamic Bidding Strategies with Multivariate Feedback Control for Multiple Goals in Display Advertising

    Authors: Michael Tashman, Jiayi Xie, John Hoffman, Lee Winikor, Rouzbeh Gerami

    Abstract: Real-Time Bidding (RTB) display advertising is a method for purchasing display advertising inventory in auctions that occur within milliseconds. The performance of RTB campaigns is generally measured with a series of Key Performance Indicators (KPIs) - measurements used to ensure that the campaign is cost-effective and that it is purchasing valuable inventory. While an RTB campaign should ideally… ▽ More

    Submitted 1 June, 2020; originally announced July 2020.

  37. arXiv:2006.10259  [pdf, other

    q-bio.NC cs.LG stat.ML

    On Path Integration of Grid Cells: Group Representation and Isotropic Scaling

    Authors: Ruiqi Gao, Jianwen Xie, Xue-Xin Wei, Song-Chun Zhu, Ying Nian Wu

    Abstract: Understanding how grid cells perform path integration calculations remains a fundamental problem. In this paper, we conduct theoretical analysis of a general representation model of path integration by grid cells, where the 2D self-position is encoded as a higher dimensional vector, and the 2D self-motion is represented by a general transformation of the vector. We identify two conditions on the t… ▽ More

    Submitted 3 November, 2021; v1 submitted 17 June, 2020; originally announced June 2020.

  38. arXiv:2005.11442  [pdf, other

    cs.LG stat.ML

    Active Learning for Skewed Data Sets

    Authors: Abbas Kazerouni, Qi Zhao, Jing Xie, Sandeep Tata, Marc Najork

    Abstract: Consider a sequential active learning problem where, at each round, an agent selects a batch of unlabeled data points, queries their labels and updates a binary classifier. While there exists a rich body of work on active learning in this general form, in this paper, we focus on problems with two distinguishing characteristics: severe class imbalance (skew) and small amounts of initial training da… ▽ More

    Submitted 22 May, 2020; originally announced May 2020.

  39. arXiv:2003.04575  [pdf, other

    cs.LG stat.ML

    GPCA: A Probabilistic Framework for Gaussian Process Embedded Channel Attention

    Authors: Jiyang Xie, Dongliang Chang, Zhanyu Ma, Guoqiang Zhang, Jun Guo

    Abstract: Channel attention mechanisms have been commonly applied in many visual tasks for effective performance improvement. It is able to reinforce the informative channels as well as to suppress the useless channels. Recently, different channel attention modules have been proposed and implemented in various ways. Generally speaking, they are mainly based on convolution and pooling operations. In this pap… ▽ More

    Submitted 10 August, 2021; v1 submitted 10 March, 2020; originally announced March 2020.

    Comments: Accepted by IEEE TPAMI, 2021

  40. arXiv:2001.06587  [pdf, ps, other

    cs.LG cs.GT stat.ML

    Scalable Bid Landscape Forecasting in Real-time Bidding

    Authors: Aritra Ghosh, Saayan Mitra, Somdeb Sarkhel, Jason Xie, Gang Wu, Viswanathan Swaminathan

    Abstract: In programmatic advertising, ad slots are usually sold using second-price (SP) auctions in real-time. The highest bidding advertiser wins but pays only the second-highest bid (known as the winning price). In SP, for a single item, the dominant strategy of each bidder is to bid the true value from the bidder's perspective. However, in a practical setting, with budget constraints, bidding the true v… ▽ More

    Submitted 17 January, 2020; originally announced January 2020.

    Comments: Appeared in ECML-PKDD 2019

  41. arXiv:1911.11374  [pdf, other

    stat.ML cs.LG

    Representation Learning: A Statistical Perspective

    Authors: Jianwen Xie, Ruiqi Gao, Erik Nijkamp, Song-Chun Zhu, Ying Nian Wu

    Abstract: Learning representations of data is an important problem in statistics and machine learning. While the origin of learning representations can be traced back to factor analysis and multidimensional scaling in statistics, it has become a central theme in deep learning with important applications in computer vision and computational neuroscience. In this article, we review recent advances in learning… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Journal ref: Annual Review of Statistics and Its Application 2020

  42. arXiv:1910.09396  [pdf, ps, other

    cs.LG math.OC stat.ML

    Efficient Projection-Free Online Methods with Stochastic Recursive Gradient

    Authors: Jiahao Xie, Zebang Shen, Chao Zhang, Boyu Wang, Hui Qian

    Abstract: This paper focuses on projection-free methods for solving smooth Online Convex Optimization (OCO) problems. Existing projection-free methods either achieve suboptimal regret bounds or have high per-iteration computational costs. To fill this gap, two efficient projection-free online methods called ORGFW and MORGFW are proposed for solving stochastic and adversarial OCO problems, respectively. By e… ▽ More

    Submitted 23 October, 2019; v1 submitted 21 October, 2019; originally announced October 2019.

    Comments: 15 pages, 3 figures

  43. arXiv:1910.09223  [pdf, ps, other

    cs.LG stat.ML

    Aggregated Gradient Langevin Dynamics

    Authors: Chao Zhang, Jiahao Xie, Zebang Shen, Peilin Zhao, Tengfei Zhou, Hui Qian

    Abstract: In this paper, we explore a general Aggregated Gradient Langevin Dynamics framework (AGLD) for the Markov Chain Monte Carlo (MCMC) sampling. We investigate the nonasymptotic convergence of AGLD with a unified analysis for different data accessing (e.g. random access, cyclic access and random reshuffle) and snapshot updating strategies, under convex and nonconvex settings respectively. It is the fi… ▽ More

    Submitted 21 October, 2019; originally announced October 2019.

  44. arXiv:1910.02660  [pdf, other

    cs.LG stat.ML

    Deep Kernel Learning via Random Fourier Features

    Authors: Jiaxuan Xie, Fanghui Liu, Kaijie Wang, Xiaolin Huang

    Abstract: Kernel learning methods are among the most effective learning methods and have been vigorously studied in the past decades. However, when tackling with complicated tasks, classical kernel methods are not flexible or "rich" enough to describe the data and hence could not yield satisfactory performance. In this paper, via Random Fourier Features (RFF), we successfully incorporate the deep architectu… ▽ More

    Submitted 7 October, 2019; originally announced October 2019.

  45. arXiv:1907.04433  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

    Authors: Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

    Abstract: We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating). These toolkits provide state-of-the-art pre-trained models, training scripts, and training logs, to facilitate rapid prototyping and promote reproducible research. We also provide modular APIs with flexible building blocks to enable efficient customiza… ▽ More

    Submitted 12 February, 2020; v1 submitted 9 July, 2019; originally announced July 2019.

    Journal ref: Journal of Machine Learning Research 21 (2020) 1-7

  46. arXiv:1906.07757  [pdf

    stat.ME

    TEAM: A Multiple Testing Algorithm on the Aggregation Tree for Flow Cytometry Analysis

    Authors: John Pura, Xuechan Li, Cliburn Chan, Jichun Xie

    Abstract: In immunology studies, flow cytometry is a commonly used multivariate single-cell assay. One key goal in flow cytometry analysis is to pinpoint the immune cells responsive to certain stimuli. Statistically, this problem can be translated into comparing two protein expression probability density functions (PDFs) before and after the stimulus; the goal is to pinpoint the regions where these two pdfs… ▽ More

    Submitted 26 March, 2021; v1 submitted 18 June, 2019; originally announced June 2019.

  47. arXiv:1904.12590  [pdf, other

    stat.AP physics.ao-ph

    Assimilation of semi-qualitative sea ice thickness data with the EnKF-SQ

    Authors: Abhishek Shah, Laurent Bertino, Francois Counillon, Mohamad El Gharamti, Jiping Xie

    Abstract: A newly introduced stochastic data assimilation method, the Ensemble Kalman Filter Semi-Qualitative (EnKF-SQ) is applied to a realistic coupled ice-ocean model of the Arctic, the TOPAZ4 configuration, in a twin experiment framework. The method is shown to add value to range-limited thin ice thickness measurements, as obtained from passive microwave remote sensing, with respect to more trivial solu… ▽ More

    Submitted 29 April, 2019; originally announced April 2019.

    Comments: 24 pages, 11 figures, research article

  48. arXiv:1904.06836  [pdf, ps, other

    cs.CV cs.AI stat.ML

    Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization

    Authors: Qilong Wang, Jiangtao Xie, Wangmeng Zuo, Lei Zhang, Peihua Li

    Abstract: Compared with global average pooling in existing deep convolutional neural networks (CNNs), global covariance pooling can capture richer statistics of deep features, having potential for improving representation and generalization abilities of deep CNNs. However, integration of global covariance pooling into deep CNNs brings two challenges: (1) robust covariance estimation given deep features of h… ▽ More

    Submitted 10 August, 2020; v1 submitted 15 April, 2019; originally announced April 2019.

    Comments: Accepted to IEEE TPAMI. Code is at http://peihuali.org/MPN-COV/

  49. arXiv:1904.05453  [pdf, other

    cs.LG stat.ML

    Energy-Based Continuous Inverse Optimal Control

    Authors: Yifei Xu, Jianwen Xie, Tianyang Zhao, Chris Baker, Yibiao Zhao, Ying Nian Wu

    Abstract: The problem of continuous inverse optimal control (over finite time horizon) is to learn the unknown cost function over the sequence of continuous control variables from expert demonstrations. In this article, we study this fundamental problem in the framework of energy-based model, where the observed expert trajectories are assumed to be random samples from a probability density function defined… ▽ More

    Submitted 18 April, 2022; v1 submitted 10 April, 2019; originally announced April 2019.

  50. arXiv:1902.03871  [pdf, other

    cs.NE cs.CV cs.LG stat.ML

    Learning V1 Simple Cells with Vector Representation of Local Content and Matrix Representation of Local Motion

    Authors: Ruiqi Gao, Jianwen Xie, Siyuan Huang, Yufan Ren, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper proposes a representational model for image pairs such as consecutive video frames that are related by local pixel displacements, in the hope that the model may shed light on motion perception in primary visual cortex (V1). The model couples the following two components: (1) the vector representations of local contents of images and (2) the matrix representations of local pixel displace… ▽ More

    Submitted 5 April, 2022; v1 submitted 24 January, 2019; originally announced February 2019.