Skip to main content

Showing 1–50 of 83 results for author: Guo, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.06644  [pdf, ps, other

    cs.LG stat.ML

    Spark Transformer: Reactivating Sparsity in FFN and Attention

    Authors: Chong You, Kan Wu, Zhipeng Jia, Lin Chen, Srinadh Bhojanapalli, Jiaxian Guo, Utku Evci, Jan Wassenberg, Praneeth Netrapalli, Jeremiah J. Willcock, Suvinay Subramanian, Felix Chern, Alek Andreev, Shreya Pathak, Felix Yu, Prateek Jain, David E. Culler, Henry M. Levy, Sanjiv Kumar

    Abstract: The discovery of the lazy neuron phenomenon in trained Transformers, where the vast majority of neurons in their feed-forward networks (FFN) are inactive for each token, has spurred tremendous interests in activation sparsity for enhancing large model efficiency. While notable progress has been made in translating such sparsity to wall-time benefits, modern Transformers have moved away from the Re… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  2. arXiv:2506.02413  [pdf, ps, other

    stat.ML cs.LG

    Tensor State Space-based Dynamic Multilayer Network Modeling

    Authors: Tian Lan, Jie Guo, Chen Zhang

    Abstract: Understanding the complex interactions within dynamic multilayer networks is critical for advancements in various scientific domains. Existing models often fail to capture such networks' temporal and cross-layer dynamics. This paper introduces a novel Tensor State Space Model for Dynamic Multilayer Networks (TSSDMN), utilizing a latent space model framework. TSSDMN employs a symmetric Tucker decom… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  3. arXiv:2505.12225  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Reward Inside the Model: A Lightweight Hidden-State Reward Model for LLM's Best-of-N sampling

    Authors: Jizhou Guo, Zhaomin Wu, Philip S. Yu

    Abstract: High-quality reward models are crucial for unlocking the reasoning potential of large language models (LLMs), with best-of-N voting demonstrating significant performance gains. However, current reward models, which typically operate on the textual output of LLMs, are computationally expensive and parameter-heavy, limiting their real-world applications. We introduce the Efficient Linear Hidden Stat… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

  4. arXiv:2505.01859  [pdf, other

    stat.ML cs.LG stat.CO

    Bayesian learning of the optimal action-value function in a Markov decision process

    Authors: Jiaqi Guo, Chon Wai Ho, Sumeetpal S. Singh

    Abstract: The Markov Decision Process (MDP) is a popular framework for sequential decision-making problems, and uncertainty quantification is an essential component of it to learn optimal decision-making strategies. In particular, a Bayesian framework is used to maintain beliefs about the optimal decisions and the unknown ingredients of the model, which are also to be learned from the data, such as the rewa… ▽ More

    Submitted 3 May, 2025; originally announced May 2025.

    Comments: 66 pages

  5. arXiv:2504.20471  [pdf, other

    cs.LG cs.AI stat.ME

    The Estimation of Continual Causal Effect for Dataset Shifting Streams

    Authors: Baining Chen, Yiming Zhang, Yuqiao Han, Ruyue Zhang, Ruihuan Du, Zhishuo Zhou, Zhengdan Zhu, Xun Liu, Jiecheng Guo

    Abstract: Causal effect estimation has been widely used in marketing optimization. The framework of an uplift model followed by a constrained optimization algorithm is popular in practice. To enhance performance in the online environment, the framework needs to be improved to address the complexities caused by temporal dataset shift. This paper focuses on capturing the dataset shift from user behavior and d… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  6. arXiv:2503.01924  [pdf, other

    cs.LG cs.AI stat.ML

    TAET: Two-Stage Adversarial Equalization Training on Long-Tailed Distributions

    Authors: Wang YuHang, Junkang Guo, Aolei Liu, Kaihao Wang, Zaitong Wu, Zhenyu Liu, Wenfei Yin, Jian Liu

    Abstract: Adversarial robustness is a critical challenge in deploying deep neural networks for real-world applications. While adversarial training is a widely recognized defense strategy, most existing studies focus on balanced datasets, overlooking the prevalence of long-tailed distributions in real-world data, which significantly complicates robustness. This paper provides a comprehensive analysis of adve… ▽ More

    Submitted 21 March, 2025; v1 submitted 2 March, 2025; originally announced March 2025.

    Comments: Text: 8 pages of main content, 5 pages of appendices have been accepted by CVPR2025

    Journal ref: Computer Vision and Pattern Recognition 2025

  7. arXiv:2407.16134  [pdf, other

    cs.LG math.ST stat.ML

    Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data

    Authors: Hengyu Fu, Zehao Dou, Jiawei Guo, Mengdi Wang, Minshuo Chen

    Abstract: Diffusion Transformer, the backbone of Sora for video generation, successfully scales the capacity of diffusion models, pioneering new avenues for high-fidelity sequential data generation. Unlike static data such as images, sequential data consists of consecutive data frames indexed by time, exhibiting rich spatial and temporal dependencies. These dependencies represent the underlying dynamic mode… ▽ More

    Submitted 4 February, 2025; v1 submitted 22 July, 2024; originally announced July 2024.

    Comments: 56 pages, 13 figures

  8. arXiv:2406.19535  [pdf, other

    stat.ME

    Modeling trajectories using functional linear differential equations

    Authors: Julia Wrobel, Britton Sauerbrei, Erik A. Kirk, Jian-Zhong Guo, Adam Hantman, Jeff Goldsmith

    Abstract: We are motivated by a study that seeks to better understand the dynamic relationship between muscle activation and paw position during locomotion. For each gait cycle in this experiment, activation in the biceps and triceps is measured continuously and in parallel with paw position as a mouse trotted on a treadmill. We propose an innovative general regression method that draws from both ordinary d… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  9. arXiv:2401.17585  [pdf, other

    cs.CL cs.AI cs.LG stat.ME

    Propagation and Pitfalls: Reasoning-based Assessment of Knowledge Editing through Counterfactual Tasks

    Authors: Wenyue Hua, Jiang Guo, Mingwen Dong, Henghui Zhu, Patrick Ng, Zhiguo Wang

    Abstract: Current approaches of knowledge editing struggle to effectively propagate updates to interconnected facts. In this work, we delve into the barriers that hinder the appropriate propagation of updated knowledge within these models for accurate reasoning. To support our analysis, we introduce a novel reasoning-based benchmark -- ReCoE (Reasoning-based Counterfactual Editing dataset) -- which covers s… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 22 pages, 14 figures, 5 tables

  10. arXiv:2309.01935  [pdf

    stat.AP

    The impact of electronic health records (EHR) data continuity on prediction model fairness and racial-ethnic disparities

    Authors: Yu Huang, Jingchuan Guo, Zhaoyi Chen, Jie Xu, William T Donahoo, Olveen Carasquillo, Hrushyang Adloori, Jiang Bian, Elizabeth A Shenkman

    Abstract: Electronic health records (EHR) data have considerable variability in data completeness across sites and patients. Lack of "EHR data-continuity" or "EHR data-discontinuity", defined as "having medical information recorded outside the reach of an EHR system" can lead to a substantial amount of information bias. The objective of this study was to comprehensively evaluate (1) how EHR data-discontinui… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  11. arXiv:2307.02884  [pdf, ps, other

    cs.LG stat.ML

    Sample-Efficient Learning of POMDPs with Multiple Observations In Hindsight

    Authors: Jiacheng Guo, Minshuo Chen, Huan Wang, Caiming Xiong, Mengdi Wang, Yu Bai

    Abstract: This paper studies the sample-efficiency of learning in Partially Observable Markov Decision Processes (POMDPs), a challenging problem in reinforcement learning that is known to be exponentially hard in the worst-case. Motivated by real-world settings such as loading in game playing, we propose an enhanced feedback model called ``multiple observations in hindsight'', where after each episode of in… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  12. MLE for the parameters of bivariate interval-valued models

    Authors: S. Yaser Samadi, L. Billard, Jiin-Huarng Guo, Wei Xu

    Abstract: With contemporary data sets becoming too large to analyze the data directly, various forms of aggregated data are becoming common. The original individual data are points, but after aggregation, the observations are interval-valued (e.g.). While some researchers simply analyze the set of averages of the observations by aggregated class, it is easily established that approach ignores much of the in… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: Will appear in ADAC

    Journal ref: Advances in Data Analysis and Classification, 2023

  13. arXiv:2212.09458  [pdf, other

    cs.LG cs.AI stat.ML

    Exploring Optimal Substructure for Out-of-distribution Generalization via Feature-targeted Model Pruning

    Authors: Yingchun Wang, Jingcai Guo, Song Guo, Weizhan Zhang, Jie Zhang

    Abstract: Recent studies show that even highly biased dense networks contain an unbiased substructure that can achieve better out-of-distribution (OOD) generalization than the original model. Existing works usually search the invariant subnetwork using modular risk minimization (MRM) with out-domain data. Such a paradigm may bring about two potential weaknesses: 1) Unfairness, due to the insufficient observ… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: 9 pages;2 figures

    ACM Class: I.2.6

  14. arXiv:2210.05218  [pdf, other

    stat.ME

    A Latent Logistic Regression Model with Graph Data

    Authors: Haixiang Zhang, Yingjun Deng, Alan J. X. Guo, Qing-Hu Hou, Ou Wu

    Abstract: Recently, graph (network) data is an emerging research area in artificial intelligence, machine learning and statistics. In this work, we are interested in whether node's labels (people's responses) are affected by their neighbor's features (friends' characteristics). We propose a novel latent logistic regression model to describe the network dependence with binary responses. The key advantage of… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

  15. arXiv:2209.14846  [pdf, other

    stat.ME

    Modeling and Learning on High-Dimensional Matrix-Variate Sequences

    Authors: Xu Zhang, Catherine C. Liu, Jianhua Guo, K. C. Yuen, A. H. Welsh

    Abstract: We propose a new matrix factor model, named RaDFaM, which is strictly derived based on the general rank decomposition and assumes a structure of a high-dimensional vector factor model for each basis vector. RaDFaM contributes a novel class of low-rank latent structure that makes tradeoff between signal intensity and dimension reduction from the perspective of tensor subspace. Based on the intrinsi… ▽ More

    Submitted 12 February, 2024; v1 submitted 29 September, 2022; originally announced September 2022.

    Comments: 33 pages, 12 figures

  16. arXiv:2206.02508  [pdf, other

    stat.ME

    Tucker tensor factor models: matricization and mode-wise PCA estimation

    Authors: Xu Zhang, Guodong Li, Catherine C. Liu, Jianhua Guo

    Abstract: High-dimensional, higher-order tensor data are gaining prominence in a variety of fields, including but not limited to computer vision and network analysis. Tensor factor models, induced from noisy versions of tensor decompositions or factorizations, are natural potent instruments to study a collection of tensor-variate objects that may be dependent or independent. However, it is still in the earl… ▽ More

    Submitted 12 December, 2024; v1 submitted 6 June, 2022; originally announced June 2022.

  17. arXiv:2203.10975  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace

    Authors: Shu Wan, Chen Zheng, Zhonggen Sun, Mengfan Xu, Xiaoqing Yang, Hongtu Zhu, Jiecheng Guo

    Abstract: Uplift modeling is a rapidly growing approach that utilizes causal inference and machine learning methods to directly estimate the heterogeneous treatment effects, which has been widely applied to various online marketplaces to assist large-scale decision-making in recent years. The existing popular models, like causal forest (CF), are limited to either discrete treatments or posing parametric ass… ▽ More

    Submitted 23 September, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

  18. arXiv:2203.05646  [pdf, ps, other

    stat.ML cs.LG

    Koopman Methods for Estimation of Animal Motions over Unknown Submanifolds

    Authors: Nathan Powell, Bowei Liu, Jia Guo, Sai Tej Parachuri, Andrew J. Kurdila

    Abstract: This paper introduces a data-dependent approximation of the forward kinematics map for certain types of animal motion models. It is assumed that motions are supported on a low-dimensional, unknown configuration manifold $Q$ that is regularly embedded in high dimensional Euclidean space $X:=\mathbb{R}^d$. This paper introduces a method to estimate forward kinematics from the unknown configuration s… ▽ More

    Submitted 6 June, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

  19. arXiv:2201.12609  [pdf, other

    cs.RO cs.LG stat.ML

    ApolloRL: a Reinforcement Learning Platform for Autonomous Driving

    Authors: Fei Gao, Peng Geng, Jiaqi Guo, Yuan Liu, Dingfeng Guo, Yabo Su, Jie Zhou, Xiao Wei, Jin Li, Xu Liu

    Abstract: We introduce ApolloRL, an open platform for research in reinforcement learning for autonomous driving. The platform provides a complete closed-loop pipeline with training, simulation, and evaluation components. It comes with 300 hours of real-world data in driving scenarios and popular baselines such as Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) agents. We elaborate in this pap… ▽ More

    Submitted 29 January, 2022; originally announced January 2022.

  20. arXiv:2111.01301  [pdf, other

    math.ST econ.EM stat.ML

    Asymptotic in a class of network models with an increasing sub-Gamma degree sequence

    Authors: Jing Luo, Haoyu Wei, Xiaoyu Lei, Jiaxin Guo

    Abstract: For the differential privacy under the sub-Gamma noise, we derive the asymptotic properties of a class of network models with binary values with a general link function. In this paper, we release the degree sequences of the binary networks under a general noisy mechanism with the discrete Laplace mechanism as a special case. We establish the asymptotic result including both consistency and asympto… ▽ More

    Submitted 10 November, 2023; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: arXiv admin note: text overlap with arXiv:2002.12733 by other authors

    MSC Class: 62E20; 62F12

  21. arXiv:2012.12196  [pdf, ps, other

    math.ST stat.AP

    Identifiability of Bifactor Models

    Authors: Guanhua Fang, Xin Xu, Jinxin Guo, Zhiliang Ying, Susu Zhang

    Abstract: The bifactor model and its extensions are multidimensional latent variable models, under which each item measures up to one subdimension on top of the primary dimension(s). Despite their wide applications to educational and psychological assessments, this type of multidimensional latent variable models may suffer from non-identifiability, which can further lead to inconsistent parameter estimation… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

    Comments: 89 pages

  22. arXiv:2012.08371  [pdf, other

    math.ST stat.ME

    Limiting laws and consistent estimation criteria for fixed and diverging number of spiked eigenvalues

    Authors: Jianwei Hu, Jingfei Zhang, Jianhua Guo, Ji Zhu

    Abstract: In this paper, we study limiting laws and consistent estimation criteria for the extreme eigenvalues in a spiked covariance model of dimension $p$. Firstly, for fixed $p$, we propose a generalized estimation criterion that can consistently estimate, $k$, the number of spiked eigenvalues. Compared with the existing literature, we show that consistency can be achieved under weaker conditions on the… ▽ More

    Submitted 17 November, 2023; v1 submitted 15 December, 2020; originally announced December 2020.

  23. DS-UI: Dual-Supervised Mixture of Gaussian Mixture Models for Uncertainty Inference

    Authors: Jiyang Xie, Zhanyu Ma, Jing-Hao Xue, Guoqiang Zhang, Jun Guo

    Abstract: This paper proposes a dual-supervised uncertainty inference (DS-UI) framework for improving Bayesian estimation-based uncertainty inference (UI) in deep neural network (DNN)-based image recognition. In the DS-UI, we combine the classifier of a DNN, i.e., the last fully-connected (FC) layer, with a mixture of Gaussian mixture models (MoGMM) to obtain an MoGMM-FC layer. Unlike existing UI methods fo… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

  24. arXiv:2011.00647  [pdf, other

    stat.ME math.ST stat.ML

    Fast Network Community Detection with Profile-Pseudo Likelihood Methods

    Authors: Jiangzhou Wang, Jingfei Zhang, Binghui Liu, Ji Zhu, Jianhua Guo

    Abstract: The stochastic block model is one of the most studied network models for community detection. It is well-known that most algorithms proposed for fitting the stochastic block model likelihood function cannot scale to large-scale networks. One prominent work that overcomes this computational challenge is Amini et al.(2013), which proposed a fast pseudo-likelihood approach for fitting stochastic bloc… ▽ More

    Submitted 29 August, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

  25. Advanced Dropout: A Model-free Methodology for Bayesian Dropout Optimization

    Authors: Jiyang Xie, Zhanyu Ma, and Jianjun Lei, Guoqiang Zhang, Jing-Hao Xue, Zheng-Hua Tan, Jun Guo

    Abstract: Due to lack of data, overfitting ubiquitously exists in real-world applications of deep neural networks (DNNs). We propose advanced dropout, a model-free methodology, to mitigate overfitting and improve the performance of DNNs. The advanced dropout technique applies a model-free and easily implemented distribution with parametric prior, and adaptively adjusts dropout rate. Specifically, the distri… ▽ More

    Submitted 10 August, 2021; v1 submitted 11 October, 2020; originally announced October 2020.

    Comments: Accepted by IEEE TPAMI, 2021

  26. arXiv:2009.10645  [pdf, other

    stat.ML cs.LG stat.AP

    Partially Observable Online Change Detection via Smooth-Sparse Decomposition

    Authors: Jie Guo, Hao Yan, Chen Zhang, Steven Hoi

    Abstract: We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities. On the one hand, the detection scheme should be able to deal with partially observable data and meanwhile have efficient detection power for sparse changes. On the other, the scheme should be able… ▽ More

    Submitted 22 September, 2020; originally announced September 2020.

    Comments: 48 pages

  27. arXiv:2007.02394  [pdf, other

    cs.LG cs.CV stat.ML

    Meta-Semi: A Meta-learning Approach for Semi-supervised Learning

    Authors: Yulin Wang, Jiayi Guo, Shiji Song, Gao Huang

    Abstract: Deep learning based semi-supervised learning (SSL) algorithms have led to promising results in recent years. However, they tend to introduce multiple tunable hyper-parameters, making them less practical in real SSL scenarios where the labeled data is scarce for extensive hyper-parameter search. In this paper, we propose a novel meta-learning based SSL algorithm (Meta-Semi) that requires tuning onl… ▽ More

    Submitted 7 September, 2021; v1 submitted 5 July, 2020; originally announced July 2020.

    Comments: This work has been submitted to the IEEE for possible publication

  28. arXiv:2007.01488  [pdf, other

    cs.LG cs.CL stat.ML

    On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation

    Authors: Jianing Li, Yanyan Lan, Jiafeng Guo, Xueqi Cheng

    Abstract: The goal of text generation models is to fit the underlying real probability distribution of text. For performance evaluation, quality and diversity metrics are usually applied. However, it is still not clear to what extend can the quality-diversity evaluation reflect the distribution-fitting goal. In this paper, we try to reveal such relation in a theoretical approach. We prove that under certain… ▽ More

    Submitted 18 August, 2020; v1 submitted 3 July, 2020; originally announced July 2020.

    Comments: 16 pages, 7 figures. ICML2020 Final Submission

  29. arXiv:2007.01231  [pdf, other

    cs.LG cs.SE stat.ML

    Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs

    Authors: Kian Ahrabian, Daniel Tarlow, Hehuimin Cheng, Jin L. C. Guo

    Abstract: We present a multi-relational temporal Knowledge Graph based on the daily interactions between artifacts in GitHub, one of the largest social coding platforms. Such representation enables posing many user-activity and project management questions as link prediction and time queries over the knowledge graph. In particular, we introduce two new datasets for i) interpolated time-conditioned link pred… ▽ More

    Submitted 12 July, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

    Comments: 11 pages, 1 figure. 37th International Conference on Machine Learning (ICML 2020) - Workshop on Graph Representation Learning and Beyond

  30. arXiv:2005.10696  [pdf, other

    cs.LG cs.AI stat.ML

    Novel Policy Seeking with Constrained Optimization

    Authors: Hao Sun, Zhenghao Peng, Bo Dai, Jian Guo, Dahua Lin, Bolei Zhou

    Abstract: In problem-solving, we humans can come up with multiple novel solutions to the same problem. However, reinforcement learning algorithms can only produce a set of monotonous policies that maximize the cumulative reward but lack diversity and novelty. In this work, we address the problem of generating novel policies in reinforcement learning tasks. Instead of following the multi-objective framework… ▽ More

    Submitted 29 October, 2022; v1 submitted 21 May, 2020; originally announced May 2020.

  31. arXiv:2003.04575  [pdf, other

    cs.LG stat.ML

    GPCA: A Probabilistic Framework for Gaussian Process Embedded Channel Attention

    Authors: Jiyang Xie, Dongliang Chang, Zhanyu Ma, Guoqiang Zhang, Jun Guo

    Abstract: Channel attention mechanisms have been commonly applied in many visual tasks for effective performance improvement. It is able to reinforce the informative channels as well as to suppress the useless channels. Recently, different channel attention modules have been proposed and implemented in various ways. Generally speaking, they are mainly based on convolution and pooling operations. In this pap… ▽ More

    Submitted 10 August, 2021; v1 submitted 10 March, 2020; originally announced March 2020.

    Comments: Accepted by IEEE TPAMI, 2021

  32. arXiv:2002.08021  [pdf

    stat.AP cs.LG econ.EM

    Seasonal and Trend Forecasting of Tourist Arrivals: An Adaptive Multiscale Ensemble Learning Approach

    Authors: Shaolong Suna, Dan Bi, Ju-e Guo, Shouyang Wang

    Abstract: The accurate seasonal and trend forecasting of tourist arrivals is a very challenging task. In the view of the importance of seasonal and trend forecasting of tourist arrivals, and limited research work paid attention to these previously. In this study, a new adaptive multiscale ensemble (AME) learning approach incorporating variational mode decomposition (VMD) and least square support vector regr… ▽ More

    Submitted 10 March, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

  33. arXiv:2002.07964  [pdf

    stat.AP cs.LG econ.EM

    Tourism Demand Forecasting: An Ensemble Deep Learning Approach

    Authors: Shaolong Sun, Yanzhao Li, Ju-e Guo, Shouyang Wang

    Abstract: The availability of tourism-related big data increases the potential to improve the accuracy of tourism demand forecasting, but presents significant challenges for forecasting, including curse of dimensionality and high model complexity. A novel bagging-based multivariate ensemble deep learning approach integrating stacked autoencoders and kernel-based extreme learning machines (B-SAKE) is propose… ▽ More

    Submitted 16 January, 2021; v1 submitted 18 February, 2020; originally announced February 2020.

  34. arXiv:2001.11355  [pdf, other

    cs.LG eess.SY stat.ML

    Constructing Deep Neural Networks with a Priori Knowledge of Wireless Tasks

    Authors: Jia Guo, Chenyang Yang

    Abstract: Deep neural networks (DNNs) have been employed for designing wireless systems in many aspects, say transceiver design, resource optimization, and information prediction. Existing works either use the fully-connected DNN or the DNNs with particular architectures developed in other domains. While generating labels for supervised learning and gathering training samples are time-consuming or cost-proh… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

    Comments: 30 pages, 9 figures. arXiv admin note: text overlap with arXiv:1910.13728

  35. arXiv:1912.13256  [pdf, other

    cs.LG cs.CV stat.ML

    Scalable NAS with Factorizable Architectural Parameters

    Authors: Lanfei Wang, Lingxi Xie, Tianyi Zhang, Jun Guo, Qi Tian

    Abstract: Neural Architecture Search (NAS) is an emerging topic in machine learning and computer vision. The fundamental ideology of NAS is using an automatic mechanism to replace manual designs for exploring powerful network architectures. One of the key factors of NAS is to scale-up the search space, e.g., increasing the number of operators, so that more possibilities are covered, but existing search algo… ▽ More

    Submitted 22 September, 2020; v1 submitted 31 December, 2019; originally announced December 2019.

    Comments: 10 pages, 4 figures

  36. arXiv:1912.04838  [pdf, other

    cs.CV cs.LG stat.ML

    Scalability in Perception for Autonomous Driving: Waymo Open Dataset

    Authors: Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Sheng Zhao, Shuyang Cheng, Yu Zhang, Jonathon Shlens, Zhifeng Chen, Dragomir Anguelov

    Abstract: The research community has increasing interest in autonomous driving research, despite the resource intensity of obtaining representative real world data. Existing self-driving datasets are limited in the scale and variation of the environments they capture, even though generalization within and between operating regions is crucial to the overall viability of the technology. In an effort to help a… ▽ More

    Submitted 12 May, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: CVPR 2020

  37. arXiv:1911.08717  [pdf, other

    cs.LG stat.ML

    Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation

    Authors: Junliang Guo, Xu Tan, Linli Xu, Tao Qin, Enhong Chen, Tie-Yan Liu

    Abstract: Non-autoregressive translation (NAT) models remove the dependence on previous target tokens and generate all target tokens in parallel, resulting in significant inference speedup but at the cost of inferior translation accuracy compared to autoregressive translation (AT) models. Considering that AT models have higher accuracy and are easier to train than NAT models, and both of them share the same… ▽ More

    Submitted 21 November, 2019; v1 submitted 20 November, 2019; originally announced November 2019.

    Comments: AAAI 2020

  38. arXiv:1911.00658  [pdf, ps, other

    stat.ML cs.LG eess.SP

    Global Adaptive Generative Adjustment

    Authors: Bin Wang, Xiaofei Wang, Jianhua Guo

    Abstract: Many traditional signal recovery approaches can behave well basing on the penalized likelihood. However, they have to meet with the difficulty in the selection of hyperparameters or tuning parameters in the penalties. In this article, we propose a global adaptive generative adjustment (GAGA) algorithm for signal recovery, in which multiple hyperpameters are automatically learned and alternatively… ▽ More

    Submitted 16 November, 2022; v1 submitted 2 November, 2019; originally announced November 2019.

  39. arXiv:1911.00623  [pdf, other

    cs.LG cs.DC stat.ML

    On-Device Machine Learning: An Algorithms and Learning Theory Perspective

    Authors: Sauptik Dhar, Junyao Guo, Jiayi Liu, Samarth Tripathi, Unmesh Kurup, Mohak Shah

    Abstract: The predominant paradigm for using machine learning models on a device is to train a model in the cloud and perform inference using the trained model on the device. However, with increasing number of smart devices and improved hardware, there is interest in performing model training on the device. Given this surge in interest, a comprehensive survey of the field from a device-agnostic perspective… ▽ More

    Submitted 24 July, 2020; v1 submitted 1 November, 2019; originally announced November 2019.

    Comments: Edge Learning, TinyML, Resource Constrained Machine Learning, Deep learning on device, Statistical Learning Theory, 45 pages survey

  40. arXiv:1910.14056  [pdf, other

    cs.LG stat.ML

    Unsupervised Star Galaxy Classification with Cascade Variational Auto-Encoder

    Authors: Hao Sun, Jiadong Guo, Edward J. Kim, Robert J. Brunner

    Abstract: The increasing amount of data in astronomy provides great challenges for machine learning research. Previously, supervised learning methods achieved satisfactory recognition accuracy for the star-galaxy classification task, based on manually labeled data set. In this work, we propose a novel unsupervised approach for the star-galaxy recognition task, namely Cascade Variational Auto-Encoder (CasVAE… ▽ More

    Submitted 30 October, 2019; originally announced October 2019.

  41. arXiv:1910.13728  [pdf, other

    cs.LG eess.SY stat.ML

    Structure of Deep Neural Networks with a Priori Information in Wireless Tasks

    Authors: Jia Guo, Chenyang Yang

    Abstract: Deep neural networks (DNNs) have been employed for designing wireless networks in many aspects, such as transceiver optimization, resource allocation, and information prediction. Existing works either use fully-connected DNN or the DNNs with specific structures that are designed in other domains. In this paper, we show that a priori information widely existed in wireless tasks is permutation invar… ▽ More

    Submitted 6 November, 2019; v1 submitted 30 October, 2019; originally announced October 2019.

    Comments: 6 pages, 2 figures, Submitted to ICC 2020

  42. arXiv:1910.08892  [pdf, other

    stat.ME

    Bayesian Symbolic Regression

    Authors: Ying Jin, Weilin Fu, Jian Kang, Jiadong Guo, Jian Guo

    Abstract: Interpretability is crucial for machine learning in many scenarios such as quantitative finance, banking, healthcare, etc. Symbolic regression (SR) is a classic interpretable machine learning method by bridging X and Y using mathematical expressions composed of some basic functions. However, the search space of all possible expressions grows exponentially with the length of the expression, making… ▽ More

    Submitted 15 January, 2020; v1 submitted 20 October, 2019; originally announced October 2019.

  43. arXiv:1910.02672  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Multi-label Detection and Classification of Red Blood Cells in Microscopic Images

    Authors: Wei Qiu, Jiaming Guo, Xiang Li, Mengjia Xu, Mo Zhang, Ning Guo, Quanzheng Li

    Abstract: Cell detection and cell type classification from biomedical images play an important role for high-throughput imaging and various clinical application. While classification of single cell sample can be performed with standard computer vision and machine learning methods, analysis of multi-label samples (region containing congregating cells) is more challenging, as separation of individual cells ca… ▽ More

    Submitted 14 December, 2019; v1 submitted 7 October, 2019; originally announced October 2019.

    Comments: Wei Qiu, Jiaming Guo and Xiang Li contributed equally

  44. Predicting Alzheimer's Disease by Hierarchical Graph Convolution from Positron Emission Tomography Imaging

    Authors: Jiaming Guo, Wei Qiu, Xiang Li, Xuandong Zhao, Ning Guo, Quanzheng Li

    Abstract: Imaging-based early diagnosis of Alzheimer Disease (AD) has become an effective approach, especially by using nuclear medicine imaging techniques such as Positron Emission Topography (PET). In various literature it has been found that PET images can be better modeled as signals (e.g. uptake of florbetapir) defined on a network (non-Euclidean) structure which is governed by its underlying graph pat… ▽ More

    Submitted 30 September, 2019; originally announced October 2019.

    Comments: Jiaming Guo, Wei Qiu and Xiang Li contribute equally to this work

  45. arXiv:1909.10710  [pdf, other

    stat.ME

    Estimating Number of Factors by Adjusted Eigenvalues Thresholding

    Authors: Jianqing Fan, Jianhua Guo, Shurong Zheng

    Abstract: Determining the number of common factors is an important and practical topic in high dimensional factor models. The existing literatures are mainly based on the eigenvalues of the covariance matrix. Due to the incomparability of the eigenvalues of the covariance matrix caused by heterogeneous scales of observed variables, it is very difficult to give an accurate relationship between these eigenval… ▽ More

    Submitted 24 September, 2019; originally announced September 2019.

    Comments: 35 pages; 4 figures

  46. arXiv:1909.04266  [pdf, other

    cs.IR cs.LG stat.ML

    Wasserstein Collaborative Filtering for Item Cold-start Recommendation

    Authors: Yitong Meng, Guangyong Chen, Benben Liao, Jun Guo, Weiwen Liu

    Abstract: The item cold-start problem seriously limits the recommendation performance of Collaborative Filtering (CF) methods when new items have either none or very little interactions. To solve this issue, many modern Internet applications propose to predict a new item's interaction from the possessing contents. However, it is difficult to design and learn a map between the item's interaction history and… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.

  47. arXiv:1909.04239  [pdf, other

    cs.IR cs.AI cs.LG stat.ML

    PMD: An Optimal Transportation-based User Distance for Recommender Systems

    Authors: Yitong Meng, Xinyan Dai, Xiao Yan, James Cheng, Weiwen Liu, Benben Liao, Jun Guo, Guangyong Chen

    Abstract: Collaborative filtering, a widely-used recommendation technique, predicts a user's preference by aggregating the ratings from similar users. As a result, these measures cannot fully utilize the rating information and are not suitable for real world sparse data. To solve these issues, we propose a novel user distance measure named Preference Mover's Distance (PMD) which makes full use of all rating… ▽ More

    Submitted 10 December, 2019; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: This paper is accepted by European Conference on Information Retrieval (ECIR 2020)

  48. arXiv:1908.05474  [pdf, other

    cs.LG stat.ML

    Adaptive Regularization of Labels

    Authors: Qianggang Ding, Sifan Wu, Hao Sun, Jiadong Guo, Shu-Tao Xia

    Abstract: Recently, a variety of regularization techniques have been widely applied in deep neural networks, such as dropout, batch normalization, data augmentation, and so on. These methods mainly focus on the regularization of weight parameters to prevent overfitting effectively. In addition, label regularization techniques such as label smoothing and label disturbance have also been proposed with the mot… ▽ More

    Submitted 15 August, 2019; originally announced August 2019.

  49. arXiv:1907.04433  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

    Authors: Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

    Abstract: We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating). These toolkits provide state-of-the-art pre-trained models, training scripts, and training logs, to facilitate rapid prototyping and promote reproducible research. We also provide modular APIs with flexible building blocks to enable efficient customiza… ▽ More

    Submitted 12 February, 2020; v1 submitted 9 July, 2019; originally announced July 2019.

    Journal ref: Journal of Machine Learning Research 21 (2020) 1-7

  50. arXiv:1906.04411  [pdf, other

    cs.LG cs.CR stat.ML

    Evolutionary Trigger Set Generation for DNN Black-Box Watermarking

    Authors: Jia Guo, Miodrag Potkonjak

    Abstract: The commercialization of deep learning creates a compelling need for intellectual property (IP) protection. Deep neural network (DNN) watermarking has been proposed as a promising tool to help model owners prove ownership and fight piracy. A popular approach of watermarking is to train a DNN to recognize images with certain \textit{trigger} patterns. In this paper, we propose a novel evolutionary… ▽ More

    Submitted 13 February, 2021; v1 submitted 11 June, 2019; originally announced June 2019.