-
NeuroPMD: Neural Fields for Density Estimation on Product Manifolds
Authors:
William Consagra,
Zhiling Gu,
Zhengwu Zhang
Abstract:
We propose a novel deep neural network methodology for density estimation on product Riemannian manifold domains. In our approach, the network directly parameterizes the unknown density function and is trained using a penalized maximum likelihood framework, with a penalty term formed using manifold differential operators. The network architecture and estimation algorithm are carefully designed to…
▽ More
We propose a novel deep neural network methodology for density estimation on product Riemannian manifold domains. In our approach, the network directly parameterizes the unknown density function and is trained using a penalized maximum likelihood framework, with a penalty term formed using manifold differential operators. The network architecture and estimation algorithm are carefully designed to handle the challenges of high-dimensional product manifold domains, effectively mitigating the curse of dimensionality that limits traditional kernel and basis expansion estimators, as well as overcoming the convergence issues encountered by non-specialized neural network methods. Extensive simulations and a real-world application to brain structural connectivity data highlight the clear advantages of our method over the competing alternatives.
△ Less
Submitted 6 January, 2025;
originally announced January 2025.
-
Zeroth-Order Stochastic Mirror Descent Algorithms for Minimax Excess Risk Optimization
Authors:
Zhihao Gu,
Zi Xu
Abstract:
The minimax excess risk optimization (MERO) problem is a new variation of the traditional distributionally robust optimization (DRO) problem, which achieves uniformly low regret across all test distributions under suitable conditions. In this paper, we propose a zeroth-order stochastic mirror descent (ZO-SMD) algorithm available for both smooth and non-smooth MERO to estimate the minimal risk of e…
▽ More
The minimax excess risk optimization (MERO) problem is a new variation of the traditional distributionally robust optimization (DRO) problem, which achieves uniformly low regret across all test distributions under suitable conditions. In this paper, we propose a zeroth-order stochastic mirror descent (ZO-SMD) algorithm available for both smooth and non-smooth MERO to estimate the minimal risk of each distrbution, and finally solve MERO as (non-)smooth stochastic convex-concave (linear) minimax optimization problems. The proposed algorithm is proved to converge at optimal convergence rates of $\mathcal{O}\left(1/\sqrt{t}\right)$ on the estimate of $R_i^*$ and $\mathcal{O}\left(1/\sqrt{t}\right)$ on the optimization error of both smooth and non-smooth MERO. Numerical results show the efficiency of the proposed algorithm.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
TSSS: A Novel Triangulated Spherical Spline Smoothing for Surface-based Data
Authors:
Zhiling Gu,
Shan Yu,
Guannan Wang,
Ming-Jun Lai,
Li Wang
Abstract:
Surface-based data is commonly observed in diverse practical applications spanning various fields. In this paper, we introduce a novel nonparametric method to discover the underlying signals from data distributed on complex surface-based domains. Our approach involves a penalized spline estimator defined on a triangulation of surface patches, which enables effective signal extraction and recovery.…
▽ More
Surface-based data is commonly observed in diverse practical applications spanning various fields. In this paper, we introduce a novel nonparametric method to discover the underlying signals from data distributed on complex surface-based domains. Our approach involves a penalized spline estimator defined on a triangulation of surface patches, which enables effective signal extraction and recovery. The proposed method offers several advantages over existing methods, including superior handling of "leakage" or "boundary effects" over complex domains, enhanced computational efficiency, and potential applications in analyzing sparse and irregularly distributed data on complex objects. We provide rigorous theoretical guarantees for the proposed method, including convergence rates of the estimator in both the $L_2$ and supremum norms, as well as the asymptotic normality of the estimator. We also demonstrate that the convergence rates achieved by our estimation method are optimal within the framework of nonparametric estimation. Furthermore, we introduce a bootstrap method to quantify the uncertainty associated with the proposed estimators accurately. The superior performance of the proposed method is demonstrated through simulation experiments and data applications on cortical surface functional magnetic resonance imaging data and oceanic near-surface atmospheric data.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Applying Machine Learning to Understand Water Security and Water Access Inequality in Underserved Colonia Communities
Authors:
Zhining Gu,
Wenwen Li,
Michael Hanemann,
Yushiou Tsai,
Amber Wutich,
Paul Westerhoff,
Laura Landes,
Anais D. Roque,
Madeleine Zheng,
Carmen A. Velasco,
Sarah Porter
Abstract:
This paper explores the application of machine learning to enhance our understanding of water accessibility issues in underserved communities called Colonias located along the northern part of the United States - Mexico border. We analyzed more than 2000 such communities using data from the Rural Community Assistance Partnership (RCAP) and applied hierarchical clustering and the adaptive affinity…
▽ More
This paper explores the application of machine learning to enhance our understanding of water accessibility issues in underserved communities called Colonias located along the northern part of the United States - Mexico border. We analyzed more than 2000 such communities using data from the Rural Community Assistance Partnership (RCAP) and applied hierarchical clustering and the adaptive affinity propagation algorithm to automatically group Colonias into clusters with different water access conditions. The Gower distance was introduced to make the algorithm capable of processing complex datasets containing both categorical and numerical attributes. To better understand and explain the clustering results derived from the machine learning process, we further applied a decision tree analysis algorithm to associate the input data with the derived clusters, to identify and rank the importance of factors that characterize different water access conditions in each cluster. Our results complement experts' priority rankings of water infrastructure needs, providing a more in-depth view of the water insecurity challenges that the Colonias suffer from. As an automated and reproducible workflow combining a series of tools, the proposed machine learning pipeline represents an operationalized solution for conducting data-driven analysis to understand water access inequality. This pipeline can be adapted to analyze different datasets and decision scenarios.
△ Less
Submitted 29 March, 2023;
originally announced March 2023.
-
STSC-SNN: Spatio-Temporal Synaptic Connection with Temporal Convolution and Attention for Spiking Neural Networks
Authors:
Chengting Yu,
Zheming Gu,
Da Li,
Gaoang Wang,
Aili Wang,
Erping Li
Abstract:
Spiking Neural Networks (SNNs), as one of the algorithmic models in neuromorphic computing, have gained a great deal of research attention owing to temporal information processing capability, low power consumption, and high biological plausibility. The potential to efficiently extract spatio-temporal features makes it suitable for processing the event streams. However, existing synaptic structures…
▽ More
Spiking Neural Networks (SNNs), as one of the algorithmic models in neuromorphic computing, have gained a great deal of research attention owing to temporal information processing capability, low power consumption, and high biological plausibility. The potential to efficiently extract spatio-temporal features makes it suitable for processing the event streams. However, existing synaptic structures in SNNs are almost full-connections or spatial 2D convolution, neither of which can extract temporal dependencies adequately. In this work, we take inspiration from biological synapses and propose a spatio-temporal synaptic connection SNN (STSC-SNN) model, to enhance the spatio-temporal receptive fields of synaptic connections, thereby establishing temporal dependencies across layers. Concretely, we incorporate temporal convolution and attention mechanisms to implement synaptic filtering and gating functions. We show that endowing synaptic models with temporal dependencies can improve the performance of SNNs on classification tasks. In addition, we investigate the impact of performance vias varied spatial-temporal receptive fields and reevaluate the temporal modules in SNNs. Our approach is tested on neuromorphic datasets, including DVS128 Gesture (gesture recognition), N-MNIST, CIFAR10-DVS (image classification), and SHD (speech digit recognition). The results show that the proposed model outperforms the state-of-the-art accuracy on nearly all datasets.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Joint Modeling of An Outcome Variable and Integrated Omic Datasets Using GLM-PO2PLS
Authors:
Zhujie Gu,
Said el Bouhaddani,
Jeanine Houwing-Duistermaat,
Hae-Won Uh
Abstract:
In many studies of human diseases, multiple omic datasets are measured. Typically, these omic datasets are studied one by one with the disease, thus the relationship between omics are overlooked. Modeling the joint part of multiple omics and its association to the outcome disease will provide insights into the complex molecular base of the disease. In this article, we extend dimension reduction me…
▽ More
In many studies of human diseases, multiple omic datasets are measured. Typically, these omic datasets are studied one by one with the disease, thus the relationship between omics are overlooked. Modeling the joint part of multiple omics and its association to the outcome disease will provide insights into the complex molecular base of the disease. In this article, we extend dimension reduction methods which model the joint part of omics to a novel method that jointly models an outcome variable with omics. We establish the model identifiability and develop EM algorithms to obtain maximum likelihood estimators of the parameters for normally and Bernoulli distributed outcomes. Test statistics are proposed to infer the association between the outcome and omics, and their asymptotic distributions are derived. Extensive simulation studies are conducted to evaluate the proposed model. The model is illustrated by a Down syndrome study where Down syndrome and two omics - methylation and glycomics - are jointly modeled.
△ Less
Submitted 31 August, 2022;
originally announced September 2022.
-
Using Undersampling with Ensemble Learning to Identify Factors Contributing to Preterm Birth
Authors:
Shi Dong,
Zlatan Feric,
Guangyu Li,
Chieh Wu,
April Z. Gu,
Jennifer Dy,
John Meeker,
Ingrid Y. Padilla,
Jose Cordero,
Carmen Velez Vega,
Zaira Rosario,
Akram Alshawabkeh,
David Kaeli
Abstract:
In this paper, we propose Ensemble Learning models to identify factors contributing to preterm birth. Our work leverages a rich dataset collected by a NIEHS P42 Center that is trying to identify the dominant factors responsible for the high rate of premature births in northern Puerto Rico. We investigate analytical models addressing two major challenges present in the dataset: 1) the significant a…
▽ More
In this paper, we propose Ensemble Learning models to identify factors contributing to preterm birth. Our work leverages a rich dataset collected by a NIEHS P42 Center that is trying to identify the dominant factors responsible for the high rate of premature births in northern Puerto Rico. We investigate analytical models addressing two major challenges present in the dataset: 1) the significant amount of incomplete data in the dataset, and 2) class imbalance in the dataset. First, we leverage and compare two types of missing data imputation methods: 1) mean-based and 2) similarity-based, increasing the completeness of this dataset. Second, we propose a feature selection and evaluation model based on using undersampling with Ensemble Learning to address class imbalance present in the dataset. We leverage and compare multiple Ensemble Feature selection methods, including Complete Linear Aggregation (CLA), Weighted Mean Aggregation (WMA), Feature Occurrence Frequency (OFA), and Classification Accuracy Based Aggregation (CAA). To further address missing data present in each feature, we propose two novel methods: 1) Missing Data Rate and Accuracy Based Aggregation (MAA), and 2) Entropy and Accuracy Based Aggregation (EAA). Both proposed models balance the degree of data variance introduced by the missing data handling during the feature selection process while maintaining model performance. Our results show a 42\% improvement in sensitivity versus fallout over previous state-of-the-art methods.
△ Less
Submitted 23 September, 2020;
originally announced September 2020.
-
EI-MTD:Moving Target Defense for Edge Intelligence against Adversarial Attacks
Authors:
Yaguan Qian,
Qiqi Shao,
Jiamin Wang,
Xiang Lin,
Yankai Guo,
Zhaoquan Gu,
Bin Wang,
Chunming Wu
Abstract:
With the boom of edge intelligence, its vulnerability to adversarial attacks becomes an urgent problem. The so-called adversarial example can fool a deep learning model on the edge node to misclassify. Due to the property of transferability, the adversary can easily make a black-box attack using a local substitute model. Nevertheless, the limitation of resource of edge nodes cannot afford a compli…
▽ More
With the boom of edge intelligence, its vulnerability to adversarial attacks becomes an urgent problem. The so-called adversarial example can fool a deep learning model on the edge node to misclassify. Due to the property of transferability, the adversary can easily make a black-box attack using a local substitute model. Nevertheless, the limitation of resource of edge nodes cannot afford a complicated defense mechanism as doing on the cloud data center. To overcome the challenge, we propose a dynamic defense mechanism, namely EI-MTD. It first obtains robust member models with small size through differential knowledge distillation from a complicated teacher model on the cloud data center. Then, a dynamic scheduling policy based on a Bayesian Stackelberg game is applied to the choice of a target model for service. This dynamic defense can prohibit the adversary from selecting an optimal substitute model for black-box attacks. Our experimental result shows that this dynamic scheduling can effectively protect edge intelligence against adversarial attacks under the black-box setting.
△ Less
Submitted 24 November, 2020; v1 submitted 19 September, 2020;
originally announced September 2020.
-
TEAM: We Need More Powerful Adversarial Examples for DNNs
Authors:
Yaguan Qian,
Ximin Zhang,
Bin Wang,
Wei Li,
Zhaoquan Gu,
Haijiang Wang,
Wassim Swaileh
Abstract:
Although deep neural networks (DNNs) have achieved success in many application fields, it is still vulnerable to imperceptible adversarial examples that can lead to misclassification of DNNs easily. To overcome this challenge, many defensive methods are proposed. Indeed, a powerful adversarial example is a key benchmark to measure these defensive mechanisms. In this paper, we propose a novel metho…
▽ More
Although deep neural networks (DNNs) have achieved success in many application fields, it is still vulnerable to imperceptible adversarial examples that can lead to misclassification of DNNs easily. To overcome this challenge, many defensive methods are proposed. Indeed, a powerful adversarial example is a key benchmark to measure these defensive mechanisms. In this paper, we propose a novel method (TEAM, Taylor Expansion-Based Adversarial Methods) to generate more powerful adversarial examples than previous methods. The main idea is to craft adversarial examples by minimizing the confidence of the ground-truth class under untargeted attacks or maximizing the confidence of the target class under targeted attacks. Specifically, we define the new objective functions that approximate DNNs by using the second-order Taylor expansion within a tiny neighborhood of the input. Then the Lagrangian multiplier method is used to obtain the optimize perturbations for these objective functions. To decrease the amount of computation, we further introduce the Gauss-Newton (GN) method to speed it up. Finally, the experimental result shows that our method can reliably produce adversarial examples with 100% attack success rate (ASR) while only by smaller perturbations. In addition, the adversarial example generated with our method can defeat defensive distillation based on gradient masking.
△ Less
Submitted 9 August, 2020; v1 submitted 31 July, 2020;
originally announced July 2020.
-
Comparing and Integrating US COVID-19 Data from Multiple Sources with Anomaly Detection and Repairing
Authors:
Guannan Wang,
Zhiling Gu,
Xinyi Li,
Shan Yu,
Myungjin Kim,
Yueying Wang,
Lei Gao,
Li Wang
Abstract:
Over the past few months, the outbreak of Coronavirus disease (COVID-19) has been expanding over the world. A reliable and accurate dataset of the cases is vital for scientists to conduct related research and for policy-makers to make better decisions. We collect the United States COVID-19 daily reported data from four open sources: the New York Times, the COVID-19 Data Repository by Johns Hopkins…
▽ More
Over the past few months, the outbreak of Coronavirus disease (COVID-19) has been expanding over the world. A reliable and accurate dataset of the cases is vital for scientists to conduct related research and for policy-makers to make better decisions. We collect the United States COVID-19 daily reported data from four open sources: the New York Times, the COVID-19 Data Repository by Johns Hopkins University, the COVID Tracking Project at the Atlantic, and the USAFacts, then compare the similarities and differences among them. To obtain reliable data for further analysis, we first examine the cyclical pattern and the following anomalies, which frequently occur in the reported cases: (1) the order dependencies violation, (2) the point or period anomalies, and (3) the issue of reporting delay. To address these detected issues, we propose the corresponding repairing methods and procedures if corrections are necessary. In addition, we integrate the COVID-19 reported cases with the county-level auxiliary information of the local features from official sources, such as health infrastructure, demographic, socioeconomic, and environmental information, which are also essential for understanding the spread of the virus.
△ Less
Submitted 28 November, 2020; v1 submitted 1 June, 2020;
originally announced June 2020.
-
Spatiotemporal Dynamics, Nowcasting and Forecasting of COVID-19 in the United States
Authors:
Li Wang,
Guannan Wang,
Lei Gao,
Xinyi Li,
Shan Yu,
Myungjin Kim,
Yueying Wang,
Zhiling Gu
Abstract:
Epidemic modeling is an essential tool to understand the spread of the novel coronavirus and ultimately assist in disease prevention, policymaking, and resource allocation. In this article, we establish a state of the art interface between classic mathematical and statistical models and propose a novel space-time epidemic modeling framework to study the spatial-temporal pattern in the spread of in…
▽ More
Epidemic modeling is an essential tool to understand the spread of the novel coronavirus and ultimately assist in disease prevention, policymaking, and resource allocation. In this article, we establish a state of the art interface between classic mathematical and statistical models and propose a novel space-time epidemic modeling framework to study the spatial-temporal pattern in the spread of infectious disease. We propose a quasi-likelihood approach via the penalized spline approximation and alternatively reweighted least-squares technique to estimate the model. Furthermore, we provide a short-term and long-term county-level prediction of the infected/death count for the U.S. by accounting for the control measures, health service resources, and other local features. Utilizing spatiotemporal analysis, our proposed model enhances the dynamics of the epidemiological mechanism and dissects the spatiotemporal structure of the spreading disease. To assess the uncertainty associated with the prediction, we develop a projection band based on the envelope of the bootstrap forecast paths. The performance of the proposed method is evaluated by a simulation study. We apply the proposed method to model and forecast the spread of COVID-19 at both county and state levels in the United States.
△ Less
Submitted 15 December, 2020; v1 submitted 29 April, 2020;
originally announced April 2020.
-
Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G Networks
Authors:
Changyang She,
Rui Dong,
Zhouyou Gu,
Zhanwei Hou,
Yonghui Li,
Wibowo Hardjawana,
Chenyang Yang,
Lingyang Song,
Branka Vucetic
Abstract:
In the future 6th generation networks, ultra-reliable and low-latency communications (URLLC) will lay the foundation for emerging mission-critical applications that have stringent requirements on end-to-end delay and reliability. Existing works on URLLC are mainly based on theoretical models and assumptions. The model-based solutions provide useful insights, but cannot be directly implemented in p…
▽ More
In the future 6th generation networks, ultra-reliable and low-latency communications (URLLC) will lay the foundation for emerging mission-critical applications that have stringent requirements on end-to-end delay and reliability. Existing works on URLLC are mainly based on theoretical models and assumptions. The model-based solutions provide useful insights, but cannot be directly implemented in practice. In this article, we first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC, and discuss some open problems of these methods. To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC. The basic idea is to merge theoretical models and real-world data in analyzing the latency and reliability and training deep neural networks (DNNs). Deep transfer learning is adopted in the architecture to fine-tune the pre-trained DNNs in non-stationary networks. Further considering that the computing capacity at each user and each mobile edge computing server is limited, federated learning is applied to improve the learning efficiency. Finally, we provide some experimental and simulation results and discuss some future directions.
△ Less
Submitted 22 February, 2020;
originally announced February 2020.
-
Adversary A3C for Robust Reinforcement Learning
Authors:
Zhaoyuan Gu,
Zhenzhong Jia,
Howie Choset
Abstract:
Asynchronous Advantage Actor Critic (A3C) is an effective Reinforcement Learning (RL) algorithm for a wide range of tasks, such as Atari games and robot control. The agent learns policies and value function through trial-and-error interactions with the environment until converging to an optimal policy. Robustness and stability are critical in RL; however, neural network can be vulnerable to noise…
▽ More
Asynchronous Advantage Actor Critic (A3C) is an effective Reinforcement Learning (RL) algorithm for a wide range of tasks, such as Atari games and robot control. The agent learns policies and value function through trial-and-error interactions with the environment until converging to an optimal policy. Robustness and stability are critical in RL; however, neural network can be vulnerable to noise from unexpected sources and is not likely to withstand very slight disturbances. We note that agents generated from mild environment using A3C are not able to handle challenging environments. Learning from adversarial examples, we proposed an algorithm called Adversary Robust A3C (AR-A3C) to improve the agent's performance under noisy environments. In this algorithm, an adversarial agent is introduced to the learning process to make it more robust against adversarial disturbances, thereby making it more adaptive to noisy environments. Both simulations and real-world experiments are carried out to illustrate the stability of the proposed algorithm. The AR-A3C algorithm outperforms A3C in both clean and noisy environments.
△ Less
Submitted 1 December, 2019;
originally announced December 2019.
-
Enhancing Adversarial Example Transferability with an Intermediate Level Attack
Authors:
Qian Huang,
Isay Katsman,
Horace He,
Zeqi Gu,
Serge Belongie,
Ser-Nam Lim
Abstract:
Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models. Adversarial examples often exhibit black-box transfer, meaning that adversarial examples for one model can fool another model. However, adversarial examples are typically overfit to exploit the particular architecture and feature representation of a source model, resulting in sub-optimal black-…
▽ More
Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models. Adversarial examples often exhibit black-box transfer, meaning that adversarial examples for one model can fool another model. However, adversarial examples are typically overfit to exploit the particular architecture and feature representation of a source model, resulting in sub-optimal black-box transfer attacks to other target models. We introduce the Intermediate Level Attack (ILA), which attempts to fine-tune an existing adversarial example for greater black-box transferability by increasing its perturbation on a pre-specified layer of the source model, improving upon state-of-the-art methods. We show that we can select a layer of the source model to perturb without any knowledge of the target models while achieving high transferability. Additionally, we provide some explanatory insights regarding our method and the effect of optimizing for adversarial examples using intermediate feature maps. Our code is available at https://github.com/CUVL/Intermediate-Level-Attack.
△ Less
Submitted 27 February, 2020; v1 submitted 23 July, 2019;
originally announced July 2019.
-
Yelp Food Identification via Image Feature Extraction and Classification
Authors:
Fanbo Sun,
Zhixiang Gu,
Bo Feng
Abstract:
Yelp has been one of the most popular local service search engine in US since 2004. It is powered by crowd-sourced text reviews and photo reviews. Restaurant customers and business owners upload photo images to Yelp, including reviewing or advertising either food, drinks, or inside and outside decorations. It is obviously not so effective that labels for food photos rely on human editors, which is…
▽ More
Yelp has been one of the most popular local service search engine in US since 2004. It is powered by crowd-sourced text reviews and photo reviews. Restaurant customers and business owners upload photo images to Yelp, including reviewing or advertising either food, drinks, or inside and outside decorations. It is obviously not so effective that labels for food photos rely on human editors, which is an issue should be addressed by innovative machine learning approaches. In this paper, we present a simple but effective approach which can identify up to ten kinds of food via raw photos from the challenge dataset. We use 1) image pre-processing techniques, including filtering and image augmentation, 2) feature extraction via convolutional neural networks (CNN), and 3) three ways of classification algorithms. Then, we illustrate the classification accuracy by tuning parameters for augmentations, CNN, and classification. Our experimental results show this simple but effective approach to identify up to 10 food types from images.
△ Less
Submitted 11 February, 2019;
originally announced February 2019.
-
Intermediate Level Adversarial Attack for Enhanced Transferability
Authors:
Qian Huang,
Zeqi Gu,
Isay Katsman,
Horace He,
Pian Pawakapan,
Zhiqiu Lin,
Serge Belongie,
Ser-Nam Lim
Abstract:
Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models. Adversarial examples often exhibit black-box transfer, meaning that adversarial examples for one model can fool another model. However, adversarial examples may be overfit to exploit the particular architecture and feature representation of a source model, resulting in sub-optimal black-box tra…
▽ More
Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models. Adversarial examples often exhibit black-box transfer, meaning that adversarial examples for one model can fool another model. However, adversarial examples may be overfit to exploit the particular architecture and feature representation of a source model, resulting in sub-optimal black-box transfer attacks to other target models. This leads us to introduce the Intermediate Level Attack (ILA), which attempts to fine-tune an existing adversarial example for greater black-box transferability by increasing its perturbation on a pre-specified layer of the source model. We show that our method can effectively achieve this goal and that we can decide a nearly-optimal layer of the source model to perturb without any knowledge of the target models.
△ Less
Submitted 20 November, 2018;
originally announced November 2018.