-
Statistical Query Hardness of Multiclass Linear Classification with Random Classification Noise
Authors:
Ilias Diakonikolas,
Mingchen Ma,
Lisheng Ren,
Christos Tzamos
Abstract:
We study the task of Multiclass Linear Classification (MLC) in the distribution-free PAC model with Random Classification Noise (RCN). Specifically, the learner is given a set of labeled examples $(x, y)$, where $x$ is drawn from an unknown distribution on $R^d$ and the labels are generated by a multiclass linear classifier corrupted with RCN. That is, the label $y$ is flipped from $i$ to $j$ with…
▽ More
We study the task of Multiclass Linear Classification (MLC) in the distribution-free PAC model with Random Classification Noise (RCN). Specifically, the learner is given a set of labeled examples $(x, y)$, where $x$ is drawn from an unknown distribution on $R^d$ and the labels are generated by a multiclass linear classifier corrupted with RCN. That is, the label $y$ is flipped from $i$ to $j$ with probability $H_{ij}$ according to a known noise matrix $H$ with non-negative separation $σ: = \min_{i \neq j} H_{ii}-H_{ij}$. The goal is to compute a hypothesis with small 0-1 error. For the special case of two labels, prior work has given polynomial-time algorithms achieving the optimal error. Surprisingly, little is known about the complexity of this task even for three labels. As our main contribution, we show that the complexity of MLC with RCN becomes drastically different in the presence of three or more labels. Specifically, we prove super-polynomial Statistical Query (SQ) lower bounds for this problem. In more detail, even for three labels and constant separation, we give a super-polynomial lower bound on the complexity of any SQ algorithm achieving optimal error. For a larger number of labels and smaller separation, we show a super-polynomial SQ lower bound even for the weaker goal of achieving any constant factor approximation to the optimal loss or even beating the trivial hypothesis.
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
Causally-Aware Unsupervised Feature Selection Learning
Authors:
Zongxin Shen,
Yanyong Huang,
Dongjie Wang,
Minbo Ma,
Fengmao Lv,
Tianrui Li
Abstract:
Unsupervised feature selection (UFS) has recently gained attention for its effectiveness in processing unlabeled high-dimensional data. However, existing methods overlook the intrinsic causal mechanisms within the data, resulting in the selection of irrelevant features and poor interpretability. Additionally, previous graph-based methods fail to account for the differing impacts of non-causal and…
▽ More
Unsupervised feature selection (UFS) has recently gained attention for its effectiveness in processing unlabeled high-dimensional data. However, existing methods overlook the intrinsic causal mechanisms within the data, resulting in the selection of irrelevant features and poor interpretability. Additionally, previous graph-based methods fail to account for the differing impacts of non-causal and causal features in constructing the similarity graph, which leads to false links in the generated graph. To address these issues, a novel UFS method, called Causally-Aware UnSupErvised Feature Selection learning (CAUSE-FS), is proposed. CAUSE-FS introduces a novel causal regularizer that reweights samples to balance the confounding distribution of each treatment feature. This regularizer is subsequently integrated into a generalized unsupervised spectral regression model to mitigate spurious associations between features and clustering labels, thus achieving causal feature selection. Furthermore, CAUSE-FS employs causality-guided hierarchical clustering to partition features with varying causal contributions into multiple granularities. By integrating similarity graphs learned adaptively at different granularities, CAUSE-FS increases the importance of causal features when constructing the fused similarity graph to capture the reliable local structure of data. Extensive experimental results demonstrate the superiority of CAUSE-FS over state-of-the-art methods, with its interpretability further validated through feature visualization.
△ Less
Submitted 25 January, 2025; v1 submitted 16 October, 2024;
originally announced October 2024.
-
Momentum Dynamics in Competitive Sports: A Multi-Model Analysis Using TOPSIS and Logistic Regression
Authors:
Mingpu Ma
Abstract:
This paper explores the concept of "momentum" in sports competitions through the use of the TOPSIS model and 0-1 logistic regression model. First, the TOPSIS model is employed to evaluate the performance of two tennis players, with visualizations used to analyze the situation's evolution at every moment in the match, explaining how "momentum" manifests in sports. Then, the 0-1 logistic regression…
▽ More
This paper explores the concept of "momentum" in sports competitions through the use of the TOPSIS model and 0-1 logistic regression model. First, the TOPSIS model is employed to evaluate the performance of two tennis players, with visualizations used to analyze the situation's evolution at every moment in the match, explaining how "momentum" manifests in sports. Then, the 0-1 logistic regression model is utilized to verify the impact of "momentum" on match outcomes, demonstrating that fluctuations in player performance and the successive occurrence of successes are not random. Additionally, this paper examines the indicators that influence the reversal of game situations by analyzing key match data and testing the accuracy of the models with match data. The findings show that the model accurately explains the conditions during matches and can be generalized to other sports competitions. Finally, the strengths, weaknesses, and potential future improvements of the model are discussed.
△ Less
Submitted 4 September, 2024;
originally announced September 2024.
-
A Probabilistic Neural Twin for Treatment Planning in Peripheral Pulmonary Artery Stenosis
Authors:
John D. Lee,
Jakob Richter,
Martin R. Pfaller,
Jason M. Szafron,
Karthik Menon,
Andrea Zanoni,
Michael R. Ma,
Jeffrey A. Feinstein,
Jacqueline Kreutzer,
Alison L. Marsden,
Daniele E. Schiavazzi
Abstract:
The substantial computational cost of high-fidelity models in numerical hemodynamics has, so far, relegated their use mainly to offline treatment planning. New breakthroughs in data-driven architectures and optimization techniques for fast surrogate modeling provide an exciting opportunity to overcome these limitations, enabling the use of such technology for time-critical decisions. We discuss an…
▽ More
The substantial computational cost of high-fidelity models in numerical hemodynamics has, so far, relegated their use mainly to offline treatment planning. New breakthroughs in data-driven architectures and optimization techniques for fast surrogate modeling provide an exciting opportunity to overcome these limitations, enabling the use of such technology for time-critical decisions. We discuss an application to the repair of multiple stenosis in peripheral pulmonary artery disease through either transcatheter pulmonary artery rehabilitation or surgery, where it is of interest to achieve desired pressures and flows at specific locations in the pulmonary artery tree, while minimizing the risk for the patient. Since different degrees of success can be achieved in practice during treatment, we formulate the problem in probability, and solve it through a sample-based approach. We propose a new offline-online pipeline for probabilsitic real-time treatment planning which combines offline assimilation of boundary conditions, model reduction, and training dataset generation with online estimation of marginal probabilities, possibly conditioned on the degree of augmentation observed in already repaired lesions. Moreover, we propose a new approach for the parametrization of arbitrarily shaped vascular repairs through iterative corrections of a zero-dimensional approximant. We demonstrate this pipeline for a diseased model of the pulmonary artery tree available through the Vascular Model Repository.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Theoretical analysis of deep neural networks for temporally dependent observations
Authors:
Mingliang Ma,
Abolfazl Safikhani
Abstract:
Deep neural networks are powerful tools to model observations over time with non-linear patterns. Despite the widespread use of neural networks in such settings, most theoretical developments of deep neural networks are under the assumption of independent observations, and theoretical results for temporally dependent observations are scarce. To bridge this gap, we study theoretical properties of d…
▽ More
Deep neural networks are powerful tools to model observations over time with non-linear patterns. Despite the widespread use of neural networks in such settings, most theoretical developments of deep neural networks are under the assumption of independent observations, and theoretical results for temporally dependent observations are scarce. To bridge this gap, we study theoretical properties of deep neural networks on modeling non-linear time series data. Specifically, non-asymptotic bounds for prediction error of (sparse) feed-forward neural network with ReLU activation function is established under mixing-type assumptions. These assumptions are mild such that they include a wide range of time series models including auto-regressive models. Compared to independent observations, established convergence rates have additional logarithmic factors to compensate for additional complexity due to dependence among data points. The theoretical results are supported via various numerical simulation settings as well as an application to a macroeconomic data set.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Accurate estimation of dynamical quantities for nonequilibrium nanoscale system
Authors:
Zhi Xu,
Han Li,
Ming Ma
Abstract:
Fluctuations of dynamical quantities are fundamental and inevitable. For the booming research in nanotechnology, huge relative fluctuation comes with the reduction of system size, leading to large uncertainty for the estimates of dynamical quantities. Thus, increasing statistical efficiency, i.e., reducing the number of samples required to achieve a given accuracy, is of great significance for acc…
▽ More
Fluctuations of dynamical quantities are fundamental and inevitable. For the booming research in nanotechnology, huge relative fluctuation comes with the reduction of system size, leading to large uncertainty for the estimates of dynamical quantities. Thus, increasing statistical efficiency, i.e., reducing the number of samples required to achieve a given accuracy, is of great significance for accurate estimation. Here we propose a theory as a fundamental solution for such problem by constructing auxiliary path for each real path. The states on auxiliary paths constitute canonical ensemble and share the same macroscopic properties with the initial states of the real path. By implementing the theory in molecular dynamics simulations, we obtain a nanoscale Couette flow field with an accuracy of 0.2 μm/s with relative standard error < 0.1. The required number of samples is reduced by 12 orders compared to conventional method. The predicted thermolubric behavior of water sliding on a self-assembled surface is directly validated by experiment under the same velocity. As the theory only assumes the system is initially in thermal equilibrium then driven from that equilibrium by an external perturbation, we believe it could serve as a general approach for extracting the accurate estimate of dynamical quantities from large fluctuations to provide insights on atomic level under experimental conditions, and benefit the studies on mass transport across (biological) nanochannels and fluid film lubrication of nanometer thickness.
△ Less
Submitted 9 July, 2022;
originally announced July 2022.
-
AEGCN: An Autoencoder-Constrained Graph Convolutional Network
Authors:
Mingyuan Ma,
Sen Na,
Hongyu Wang
Abstract:
We propose a novel neural network architecture, called autoencoder-constrained graph convolutional network, to solve node classification task on graph domains. As suggested by its name, the core of this model is a convolutional network operating directly on graphs, whose hidden layers are constrained by an autoencoder. Comparing with vanilla graph convolutional networks, the autoencoder step is ad…
▽ More
We propose a novel neural network architecture, called autoencoder-constrained graph convolutional network, to solve node classification task on graph domains. As suggested by its name, the core of this model is a convolutional network operating directly on graphs, whose hidden layers are constrained by an autoencoder. Comparing with vanilla graph convolutional networks, the autoencoder step is added to reduce the information loss brought by Laplacian smoothing. We consider applying our model on both homogeneous graphs and heterogeneous graphs. For homogeneous graphs, the autoencoder approximates to the adjacency matrix of the input graph by taking hidden layer representations as encoder and another one-layer graph convolutional network as decoder. For heterogeneous graphs, since there are multiple adjacency matrices corresponding to different types of edges, the autoencoder approximates to the feature matrix of the input graph instead, and changes the encoder to a particularly designed multi-channel pre-processing network with two layers. In both cases, the error occurred in the autoencoder approximation goes to the penalty term in the loss function. In extensive experiments on citation networks and other heterogeneous graphs, we demonstrate that adding autoencoder constraints significantly improves the performance of graph convolutional networks. Further, we notice that our technique can be applied on graph attention network to improve the performance as well. This reveals the wide applicability of the proposed autoencoder technique.
△ Less
Submitted 10 February, 2021; v1 submitted 3 July, 2020;
originally announced July 2020.
-
Joint User Pairing and Association for Multicell NOMA: A Pointer Network-based Approach
Authors:
Manyou Ma,
Vincent W. S. Wong
Abstract:
In this paper, we investigate the joint user pairing and association problem for multicell non-orthogonal multiple access (NOMA) systems. We consider a scenario where the user equipments (UEs) are located in a multicell network equipped with multiple base stations. Each base station has multiple orthogonal physical resource blocks (PRBs). Each PRB can be allocated to a pair of UEs using NOMA. Each…
▽ More
In this paper, we investigate the joint user pairing and association problem for multicell non-orthogonal multiple access (NOMA) systems. We consider a scenario where the user equipments (UEs) are located in a multicell network equipped with multiple base stations. Each base station has multiple orthogonal physical resource blocks (PRBs). Each PRB can be allocated to a pair of UEs using NOMA. Each UE has the additional freedom to be served by any one of the base stations, which further increases the complexity of the joint user pairing and association algorithm design. Leveraging the recent success on using machine learning to solve numerical optimization problems, we formulate the joint user pairing and association problem as a combinatorial optimization problem. The solution is found using an emerging deep learning architecture called Pointer Network (PtrNet), which has a lower computational complexity compared to solutions based on iterative algorithms and has been proven to achieve near-optimal performance. The training phase of the PtrNet is based on deep reinforcement learning (DRL), and does not require the use of the optimal solution of the formulated problem as training labels. Simulation results show that the proposed joint user pairing and association scheme achieves near-optimal performance in terms of the aggregate data rate, and outperforms the random user pairing and association heuristic by up to 30%.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
Complex Transformer: A Framework for Modeling Complex-Valued Sequence
Authors:
Muqiao Yang,
Martin Q. Ma,
Dongyu Li,
Yao-Hung Hubert Tsai,
Ruslan Salakhutdinov
Abstract:
While deep learning has received a surge of interest in a variety of fields in recent years, major deep learning models barely use complex numbers. However, speech, signal and audio data are naturally complex-valued after Fourier Transform, and studies have shown a potentially richer representation of complex nets. In this paper, we propose a Complex Transformer, which incorporates the transformer…
▽ More
While deep learning has received a surge of interest in a variety of fields in recent years, major deep learning models barely use complex numbers. However, speech, signal and audio data are naturally complex-valued after Fourier Transform, and studies have shown a potentially richer representation of complex nets. In this paper, we propose a Complex Transformer, which incorporates the transformer model as a backbone for sequence modeling; we also develop attention and encoder-decoder network operating for complex input. The model achieves state-of-the-art performance on the MusicNet dataset and an In-phase Quadrature (IQ) signal dataset.
△ Less
Submitted 6 August, 2021; v1 submitted 22 October, 2019;
originally announced October 2019.
-
Adversarial Sensor Attack on LiDAR-based Perception in Autonomous Driving
Authors:
Yulong Cao,
Chaowei Xiao,
Benjamin Cyr,
Yimeng Zhou,
Won Park,
Sara Rampazzi,
Qi Alfred Chen,
Kevin Fu,
Z. Morley Mao
Abstract:
In Autonomous Vehicles (AVs), one fundamental pillar is perception, which leverages sensors like cameras and LiDARs (Light Detection and Ranging) to understand the driving environment. Due to its direct impact on road safety, multiple prior efforts have been made to study its the security of perception systems. In contrast to prior work that concentrates on camera-based perception, in this work we…
▽ More
In Autonomous Vehicles (AVs), one fundamental pillar is perception, which leverages sensors like cameras and LiDARs (Light Detection and Ranging) to understand the driving environment. Due to its direct impact on road safety, multiple prior efforts have been made to study its the security of perception systems. In contrast to prior work that concentrates on camera-based perception, in this work we perform the first security study of LiDAR-based perception in AV settings, which is highly important but unexplored. We consider LiDAR spoofing attacks as the threat model and set the attack goal as spoofing obstacles close to the front of a victim AV. We find that blindly applying LiDAR spoofing is insufficient to achieve this goal due to the machine learning-based object detection process. Thus, we then explore the possibility of strategically controlling the spoofed attack to fool the machine learning model. We formulate this task as an optimization problem and design modeling methods for the input perturbation function and the objective function. We also identify the inherent limitations of directly solving the problem using optimization and design an algorithm that combines optimization and global sampling, which improves the attack success rates to around 75%. As a case study to understand the attack impact at the AV driving decision level, we construct and evaluate two attack scenarios that may damage road safety and mobility. We also discuss defense directions at the AV system, sensor, and machine learning model levels.
△ Less
Submitted 20 August, 2019; v1 submitted 16 July, 2019;
originally announced July 2019.
-
Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale
Authors:
Atılım Güneş Baydin,
Lei Shao,
Wahid Bhimji,
Lukas Heinrich,
Lawrence Meadows,
Jialin Liu,
Andreas Munk,
Saeid Naderiparizi,
Bradley Gram-Hansen,
Gilles Louppe,
Mingfei Ma,
Xiaohui Zhao,
Philip Torr,
Victor Lee,
Kyle Cranmer,
Prabhat,
Frank Wood
Abstract:
Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL frame…
▽ More
Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNN--LSTM architecture with a PyTorch-MPI-based framework on 1,024 32-core CPU nodes of the Cori supercomputer with a global minibatch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) use-case with the C++ Sherpa simulator and achieve the largest-scale posterior inference in a Turing-complete PPL.
△ Less
Submitted 27 August, 2019; v1 submitted 7 July, 2019;
originally announced July 2019.
-
On the Convergence of SARAH and Beyond
Authors:
Bingcong Li,
Meng Ma,
Georgios B. Giannakis
Abstract:
The main theme of this work is a unifying algorithm, \textbf{L}oop\textbf{L}ess \textbf{S}ARAH (L2S) for problems formulated as summation of $n$ individual loss functions. L2S broadens a recently developed variance reduction method known as SARAH. To find an $ε$-accurate solution, L2S enjoys a complexity of ${\cal O}\big( (n+κ) \ln (1/ε)\big)$ for strongly convex problems. For convex problems, whe…
▽ More
The main theme of this work is a unifying algorithm, \textbf{L}oop\textbf{L}ess \textbf{S}ARAH (L2S) for problems formulated as summation of $n$ individual loss functions. L2S broadens a recently developed variance reduction method known as SARAH. To find an $ε$-accurate solution, L2S enjoys a complexity of ${\cal O}\big( (n+κ) \ln (1/ε)\big)$ for strongly convex problems. For convex problems, when adopting an $n$-dependent step size, the complexity of L2S is ${\cal O}(n+ \sqrt{n}/ε)$; while for more frequently adopted $n$-independent step size, the complexity is ${\cal O}(n+ n/ε)$. Distinct from SARAH, our theoretical findings support an $n$-independent step size in convex problems without extra assumptions. For nonconvex problems, the complexity of L2S is ${\cal O}(n+ \sqrt{n}/ε)$. Our numerical tests on neural networks suggest that L2S can have better generalization properties than SARAH. Along with L2S, our side results include the linear convergence of the last iteration for SARAH in strongly convex problems.
△ Less
Submitted 16 January, 2020; v1 submitted 5 June, 2019;
originally announced June 2019.
-
Gaining Extra Supervision via Multi-task learning for Multi-Modal Video Question Answering
Authors:
Junyeong Kim,
Minuk Ma,
Kyungsu Kim,
Sungjin Kim,
Chang D. Yoo
Abstract:
This paper proposes a method to gain extra supervision via multi-task learning for multi-modal video question answering. Multi-modal video question answering is an important task that aims at the joint understanding of vision and language. However, establishing large scale dataset for multi-modal video question answering is expensive and the existing benchmarks are relatively small to provide suff…
▽ More
This paper proposes a method to gain extra supervision via multi-task learning for multi-modal video question answering. Multi-modal video question answering is an important task that aims at the joint understanding of vision and language. However, establishing large scale dataset for multi-modal video question answering is expensive and the existing benchmarks are relatively small to provide sufficient supervision. To overcome this challenge, this paper proposes a multi-task learning method which is composed of three main components: (1) multi-modal video question answering network that answers the question based on the both video and subtitle feature, (2) temporal retrieval network that predicts the time in the video clip where the question was generated from and (3) modality alignment network that solves metric learning problem to find correct association of video and subtitle modalities. By simultaneously solving related auxiliary tasks with hierarchically shared intermediate layers, the extra synergistic supervisions are provided. Motivated by curriculum learning, multi task ratio scheduling is proposed to learn easier task earlier to set inductive bias at the beginning of the training. The experiments on publicly available dataset TVQA shows state-of-the-art results, and ablation studies are conducted to prove the statistical validity.
△ Less
Submitted 27 May, 2019;
originally announced May 2019.
-
The Graph-Based Behavior-Aware Recommendation for Interactive News
Authors:
Mingyuan Ma,
Sen Na,
Hongyu Wang,
Congzhou Chen,
Jin Xu
Abstract:
Interactive news recommendation has been launched and attracted much attention recently. In this scenario, user's behavior evolves from single click behavior to multiple behaviors including like, comment, share etc. However, most of the existing methods still use single click behavior as the unique criterion of judging user's preferences. Further, although heterogeneous graphs have been applied in…
▽ More
Interactive news recommendation has been launched and attracted much attention recently. In this scenario, user's behavior evolves from single click behavior to multiple behaviors including like, comment, share etc. However, most of the existing methods still use single click behavior as the unique criterion of judging user's preferences. Further, although heterogeneous graphs have been applied in different areas, a proper way to construct a heterogeneous graph for interactive news data with an appropriate learning mechanism on it is still desired. To address the above concerns, we propose a graph-based behavior-aware network, which simultaneously considers six different types of behaviors as well as user's demand on the news diversity. We have three main steps. First, we build an interaction behavior graph for multi-level and multi-category data. Second, we apply DeepWalk on the behavior graph to obtain entity semantics, then build a graph-based convolutional neural network called G-CNN to learn news representations, and an attention-based LSTM to learn behavior sequence representations. Third, we introduce core and coritivity features for the behavior graph, which measure the concentration degree of user's interests. These features affect the trade-off between accuracy and diversity of our personalized recommendation system. Taking these features into account, our system finally achieves recommending news to different users at their different levels of concentration degrees.
△ Less
Submitted 20 May, 2021; v1 submitted 30 November, 2018;
originally announced December 2018.
-
Kernel-based Inference of Functions over Graphs
Authors:
Vassilis N. Ioannidis,
Meng Ma,
Athanasios N. Nikolakopoulos,
Georgios B. Giannakis,
Daniel Romero
Abstract:
The study of networks has witnessed an explosive growth over the past decades with several ground-breaking methods introduced. A particularly interesting -- and prevalent in several fields of study -- problem is that of inferring a function defined over the nodes of a network. This work presents a versatile kernel-based framework for tackling this inference problem that naturally subsumes and gene…
▽ More
The study of networks has witnessed an explosive growth over the past decades with several ground-breaking methods introduced. A particularly interesting -- and prevalent in several fields of study -- problem is that of inferring a function defined over the nodes of a network. This work presents a versatile kernel-based framework for tackling this inference problem that naturally subsumes and generalizes the reconstruction approaches put forth recently by the signal processing on graphs community. Both the static and the dynamic settings are considered along with effective modeling approaches for addressing real-world problems. The herein analytical discussion is complemented by a set of numerical examples, which showcase the effectiveness of the presented techniques, as well as their merits related to state-of-the-art methods.
△ Less
Submitted 10 April, 2018; v1 submitted 28 November, 2017;
originally announced November 2017.
-
Scalable Peaceman-Rachford Splitting Method with Proximal Terms
Authors:
Sen Na,
Mingyuan Ma,
Mladen Kolar
Abstract:
Along with developing of Peaceman-Rachford Splittling Method (PRSM), many batch algorithms based on it have been studied very deeply. But almost no algorithm focused on the performance of stochastic version of PRSM. In this paper, we propose a new stochastic algorithm based on PRSM, prove its convergence rate in ergodic sense, and test its performance on both artificial and real data. We show that…
▽ More
Along with developing of Peaceman-Rachford Splittling Method (PRSM), many batch algorithms based on it have been studied very deeply. But almost no algorithm focused on the performance of stochastic version of PRSM. In this paper, we propose a new stochastic algorithm based on PRSM, prove its convergence rate in ergodic sense, and test its performance on both artificial and real data. We show that our proposed algorithm, Stochastic Scalable PRSM (SS-PRSM), enjoys the $O(1/K)$ convergence rate, which is the same as those newest stochastic algorithms that based on ADMM but faster than general Stochastic ADMM (which is $O(1/\sqrt{K})$). Our algorithm also owns wide flexibility, outperforms many state-of-the-art stochastic algorithms coming from ADMM, and has low memory cost in large-scale splitting optimization problems.
△ Less
Submitted 9 February, 2018; v1 submitted 14 November, 2017;
originally announced November 2017.
-
Kernel-based Reconstruction of Graph Signals
Authors:
Daniel Romero,
Meng Ma,
Georgios B. Giannakis
Abstract:
A number of applications in engineering, social sciences, physics, and biology involve inference over networks. In this context, graph signals are widely encountered as descriptors of vertex attributes or features in graph-structured data. Estimating such signals in all vertices given noisy observations of their values on a subset of vertices has been extensively analyzed in the literature of sign…
▽ More
A number of applications in engineering, social sciences, physics, and biology involve inference over networks. In this context, graph signals are widely encountered as descriptors of vertex attributes or features in graph-structured data. Estimating such signals in all vertices given noisy observations of their values on a subset of vertices has been extensively analyzed in the literature of signal processing on graphs (SPoG). This paper advocates kernel regression as a framework generalizing popular SPoG modeling and reconstruction and expanding their capabilities. Formulating signal reconstruction as a regression task on reproducing kernel Hilbert spaces of graph signals permeates benefits from statistical learning, offers fresh insights, and allows for estimators to leverage richer forms of prior information than existing alternatives. A number of SPoG notions such as bandlimitedness, graph filters, and the graph Fourier transform are naturally accommodated in the kernel framework. Additionally, this paper capitalizes on the so-called representer theorem to devise simpler versions of existing Thikhonov regularized estimators, and offers a novel probabilistic interpretation of kernel methods on graphs based on graphical models. Motivated by the challenges of selecting the bandwidth parameter in SPoG estimators or the kernel map in kernel-based methods, the present paper further proposes two multi-kernel approaches with complementary strengths. Whereas the first enables estimation of the unknown bandwidth of bandlimited signals, the second allows for efficient graph filter selection. Numerical tests with synthetic as well as real data demonstrate the merits of the proposed methods relative to state-of-the-art alternatives.
△ Less
Submitted 23 May, 2016;
originally announced May 2016.
-
LARSEN-ELM: Selective Ensemble of Extreme Learning Machines using LARS for Blended Data
Authors:
Bo Han,
Bo He,
Rui Nian,
Mengmeng Ma,
Shujing Zhang,
Minghui Li,
Amaury Lendasse
Abstract:
Extreme learning machine (ELM) as a neural network algorithm has shown its good performance, such as fast speed, simple structure etc, but also, weak robustness is an unavoidable defect in original ELM for blended data. We present a new machine learning framework called LARSEN-ELM for overcoming this problem. In our paper, we would like to show two key steps in LARSEN-ELM. In the first step, prepr…
▽ More
Extreme learning machine (ELM) as a neural network algorithm has shown its good performance, such as fast speed, simple structure etc, but also, weak robustness is an unavoidable defect in original ELM for blended data. We present a new machine learning framework called LARSEN-ELM for overcoming this problem. In our paper, we would like to show two key steps in LARSEN-ELM. In the first step, preprocessing, we select the input variables highly related to the output using least angle regression (LARS). In the second step, training, we employ Genetic Algorithm (GA) based selective ensemble and original ELM. In the experiments, we apply a sum of two sines and four datasets from UCI repository to verify the robustness of our approach. The experimental results show that compared with original ELM and other methods such as OP-ELM, GASEN-ELM and LSBoost, LARSEN-ELM significantly improve robustness performance while keeping a relatively high speed.
△ Less
Submitted 26 August, 2014; v1 submitted 8 August, 2014;
originally announced August 2014.