-
Wasserstein Robust Reinforcement Learning
Authors:
Mohammed Amin Abdullah,
Hang Ren,
Haitham Bou Ammar,
Vladimir Milenkovic,
Rui Luo,
Mingtian Zhang,
Jun Wang
Abstract:
Reinforcement learning algorithms, though successful, tend to over-fit to training environments hampering their application to the real-world. This paper proposes $\text{W}\text{R}^{2}\text{L}$ -- a robust reinforcement learning algorithm with significant robust performance on low and high-dimensional control tasks. Our method formalises robust reinforcement learning as a novel min-max game with a…
▽ More
Reinforcement learning algorithms, though successful, tend to over-fit to training environments hampering their application to the real-world. This paper proposes $\text{W}\text{R}^{2}\text{L}$ -- a robust reinforcement learning algorithm with significant robust performance on low and high-dimensional control tasks. Our method formalises robust reinforcement learning as a novel min-max game with a Wasserstein constraint for a correct and convergent solver. Apart from the formulation, we also propose an efficient and scalable solver following a novel zero-order optimisation method that we believe can be useful to numerical optimisation in general. We empirically demonstrate significant gains compared to standard and robust state-of-the-art algorithms on high-dimensional MuJuCo environments.
△ Less
Submitted 16 September, 2019; v1 submitted 30 July, 2019;
originally announced July 2019.
-
Optimal Downlink Transmission for Cell Free SWIPT Massive MIMO Systems with Active Eavesdropping
Authors:
Mahmoud Alageli,
Aissa Ikhlef,
Fahad Alsifiany,
Mohammed A. M. Abdullah,
Gaojie Chen,
Jonathon Chambers
Abstract:
This paper considers secure simultaneous wireless information and power transfer (SWIPT) in cell-free massive multiple-input multiple-output (MIMO) systems. The system consists of a large number of randomly (Poisson-distributed) located access points (APs) serving multiple information users (IUs) and an information-untrusted dual-antenna active energy harvester (EH). The active EH uses one antenna…
▽ More
This paper considers secure simultaneous wireless information and power transfer (SWIPT) in cell-free massive multiple-input multiple-output (MIMO) systems. The system consists of a large number of randomly (Poisson-distributed) located access points (APs) serving multiple information users (IUs) and an information-untrusted dual-antenna active energy harvester (EH). The active EH uses one antenna to legitimately harvest energy and the other antenna to eavesdrop information. The APs are networked by a centralized infinite backhaul which allows the APs to synchronize and cooperate via a central processing unit (CPU). Closed-form expressions for the average harvested energy (AHE) and a tight lower bound on the ergodic secrecy rate (ESR) are derived. The obtained lower bound on the ESR takes into account the IUs' knowledge attained by downlink effective precoded-channel training. Since the transmit power constraint is per AP, the ESR is nonlinear in terms of the transmit power elements of the APs and that imposes new challenges in formulating a convex power control problem for the downlink transmission. To deal with these nonlinearities, a new method of balancing the transmit power among the APs via relaxed semidefinite programming (SDP) which is proved to be rank-one globally optimal is derived. A fair comparison between the proposed cell-free and the colocated massive MIMO systems shows that the cell-free MIMO outperforms the colocated MIMO over the interval in which the AHE constraint is low and vice versa. Also, the cell-free MIMO is found to be more immune to the increase in the active eavesdropping power than the colocated MIMO.
△ Less
Submitted 23 April, 2019;
originally announced April 2019.
-
Reinforcement Learning with Wasserstein Distance Regularisation, with Applications to Multipolicy Learning
Authors:
Mohammed Amin Abdullah,
Aldo Pacchiano,
Moez Draief
Abstract:
We describe an application of Wasserstein distance to Reinforcement Learning. The Wasserstein distance in question is between the distribution of mappings of trajectories of a policy into some metric space, and some other fixed distribution (which may, for example, come from another policy). Different policies induce different distributions, so given an underlying metric, the Wasserstein distance…
▽ More
We describe an application of Wasserstein distance to Reinforcement Learning. The Wasserstein distance in question is between the distribution of mappings of trajectories of a policy into some metric space, and some other fixed distribution (which may, for example, come from another policy). Different policies induce different distributions, so given an underlying metric, the Wasserstein distance quantifies how different policies are. This can be used to learn multiple polices which are different in terms of such Wasserstein distances by using a Wasserstein regulariser. Changing the sign of the regularisation parameter, one can learn a policy for which its trajectory mapping distribution is attracted to a given fixed distribution.
△ Less
Submitted 30 July, 2019; v1 submitted 12 February, 2018;
originally announced February 2018.
-
Robust On-line Matrix Completion on Graphs
Authors:
Symeon Chouvardas,
Mohammed Amin Abdullah,
Lucas Claude,
Moez Draief
Abstract:
We study online robust matrix completion on graphs. At each iteration a vector with some entries missing is revealed and our goal is to reconstruct it by identifying the underlying low-dimensional subspace from which the vectors are drawn. We assume there is an underlying graph structure to the data, that is, the components of each vector correspond to nodes of a certain (known) graph, and their v…
▽ More
We study online robust matrix completion on graphs. At each iteration a vector with some entries missing is revealed and our goal is to reconstruct it by identifying the underlying low-dimensional subspace from which the vectors are drawn. We assume there is an underlying graph structure to the data, that is, the components of each vector correspond to nodes of a certain (known) graph, and their values are related accordingly. We give algorithms that exploit the graph to reconstruct the incomplete data, even in the presence of outlier noise. The theoretical properties of the algorithms are studied and numerical experiments using both synthetic and real world datasets verify the improved performance of the proposed technique compared to other state of the art algorithms.
△ Less
Submitted 13 May, 2016;
originally announced May 2016.
-
Global Majority Consensus by Local Majority Polling on Graphs of a Given Degree Sequence
Authors:
Mohammed Amin Abdullah,
Moez Draief
Abstract:
Suppose in a graph $G$ vertices can be either red or blue. Let $k$ be odd. At each time step, each vertex $v$ in $G$ polls $k$ random neighbours and takes the majority colour. If it doesn't have $k$ neighbours, it simply polls all of them, or all less one if the degree of $v$ is even. We study this protocol on graphs of a given degree sequence, in the following setting: initially each vertex of…
▽ More
Suppose in a graph $G$ vertices can be either red or blue. Let $k$ be odd. At each time step, each vertex $v$ in $G$ polls $k$ random neighbours and takes the majority colour. If it doesn't have $k$ neighbours, it simply polls all of them, or all less one if the degree of $v$ is even. We study this protocol on graphs of a given degree sequence, in the following setting: initially each vertex of $G$ is red independently with probability $α< \frac{1}{2}$, and is otherwise blue. We show that if $α$ is sufficiently biased, then with high probability consensus is reached on the initial global majority within $O(\log_k \log_k n)$ steps if $5 \leq k \leq d$, and $O(\log_d \log_d n)$ steps if $k > d$. Here, $d\geq 5$ is the effective minimum degree, the smallest integer which occurs $Θ(n)$ times in the degree sequence. We further show that on such graphs, any local protocol in which a vertex does not change colour if all its neighbours have that same colour, takes time at least $Ω(\log_d \log_d n)$, with high probability. Additionally, we demonstrate how the technique for the above sparse graphs can be applied in a straightforward manner to get bounds for the Erdős-Rényi random graphs in the connected regime.
△ Less
Submitted 30 July, 2019; v1 submitted 22 September, 2012;
originally announced September 2012.
-
Off-Line Arabic Handwriting Character Recognition Using Word Segmentation
Authors:
Manal A. Abdullah,
Lulwah M. Al-Harigy,
Hanadi H. Al-Fraidi
Abstract:
The ultimate aim of handwriting recognition is to make computers able to read and/or authenticate human written texts, with a performance comparable to or even better than that of humans. Reading means that the computer is given a piece of handwriting and it provides the electronic transcription of that (e.g. in ASCII format). Two types of handwriting: on-line and offline. The most important purpo…
▽ More
The ultimate aim of handwriting recognition is to make computers able to read and/or authenticate human written texts, with a performance comparable to or even better than that of humans. Reading means that the computer is given a piece of handwriting and it provides the electronic transcription of that (e.g. in ASCII format). Two types of handwriting: on-line and offline. The most important purpose of off-line handwriting recognition is in protection systems and authentication. Arabic Handwriting scripts are much more complicated in comparison to Latin scripts. This paper introduces a simple and novel methodology to authenticate Arabic handwriting characters. Reaching our aim, we built our own character database. The research methodology depends on two stages: The first is character extraction where preprocessing the word and then apply segmentation process to obtain the character. The second is the character recognition by matching the characters comprising the word with the letters in the database. Our results ensure character recognition with 81%. We eliminate FAR by using similarity percent between 45-55%. Our research is coded using MATLAB.
△ Less
Submitted 7 June, 2012;
originally announced June 2012.