-
Revisiting Design Choices in Proximal Policy Optimization
Authors:
Chloe Ching-Yun Hsu,
Celestine Mendler-Dünner,
Moritz Hardt
Abstract:
Proximal Policy Optimization (PPO) is a popular deep policy gradient algorithm. In standard implementations, PPO regularizes policy updates with clipped probability ratios, and parameterizes policies with either continuous Gaussian distributions or discrete Softmax distributions. These design choices are widely accepted, and motivated by empirical performance comparisons on MuJoCo and Atari benchm…
▽ More
Proximal Policy Optimization (PPO) is a popular deep policy gradient algorithm. In standard implementations, PPO regularizes policy updates with clipped probability ratios, and parameterizes policies with either continuous Gaussian distributions or discrete Softmax distributions. These design choices are widely accepted, and motivated by empirical performance comparisons on MuJoCo and Atari benchmarks.
We revisit these practices outside the regime of current benchmarks, and expose three failure modes of standard PPO. We explain why standard design choices are problematic in these cases, and show that alternative choices of surrogate objectives and policy parameterizations can prevent the failure modes. We hope that our work serves as a reminder that many algorithmic design choices in reinforcement learning are tied to specific simulation environments. We should not implicitly accept these choices as a standard part of a more general algorithm.
△ Less
Submitted 22 September, 2020;
originally announced September 2020.
-
Linear Dynamics: Clustering without identification
Authors:
Chloe Ching-Yun Hsu,
Michaela Hardt,
Moritz Hardt
Abstract:
Linear dynamical systems are a fundamental and powerful parametric model class. However, identifying the parameters of a linear dynamical system is a venerable task, permitting provably efficient solutions only in special cases. This work shows that the eigenspectrum of unknown linear dynamics can be identified without full system identification. We analyze a computationally efficient and provably…
▽ More
Linear dynamical systems are a fundamental and powerful parametric model class. However, identifying the parameters of a linear dynamical system is a venerable task, permitting provably efficient solutions only in special cases. This work shows that the eigenspectrum of unknown linear dynamics can be identified without full system identification. We analyze a computationally efficient and provably convergent algorithm to estimate the eigenvalues of the state-transition matrix in a linear dynamical system.
When applied to time series clustering, our algorithm can efficiently cluster multi-dimensional time series with temporal offsets and varying lengths, under the assumption that the time series are generated from linear dynamical systems. Evaluating our algorithm on both synthetic data and real electrocardiogram (ECG) signals, we see improvements in clustering quality over existing baselines.
△ Less
Submitted 29 February, 2020; v1 submitted 2 August, 2019;
originally announced August 2019.
-
A new algorithm for fast generalized DFTs
Authors:
Chloe Ching-Yun Hsu,
Chris Umans
Abstract:
We give an new arithmetic algorithm to compute the generalized Discrete Fourier Transform (DFT) over finite groups $G$. The new algorithm uses $O(|G|^{ω/2 + o(1)})$ operations to compute the generalized DFT over finite groups of Lie type, including the linear, orthogonal, and symplectic families and their variants, as well as all finite simple groups of Lie type. Here $ω$ is the exponent of matrix…
▽ More
We give an new arithmetic algorithm to compute the generalized Discrete Fourier Transform (DFT) over finite groups $G$. The new algorithm uses $O(|G|^{ω/2 + o(1)})$ operations to compute the generalized DFT over finite groups of Lie type, including the linear, orthogonal, and symplectic families and their variants, as well as all finite simple groups of Lie type. Here $ω$ is the exponent of matrix multiplication, so the exponent $ω/2$ is optimal if $ω= 2$. Previously, "exponent one" algorithms were known for supersolvable groups and the symmetric and alternating groups. No exponent one algorithms were known (even under the assumption $ω= 2$) for families of linear groups of fixed dimension, and indeed the previous best-known algorithm for $SL_2(F_q)$ had exponent $4/3$ despite being the focus of significant effort. We unconditionally achieve exponent at most $1.19$ for this group, and exponent one if $ω= 2$. Our algorithm also yields an improved exponent for computing the generalized DFT over general finite groups $G$, which beats the longstanding previous best upper bound, for any $ω$. In particular, assuming $ω= 2$, we achieve exponent $\sqrt{2}$, while the previous best was $3/2$.
△ Less
Submitted 30 March, 2018; v1 submitted 2 July, 2017;
originally announced July 2017.