Search | arXiv e-print repository

arXiv:2009.10897 [pdf, other]

Revisiting Design Choices in Proximal Policy Optimization

Authors: Chloe Ching-Yun Hsu, Celestine Mendler-Dünner, Moritz Hardt

Abstract: Proximal Policy Optimization (PPO) is a popular deep policy gradient algorithm. In standard implementations, PPO regularizes policy updates with clipped probability ratios, and parameterizes policies with either continuous Gaussian distributions or discrete Softmax distributions. These design choices are widely accepted, and motivated by empirical performance comparisons on MuJoCo and Atari benchm… ▽ More Proximal Policy Optimization (PPO) is a popular deep policy gradient algorithm. In standard implementations, PPO regularizes policy updates with clipped probability ratios, and parameterizes policies with either continuous Gaussian distributions or discrete Softmax distributions. These design choices are widely accepted, and motivated by empirical performance comparisons on MuJoCo and Atari benchmarks. We revisit these practices outside the regime of current benchmarks, and expose three failure modes of standard PPO. We explain why standard design choices are problematic in these cases, and show that alternative choices of surrogate objectives and policy parameterizations can prevent the failure modes. We hope that our work serves as a reminder that many algorithmic design choices in reinforcement learning are tied to specific simulation environments. We should not implicitly accept these choices as a standard part of a more general algorithm. △ Less

Submitted 22 September, 2020; originally announced September 2020.

arXiv:1908.01039 [pdf, other]

Linear Dynamics: Clustering without identification

Authors: Chloe Ching-Yun Hsu, Michaela Hardt, Moritz Hardt

Abstract: Linear dynamical systems are a fundamental and powerful parametric model class. However, identifying the parameters of a linear dynamical system is a venerable task, permitting provably efficient solutions only in special cases. This work shows that the eigenspectrum of unknown linear dynamics can be identified without full system identification. We analyze a computationally efficient and provably… ▽ More Linear dynamical systems are a fundamental and powerful parametric model class. However, identifying the parameters of a linear dynamical system is a venerable task, permitting provably efficient solutions only in special cases. This work shows that the eigenspectrum of unknown linear dynamics can be identified without full system identification. We analyze a computationally efficient and provably convergent algorithm to estimate the eigenvalues of the state-transition matrix in a linear dynamical system. When applied to time series clustering, our algorithm can efficiently cluster multi-dimensional time series with temporal offsets and varying lengths, under the assumption that the time series are generated from linear dynamical systems. Evaluating our algorithm on both synthetic data and real electrocardiogram (ECG) signals, we see improvements in clustering quality over existing baselines. △ Less

Submitted 29 February, 2020; v1 submitted 2 August, 2019; originally announced August 2019.

arXiv:1707.00349 [pdf, ps, other]

A new algorithm for fast generalized DFTs

Authors: Chloe Ching-Yun Hsu, Chris Umans

Abstract: We give an new arithmetic algorithm to compute the generalized Discrete Fourier Transform (DFT) over finite groups $G$. The new algorithm uses $O(|G|^{ω/2 + o(1)})$ operations to compute the generalized DFT over finite groups of Lie type, including the linear, orthogonal, and symplectic families and their variants, as well as all finite simple groups of Lie type. Here $ω$ is the exponent of matrix… ▽ More We give an new arithmetic algorithm to compute the generalized Discrete Fourier Transform (DFT) over finite groups $G$. The new algorithm uses $O(|G|^{ω/2 + o(1)})$ operations to compute the generalized DFT over finite groups of Lie type, including the linear, orthogonal, and symplectic families and their variants, as well as all finite simple groups of Lie type. Here $ω$ is the exponent of matrix multiplication, so the exponent $ω/2$ is optimal if $ω= 2$. Previously, "exponent one" algorithms were known for supersolvable groups and the symmetric and alternating groups. No exponent one algorithms were known (even under the assumption $ω= 2$) for families of linear groups of fixed dimension, and indeed the previous best-known algorithm for $SL_2(F_q)$ had exponent $4/3$ despite being the focus of significant effort. We unconditionally achieve exponent at most $1.19$ for this group, and exponent one if $ω= 2$. Our algorithm also yields an improved exponent for computing the generalized DFT over general finite groups $G$, which beats the longstanding previous best upper bound, for any $ω$. In particular, assuming $ω= 2$, we achieve exponent $\sqrt{2}$, while the previous best was $3/2$. △ Less

Submitted 30 March, 2018; v1 submitted 2 July, 2017; originally announced July 2017.

Showing 1–3 of 3 results for author: Hsu, C C