Search | arXiv e-print repository

Provable Hierarchy-Based Meta-Reinforcement Learning

Authors: Kurtland Chua, Qi Lei, Jason D. Lee

Abstract: Hierarchical reinforcement learning (HRL) has seen widespread interest as an approach to tractable learning of complex modular behaviors. However, existing work either assume access to expert-constructed hierarchies, or use hierarchy-learning heuristics with no provable guarantees. To address this gap, we analyze HRL in the meta-RL setting, where a learner learns latent hierarchical structure duri… ▽ More Hierarchical reinforcement learning (HRL) has seen widespread interest as an approach to tractable learning of complex modular behaviors. However, existing work either assume access to expert-constructed hierarchies, or use hierarchy-learning heuristics with no provable guarantees. To address this gap, we analyze HRL in the meta-RL setting, where a learner learns latent hierarchical structure during meta-training for use in a downstream task. We consider a tabular setting where natural hierarchical structure is embedded in the transition dynamics. Analogous to supervised meta-learning theory, we provide "diversity conditions" which, together with a tractable optimism-based algorithm, guarantee sample-efficient recovery of this natural hierarchy. Furthermore, we provide regret bounds on a learner using the recovered hierarchy to solve a meta-test task. Our bounds incorporate common notions in HRL literature such as temporal and state/action abstractions, suggesting that our setting and analysis capture important features of HRL in practice. △ Less

Submitted 18 October, 2021; originally announced October 2021.

arXiv:2105.02221 [pdf, other]

How Fine-Tuning Allows for Effective Meta-Learning

Authors: Kurtland Chua, Qi Lei, Jason D. Lee

Abstract: Representation learning has been widely studied in the context of meta-learning, enabling rapid learning of new tasks through shared representations. Recent works such as MAML have explored using fine-tuning-based metrics, which measure the ease by which fine-tuning can achieve good performance, as proxies for obtaining representations. We present a theoretical framework for analyzing representati… ▽ More Representation learning has been widely studied in the context of meta-learning, enabling rapid learning of new tasks through shared representations. Recent works such as MAML have explored using fine-tuning-based metrics, which measure the ease by which fine-tuning can achieve good performance, as proxies for obtaining representations. We present a theoretical framework for analyzing representations derived from a MAML-like algorithm, assuming the available tasks use approximately the same underlying representation. We then provide risk bounds on the best predictor found by fine-tuning via gradient descent, demonstrating that the algorithm can provably leverage the shared structure. The upper bound applies to general function classes, which we demonstrate by instantiating the guarantees of our framework in the logistic regression and neural network settings. In contrast, we establish the existence of settings where any algorithm, using a representation trained with no consideration for task-specific fine-tuning, performs as well as a learner with no access to source tasks in the worst case. This separation result underscores the benefit of fine-tuning-based methods, such as MAML, over methods with "frozen representation" objectives in few-shot learning. △ Less

Submitted 5 May, 2021; originally announced May 2021.

arXiv:2006.08918 [pdf, other]

On parametric tests of relativity with false degrees of freedom

Authors: Alvin J. K. Chua, Michele Vallisneri

Abstract: General relativity can be tested by comparing the binary-inspiral signals found in LIGO--Virgo data against waveform models that are augmented with artificial degrees of freedom. This approach suffers from a number of logical and practical pitfalls. 1) It is difficult to ascribe meaning to the stringency of the resultant constraints. 2) It is doubtful that the Bayesian model comparison of relativi… ▽ More General relativity can be tested by comparing the binary-inspiral signals found in LIGO--Virgo data against waveform models that are augmented with artificial degrees of freedom. This approach suffers from a number of logical and practical pitfalls. 1) It is difficult to ascribe meaning to the stringency of the resultant constraints. 2) It is doubtful that the Bayesian model comparison of relativity against these artificial models can offer actual validation for the former. 3) It is unknown to what extent these tests might detect alternative theories of gravity for which there are no computed waveforms; conversely, when waveforms are available, tests that employ them will be superior. △ Less

Submitted 16 June, 2020; originally announced June 2020.

Comments: 4 pages, 2 figures

arXiv:1909.05966 [pdf, other]

doi 10.1103/PhysRevLett.124.041102

Learning Bayesian posteriors with neural networks for gravitational-wave inference

Authors: Alvin J. K. Chua, Michele Vallisneri

Abstract: We seek to achieve the Holy Grail of Bayesian inference for gravitational-wave astronomy: using deep-learning techniques to instantly produce the posterior $p(θ|D)$ for the source parameters $θ$, given the detector data $D$. To do so, we train a deep neural network to take as input a signal + noise data set (drawn from the astrophysical source-parameter prior and the sampling distribution of detec… ▽ More We seek to achieve the Holy Grail of Bayesian inference for gravitational-wave astronomy: using deep-learning techniques to instantly produce the posterior $p(θ|D)$ for the source parameters $θ$, given the detector data $D$. To do so, we train a deep neural network to take as input a signal + noise data set (drawn from the astrophysical source-parameter prior and the sampling distribution of detector noise), and to output a parametrized approximation of the corresponding posterior. We rely on a compact representation of the data based on reduced-order modeling, which we generate efficiently using a separate neural-network waveform interpolant [A. J. K. Chua, C. R. Galley & M. Vallisneri, Phys. Rev. Lett. 122, 211101 (2019)]. Our scheme has broad relevance to gravitational-wave applications such as low-latency parameter estimation and characterizing the science returns of future experiments. Source code and trained networks are available online at https://github.com/vallis/truebayes. △ Less

Submitted 29 January, 2020; v1 submitted 12 September, 2019; originally announced September 2019.

Comments: (Superior-to-)published version; source code and trained networks available at https://github.com/vallis/truebayes

Journal ref: Phys. Rev. Lett. 124, 041102 (2020)

arXiv:1811.05494 [pdf, other]

doi 10.1007/s11222-019-09907-8

Sampling from manifold-restricted distributions using tangent bundle projections

Authors: Alvin J. K. Chua

Abstract: A common problem in Bayesian inference is the sampling of target probability distributions at sufficient resolution and accuracy to estimate the probability density, and to compute credible regions. Often by construction, many target distributions can be expressed as some higher-dimensional closed-form distribution with parametrically constrained variables, i.e., one that is restricted to a smooth… ▽ More A common problem in Bayesian inference is the sampling of target probability distributions at sufficient resolution and accuracy to estimate the probability density, and to compute credible regions. Often by construction, many target distributions can be expressed as some higher-dimensional closed-form distribution with parametrically constrained variables, i.e., one that is restricted to a smooth submanifold of Euclidean space. I propose a derivative-based importance sampling framework for such distributions. A base set of $n$ samples from the target distribution is used to map out the tangent bundle of the manifold, and to seed $nm$ additional points that are projected onto the tangent bundle and weighted appropriately. The method essentially acts as an upsampling complement to any standard algorithm. It is designed for the efficient production of approximate high-resolution histograms from manifold-restricted Gaussian distributions, and can provide large computational savings when sampling directly from the target distribution is expensive. △ Less

Submitted 16 October, 2019; v1 submitted 13 November, 2018; originally announced November 2018.

Comments: Published version; Python implementation available at https://github.com/alvincjk/sampling-manifold-restricted-gaussians

Journal ref: Stat. Comput. 30, 587 (2020)

arXiv:1811.05491 [pdf, other]

doi 10.1103/PhysRevLett.122.211101

Reduced-order modeling with artificial neurons for gravitational-wave inference

Authors: Alvin J. K. Chua, Chad R. Galley, Michele Vallisneri

Abstract: Gravitational-wave data analysis is rapidly absorbing techniques from deep learning, with a focus on convolutional networks and related methods that treat noisy time series as images. We pursue an alternative approach, in which waveforms are first represented as weighted sums over reduced bases (reduced-order modeling); we then train artificial neural networks to map gravitational-wave source para… ▽ More Gravitational-wave data analysis is rapidly absorbing techniques from deep learning, with a focus on convolutional networks and related methods that treat noisy time series as images. We pursue an alternative approach, in which waveforms are first represented as weighted sums over reduced bases (reduced-order modeling); we then train artificial neural networks to map gravitational-wave source parameters into basis coefficients. Statistical inference proceeds directly in coefficient space, where it is theoretically straightforward and computationally efficient. The neural networks also provide analytic waveform derivatives, which are useful for gradient-based sampling schemes. We demonstrate fast and accurate coefficient interpolation for the case of a four-dimensional binary-inspiral waveform family, and discuss promising applications of our framework in parameter estimation. △ Less

Submitted 30 May, 2019; v1 submitted 13 November, 2018; originally announced November 2018.

Comments: Published version

Journal ref: Phys. Rev. Lett. 122, 211101 (2019)

arXiv:1805.12114 [pdf, other]

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

Authors: Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine

Abstract: Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance. This is especially true with high-capacity parametric function approximators, such as deep networks. In this paper, we study how to bridge this gap, by employing uncertainty-aware dynamics models. We propose a new algorit… ▽ More Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance. This is especially true with high-capacity parametric function approximators, such as deep networks. In this paper, we study how to bridge this gap, by employing uncertainty-aware dynamics models. We propose a new algorithm called probabilistic ensembles with trajectory sampling (PETS) that combines uncertainty-aware deep network dynamics models with sampling-based uncertainty propagation. Our comparison to state-of-the-art model-based and model-free deep RL algorithms shows that our approach matches the asymptotic performance of model-free algorithms on several challenging benchmark tasks, while requiring significantly fewer samples (e.g., 8 and 125 times fewer samples than Soft Actor Critic and Proximal Policy Optimization respectively on the half-cheetah task). △ Less

Submitted 2 November, 2018; v1 submitted 30 May, 2018; originally announced May 2018.

Comments: NIPS 2018, video and code available at https://sites.google.com/view/drl-in-a-handful-of-trials/

arXiv:1604.01250 [pdf, other]

doi 10.1098/rsos.160125

Fast methods for training Gaussian processes on large data sets

Authors: Christopher J. Moore, Alvin J. K. Chua, Christopher P. L. Berry, Jonathan R. Gair

Abstract: Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large data sets. Here, we derive some simple results which we have found useful for speeding up the learning stage in the GPR algorithm, and es… ▽ More Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large data sets. Here, we derive some simple results which we have found useful for speeding up the learning stage in the GPR algorithm, and especially for performing Bayesian model comparison between different covariance functions. We apply our techniques to both synthetic and real data and quantify the speed-up relative to using nested sampling to numerically evaluate model evidences. △ Less

Submitted 13 May, 2016; v1 submitted 5 April, 2016; originally announced April 2016.

Comments: Fixed missing references

Journal ref: R. Soc. Open Sci. 3, 160125 (2016)

arXiv:1407.1576 [pdf, other]

A Statistical Modelling and Analysis of PHEVs' Power Demand in Smart Grids

Authors: Farshad Rassaei, Wee-Seng Soh, Kee-Chaing Chua

Abstract: Electric vehicles (EVs) and particularly plug-in hybrid electric vehicles (PHEVs) are foreseen to become popular in the near future. Not only are they much more environmentally friendly than conventional internal combustion engine (ICE) vehicles, their fuel can also be catered from diverse energy sources and resources. However, they add significant load on the power grid as they become widespread.… ▽ More Electric vehicles (EVs) and particularly plug-in hybrid electric vehicles (PHEVs) are foreseen to become popular in the near future. Not only are they much more environmentally friendly than conventional internal combustion engine (ICE) vehicles, their fuel can also be catered from diverse energy sources and resources. However, they add significant load on the power grid as they become widespread. The characteristics of this extra load follow the patterns of people's driving behaviours. In particular, random parameters such as arrival time and driven distance of the vehicles determine their expected demand profile from the power grid. In this paper, we first present a model for uncoordinated charging power demand of PHEVs based on a stochastic process and accordingly we characterize the EV's expected daily power demand profile. Next, we adopt different distributions for the EV's charging time following some available empirical research data in the literature. Simulation results show that the EV's expected daily power demand profiles obtained under the uniform, Gaussian with positive support and Rician distributions for charging time are identical when the first and second order statistics of these distributions are the same. This gives us useful insights into the long-term planning for upgrading power systems' infrastructure to accommodate PHEVs. In addition, the results from this modelling can be incorporated into designing demand response (DR) algorithms and evaluating the available DR techniques more accurately. △ Less

Submitted 7 July, 2014; originally announced July 2014.

Comments: 6 pages, 10 figures

Showing 1–9 of 9 results for author: Chua, K