Search | arXiv e-print repository

Program-Based Strategy Induction for Reinforcement Learning

Authors: Carlos G. Correa, Thomas L. Griffiths, Nathaniel D. Daw

Abstract: Typical models of learning assume incremental estimation of continuously-varying decision variables like expected rewards. However, this class of models fails to capture more idiosyncratic, discrete heuristics and strategies that people and animals appear to exhibit. Despite recent advances in strategy discovery using tools like recurrent networks that generalize the classic models, the resulting… ▽ More Typical models of learning assume incremental estimation of continuously-varying decision variables like expected rewards. However, this class of models fails to capture more idiosyncratic, discrete heuristics and strategies that people and animals appear to exhibit. Despite recent advances in strategy discovery using tools like recurrent networks that generalize the classic models, the resulting strategies are often onerous to interpret, making connections to cognition difficult to establish. We use Bayesian program induction to discover strategies implemented by programs, letting the simplicity of strategies trade off against their effectiveness. Focusing on bandit tasks, we find strategies that are difficult or unexpected with classical incremental learning, like asymmetric learning from rewarded and unrewarded trials, adaptive horizon-dependent random exploration, and discrete state switching. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2311.18644 [pdf, other]

doi 10.1016/j.cognition.2024.105990

Exploring the hierarchical structure of human plans via program generation

Authors: Carlos G. Correa, Sophia Sanborn, Mark K. Ho, Frederick Callaway, Nathaniel D. Daw, Thomas L. Griffiths

Abstract: Human behavior is often assumed to be hierarchically structured, made up of abstract actions that can be decomposed into concrete actions. However, behavior is typically measured as a sequence of actions, which makes it difficult to infer its hierarchical structure. In this paper, we explore how people form hierarchically structured plans, using an experimental paradigm with observable hierarchica… ▽ More Human behavior is often assumed to be hierarchically structured, made up of abstract actions that can be decomposed into concrete actions. However, behavior is typically measured as a sequence of actions, which makes it difficult to infer its hierarchical structure. In this paper, we explore how people form hierarchically structured plans, using an experimental paradigm with observable hierarchical representations: participants create programs that produce sequences of actions in a language with explicit hierarchical structure. This task lets us test two well-established principles of human behavior: utility maximization (i.e. using fewer actions) and minimum description length (MDL; i.e. having a shorter program). We find that humans are sensitive to both metrics, but that both accounts fail to predict a qualitative feature of human-created programs, namely that people prefer programs with reuse over and above the predictions of MDL. We formalize this preference for reuse by extending the MDL account into a generative model over programs, modeling hierarchy choice as the induction of a grammar over actions. Our account can explain the preference for reuse and provides better predictions of human behavior, going beyond simple accounts of compressibility to highlight a principle that guides hierarchical planning. △ Less

Submitted 3 December, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

arXiv:2310.02221 [pdf, other]

Structurally guided task decomposition in spatial navigation tasks

Authors: Ruiqi He, Carlos G. Correa, Thomas L. Griffiths, Mark K. Ho

Abstract: How are people able to plan so efficiently despite limited cognitive resources? We aimed to answer this question by extending an existing model of human task decomposition that can explain a wide range of simple planning problems by adding structure information to the task to facilitate planning in more complex tasks. The extended model was then applied to a more complex planning domain of spatial… ▽ More How are people able to plan so efficiently despite limited cognitive resources? We aimed to answer this question by extending an existing model of human task decomposition that can explain a wide range of simple planning problems by adding structure information to the task to facilitate planning in more complex tasks. The extended model was then applied to a more complex planning domain of spatial navigation. Our results suggest that our framework can correctly predict the navigation strategies of the majority of the participants in an online experiment. △ Less

Submitted 3 October, 2023; originally announced October 2023.

arXiv:2211.03890 [pdf, other]

doi 10.1371/journal.pcbi.1011087

Humans decompose tasks by trading off utility and computational cost

Authors: Carlos G. Correa, Mark K. Ho, Frederick Callaway, Nathaniel D. Daw, Thomas L. Griffiths

Abstract: Human behavior emerges from planning over elaborate decompositions of tasks into goals, subgoals, and low-level actions. How are these decompositions created and used? Here, we propose and evaluate a normative framework for task decomposition based on the simple idea that people decompose tasks to reduce the overall cost of planning while maintaining task performance. Analyzing 11,117 distinct gra… ▽ More Human behavior emerges from planning over elaborate decompositions of tasks into goals, subgoals, and low-level actions. How are these decompositions created and used? Here, we propose and evaluate a normative framework for task decomposition based on the simple idea that people decompose tasks to reduce the overall cost of planning while maintaining task performance. Analyzing 11,117 distinct graph-structured planning tasks, we find that our framework justifies several existing heuristics for task decomposition and makes predictions that can be distinguished from two alternative normative accounts. We report a behavioral study of task decomposition ($N=806$) that uses 30 randomly sampled graphs, a larger and more diverse set than that of any previous behavioral study on this topic. We find that human responses are more consistent with our framework for task decomposition than alternative normative accounts and are most consistent with a heuristic -- betweenness centrality -- that is justified by our approach. Taken together, our results provide new theoretical insight into the computational principles underlying the intelligent structuring of goal-directed behavior. △ Less

Submitted 7 November, 2022; originally announced November 2022.

arXiv:2205.11558 [pdf, other]

Using Natural Language and Program Abstractions to Instill Human Inductive Biases in Machines

Authors: Sreejan Kumar, Carlos G. Correa, Ishita Dasgupta, Raja Marjieh, Michael Y. Hu, Robert D. Hawkins, Nathaniel D. Daw, Jonathan D. Cohen, Karthik Narasimhan, Thomas L. Griffiths

Abstract: Strong inductive biases give humans the ability to quickly learn to perform a variety of tasks. Although meta-learning is a method to endow neural networks with useful inductive biases, agents trained by meta-learning may sometimes acquire very different strategies from humans. We show that co-training these agents on predicting representations from natural language task descriptions and programs… ▽ More Strong inductive biases give humans the ability to quickly learn to perform a variety of tasks. Although meta-learning is a method to endow neural networks with useful inductive biases, agents trained by meta-learning may sometimes acquire very different strategies from humans. We show that co-training these agents on predicting representations from natural language task descriptions and programs induced to generate such tasks guides them toward more human-like inductive biases. Human-generated language descriptions and program induction models that add new learned primitives both contain abstract concepts that can compress description length. Co-training on these representations result in more human-like behavior in downstream meta-reinforcement learning agents than less abstract controls (synthetic language descriptions, program induction without learned primitives), suggesting that the abstraction supported by these representations is key. △ Less

Submitted 5 February, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

Comments: In Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022), winner of Outstanding Paper Award

arXiv:2105.06948 [pdf, other]

doi 10.1038/s41586-022-04743-9

People construct simplified mental representations to plan

Authors: Mark K. Ho, David Abel, Carlos G. Correa, Michael L. Littman, Jonathan D. Cohen, Thomas L. Griffiths

Abstract: One of the most striking features of human cognition is the capacity to plan. Two aspects of human planning stand out: its efficiency and flexibility. Efficiency is especially impressive because plans must often be made in complex environments, and yet people successfully plan solutions to myriad everyday problems despite having limited cognitive resources. Standard accounts in psychology, economi… ▽ More One of the most striking features of human cognition is the capacity to plan. Two aspects of human planning stand out: its efficiency and flexibility. Efficiency is especially impressive because plans must often be made in complex environments, and yet people successfully plan solutions to myriad everyday problems despite having limited cognitive resources. Standard accounts in psychology, economics, and artificial intelligence have suggested human planning succeeds because people have a complete representation of a task and then use heuristics to plan future actions in that representation. However, this approach generally assumes that task representations are fixed. Here, we propose that task representations can be controlled and that such control provides opportunities to quickly simplify problems and more easily reason about them. We propose a computational account of this simplification process and, in a series of pre-registered behavioral experiments, show that it is subject to online cognitive control and that people optimally balance the complexity of a task representation and its utility for planning and acting. These results demonstrate how strategically perceiving and conceiving problems facilitates the effective use of limited cognitive resources. △ Less

Submitted 26 November, 2022; v1 submitted 14 May, 2021; originally announced May 2021.

Comments: 56 pages, 5 main figures, 10 extended data figures, supplementary information is included in ancillary files

Journal ref: Nature, 606(7912), 129-136 (2022)

arXiv:2007.13862 [pdf, other]

Resource-rational Task Decomposition to Minimize Planning Costs

Authors: Carlos G. Correa, Mark K. Ho, Fred Callaway, Thomas L. Griffiths

Abstract: People often plan hierarchically. That is, rather than planning over a monolithic representation of a task, they decompose the task into simpler subtasks and then plan to accomplish those. Although much work explores how people decompose tasks, there is less analysis of why people decompose tasks in the way they do. Here, we address this question by formalizing task decomposition as a resource-rat… ▽ More People often plan hierarchically. That is, rather than planning over a monolithic representation of a task, they decompose the task into simpler subtasks and then plan to accomplish those. Although much work explores how people decompose tasks, there is less analysis of why people decompose tasks in the way they do. Here, we address this question by formalizing task decomposition as a resource-rational representation problem. Specifically, we propose that people decompose tasks in a manner that facilitates efficient use of limited cognitive resources given the structure of the environment and their own planning algorithms. Using this model, we replicate several existing findings. Our account provides a normative explanation for how people identify subtasks as well as a framework for studying how people reason, plan, and act using resource-rational representations. △ Less

Submitted 27 July, 2020; originally announced July 2020.

Comments: The first two authors contributed equally. To appear in Proceedings of the 42nd Annual Conference of the Cognitive Science Society (CogSci 2020)

Showing 1–7 of 7 results for author: Correa, C G