Skip to main content

Showing 1–1 of 1 results for author: Jucys, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12161  [pdf, other

    cs.AI

    Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent

    Authors: Karolis Jucys, George Adamopoulos, Mehrab Hamidi, Stephanie Milani, Mohammad Reza Samsami, Artem Zholus, Sonia Joseph, Blake Richards, Irina Rish, Özgür Şimşek

    Abstract: Understanding the mechanisms behind decisions taken by large foundation models in sequential decision making tasks is critical to ensuring that such systems operate transparently and safely. In this work, we perform exploratory analysis on the Video PreTraining (VPT) Minecraft playing agent, one of the largest open-source vision-based agents. We aim to illuminate its reasoning mechanisms by applyi… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Mechanistic Interpretability Workshop at ICML 2024