Skip to main content

Showing 1–2 of 2 results for author: Pacek, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.03170  [pdf, other

    cs.CL cs.AI cs.LG

    Focused Transformer: Contrastive Training for Context Scaling

    Authors: Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Miłoś

    Abstract: Large language models have an exceptional capability to incorporate new information in a contextual manner. However, the full potential of such an approach is often restrained due to a limitation in the effective context length. One solution to this issue is to endow an attention layer with access to an external memory, which comprises of (key, value) pairs. Yet, as the number of documents increas… ▽ More

    Submitted 30 November, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: Accepted at 37th Conference on Neural Information Processing Systems (NeurIPS 2023). 28 pages, 10 figures, 11 tables

  2. Planning and Learning Using Adaptive Entropy Tree Search

    Authors: Piotr Kozakowski, Mikołaj Pacek, Piotr Miłoś

    Abstract: Recent breakthroughs in Artificial Intelligence have shown that the combination of tree-based planning with deep learning can lead to superior performance. We present Adaptive Entropy Tree Search (ANTS) - a novel algorithm combining planning and learning in the maximum entropy paradigm. Through a comprehensive suite of experiments on the Atari benchmark we show that ANTS significantly outperforms… ▽ More

    Submitted 14 March, 2023; v1 submitted 12 February, 2021; originally announced February 2021.