Skip to main content

Showing 1–3 of 3 results for author: Yağmurlu, Ö E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06072  [pdf, ps, other

    cs.RO cs.LG

    BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning

    Authors: Hongyi Zhou, Weiran Liao, Xi Huang, Yucheng Tang, Fabian Otto, Xiaogang Jia, Xinkai Jiang, Simon Hilber, Ge Li, Qian Wang, Ömer Erdinç Yağmurlu, Nils Blank, Moritz Reuss, Rudolf Lioutikov

    Abstract: We present the B-spline Encoded Action Sequence Tokenizer (BEAST), a novel action tokenizer that encodes action sequences into compact discrete or continuous tokens using B-splines. In contrast to existing action tokenizers based on vector quantization or byte pair encoding, BEAST requires no separate tokenizer training and consistently produces tokens of uniform length, enabling fast action seque… ▽ More

    Submitted 10 June, 2025; v1 submitted 6 June, 2025; originally announced June 2025.

  2. arXiv:2410.17772  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Scaling Robot Policy Learning via Zero-Shot Labeling with Foundation Models

    Authors: Nils Blank, Moritz Reuss, Marcel Rühle, Ömer Erdinç Yağmurlu, Fabian Wenzel, Oier Mees, Rudolf Lioutikov

    Abstract: A central challenge towards developing robots that can relate human language to their perception and actions is the scarcity of natural language annotations in diverse robot datasets. Moreover, robot policies that follow natural language instructions are typically trained on either templated language or expensive human-labeled instructions, hindering their scalability. To this end, we introduce NI… ▽ More

    Submitted 26 October, 2024; v1 submitted 23 October, 2024; originally announced October 2024.

    Comments: Project Website at https://robottasklabeling.github.io/

  3. arXiv:2407.05996  [pdf, other

    cs.RO

    Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals

    Authors: Moritz Reuss, Ömer Erdinç Yağmurlu, Fabian Wenzel, Rudolf Lioutikov

    Abstract: This work introduces the Multimodal Diffusion Transformer (MDT), a novel diffusion policy framework, that excels at learning versatile behavior from multimodal goal specifications with few language annotations. MDT leverages a diffusion-based multimodal transformer backbone and two self-supervised auxiliary objectives to master long-horizon manipulation tasks based on multimodal goals. The vast ma… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: RSS 2024