Skip to main content

Showing 1–7 of 7 results for author: Alakuijala, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.13180  [pdf, ps, other

    cs.AI

    ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models

    Authors: Matteo Merler, Nicola Dainese, Minttu Alakuijala, Giovanni Bonetta, Pietro Ferrazzi, Yu Tian, Bernardo Magnini, Pekka Marttinen

    Abstract: Integrating Large Language Models with symbolic planners is a promising direction for obtaining verifiable and grounded plans compared to planning in natural language, with recent works extending this idea to visual domains using Vision-Language Models (VLMs). However, rigorous comparison between VLM-grounded symbolic approaches and methods that plan directly with a VLM has been hindered by a lack… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    Comments: 9 pages, 5 figures and 1 table in the main text; 43 pages, 9 figures and 16 tables including supplementary material

  2. arXiv:2505.02576  [pdf, other

    cs.AI cs.LG

    Recursive Decomposition with Dependencies for Generic Divide-and-Conquer Reasoning

    Authors: Sergio Hernández-Gutiérrez, Minttu Alakuijala, Alexander V. Nikitin, Pekka Marttinen

    Abstract: Reasoning tasks are crucial in many domains, especially in science and engineering. Although large language models (LLMs) have made progress in reasoning tasks using techniques such as chain-of-thought and least-to-most prompting, these approaches still do not effectively scale to complex problems in either their performance or execution time. Moreover, they often require additional supervision fo… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

  3. arXiv:2502.01562  [pdf, other

    cs.LG

    Memento No More: Coaching AI Agents to Master Multiple Tasks via Hints Internalization

    Authors: Minttu Alakuijala, Ya Gao, Georgy Ananov, Samuel Kaski, Pekka Marttinen, Alexander Ilin, Harri Valpola

    Abstract: As the general capabilities of artificial intelligence (AI) agents continue to evolve, their ability to learn to master multiple complex tasks through experience remains a key challenge. Current LLM agents, particularly those based on proprietary language models, typically rely on prompts to incorporate knowledge about the target tasks. This approach does not allow the agent to internalize this in… ▽ More

    Submitted 28 May, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

  4. arXiv:2405.19988  [pdf, other

    cs.RO cs.AI cs.CL cs.CV cs.LG

    Video-Language Critic: Transferable Reward Functions for Language-Conditioned Robotics

    Authors: Minttu Alakuijala, Reginald McLean, Isaac Woungang, Nariman Farsad, Samuel Kaski, Pekka Marttinen, Kai Yuan

    Abstract: Natural language is often the easiest and most convenient modality for humans to specify tasks for robots. However, learning to ground language to behavior typically requires impractical amounts of diverse, language-annotated demonstrations collected on each target robot. In this work, we aim to separate the problem of what to accomplish from how to accomplish it, as the former can benefit from su… ▽ More

    Submitted 7 November, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: 10 pages in the main text, 16 pages including references and supplementary materials. 4 figures and 3 tables in the main text, 1 table in supplementary materials

  5. arXiv:2405.15383  [pdf, other

    cs.AI

    Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search

    Authors: Nicola Dainese, Matteo Merler, Minttu Alakuijala, Pekka Marttinen

    Abstract: In this work we consider Code World Models, world models generated by a Large Language Model (LLM) in the form of Python code for model-based Reinforcement Learning (RL). Calling code instead of LLMs for planning has potential to be more precise, reliable, interpretable, and extremely efficient. However, writing appropriate Code World Models requires the ability to understand complex instructions,… ▽ More

    Submitted 30 October, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted at NeurIPS 2024, Main Track. 11 pages in main text, 40 pages including references and supplementary materials. 2 figures and 3 tables in the main text, 9 figures and 12 tables when including the supplementary materials. Website at https://sites.google.com/view/code-world-models/home

  6. arXiv:2211.09019  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Learning Reward Functions for Robotic Manipulation by Observing Humans

    Authors: Minttu Alakuijala, Gabriel Dulac-Arnold, Julien Mairal, Jean Ponce, Cordelia Schmid

    Abstract: Observing a human demonstrator manipulate objects provides a rich, scalable and inexpensive source of data for learning robotic policies. However, transferring skills from human videos to a robotic manipulator poses several challenges, not least a difference in action and observation spaces. In this work, we use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-… ▽ More

    Submitted 7 March, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

  7. arXiv:2106.08050  [pdf, other

    cs.LG

    Residual Reinforcement Learning from Demonstrations

    Authors: Minttu Alakuijala, Gabriel Dulac-Arnold, Julien Mairal, Jean Ponce, Cordelia Schmid

    Abstract: Residual reinforcement learning (RL) has been proposed as a way to solve challenging robotic tasks by adapting control actions from a conventional feedback controller to maximize a reward signal. We extend the residual formulation to learn from visual inputs and sparse rewards using demonstrations. Learning from images, proprioceptive inputs and a sparse task-completion reward relaxes the requirem… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.