Skip to main content

Showing 1–42 of 42 results for author: Ikeuchi, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.00871  [pdf, other

    cs.RO cs.AI

    IK Seed Generator for Dual-Arm Human-like Physicality Robot with Mobile Base

    Authors: Jun Takamatsu, Atsushi Kanehira, Kazuhiro Sasabuchi, Naoki Wake, Katsushi Ikeuchi

    Abstract: Robots are strongly expected as a means of replacing human tasks. If a robot has a human-like physicality, the possibility of replacing human tasks increases. In the case of household service robots, it is desirable for them to be on a human-like size so that they do not become excessively large in order to coexist with humans in their operating environment. However, robots with size limitations t… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

    Comments: 8 pages, 12 figures, 4 tables

  2. arXiv:2504.18084  [pdf, other

    cs.RO

    RL-Driven Data Generation for Robust Vision-Based Dexterous Grasping

    Authors: Atsushi Kanehira, Naoki Wake, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: This work presents reinforcement learning (RL)-driven data augmentation to improve the generalization of vision-action (VA) models for dexterous grasping. While real-to-sim-to-real frameworks, where a few real demonstrations seed large-scale simulated data, have proven effective for VA models, applying them to dexterous settings remains challenging: obtaining stable multi-finger contacts is nontri… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

  3. arXiv:2504.04939  [pdf, other

    cs.RO cs.AI cs.CV

    A Taxonomy of Self-Handover

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: Self-handover, transferring an object between one's own hands, is a common but understudied bimanual action. While it facilitates seamless transitions in complex tasks, the strategies underlying its execution remain largely unexplored. Here, we introduce the first systematic taxonomy of self-handover, derived from manual annotation of over 12 hours of cooking activity performed by 21 participants.… ▽ More

    Submitted 8 April, 2025; v1 submitted 7 April, 2025; originally announced April 2025.

    Comments: 8 pages, 8 figures, 1 table, Last updated on April 7th, 2025

  4. arXiv:2504.01252  [pdf, other

    cs.RO cs.AI

    Plan-and-Act using Large Language Models for Interactive Agreement

    Authors: Kazuhiro Sasabuchi, Naoki Wake, Atsushi Kanehira, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: Recent large language models (LLMs) are capable of planning robot actions. In this paper, we explore how LLMs can be used for planning actions with tasks involving situational human-robot interaction (HRI). A key problem of applying LLMs in situational HRI is balancing between "respecting the current human's activity" and "prioritizing the robot's task," as well as understanding the timing of when… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  5. arXiv:2503.15491  [pdf, other

    cs.HC cs.CL cs.LG cs.RO

    Agreeing to Interact in Human-Robot Interaction using Large Language Models and Vision Language Models

    Authors: Kazuhiro Sasabuchi, Naoki Wake, Atsushi Kanehira, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: In human-robot interaction (HRI), the beginning of an interaction is often complex. Whether the robot should communicate with the human is dependent on several situational factors (e.g., the current human's activity, urgency of the interaction, etc.). We test whether large language models (LLM) and vision language models (VLM) can provide solutions to this problem. We compare four different system… ▽ More

    Submitted 7 January, 2025; originally announced March 2025.

  6. arXiv:2501.03968  [pdf, other

    cs.RO cs.AI cs.CV cs.HC

    VLM-driven Behavior Tree for Context-aware Task Planning

    Authors: Naoki Wake, Atsushi Kanehira, Jun Takamatsu, Kazuhiro Sasabuchi, Katsushi Ikeuchi

    Abstract: The use of Large Language Models (LLMs) for generating Behavior Trees (BTs) has recently gained attention in the robotics community, yet remains in its early stages of development. In this paper, we propose a novel framework that leverages Vision-Language Models (VLMs) to interactively generate and edit BTs that address visual conditions, enabling context-aware robot operations in visually complex… ▽ More

    Submitted 10 January, 2025; v1 submitted 7 January, 2025; originally announced January 2025.

    Comments: 10 pages, 11 figures, 5 tables. Last updated on January 9th, 2024

  7. arXiv:2412.11337  [pdf, other

    cs.RO cs.AI cs.CV

    Modality-Driven Design for Multi-Step Dexterous Manipulation: Insights from Neuroscience

    Authors: Naoki Wake, Atsushi Kanehira, Daichi Saito, Jun Takamatsu, Kazuhiro Sasabuchi, Hideki Koike, Katsushi Ikeuchi

    Abstract: Multi-step dexterous manipulation is a fundamental skill in household scenarios, yet remains an underexplored area in robotics. This paper proposes a modular approach, where each step of the manipulation process is addressed with dedicated policies based on effective modality input, rather than relying on a single end-to-end model. To demonstrate this, a dexterous robotic hand performs a manipulat… ▽ More

    Submitted 15 December, 2024; originally announced December 2024.

    Comments: 8 pages, 5 figures, 2 tables. Last updated on December 14th, 2024

  8. Open-Vocabulary Action Localization with Iterative Visual Prompting

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: Video action localization aims to find the timings of specific actions from a long video. Although existing learning-based approaches have been successful, they require annotating videos, which comes with a considerable labor cost. This paper proposes a training-free, open-vocabulary approach based on emerging off-the-shelf vision-language models (VLMs). The challenge stems from the fact that VLMs… ▽ More

    Submitted 7 April, 2025; v1 submitted 30 August, 2024; originally announced August 2024.

    Comments: 9 pages, 5 figures, 6 tables. Published in IEEE Access. Last updated on April 7th, 2025

  9. arXiv:2407.11436  [pdf, other

    cs.RO

    APriCoT: Action Primitives based on Contact-state Transition for In-Hand Tool Manipulation

    Authors: Daichi Saito, Atsushi Kanehira, Kazuhiro Sasabuchi, Naoki Wake, Jun Takamatsu, Hideki Koike, Katsushi Ikeuchi

    Abstract: In-hand tool manipulation is an operation that not only manipulates a tool within the hand (i.e., in-hand manipulation) but also achieves a grasp suitable for a task after the manipulation. This study aims to achieve an in-hand tool manipulation skill through deep reinforcement learning. The difficulty of learning the skill arises because this manipulation requires (A) exploring long-term contact-… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  10. arXiv:2403.02316  [pdf, other

    cs.RO

    Designing Library of Skill-Agents for Hardware-Level Reusability

    Authors: Jun Takamatsu, Daichi Saito, Katsushi Ikeuchi, Atsushi Kanehira, Kazuhiro Sasabuchi, Naoki Wake

    Abstract: To use new robot hardware in a new environment, it is necessary to develop a control program tailored to that specific robot in that environment. Considering the reusability of software among robots is crucial to minimize the effort involved in this process and maximize software reuse across different robots in different environments. This paper proposes a method to remedy this process by consider… ▽ More

    Submitted 20 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  11. arXiv:2403.00833  [pdf, other

    cs.AI

    Position Paper: Agent AI Towards a Holistic Intelligence

    Authors: Qiuyuan Huang, Naoki Wake, Bidipta Sarkar, Zane Durante, Ran Gong, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Noboru Kuno, Ade Famoti, Ashley Llorens, John Langford, Hoi Vo, Li Fei-Fei, Katsu Ikeuchi, Jianfeng Gao

    Abstract: Recent advancements in large foundation models have remarkably enhanced our understanding of sensory information in open-world environments. In leveraging the power of foundation models, it is crucial for AI research to pivot away from excessive reductionism and toward an emphasis on systems that function as cohesive wholes. Specifically, we emphasize developing Agent AI -- an embodied system that… ▽ More

    Submitted 28 February, 2024; originally announced March 2024.

    Comments: 22 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2401.03568

  12. arXiv:2402.05929  [pdf, other

    cs.AI cs.LG cs.RO

    An Interactive Agent Foundation Model

    Authors: Zane Durante, Bidipta Sarkar, Ran Gong, Rohan Taori, Yusuke Noda, Paul Tang, Ehsan Adeli, Shrinidhi Kowshika Lakshmikanth, Kevin Schulman, Arnold Milstein, Demetri Terzopoulos, Ade Famoti, Noboru Kuno, Ashley Llorens, Hoi Vo, Katsu Ikeuchi, Li Fei-Fei, Jianfeng Gao, Naoki Wake, Qiuyuan Huang

    Abstract: The development of artificial intelligence systems is transitioning from creating static, task-specific models to dynamic, agent-based systems capable of performing well in a wide range of applications. We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents across a wide range of domains, datasets, and tasks. Our training paradi… ▽ More

    Submitted 17 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  13. arXiv:2401.03568  [pdf, other

    cs.AI cs.HC cs.LG

    Agent AI: Surveying the Horizons of Multimodal Interaction

    Authors: Zane Durante, Qiuyuan Huang, Naoki Wake, Ran Gong, Jae Sung Park, Bidipta Sarkar, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Yejin Choi, Katsushi Ikeuchi, Hoi Vo, Li Fei-Fei, Jianfeng Gao

    Abstract: Multi-modal AI systems will likely become a ubiquitous presence in our everyday lives. A promising approach to making these systems more interactive is to embody them as agents within physical and virtual environments. At present, systems leverage existing foundation models as the basic building blocks for the creation of embodied agents. Embedding agents within such environments facilitates the a… ▽ More

    Submitted 25 January, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  14. arXiv:2311.12015  [pdf, other

    cs.RO cs.CL cs.CV

    GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: We introduce a pipeline that enhances a general-purpose Vision Language Model, GPT-4V(ision), to facilitate one-shot visual teaching for robotic manipulation. This system analyzes videos of humans performing tasks and outputs executable robot programs that incorporate insights into affordances. The process begins with GPT-4V analyzing the videos to obtain textual explanations of environmental and… ▽ More

    Submitted 26 September, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

    Comments: 8 pages, 10 figures, 3 tables. Published in IEEE Robotics and Automation Letters (RA-L) (in press). Last updated on September 26th, 2024

  15. arXiv:2311.11007  [pdf, other

    cs.RO

    Constraint-aware Policy for Compliant Manipulation

    Authors: Daichi Saito, Kazuhiro Sasabuchi, Naoki Wake, Atsushi Kanehira, Jun Takamatsu, Hideki Koike, Katsushi Ikeuchi

    Abstract: Robot manipulation in a physically-constrained environment requires compliant manipulation. Compliant manipulation is a manipulation skill to adjust hand motion based on the force imposed by the environment. Recently, reinforcement learning (RL) has been applied to solve household operations involving compliant manipulation. However, previous RL methods have primarily focused on designing a policy… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  16. arXiv:2310.11753  [pdf, other

    cs.RO cs.CL

    Bias in Emotion Recognition with ChatGPT

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: This technical report explores the ability of ChatGPT in recognizing emotions from text, which can be the basis of various applications like interactive chatbots, data annotation, and mental health analysis. While prior research has shown ChatGPT's basic ability in sentiment analysis, its performance in more nuanced emotion recognition is not yet explored. Here, we conducted experiments to evaluat… ▽ More

    Submitted 4 December, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: 5 pages, 4 figures, 6 tables

  17. arXiv:2309.16162  [pdf, other

    cs.HC

    ACT2G: Attention-based Contrastive Learning for Text-to-Gesture Generation

    Authors: Hitoshi Teshima, Naoki Wake, Diego Thomas, Yuta Nakashima, Hiroshi Kawasaki, Katsushi Ikeuchi

    Abstract: Recent increase of remote-work, online meeting and tele-operation task makes people find that gesture for avatars and communication robots is more important than we have thought. It is one of the key factors to achieve smooth and natural communication between humans and AI systems and has been intensively researched. Current gesture generation methods are mostly based on deep neural network using… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  18. arXiv:2306.01741  [pdf, other

    cs.RO cs.CL

    GPT Models Meet Robotic Applications: Co-Speech Gesturing Chat System

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: This technical paper introduces a chatting robot system that utilizes recent advancements in large-scale language models (LLMs) such as GPT-3 and ChatGPT. The system is integrated with a co-speech gesture generation system, which selects appropriate gestures based on the conceptual meaning of speech. Our motivation is to explore ways of utilizing the recent progress in LLMs for practical robotic a… ▽ More

    Submitted 10 May, 2023; originally announced June 2023.

  19. arXiv:2304.09966  [pdf, other

    cs.RO

    Applying Learning-from-observation to household service robots: three common-sense formulation

    Authors: Katsushi Ikeuchi, Jun Takamatsu, Kazuhiro Sasabuchi, Naoki Wake, Atsushi Kanehiro

    Abstract: Utilizing a robot in a new application requires the robot to be programmed at each time. To reduce such programmings efforts, we have been developing ``Learning-from-observation (LfO)'' that automatically generates robot programs by observing human demonstrations. One of the main issues with introducing this LfO system into the domain of household tasks is the cluttered environments, which cause d… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

  20. ChatGPT Empowered Long-Step Robot Control in Various Environments: A Case Application

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: This paper demonstrates how OpenAI's ChatGPT can be used in a few-shot setting to convert natural language instructions into a sequence of executable robot actions. The paper proposes easy-to-customize input prompts for ChatGPT that meet common requirements in practical applications, such as easy integration with robot execution systems and applicability to various environments while minimizing th… ▽ More

    Submitted 29 August, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: 21 figures, 7 tables. Published in IEEE Access (in press). Last updated August 29th, 2023

  21. arXiv:2301.01382  [pdf, other

    cs.RO

    Task-sequencing Simulator: Integrated Machine Learning to Execution Simulation for Robot Manipulation

    Authors: Kazuhiro Sasabuchi, Daichi Saito, Atsushi Kanehira, Naoki Wake, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: A task-sequencing simulator in robotics manipulation to integrate simulation-for-learning and simulation-for-execution is introduced. Unlike existing machine-learning simulation where a non-decomposed simulation is used to simulate a training scenario, the task-sequencing simulator runs a composed simulation using building blocks. This way, the simulation-for-learning is structured similarly to a… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

    Comments: 7 pages, 6 figures

  22. Interactive Task Encoding System for Learning-from-Observation

    Authors: Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: We present the Interactive Task Encoding System (ITES) for teaching robots to perform manipulative tasks. ITES is designed as an input system for the Learning-from-Observation (LfO) framework, which enables household robots to be programmed using few-shot human demonstrations without the need for coding. In contrast to previous LfO systems that rely solely on visual demonstrations, ITES leverages… ▽ More

    Submitted 28 April, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: 6 pages, 9 figures. Submitted to and accepted by 2023 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM). Last updated April 28st, 2023

  23. arXiv:2212.09242  [pdf, other

    cs.RO

    Learning-from-Observation System Considering Hardware-Level Reusability

    Authors: Jun Takamatsu, Kazuhiro Sasabuchi, Naoki Wake, Atsushi Kanehira, Katsushi Ikeuchi

    Abstract: Robot developers develop various types of robots for satisfying users' various demands. Users' demands are related to their backgrounds and robots suitable for users may vary. If a certain developer would offer a robot that is different from the usual to a user, the robot-specific software has to be changed. On the other hand, robot-software developers would like to reuse their developed software… ▽ More

    Submitted 18 December, 2022; originally announced December 2022.

    Comments: 5 pages, 4 figures

  24. arXiv:2210.06790  [pdf, other

    cs.RO cs.MM

    Deep Gesture Generation for Social Robots Using Type-Specific Libraries

    Authors: Hitoshi Teshima, Naoki Wake, Diego Thomas, Yuta Nakashima, Hiroshi Kawasaki, Katsushi Ikeuchi

    Abstract: Body language such as conversational gesture is a powerful way to ease communication. Conversational gestures do not only make a speech more lively but also contain semantic meaning that helps to stress important information in the discussion. In the field of robotics, giving conversational agents (humanoid robots or virtual avatars) the ability to properly use gestures is critical, yet remain a t… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

  25. arXiv:2203.15290  [pdf, other

    cs.RO

    Design strategies for controlling neuron-connected robots using reinforcement learning

    Authors: Haruto Sawada, Naoki Wake, Kazuhiro Sasabuchi, Jun Takamatsu, Hirokazu Takahashi, Katsushi Ikeuchi

    Abstract: Despite the growing interest in robot control utilizing the computation of biological neurons, context-dependent behavior by neuron-connected robots remains a challenge. Context-dependent behavior here is defined as behavior that is not the result of a simple sensory-motor coupling, but rather based on an understanding of the task goal. This paper proposes design principles for training neuron-con… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: Last updated March 29th, 2022

  26. arXiv:2203.00733  [pdf, other

    cs.RO

    Task-grasping from human demonstration

    Authors: Daichi Saito, Kazuhiro Sasabuchi, Naoki Wake, Jun Takamatsu, Hideki Koike, Katsushi Ikeuchi

    Abstract: A challenge in robot grasping is to achieve task-grasping which is to select a grasp that is advantageous to the success of tasks before and after grasps. One of the frameworks to address this difficulty is Learning-from-Observation (LfO), which obtains various hints from human demonstrations. This paper solves three issues in the grasping skills in the LfO framework: 1) how to functionally mimic… ▽ More

    Submitted 1 March, 2022; originally announced March 2022.

    Comments: 7 pages, 8 figures

  27. arXiv:2107.03000  [pdf, other

    cs.CV

    PoseRN: A 2D pose refinement network for bias-free multi-view 3D human pose estimation

    Authors: Akihiko Sayo, Diego Thomas, Hiroshi Kawasaki, Yuta Nakashima, Katsushi Ikeuchi

    Abstract: We propose a new 2D pose refinement network that learns to predict the human bias in the estimated 2D pose. There are biases in 2D pose estimations that are due to differences between annotations of 2D joint locations based on annotators' perception and those defined by motion capture (MoCap) systems. These biases are crafted into publicly available 2D pose datasets and cannot be removed with exis… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

  28. arXiv:2103.02201  [pdf, other

    cs.RO

    Semantic constraints to represent common sense required in household actions for multi-modal Learning-from-observation robot

    Authors: Katsushi Ikeuchi, Naoki Wake, Riku Arakawa, Kazuhiro Sasabuchi, Jun Takamatsu

    Abstract: The paradigm of learning-from-observation (LfO) enables a robot to learn how to perform actions by observing human-demonstrated actions. Previous research in LfO have mainly focused on the industrial domain which only consist of the observable physical constraints between a manipulating tool and the robot's working environment. In order to extend this paradigm to the household domain which consist… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: 18 pages, 31 figures

  29. Text-driven object affordance for guiding grasp-type recognition in multimodal robot teaching

    Authors: Naoki Wake, Daichi Saito, Kazuhiro Sasabuchi, Hideki Koike, Katsushi Ikeuchi

    Abstract: This study investigates how text-driven object affordance, which provides prior knowledge about grasp types for each object, affects image-based grasp-type recognition in robot teaching. The researchers created labeled datasets of first-person hand images to examine the impact of object affordance on recognition performance. They evaluated scenarios with real and illusory objects, considering mixe… ▽ More

    Submitted 12 May, 2023; v1 submitted 27 February, 2021; originally announced March 2021.

    Comments: 8 pages, 11 figures. Last updated March 12, 2023 Accepted for publication in Machine Vision and Applications

  30. arXiv:2101.05061  [pdf, other

    cs.CV cs.RO

    Understanding Action Sequences based on Video Captioning for Learning-from-Observation

    Authors: Iori Yanokura, Naoki Wake, Kazuhiro Sasabuchi, Katsushi Ikeuchi, Masayuki Inaba

    Abstract: Learning actions from human demonstration video is promising for intelligent robotic systems. Extracting the exact section and re-observing the extracted video section in detail is important for imitating complex skills because human motions give valuable hints for robots. However, the general video understanding methods focus more on the understanding of the full frame,lacking consideration on ex… ▽ More

    Submitted 9 December, 2020; originally announced January 2021.

  31. arXiv:2010.06194  [pdf

    cs.RO

    Labeling the Phrases of a Conversational Agent with a Unique Personalized Vocabulary

    Authors: Naoki Wake, Machiko Sato, Kazuhiro Sasabuchi, Minako Nakamura, Katsushi Ikeuchi

    Abstract: Mapping spoken text to gestures is an important research topic for robots with conversation capabilities. According to studies on human co-speech gestures, a reasonable solution for mapping is using a concept-based approach in which a text is first mapped to a semantic cluster (i.e., a concept) containing texts with similar meanings. Subsequently, each concept is mapped to a predefined gesture. By… ▽ More

    Submitted 12 November, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

    Comments: 8 pages, 3 figures. Submitted to and accepted by IEEE/SICE SII 2022. Last updated November 12th, 2021

  32. arXiv:2009.09813  [pdf

    cs.RO cs.CV

    Grasp-type Recognition Leveraging Object Affordance

    Authors: Naoki Wake, Kazuhiro Sasabuchi, Katsushi Ikeuchi

    Abstract: A key challenge in robot teaching is grasp-type recognition with a single RGB image and a target object name. Here, we propose a simple yet effective pipeline to enhance learning-based recognition by leveraging a prior distribution of grasp types for each object. In the pipeline, a convolutional neural network (CNN) recognizes the grasp type from an RGB image. The recognition result is further cor… ▽ More

    Submitted 26 August, 2020; originally announced September 2020.

    Comments: 2 pages, 2 figures. Submitted to and accepted by HOBI (IEEE RO-MAN Workshop 2020). Last updated August 26th, 2020

  33. A Learning-from-Observation Framework: One-Shot Robot Teaching for Grasp-Manipulation-Release Household Operations

    Authors: Naoki Wake, Riku Arakawa, Iori Yanokura, Takuya Kiyokawa, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

    Abstract: A household robot is expected to perform various manipulative operations with an understanding of the purpose of the task. To this end, a desirable robotic application should provide an on-site robot teaching framework for non-experts. Here we propose a Learning-from-Observation (LfO) framework for grasp-manipulation-release class household operations (GMR-operations). The framework maps human dem… ▽ More

    Submitted 20 October, 2020; v1 submitted 4 August, 2020; originally announced August 2020.

    Comments: 6 pages, 6 figures. Submitted to and accepted by IEEE/SICE SII 2021. Last updated October 20th, 2020

  34. Task-oriented Motion Mapping on Robots of Various Configuration using Body Role Division

    Authors: Kazuhiro Sasabuchi, Naoki Wake, Katsushi Ikeuchi

    Abstract: Many works in robot teaching either focus only on teaching task knowledge, such as geometric constraints, or motion knowledge, such as the motion for accomplishing a task. However, to effectively teach a complex task sequence to a robot, it is important to take advantage of both task and motion knowledge. The task knowledge provides the goals of each individual task within the sequence and reduces… ▽ More

    Submitted 30 December, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

    Comments: 8 pages, 10 figures

  35. arXiv:2007.08705  [pdf

    cs.RO cs.HC

    Verbal Focus-of-Attention System for Learning-from-Observation

    Authors: Naoki Wake, Iori Yanokura, Kazuhiro Sasabuchi, Katsushi Ikeuchi

    Abstract: The learning-from-observation (LfO) framework aims to map human demonstrations to a robot to reduce programming effort. To this end, an LfO system encodes a human demonstration into a series of execution units for a robot, which are referred to as task models. Although previous research has proposed successful task-model encoders, there has been little discussion on how to guide a task-model encod… ▽ More

    Submitted 24 March, 2021; v1 submitted 16 July, 2020; originally announced July 2020.

    Comments: 8 pages, 7 figures. Submitted to and accepted by IEEE ICRA 2021. Last updated March 3rd, 2021

  36. arXiv:1905.08702  [pdf

    cs.RO

    Design of conversational humanoid robot based on hardware independent gesture generation

    Authors: Katsushi Ikeuchi, David Baumert, Shunsuke Kudoh, Masaru Takizawa

    Abstract: With an increasing need for elderly and disability care, there is an increasing opportunity for intelligent and mobile devices such as robots to provide care and support solutions. In order to naturally assist and interact with humans, a robot must possess effective conversational capabilities. Gestures accompanying spoken sentences are an important factor in human-to-human conversational communic… ▽ More

    Submitted 21 May, 2019; originally announced May 2019.

    Comments: 7 pages, 8 figures

  37. arXiv:1807.02632  [pdf, other

    cs.CV

    Representing a Partially Observed Non-Rigid 3D Human Using Eigen-Texture and Eigen-Deformation

    Authors: Ryosuke Kimura, Akihiko Sayo, Fabian Lorenzo Dayrit, Yuta Nakashima, Hiroshi Kawasaki, Ambrosio Blanco, Katsushi Ikeuchi

    Abstract: Reconstruction of the shape and motion of humans from RGB-D is a challenging problem, receiving much attention in recent years. Recent approaches for full-body reconstruction use a statistic shape model, which is built upon accurate full-body scans of people in skin-tight clothes, to complete invisible parts due to occlusion. Such a statistic model may still be fit to an RGB-D measurement with loo… ▽ More

    Submitted 7 July, 2018; originally announced July 2018.

    Comments: 6pages, accepted to ICPR

  38. arXiv:1804.05178  [pdf, other

    cs.CV cs.RO

    LiDAR and Camera Calibration using Motion Estimated by Sensor Fusion Odometry

    Authors: Ryoichi Ishikawa, Takeshi Oishi, Katsushi Ikeuchi

    Abstract: In this paper, we propose a method of targetless and automatic Camera-LiDAR calibration. Our approach is an extension of hand-eye calibration framework to 2D-3D calibration. By using the sensor fusion odometry method, the scaled camera motions are calculated with high accuracy. In addition to this, we clarify the suitable motion for this calibration method. The proposed method only requires the… ▽ More

    Submitted 14 April, 2018; originally announced April 2018.

  39. arXiv:1804.04817  [pdf, other

    cs.CV cs.RO

    Offline and Online calibration of Mobile Robot and SLAM Device for Navigation

    Authors: Ryoichi Ishikawa, Takeshi Oishi, Katsushi Ikeuchi

    Abstract: Robot navigation technology is required to accomplish difficult tasks in various environments. In navigation, it is necessary to know the information of the external environments and the state of the robot under the environment. On the other hand, various studies have been done on SLAM technology, which is also used for navigation, but also applied to devices for Mixed Reality and the like. In t… ▽ More

    Submitted 13 April, 2018; originally announced April 2018.

  40. arXiv:1609.05429  [pdf

    cs.RO

    Describing upper body motions based on the Labanotation for learning-from-observation robots

    Authors: Katsushi Ikeuchi, Zengqiang Yan, Zhaoyuan Ma, Yoshihiro Sato, Minako Nakamura, Shunsuke Kudoh

    Abstract: We have been developing a paradigm, which we refer to as Learning-from-observation, for a robot to automatically acquire what-to-do through observation of human performance. Since a simple mimicking method to repeat exact joint angles does not work due to the kinematic and dynamic difference between a human and a robot, the method introduces an intermediate symbolic representation, task models, to… ▽ More

    Submitted 18 September, 2016; originally announced September 2016.

    Comments: 9 pages, 11 figures

  41. arXiv:1606.00166  [pdf, other

    cs.CV

    Multiview Rectification of Folded Documents

    Authors: Shaodi You, Yasuyuki Matsushita, Sudipta Sinha, Yusuke Bou, Katsushi Ikeuchi

    Abstract: Digitally unwrapping images of paper sheets is crucial for accurate document scanning and text recognition. This paper presents a method for automatically rectifying curved or folded paper sheets from a few images captured from multiple viewpoints. Prior methods either need expensive 3D scanners or model deformable surfaces using over-simplified parametric representations. In contrast, our method… ▽ More

    Submitted 1 June, 2016; originally announced June 2016.

    Comments: 8 pages; under review

  42. arXiv:1604.00730  [pdf, other

    cs.CV

    Waterdrop Stereo

    Authors: Shaodi You, Robby T. Tan, Rei Kawakami, Yasuhiro Mukaigawa, Katsushi Ikeuchi

    Abstract: This paper introduces depth estimation from water drops. The key idea is that a single water drop adhered to window glass is totally transparent and convex, and thus optically acts like a fisheye lens. If we have more than one water drop in a single image, then through each of them we can see the environment with different view points, similar to stereo. To realize this idea, we need to rectify ev… ▽ More

    Submitted 3 April, 2016; originally announced April 2016.

    Comments: 12 pages, 15figues