-
Sketch Interface for Teleoperation of Mobile Manipulator to Enable Intuitive and Intended Operation: A Proof of Concept
Authors:
Yuka Iwanaga,
Masayoshi Tsuchinaga,
Kosei Tanada,
Yuji Nakamura,
Takemitsu Mori,
Takashi Yamamoto
Abstract:
Recent advancements in robotics have underscored the need for effective collaboration between humans and robots. Traditional interfaces often struggle to balance robot autonomy with human oversight, limiting their practical application in complex tasks like mobile manipulation. This study aims to develop an intuitive interface that enables a mobile manipulator to autonomously interpret user-provid…
▽ More
Recent advancements in robotics have underscored the need for effective collaboration between humans and robots. Traditional interfaces often struggle to balance robot autonomy with human oversight, limiting their practical application in complex tasks like mobile manipulation. This study aims to develop an intuitive interface that enables a mobile manipulator to autonomously interpret user-provided sketches, enhancing user experience while minimizing burden. We implemented a web-based application utilizing machine learning algorithms to process sketches, making the interface accessible on mobile devices for use anytime, anywhere, by anyone. In the first validation, we examined natural sketches drawn by users for 27 selected manipulation and navigation tasks, gaining insights into trends related to sketch instructions. The second validation involved comparative experiments with five grasping tasks, showing that the sketch interface reduces workload and enhances intuitiveness compared to conventional axis control interfaces. These findings suggest that the proposed sketch interface improves the efficiency of mobile manipulators and opens new avenues for integrating intuitive human-robot collaboration in various applications.
△ Less
Submitted 21 May, 2025; v1 submitted 20 May, 2025;
originally announced May 2025.
-
Sketch-MoMa: Teleoperation for Mobile Manipulator via Interpretation of Hand-Drawn Sketches
Authors:
Kosei Tanada,
Yuka Iwanaga,
Masayoshi Tsuchinaga,
Yuji Nakamura,
Takemitsu Mori,
Remi Sakai,
Takashi Yamamoto
Abstract:
To use assistive robots in everyday life, a remote control system with common devices, such as 2D devices, is helpful to control the robots anytime and anywhere as intended. Hand-drawn sketches are one of the intuitive ways to control robots with 2D devices. However, since similar sketches have different intentions from scene to scene, existing work needs additional modalities to set the sketches'…
▽ More
To use assistive robots in everyday life, a remote control system with common devices, such as 2D devices, is helpful to control the robots anytime and anywhere as intended. Hand-drawn sketches are one of the intuitive ways to control robots with 2D devices. However, since similar sketches have different intentions from scene to scene, existing work needs additional modalities to set the sketches' semantics. This requires complex operations for users and leads to decreasing usability. In this paper, we propose Sketch-MoMa, a teleoperation system using the user-given hand-drawn sketches as instructions to control a robot. We use Vision-Language Models (VLMs) to understand the user-given sketches superimposed on an observation image and infer drawn shapes and low-level tasks of the robot. We utilize the sketches and the generated shapes for recognition and motion planning of the generated low-level tasks for precise and intuitive operations. We validate our approach using state-of-the-art VLMs with 7 tasks and 5 sketch shapes. We also demonstrate that our approach effectively specifies the detailed motions, such as how to grasp and how much to rotate. Moreover, we show the competitive usability of our approach compared with the existing 2D interface through a user experiment with 14 participants.
△ Less
Submitted 7 January, 2025; v1 submitted 26 December, 2024;
originally announced December 2024.
-
Exploring Modular Mobility: Industry Advancements, Research Trends, and Future Directions on Modular Autonomous Vehicles
Authors:
Lanhang Ye,
Toshiyuki Yamamoto
Abstract:
Modular autonomous vehicles (MAVs) represent a transformative paradigm in the rapidly advancing field of autonomous vehicle technology. The integration of modularity offers numerous advantages, poised to reshape urban mobility systems and foster innovation in this emerging domain. Although publications on MAVs have only gained traction in the past five years, these pioneering efforts are critical…
▽ More
Modular autonomous vehicles (MAVs) represent a transformative paradigm in the rapidly advancing field of autonomous vehicle technology. The integration of modularity offers numerous advantages, poised to reshape urban mobility systems and foster innovation in this emerging domain. Although publications on MAVs have only gained traction in the past five years, these pioneering efforts are critical for envisioning the future of modular mobility. This work provides a comprehensive review of industry and academic contributions to MAV development up to 2024, encompassing conceptualization, design, and applications in both passenger and logistics transport. The review systematically defines MAVs and outlines their technical framework, highlighting groundbreaking efforts in vehicular conceptualization, system design, and business models by the automotive industry and emerging mobility service providers. It also synthesizes academic research on key topics, including passenger and logistics transport, and their integration within future mobility ecosystems. The review concludes by identifying challenges, summarizing the current state of the art, and proposing future research directions to advance the development of modular autonomous mobility systems.
△ Less
Submitted 23 December, 2024;
originally announced December 2024.
-
Development of Low-Cost IoT Units for Thermal Comfort Measurement and AC Energy Consumption Prediction System
Authors:
Yutong Chen,
Daisuke Sumiyoshi,
Riki Sakai,
Takahiro Yamamoto,
Takahiro Ueno,
Jewon Oh
Abstract:
In response to the substantial energy consumption in buildings, the Japanese government initiated the BI-Tech (Behavioral Insights X Technology) project in 2019, aimed at promoting voluntary energy-saving behaviors through the utilization of AI and IoT technologies. Our study aimed at small and medium-sized office buildings introduces a cost-effective IoT-based BI-Tech system, utilizing the Raspbe…
▽ More
In response to the substantial energy consumption in buildings, the Japanese government initiated the BI-Tech (Behavioral Insights X Technology) project in 2019, aimed at promoting voluntary energy-saving behaviors through the utilization of AI and IoT technologies. Our study aimed at small and medium-sized office buildings introduces a cost-effective IoT-based BI-Tech system, utilizing the Raspberry Pi 4B+ platform for real-time monitoring of indoor thermal conditions and air conditioner (AC) set-point temperature. Employing machine learning and image recognition, the system analyzes data to calculate the PMV index and predict energy consumption changes due to temperature adjustments. The integration of mobile and desktop applications conveys this information to users, encouraging energy-efficient behavior modifications. The machine learning model achieved with an R2 value of 97%, demonstrating the system's efficiency in promoting energy-saving habits among users.
△ Less
Submitted 29 November, 2024;
originally announced November 2024.
-
EnchantedClothes: Visual and Tactile Feedback with an Abdomen-Attached Robot through Clothes
Authors:
Takumi Yamamoto,
Rin Yoshimura,
Yuta Sugiura
Abstract:
Wearable robots are designed to be worn on the human body. Taking advantage of their physical form, various applications for wearable robots are being considered. This study proposes a wearable robot worn on the abdomen and a new interaction with it. Our robot enables a variety of applications related to communication between the wearer and surrounding humans through visual and tactile feedback. T…
▽ More
Wearable robots are designed to be worn on the human body. Taking advantage of their physical form, various applications for wearable robots are being considered. This study proposes a wearable robot worn on the abdomen and a new interaction with it. Our robot enables a variety of applications related to communication between the wearer and surrounding humans through visual and tactile feedback. The contributions of this research will be (1) the proposal of a novel wearable robot worn on the abdomen and (2) a new interaction with it.
△ Less
Submitted 7 November, 2024;
originally announced November 2024.
-
Modular Autonomous Vehicle in Heterogeneous Traffic Flow: Modeling, Simulation, and Implication
Authors:
Lanhang Ye,
Toshiyuki Yamamoto
Abstract:
Modular autonomous vehicles (MAVs) represent a groundbreaking concept that integrates modularity into the ongoing development of autonomous vehicles. This innovative design introduces unique features to traffic flow, allowing multiple modules to seamlessly join together and operate collectively. To understand the traffic flow characteristics involving these vehicles and their collective operations…
▽ More
Modular autonomous vehicles (MAVs) represent a groundbreaking concept that integrates modularity into the ongoing development of autonomous vehicles. This innovative design introduces unique features to traffic flow, allowing multiple modules to seamlessly join together and operate collectively. To understand the traffic flow characteristics involving these vehicles and their collective operations, this study established a modeling framework specifically designed to simulate their behavior within traffic flow. The mixed traffic flow, incorporating arbitrarily formed trains of various modular sizes, is modeled and studied. Simulations are conducted under varying levels of traffic demand and penetration rates to examine the traffic flow dynamics in the presence of these vehicles and their operations. The microscopic trajectories, MAV train compositions, and macroscopic fundamental diagrams of the mixed traffic flow are analyzed. The simulation findings indicate that integrating MAVs and their collective operations can substantially enhance capacity, with the extent of improvement depending on the penetration rate in mixed traffic flow. Notably, the capacity nearly doubles when the penetration rate exceeds 75%. Furthermore, their presence significantly influences and regulates the free-flow speed of the mixed traffic. Particularly, when variations in operational speed limits exist between the MAVs and the background traffic, the mixed traffic adjusts to the operating velocity of these vehicles. This study provides insights into potential future traffic flow systems incorporating emerging MAV technologies.
△ Less
Submitted 23 December, 2024; v1 submitted 26 September, 2024;
originally announced September 2024.
-
Informational Health --Toward the Reduction of Risks in the Information Space
Authors:
Fujio Toriumi,
Tatsuhiko Yamamoto
Abstract:
The modern information society, markedly influenced by the advent of the internet and subsequent developments such as WEB 2.0, has seen an explosive increase in information availability, fundamentally altering human interaction with information spaces. This transformation has facilitated not only unprecedented access to information but has also raised significant challenges, particularly highlight…
▽ More
The modern information society, markedly influenced by the advent of the internet and subsequent developments such as WEB 2.0, has seen an explosive increase in information availability, fundamentally altering human interaction with information spaces. This transformation has facilitated not only unprecedented access to information but has also raised significant challenges, particularly highlighted by the spread of ``fake news'' during critical events like the 2016 U.S. presidential election and the COVID-19 pandemic. The latter event underscored the dangers of an ``infodemic,'' where the large amount of information made distinguishing between factual and non-factual content difficult, thereby complicating public health responses and posing risks to democratic processes. In response to these challenges, this paper introduces the concept of ``informational health,'' drawing an analogy between dietary habits and information consumption. It argues that just as balanced diets are crucial for physical health, well-considered nformation behavior is essential for maintaining a healthy information environment. This paper proposes three strategies for fostering informational health: literacy education, visualization of meta-information, and informational health assessments. These strategies aim to empower users and platforms to navigate and enhance the information ecosystem effectively. By focusing on long-term informational well-being, we highlight the necessity of addressing the social risks inherent in the current attention economy, advocating for a paradigm shift towards a more sustainable information consumption model.
△ Less
Submitted 19 July, 2024;
originally announced July 2024.
-
Exploring the impact of spatiotemporal granularity on the demand prediction of dynamic ride-hailing
Authors:
Kai Liu,
Zhiju Chen,
Toshiyuki Yamamoto,
Liheng Tuo
Abstract:
Dynamic demand prediction is a key issue in ride-hailing dispatching. Many methods have been developed to improve the demand prediction accuracy of an increase in demand-responsive, ride-hailing transport services. However, the uncertainties in predicting ride-hailing demands due to multiscale spatiotemporal granularity, as well as the resulting statistical errors, are seldom explored. This paper…
▽ More
Dynamic demand prediction is a key issue in ride-hailing dispatching. Many methods have been developed to improve the demand prediction accuracy of an increase in demand-responsive, ride-hailing transport services. However, the uncertainties in predicting ride-hailing demands due to multiscale spatiotemporal granularity, as well as the resulting statistical errors, are seldom explored. This paper attempts to fill this gap and to examine the spatiotemporal granularity effects on ride-hailing demand prediction accuracy by using empirical data for Chengdu, China. A convolutional, long short-term memory model combined with a hexagonal convolution operation (H-ConvLSTM) is proposed to explore the complex spatial and temporal relations. Experimental analysis results show that the proposed approach outperforms conventional methods in terms of prediction accuracy. A comparison of 36 spatiotemporal granularities with both departure demands and arrival demands shows that the combination of a hexagonal spatial partition with an 800 m side length and a 30 min time interval achieves the best comprehensive prediction accuracy. However, the departure demands and arrival demands reveal different variation trends in the prediction errors for various spatiotemporal granularities.
△ Less
Submitted 19 March, 2022;
originally announced March 2022.
-
GenéLive! Generating Rhythm Actions in Love Live!
Authors:
Atsushi Takada,
Daichi Yamazaki,
Likun Liu,
Yudai Yoshida,
Nyamkhuu Ganbat,
Takayuki Shimotomai,
Taiga Yamamoto,
Daisuke Sakurai,
Naoki Hamada
Abstract:
This article presents our generative model for rhythm action games together with applications in business operations. Rhythm action games are video games in which the player is challenged to issue commands at the right timings during a music session. The timings are rendered in the chart, which consists of visual symbols, called notes, flying through the screen. We introduce our deep generative mo…
▽ More
This article presents our generative model for rhythm action games together with applications in business operations. Rhythm action games are video games in which the player is challenged to issue commands at the right timings during a music session. The timings are rendered in the chart, which consists of visual symbols, called notes, flying through the screen. We introduce our deep generative model, GenéLive!, which outperforms the state-of-the-art model by taking into account musical structures through beats and temporal scales. Thanks to its favorable performance, GenéLive! was put into operation at KLab Inc., a Japan-based video game developer, and reduced the business cost of chart generation by as much as half. The application target included the phenomenal "Love Live!," which has more than 10 million users across Asia and beyond, and is one of the few rhythm action franchises that has led the online era of the genre. In this article, we evaluate the generative performance of GenéLive! using production datasets at KLab as well as open datasets for reproducibility, while the model continues to operate in their business. Our code and the model, tuned and trained using a supercomputer, are publicly available.
△ Less
Submitted 20 December, 2022; v1 submitted 25 February, 2022;
originally announced February 2022.
-
Mission Design of DESTINY+: Toward Active Asteroid (3200) Phaethon and Multiple Small Bodies
Authors:
Naoya Ozaki,
Takayuki Yamamoto,
Ferran Gonzalez-Franquesa,
Roger Gutierrez-Ramon,
Nishanth Pushparaj,
Takuya Chikazawa,
Diogene Alessandro Dei Tos,
Onur Çelik,
Nicola Marmo,
Yasuhiro Kawakatsu,
Tomoko Arai,
Kazutaka Nishiyama,
Takeshi Takashima
Abstract:
DESTINY+ is an upcoming JAXA Epsilon medium-class mission to fly by the Geminids meteor shower parent body (3200) Phaethon. It will be the world's first spacecraft to escape from a near-geostationary transfer orbit into deep space using a low-thrust propulsion system. In doing so, DESTINY+ will demonstrate a number of technologies that include a highly efficient ion engine system, lightweight sola…
▽ More
DESTINY+ is an upcoming JAXA Epsilon medium-class mission to fly by the Geminids meteor shower parent body (3200) Phaethon. It will be the world's first spacecraft to escape from a near-geostationary transfer orbit into deep space using a low-thrust propulsion system. In doing so, DESTINY+ will demonstrate a number of technologies that include a highly efficient ion engine system, lightweight solar array panels, and advanced asteroid flyby observation instruments. These demonstrations will pave the way for JAXA's envisioned low-cost, high-frequency space exploration plans. Following the Phaethon flyby observation, DESTINY+ will visit additional asteroids as its extended mission. The mission design is divided into three phases: a spiral-shaped apogee-raising phase, a multi-lunar-flyby phase to escape Earth, and an interplanetary and asteroids flyby phase. The main challenges include the optimization of the many-revolution low-thrust spiral phase under operational constraints; the design of a multi-lunar-flyby sequence in a multi-body environment; and the design of multiple asteroid flybys connected via Earth gravity assists. This paper shows a novel, practical approach to tackle these complex problems, and presents feasible solutions found within the mass budget and mission constraints. Among them, the baseline solution is shown and discussed in depth; DESTINY+ will spend two years raising its apogee with ion engines, followed by four lunar gravity assists, and a flyby of asteroids (3200) Phaethon and (155140) 2005 UD. Finally, the flight operations plan for the spiral phase and the asteroid flyby phase are presented in detail.
△ Less
Submitted 14 April, 2022; v1 submitted 6 January, 2022;
originally announced January 2022.
-
Explicitly Multi-Modal Benchmarks for Multi-Objective Optimization
Authors:
Ryosuke Ota,
Reiya Hagiwara,
Naoki Hamada,
Likun Liu,
Takahiro Yamamoto,
Daisuke Sakurai
Abstract:
In multi-objective optimization, designing good benchmark problems is an important issue for improving solvers.
Controlling the global location of Pareto optima in existing benchmark problems has been problematic, and it is even more difficult when the design space is high-dimensional since visualization is extremely challenging.
As a benchmarking with explicit local Pareto fronts, we introduc…
▽ More
In multi-objective optimization, designing good benchmark problems is an important issue for improving solvers.
Controlling the global location of Pareto optima in existing benchmark problems has been problematic, and it is even more difficult when the design space is high-dimensional since visualization is extremely challenging.
As a benchmarking with explicit local Pareto fronts, we introduce a benchmarking based on basin connectivity (3BC) by using basins of attraction.
The 3BC allows for the specification of a multimodal landscape through a kind of topological analysis called the basin graph, effectively generating optimization problems from this graph.
Various known indicators measure the performance of a solver in searching global Pareto optima, but using 3BC can make us localize them for each local Pareto front by restricting it to its basin.
3BC's mathematical formulation ensures the accurate representation of the specified optimization landscape, guaranteeing the existence of intended local and global Pareto optima.
△ Less
Submitted 9 February, 2024; v1 submitted 7 October, 2021;
originally announced October 2021.
-
Impact of GPU uncertainty on the training of predictive deep neural networks
Authors:
Maciej Pietrowski,
Andrzej Gajda,
Takuto Yamamoto,
Taisuke Kobayashi,
Lana Sinapayen,
Eiji Watanabe
Abstract:
[retracted] We found out that the difference was dependent on the Chainer library, and does not replicate with another library (pytorch) which indicates that the results are probably due to a bug in Chainer, rather than being hardware-dependent. -- old abstract Deep neural networks often present uncertainties such as hardware- and software-derived noise and randomness. We studied the effects of su…
▽ More
[retracted] We found out that the difference was dependent on the Chainer library, and does not replicate with another library (pytorch) which indicates that the results are probably due to a bug in Chainer, rather than being hardware-dependent. -- old abstract Deep neural networks often present uncertainties such as hardware- and software-derived noise and randomness. We studied the effects of such uncertainty on learning outcomes, with a particular focus on the function of graphics processing units (GPUs), and found that GPU-induced uncertainty increased learning accuracy of a certain deep neural network. When training a predictive deep neural network using only the CPU without the GPU, the learning error is higher than when training the same number of epochs using the GPU, suggesting that the GPU plays a different role in the learning process than just increasing the computational speed. Because this effect cannot be observed in learning by a simple autoencoder, it could be a phenomenon specific to certain types of neural networks. GPU-specific computational processing is more indeterminate than that by CPUs, and hardware-derived uncertainties, which are often considered obstacles that need to be eliminated, might, in some cases, be successfully incorporated into the training of deep neural networks. Moreover, such uncertainties might be interesting phenomena to consider in brain-related computational processing, which comprises a large mass of uncertain signals.
△ Less
Submitted 6 October, 2021; v1 submitted 3 September, 2021;
originally announced September 2021.
-
Action Units Recognition Using Improved Pairwise Deep Architecture
Authors:
Junya Saito,
Xiaoyu Mi,
Akiyoshi Uchida,
Sachihiro Youoku,
Takahisa Yamamoto,
Kentaro Murase,
Osafumi Nakayama
Abstract:
Facial Action Units (AUs) represent a set of facial muscular activities and various combinations of AUs can represent a wide range of emotions. AU recognition is often used in many applications, including marketing, healthcare, education, and so forth. Although a lot of studies have developed various methods to improve recognition accuracy, it still remains a major challenge for AU recognition. In…
▽ More
Facial Action Units (AUs) represent a set of facial muscular activities and various combinations of AUs can represent a wide range of emotions. AU recognition is often used in many applications, including marketing, healthcare, education, and so forth. Although a lot of studies have developed various methods to improve recognition accuracy, it still remains a major challenge for AU recognition. In the Affective Behavior Analysis in-the-wild (ABAW) 2020 competition, we proposed a new automatic Action Units (AUs) recognition method using a pairwise deep architecture to derive the Pseudo-Intensities of each AU and then convert them into predicted intensities. This year, we introduced a new technique to last year's framework to further reduce AU recognition errors due to temporary face occlusion such as hands on face or large face orientation. We obtained a score of 0.65 in the validation data set for this year's competition.
△ Less
Submitted 8 July, 2021; v1 submitted 7 July, 2021;
originally announced July 2021.
-
Multi-modal Affect Analysis using standardized data within subjects in the Wild
Authors:
Sachihiro Youoku,
Takahisa Yamamoto,
Junya Saito,
Akiyoshi Uchida,
Xiaoyu Mi,
Ziqiang Shi,
Liu Liu,
Zhongling Liu,
Osafumi Nakayama,
Kentaro Murase
Abstract:
Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on facial expression (EXP) and valence-arousal calculation that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2021 Contest.…
▽ More
Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on facial expression (EXP) and valence-arousal calculation that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2021 Contest.
When annotating facial expressions from a video, we thought that it would be judged not only from the features common to all people, but also from the relative changes in the time series of individuals. Therefore, after learning the common features for each frame, we constructed a facial expression estimation model and valence-arousal model using time-series data after combining the common features and the standardized features for each video. Furthermore, the above features were learned using multi-modal data such as image features, AU, Head pose, and Gaze. In the validation set, our model achieved a facial expression score of 0.546. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively.
△ Less
Submitted 10 July, 2021; v1 submitted 7 July, 2021;
originally announced July 2021.
-
Crowdsourcing Evaluation of Saliency-based XAI Methods
Authors:
Xiaotian Lu,
Arseny Tolmachev,
Tatsuya Yamamoto,
Koh Takeuchi,
Seiji Okajima,
Tomoyoshi Takebayashi,
Koji Maruhashi,
Hisashi Kashima
Abstract:
Understanding the reasons behind the predictions made by deep neural networks is critical for gaining human trust in many important applications, which is reflected in the increasing demand for explainability in AI (XAI) in recent years. Saliency-based feature attribution methods, which highlight important parts of images that contribute to decisions by classifiers, are often used as XAI methods,…
▽ More
Understanding the reasons behind the predictions made by deep neural networks is critical for gaining human trust in many important applications, which is reflected in the increasing demand for explainability in AI (XAI) in recent years. Saliency-based feature attribution methods, which highlight important parts of images that contribute to decisions by classifiers, are often used as XAI methods, especially in the field of computer vision. In order to compare various saliency-based XAI methods quantitatively, several approaches for automated evaluation schemes have been proposed; however, there is no guarantee that such automated evaluation metrics correctly evaluate explainability, and a high rating by an automated evaluation scheme does not necessarily mean a high explainability for humans. In this study, instead of the automated evaluation, we propose a new human-based evaluation scheme using crowdsourcing to evaluate XAI methods. Our method is inspired by a human computation game, "Peek-a-boom", and can efficiently compare different XAI methods by exploiting the power of crowds. We evaluate the saliency maps of various XAI methods on two datasets with automated and crowd-based evaluation schemes. Our experiments show that the result of our crowd-based evaluation scheme is different from those of automated evaluation schemes. In addition, we regard the crowd-based evaluation results as ground truths and provide a quantitative performance measure to compare different automated evaluation schemes. We also discuss the impact of crowd workers on the results and show that the varying ability of crowd workers does not significantly impact the results.
△ Less
Submitted 30 August, 2021; v1 submitted 27 June, 2021;
originally announced July 2021.
-
Passive Flow Control for Series Inflatable Actuators: Application on a Wearable Soft-Robot for Posture Assistance
Authors:
Diego Paez-Granados,
Takehiro Yamamoto,
Hideki Kadone,
Kenji Suzuki
Abstract:
This paper presents a passive control method for multiple degrees of freedom in a soft pneumatic robot through the combination of flow resistor tubes with series inflatable actuators. We designed and developed these 3D printed resistors based on the pressure drop principle of multiple capillary orifices, which allows a passive control of its sequential activation from a single source of pressure.…
▽ More
This paper presents a passive control method for multiple degrees of freedom in a soft pneumatic robot through the combination of flow resistor tubes with series inflatable actuators. We designed and developed these 3D printed resistors based on the pressure drop principle of multiple capillary orifices, which allows a passive control of its sequential activation from a single source of pressure. Our design fits in standard tube connectors, making it easy to adopt it on any other type of actuator with pneumatic inlets. We present its characterization of pressure drop and evaluation of the activation sequence for series and parallel circuits of actuators. Moreover, we present an application for the assistance of postural transition from lying to sitting. We embedded it in a wearable garment robot-suit designed for infants with cerebral palsy. Then, we performed the test with a dummy baby for emulating the upper-body motion control. The results show a sequential motion control of the sitting and lying transitions validating the proposed system for flow control and its application on the robot-suit.
△ Less
Submitted 9 March, 2021;
originally announced March 2021.
-
Action Units Recognition by Pairwise Deep Architecture
Authors:
Junya Saito,
Ryosuke Kawamura,
Akiyoshi Uchida,
Sachihiro Youoku,
Yuushi Toyoda,
Takahisa Yamamoto,
Xiaoyu Mi,
Kentaro Murase
Abstract:
In this paper, we propose a new automatic Action Units (AUs) recognition method used in a competition, Affective Behavior Analysis in-the-wild (ABAW). Our method tackles a problem of AUs label inconsistency among subjects by using pairwise deep architecture. While the baseline score is 0.31, our method achieved 0.67 in validation dataset of the competition.
In this paper, we propose a new automatic Action Units (AUs) recognition method used in a competition, Affective Behavior Analysis in-the-wild (ABAW). Our method tackles a problem of AUs label inconsistency among subjects by using pairwise deep architecture. While the baseline score is 0.31, our method achieved 0.67 in validation dataset of the competition.
△ Less
Submitted 2 October, 2020; v1 submitted 1 October, 2020;
originally announced October 2020.
-
A Multi-term and Multi-task Analyzing Framework for Affective Analysis in-the-wild
Authors:
Sachihiro Youoku,
Yuushi Toyoda,
Takahisa Yamamoto,
Junya Saito,
Ryosuke Kawamura,
Xiaoyu Mi,
Kentaro Murase
Abstract:
Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we consi…
▽ More
Human affective recognition is an important factor in human-computer interaction. However, the method development with in-the-wild data is not yet accurate enough for practical usage. In this paper, we introduce the affective recognition method focusing on valence-arousal (VA) and expression (EXP) that was submitted to the Affective Behavior Analysis in-the-wild (ABAW) 2020 Contest. Since we considered that affective behaviors have many observable features that have their own time frames, we introduced multiple optimized time windows (short-term, middle-term, and long-term) into our analyzing framework for extracting feature parameters from video data. Moreover, multiple modality data are used, including action units, head poses, gaze, posture, and ResNet 50 or Efficient NET features, and are optimized during the extraction of these features. Then, we generated affective recognition models for each time window and ensembled these models together. Also, we fussed the valence, arousal, and expression models together to enable the multi-task learning, considering the fact that the basic psychological states behind facial expressions are closely related to each another. In the validation set, our model achieved a valence-arousal score of 0.498 and a facial expression score of 0.471. These verification results reveal that our proposed framework can improve estimation accuracy and robustness effectively.
△ Less
Submitted 2 October, 2020; v1 submitted 29 September, 2020;
originally announced September 2020.
-
Machine Learning Guided Discovery of Gigantic Magnetocaloric Effect in HoB$_{2}$ Near Hydrogen Liquefaction Temperature
Authors:
Pedro Baptista de Castro,
Kensei Terashima,
Takafumi D Yamamoto,
Zhufeng Hou,
Suguru Iwasaki,
Ryo Matsumoto,
Shintaro Adachi,
Yoshito Saito,
Peng Song,
Hiroyuki Takeya,
Yoshihiko Takano
Abstract:
Magnetic refrigeration exploits the magnetocaloric effect which is the entropy change upon application and removal of magnetic fields in materials, providing an alternate path for refrigeration other than the conventional gas cycles. While intensive research has uncovered a vast number of magnetic materials which exhibits large magnetocaloric effect, these properties for a large number of compound…
▽ More
Magnetic refrigeration exploits the magnetocaloric effect which is the entropy change upon application and removal of magnetic fields in materials, providing an alternate path for refrigeration other than the conventional gas cycles. While intensive research has uncovered a vast number of magnetic materials which exhibits large magnetocaloric effect, these properties for a large number of compounds still remain unknown. To explore new functional materials in this unknown space, machine learning is used as a guide for selecting materials which could exhibit large magnetocaloric effect. By this approach, HoB$_{2}$ is singled out, synthesized and its magnetocaloric properties are evaluated, leading to the experimental discovery of gigantic magnetic entropy change 40.1 J kg$^{-1}$ K$^{-1}$ (0.35 J cm$^{-3}$ K$^{-1}$) for a field change of 5 T in the vicinity of a ferromagnetic second-order phase transition with a Curie temperature of 15 K. This is the highest value reported so far, to our knowledge, near the hydrogen liquefaction temperature thus it is a highly suitable material for hydrogen liquefaction and low temperature magnetic cooling applications.
△ Less
Submitted 12 May, 2020;
originally announced May 2020.
-
Temporal Extension Module for Skeleton-Based Action Recognition
Authors:
Yuya Obinata,
Takuma Yamamoto
Abstract:
We present a module that extends the temporal graph of a graph convolutional network (GCN) for action recognition with a sequence of skeletons. Existing methods attempt to represent a more appropriate spatial graph on an intra-frame, but disregard optimization of the temporal graph on the interframe. Concretely, these methods connect between vertices corresponding only to the same joint on the int…
▽ More
We present a module that extends the temporal graph of a graph convolutional network (GCN) for action recognition with a sequence of skeletons. Existing methods attempt to represent a more appropriate spatial graph on an intra-frame, but disregard optimization of the temporal graph on the interframe. Concretely, these methods connect between vertices corresponding only to the same joint on the inter-frame. In this work, we focus on adding connections to neighboring multiple vertices on the inter-frame and extracting additional features based on the extended temporal graph. Our module is a simple yet effective method to extract correlated features of multiple joints in human movement. Moreover, our module aids in further performance improvements, along with other GCN methods that optimize only the spatial graph. We conduct extensive experiments on two large datasets, NTU RGB+D and Kinetics-Skeleton, and demonstrate that our module is effective for several existing models and our final model achieves state-of-the-art performance.
△ Less
Submitted 18 October, 2020; v1 submitted 19 March, 2020;
originally announced March 2020.
-
Precise Estimation of Renal Vascular Dominant Regions Using Spatially Aware Fully Convolutional Networks, Tensor-Cut and Voronoi Diagrams
Authors:
Chenglong Wang,
Holger R. Roth,
Takayuki Kitasaka,
Masahiro Oda,
Yuichiro Hayashi,
Yasushi Yoshino,
Tokunori Yamamoto,
Naoto Sassa,
Momokazu Goto,
Kensaku Mori
Abstract:
This paper presents a new approach for precisely estimating the renal vascular dominant region using a Voronoi diagram. To provide computer-assisted diagnostics for the pre-surgical simulation of partial nephrectomy surgery, we must obtain information on the renal arteries and the renal vascular dominant regions. We propose a fully automatic segmentation method that combines a neural network and t…
▽ More
This paper presents a new approach for precisely estimating the renal vascular dominant region using a Voronoi diagram. To provide computer-assisted diagnostics for the pre-surgical simulation of partial nephrectomy surgery, we must obtain information on the renal arteries and the renal vascular dominant regions. We propose a fully automatic segmentation method that combines a neural network and tensor-based graph-cut methods to precisely extract the kidney and renal arteries. First, we use a convolutional neural network to localize the kidney regions and extract tiny renal arteries with a tensor-based graph-cut method. Then we generate a Voronoi diagram to estimate the renal vascular dominant regions based on the segmented kidney and renal arteries. The accuracy of kidney segmentation in 27 cases with 8-fold cross validation reached a Dice score of 95%. The accuracy of renal artery segmentation in 8 cases obtained a centerline overlap ratio of 80%. Each partition region corresponds to a renal vascular dominant region. The final dominant-region estimation accuracy achieved a Dice coefficient of 80%. A clinical application showed the potential of our proposed estimation approach in a real clinical surgical environment. Further validation using large-scale database is our future work.
△ Less
Submitted 5 August, 2019;
originally announced August 2019.
-
Millimeter-wave Wireless LAN and its Extension toward 5G Heterogeneous Networks
Authors:
Kei Sakaguchi,
Ehab Mahmoud Mohamed,
Hideyuki Kusano,
Makoto Mizukami,
Shinichi Miyamoto,
Roya Rezagah,
Koji Takinami,
Kazuaki Takahashi,
Naganori Shirakata,
Hailan Peng,
Toshiaki Yamamoto,
Shinobu Namba
Abstract:
Millimeter-wave (mmw) frequency bands, especially 60 GHz unlicensed band, are considered as a promising solution for gigabit short range wireless communication systems. IEEE standard 802.11ad, also known as WiGig, is standardized for the usage of the 60 GHz unlicensed band for wireless local area networks (WLANs). By using this mmw WLAN, multi-Gbps rate can be achieved to support bandwidth-intensi…
▽ More
Millimeter-wave (mmw) frequency bands, especially 60 GHz unlicensed band, are considered as a promising solution for gigabit short range wireless communication systems. IEEE standard 802.11ad, also known as WiGig, is standardized for the usage of the 60 GHz unlicensed band for wireless local area networks (WLANs). By using this mmw WLAN, multi-Gbps rate can be achieved to support bandwidth-intensive multimedia applications. Exhaustive search along with beamforming (BF) is usually used to overcome 60 GHz channel propagation loss and accomplish data transmissions in such mmw WLANs. Because of its short range transmission with a high susceptibility to path blocking, multiple number of mmw access points (APs) should be used to fully cover a typical target environment for future high capacity multi-Gbps WLANs. Therefore, coordination among mmw APs is highly needed to overcome packet collisions resulting from un-coordinated exhaustive search BF and to increase the total capacity of mmw WLANs. In this paper, we firstly give the current status of mmw WLANs with our developed WiGig AP prototype. Then, we highlight the great need for coordinated transmissions among mmw APs as a key enabler for future high capacity mmw WLANs. Two different types of coordinated mmw WLAN architecture are introduced. One is the distributed antenna type architecture to realize centralized coordination, while the other is an autonomous coordination with the assistance of legacy Wi-Fi signaling. Moreover, two heterogeneous network (HetNet) architectures are also introduced to efficiently extend the coordinated mmw WLANs to be used for future 5th Generation (5G) cellular networks.
△ Less
Submitted 16 July, 2015;
originally announced July 2015.