-
Investigation of Frame Differences as Motion Cues for Video Object Segmentation
Authors:
Sota Kawamura,
Hirotada Honda,
Shugo Nakamura,
Takashi Sano
Abstract:
Automatic Video Object Segmentation (AVOS) refers to the task of autonomously segmenting target objects in video sequences without relying on human-provided annotations in the first frames. In AVOS, the use of motion information is crucial, with optical flow being a commonly employed method for capturing motion cues. However, the computation of optical flow is resource-intensive, making it unsuita…
▽ More
Automatic Video Object Segmentation (AVOS) refers to the task of autonomously segmenting target objects in video sequences without relying on human-provided annotations in the first frames. In AVOS, the use of motion information is crucial, with optical flow being a commonly employed method for capturing motion cues. However, the computation of optical flow is resource-intensive, making it unsuitable for real-time applications, especially on edge devices with limited computational resources. In this study, we propose using frame differences as an alternative to optical flow for motion cue extraction. We developed an extended U-Net-like AVOS model that takes a frame on which segmentation is performed and a frame difference as inputs, and outputs an estimated segmentation map. Our experimental results demonstrate that the proposed model achieves performance comparable to the model with optical flow as an input, particularly when applied to videos captured by stationary cameras. Our results suggest the usefulness of employing frame differences as motion cues in cases with limited computational resources.
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
RAO-SS: A Prototype of Run-time Auto-tuning Facility for Sparse Direct Solvers
Authors:
Takahiro Katagiri,
Yoshinori Ishii,
Hiroki Honda
Abstract:
In this paper, a run-time auto-tuning method for performance parameters according to input matrices is proposed. RAO-SS (Run-time Auto-tuning Optimizer for Sparse Solvers), which is a prototype of auto-tuning software using the proposed method, is also evaluated. The RAO-SS is implemented with the Autopilot, which is middle-ware to support run-time auto-tuning with fuzzy logic function. The target…
▽ More
In this paper, a run-time auto-tuning method for performance parameters according to input matrices is proposed. RAO-SS (Run-time Auto-tuning Optimizer for Sparse Solvers), which is a prototype of auto-tuning software using the proposed method, is also evaluated. The RAO-SS is implemented with the Autopilot, which is middle-ware to support run-time auto-tuning with fuzzy logic function. The target numerical library is the SuperLU, which is a sparse direct solver for linear equations. The result indicated that: (1) the speedup factors of 1.2 for average and 3.6 for maximum to default executions were obtained; (2) the software overhead of the Autopilot can be ignored in RAO-SS.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
From Coupled Oscillators to Graph Neural Networks: Reducing Over-smoothing via a Kuramoto Model-based Approach
Authors:
Tuan Nguyen,
Hirotada Honda,
Takashi Sano,
Vinh Nguyen,
Shugo Nakamura,
Tan M. Nguyen
Abstract:
We propose the Kuramoto Graph Neural Network (KuramotoGNN), a novel class of continuous-depth graph neural networks (GNNs) that employs the Kuramoto model to mitigate the over-smoothing phenomenon, in which node features in GNNs become indistinguishable as the number of layers increases. The Kuramoto model captures the synchronization behavior of non-linear coupled oscillators. Under the view of c…
▽ More
We propose the Kuramoto Graph Neural Network (KuramotoGNN), a novel class of continuous-depth graph neural networks (GNNs) that employs the Kuramoto model to mitigate the over-smoothing phenomenon, in which node features in GNNs become indistinguishable as the number of layers increases. The Kuramoto model captures the synchronization behavior of non-linear coupled oscillators. Under the view of coupled oscillators, we first show the connection between Kuramoto model and basic GNN and then over-smoothing phenomenon in GNNs can be interpreted as phase synchronization in Kuramoto model. The KuramotoGNN replaces this phase synchronization with frequency synchronization to prevent the node features from converging into each other while allowing the system to reach a stable synchronized state. We experimentally verify the advantages of the KuramotoGNN over the baseline GNNs and existing methods in reducing over-smoothing on various graph deep learning benchmark tasks.
△ Less
Submitted 5 March, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
CLRerNet: Improving Confidence of Lane Detection with LaneIoU
Authors:
Hiroto Honda,
Yusuke Uchida
Abstract:
Lane marker detection is a crucial component of the autonomous driving and driver assistance systems. Modern deep lane detection methods with row-based lane representation exhibit excellent performance on lane detection benchmarks. Through preliminary oracle experiments, we firstly disentangle the lane representation components to determine the direction of our approach. We show that correct lane…
▽ More
Lane marker detection is a crucial component of the autonomous driving and driver assistance systems. Modern deep lane detection methods with row-based lane representation exhibit excellent performance on lane detection benchmarks. Through preliminary oracle experiments, we firstly disentangle the lane representation components to determine the direction of our approach. We show that correct lane positions are already among the predictions of an existing row-based detector, and the confidence scores that accurately represent intersection-over-union (IoU) with ground truths are the most beneficial. Based on the finding, we propose LaneIoU that better correlates with the metric, by taking the local lane angles into consideration. We develop a novel detector coined CLRerNet featuring LaneIoU for the target assignment cost and loss functions aiming at the improved quality of confidence scores. Through careful and fair benchmark including cross validation, we demonstrate that CLRerNet outperforms the state-of-the-art by a large margin - enjoying F1 score of 81.43% compared with 80.47% of the existing method on CULane, and 86.47% compared with 86.10% on CurveLanes.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
End-to-End Monocular Vanishing Point Detection Exploiting Lane Annotations
Authors:
Hiroto Honda,
Motoki Kimura,
Takumi Karasawa,
Yusuke Uchida
Abstract:
Vanishing points (VPs) play a vital role in various computer vision tasks, especially for recognizing the 3D scenes from an image. In the real-world scenario of automobile applications, it is costly to manually obtain the external camera parameters when the camera is attached to the vehicle or the attachment is accidentally perturbed. In this paper we introduce a simple but effective end-to-end va…
▽ More
Vanishing points (VPs) play a vital role in various computer vision tasks, especially for recognizing the 3D scenes from an image. In the real-world scenario of automobile applications, it is costly to manually obtain the external camera parameters when the camera is attached to the vehicle or the attachment is accidentally perturbed. In this paper we introduce a simple but effective end-to-end vanishing point detection. By automatically calculating intersection of the extrapolated lane marker annotations, we obtain geometrically consistent VP labels and mitigate human annotation errors caused by manual VP labeling. With the calculated VP labels we train end-to-end VP Detector via heatmap estimation. The VP Detector realizes higher accuracy than the methods utilizing manual annotation or lane detection, paving the way for accurate online camera calibration.
△ Less
Submitted 31 August, 2021;
originally announced August 2021.
-
Leveraging Temporal Joint Depths for Improving 3D Human Pose Estimation in Video
Authors:
Naoki Kato,
Hiroto Honda,
Yusuke Uchida
Abstract:
The effectiveness of the approaches to predict 3D poses from 2D poses estimated in each frame of a video has been demonstrated for 3D human pose estimation. However, 2D poses without appearance information of persons have much ambiguity with respect to the joint depths. In this paper, we propose to estimate a 3D pose in each frame of a video and refine it considering temporal information. The prop…
▽ More
The effectiveness of the approaches to predict 3D poses from 2D poses estimated in each frame of a video has been demonstrated for 3D human pose estimation. However, 2D poses without appearance information of persons have much ambiguity with respect to the joint depths. In this paper, we propose to estimate a 3D pose in each frame of a video and refine it considering temporal information. The proposed approach reduces the ambiguity of the joint depths and improves the 3D pose estimation accuracy.
△ Less
Submitted 4 November, 2020;
originally announced November 2020.
-
Sensitivity of quantum PageRank
Authors:
Hirotada Honda
Abstract:
In this paper, we discuss the sensitivity of quantum PageRank. By using the finite dimensional perturbation theory, we estimate the change of the quantum PageRank under a small analytical perturbation on the Google matrix. In addition, we will show the way to estimate the lower bound of the convergence radius as well as the error bound of the finite sum in the expansion of the perturbed PageRank.
In this paper, we discuss the sensitivity of quantum PageRank. By using the finite dimensional perturbation theory, we estimate the change of the quantum PageRank under a small analytical perturbation on the Google matrix. In addition, we will show the way to estimate the lower bound of the convergence radius as well as the error bound of the finite sum in the expansion of the perturbed PageRank.
△ Less
Submitted 27 June, 2019;
originally announced July 2019.
-
Full-body High-resolution Anime Generation with Progressive Structure-conditional Generative Adversarial Networks
Authors:
Koichi Hamada,
Kentaro Tachibana,
Tianqi Li,
Hiroto Honda,
Yusuke Uchida
Abstract:
We propose Progressive Structure-conditional Generative Adversarial Networks (PSGAN), a new framework that can generate full-body and high-resolution character images based on structural information. Recent progress in generative adversarial networks with progressive training has made it possible to generate high-resolution images. However, existing approaches have limitations in achieving both hi…
▽ More
We propose Progressive Structure-conditional Generative Adversarial Networks (PSGAN), a new framework that can generate full-body and high-resolution character images based on structural information. Recent progress in generative adversarial networks with progressive training has made it possible to generate high-resolution images. However, existing approaches have limitations in achieving both high image quality and structural consistency at the same time. Our method tackles the limitations by progressively increasing the resolution of both generated images and structural conditions during training. In this paper, we empirically demonstrate the effectiveness of this method by showing the comparison with existing approaches and video generation results of diverse anime characters at 1024x1024 based on target pose sequences. We also create a novel dataset containing full-body 1024x1024 high-resolution images and exact 2D pose keypoints using Unity 3D Avatar models.
△ Less
Submitted 6 September, 2018;
originally announced September 2018.
-
Geometric Analysis of Observability of Target Object Shape Using Location-Unknown Distance Sensors
Authors:
Hiroshi Saito,
Hirotada Honda
Abstract:
We geometrically analyze the problem of estimating parameters related to the shape and size of a two-dimensional target object on the plane by using randomly distributed distance sensors whose locations are unknown. Based on the analysis using geometric probability, we discuss the observability of these parameters: which parameters we can estimate and what conditions are required to estimate them.…
▽ More
We geometrically analyze the problem of estimating parameters related to the shape and size of a two-dimensional target object on the plane by using randomly distributed distance sensors whose locations are unknown. Based on the analysis using geometric probability, we discuss the observability of these parameters: which parameters we can estimate and what conditions are required to estimate them. For a convex target object, its size and perimeter length are observable, and other parameters are not observable. For a general polygon target object, convexity in addition to its size and perimeter length is observable. Parameters related to a concave vertex can be observable when some conditions are satisfied. We also propose a method for estimating the convexity of a target object and the perimeter length of the target object.
△ Less
Submitted 15 May, 2017;
originally announced July 2017.
-
Multi-physics Extension of OpenFMO Framework
Authors:
Toshiya Takami,
Jun Maki,
Jun'ichi Ooba,
Yuuichi Inadomi,
Hiroaki Honda,
Ryutaro Susukita,
Koji Inoue,
Taizo Kobayashi,
Rie Nogita,
Mutsumi Aoyagi
Abstract:
OpenFMO framework, an open-source software (OSS) platform for Fragment Molecular Orbital (FMO) method, is extended to multi-physics simulations (MPS). After reviewing the several FMO implementations on distributed computer environments, the subsequent development planning corresponding to MPS is presented. It is discussed which should be selected as a scientific software, lightweight and reconfi…
▽ More
OpenFMO framework, an open-source software (OSS) platform for Fragment Molecular Orbital (FMO) method, is extended to multi-physics simulations (MPS). After reviewing the several FMO implementations on distributed computer environments, the subsequent development planning corresponding to MPS is presented. It is discussed which should be selected as a scientific software, lightweight and reconfigurable form or large and self-contained form.
△ Less
Submitted 18 July, 2007;
originally announced July 2007.
-
Open-architecture Implementation of Fragment Molecular Orbital Method for Peta-scale Computing
Authors:
Toshiya Takami,
Jun Maki,
Jun-ichi Ooba,
Yuichi Inadomi,
Hiroaki Honda,
Taizo Kobayashi,
Rie Nogita,
Mutsumi Aoyagi
Abstract:
We present our perspective and goals on highperformance computing for nanoscience in accordance with the global trend toward "peta-scale computing." After reviewing our results obtained through the grid-enabled version of the fragment molecular orbital method (FMO) on the grid testbed by the Japanese Grid Project, National Research Grid Initiative (NAREGI), we show that FMO is one of the best ca…
▽ More
We present our perspective and goals on highperformance computing for nanoscience in accordance with the global trend toward "peta-scale computing." After reviewing our results obtained through the grid-enabled version of the fragment molecular orbital method (FMO) on the grid testbed by the Japanese Grid Project, National Research Grid Initiative (NAREGI), we show that FMO is one of the best candidates for peta-scale applications by predicting its effective performance in peta-scale computers. Finally, we introduce our new project constructing a peta-scale application in an open-architecture implementation of FMO in order to realize both goals of highperformance in peta-scale computers and extendibility to multiphysics simulations.
△ Less
Submitted 10 January, 2007;
originally announced January 2007.