Search | arXiv e-print repository

Deep Curvilinear Editing: Commutative and Nonlinear Image Manipulation for Pretrained Deep Generative Model

Authors: Takehiro Aoshima, Takashi Matsubara

Abstract: Semantic editing of images is the fundamental goal of computer vision. Although deep learning methods, such as generative adversarial networks (GANs), are capable of producing high-quality images, they often do not have an inherent way of editing generated images semantically. Recent studies have investigated a way of manipulating the latent variable to determine the images to be generated. Howeve… ▽ More Semantic editing of images is the fundamental goal of computer vision. Although deep learning methods, such as generative adversarial networks (GANs), are capable of producing high-quality images, they often do not have an inherent way of editing generated images semantically. Recent studies have investigated a way of manipulating the latent variable to determine the images to be generated. However, methods that assume linear semantic arithmetic have certain limitations in terms of the quality of image editing, whereas methods that discover nonlinear semantic pathways provide non-commutative editing, which is inconsistent when applied in different orders. This study proposes a novel method called deep curvilinear editing (DeCurvEd) to determine semantic commuting vector fields on the latent space. We theoretically demonstrate that owing to commutativity, the editing of multiple attributes depends only on the quantities and not on the order. Furthermore, we experimentally demonstrate that compared to previous methods, the nonlinear and commutative nature of DeCurvEd facilitates the disentanglement of image attributes and provides higher-quality editing. △ Less

Submitted 29 August, 2023; v1 submitted 26 November, 2022; originally announced November 2022.

Comments: 15 pages. The last update made no changes except for adding the following link to the CVF repository: https://openaccess.thecvf.com/content/CVPR2023/html/Aoshima_Deep_Curvilinear_Editing_Commutative_and_Nonlinear_Image_Manipulation_for_Pretrained_CVPR_2023_paper.html. Here, you can find our code to reproduce our results

Journal ref: The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (CVPR2023)

arXiv:1808.03812 [pdf, ps, other]

Swarm Robots Inspired by Friendship Formation Process

Authors: Takeshi Kano, Naoki matsui, Eiichi Naito, Takenobu Aoshima, Akio Ishiguro

Abstract: Swarm robotic systems are systems in which multiple robots having simple functionality perform tasks through their cooperation, and are advantageous in that they can exhibit non-trivial macroscopic functions such as adaptability, fault tolerance, and scalability. We previously proposed a simple model of swarm formation inspired by friendship formation process in human society, and demonstrated via… ▽ More Swarm robotic systems are systems in which multiple robots having simple functionality perform tasks through their cooperation, and are advantageous in that they can exhibit non-trivial macroscopic functions such as adaptability, fault tolerance, and scalability. We previously proposed a simple model of swarm formation inspired by friendship formation process in human society, and demonstrated via simulation that various non-trivial patterns emerge. In this study, we examine the applicability of the proposed model to a swarm robotic system. As a first step, we developed five robots and demonstrated via real-world experiments that the simulation results can be largely reproduced. △ Less

Submitted 11 August, 2018; originally announced August 2018.

Comments: 9 pages, 8 figures

arXiv:1802.03938 [pdf, ps, other]

Revisiting the Vector Space Model: Sparse Weighted Nearest-Neighbor Method for Extreme Multi-Label Classification

Authors: Tatsuhiro Aoshima, Kei Kobayashi, Mihoko Minami

Abstract: Machine learning has played an important role in information retrieval (IR) in recent times. In search engines, for example, query keywords are accepted and documents are returned in order of relevance to the given query; this can be cast as a multi-label ranking problem in machine learning. Generally, the number of candidate documents is extremely large (from several thousand to several million);… ▽ More Machine learning has played an important role in information retrieval (IR) in recent times. In search engines, for example, query keywords are accepted and documents are returned in order of relevance to the given query; this can be cast as a multi-label ranking problem in machine learning. Generally, the number of candidate documents is extremely large (from several thousand to several million); thus, the classifier must handle many labels. This problem is referred to as extreme multi-label classification (XMLC). In this paper, we propose a novel approach to XMLC termed the Sparse Weighted Nearest-Neighbor Method. This technique can be derived as a fast implementation of state-of-the-art (SOTA) one-versus-rest linear classifiers for very sparse datasets. In addition, we show that the classifier can be written as a sparse generalization of a representer theorem with a linear kernel. Furthermore, our method can be viewed as the vector space model used in IR. Finally, we show that the Sparse Weighted Nearest-Neighbor Method can process data points in real time on XMLC datasets with equivalent performance to SOTA models, with a single thread and smaller storage footprint. In particular, our method exhibits superior performance to the SOTA models on a dataset with 3 million labels. △ Less

Submitted 12 February, 2018; originally announced February 2018.

arXiv:1706.09597 [pdf, other]

Path Integral Networks: End-to-End Differentiable Optimal Control

Authors: Masashi Okada, Luca Rigazio, Takenobu Aoshima

Abstract: In this paper, we introduce Path Integral Networks (PI-Net), a recurrent network representation of the Path Integral optimal control algorithm. The network includes both system dynamics and cost models, used for optimal control based planning. PI-Net is fully differentiable, learning both dynamics and cost models end-to-end by back-propagation and stochastic gradient descent. Because of this, PI-N… ▽ More In this paper, we introduce Path Integral Networks (PI-Net), a recurrent network representation of the Path Integral optimal control algorithm. The network includes both system dynamics and cost models, used for optimal control based planning. PI-Net is fully differentiable, learning both dynamics and cost models end-to-end by back-propagation and stochastic gradient descent. Because of this, PI-Net can learn to plan. PI-Net has several advantages: it can generalize to unseen states thanks to planning, it can be applied to continuous control tasks, and it allows for a wide variety learning schemes, including imitation and reinforcement learning. Preliminary experiment results show that PI-Net, trained by imitation learning, can mimic control demonstrations for two simulated problems; a linear system and a pendulum swing-up problem. We also show that PI-Net is able to learn dynamics and cost models latent in the demonstrations. △ Less

Submitted 29 June, 2017; originally announced June 2017.

Showing 1–4 of 4 results for author: Aoshima, T