Skip to main content

Showing 1–7 of 7 results for author: Shirakawa, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.02361  [pdf, other

    cs.GR cs.CV

    MG-Gen: Single Image to Motion Graphics Generation with Layer Decomposition

    Authors: Takahiro Shirakawa, Tomoyuki Suzuki, Daichi Haraguchi

    Abstract: General image-to-video generation methods often produce suboptimal animations that do not meet the requirements of animated graphics, as they lack active text motion and exhibit object distortion. Also, code-based animation generation methods typically require layer-structured vector data which are often not readily available for motion graphic generation. To address these challenges, we propose a… ▽ More

    Submitted 3 April, 2025; v1 submitted 3 April, 2025; originally announced April 2025.

  2. arXiv:2411.15580  [pdf, other

    cs.CV

    TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

    Authors: Ryugo Morita, Stanislav Frolov, Brian Bernhard Moser, Takahiro Shirakawa, Ko Watanabe, Andreas Dengel, Jinjia Zhou

    Abstract: Diffusion models have enabled the generation of high-quality images with a strong focus on realism and textual fidelity. Yet, large-scale text-to-image models, such as Stable Diffusion, struggle to generate images where foreground objects are placed over a chroma key background, limiting their ability to separate foreground and background elements without fine-tuning. To address this limitation, w… ▽ More

    Submitted 8 March, 2025; v1 submitted 23 November, 2024; originally announced November 2024.

    Comments: Accepted to CVPR2025

  3. arXiv:2405.04767  [pdf, other

    cs.LG cs.AI

    Test-Time Augmentation for Traveling Salesperson Problem

    Authors: Ryo Ishiyama, Takahiro Shirakawa, Seiichi Uchida, Shinnosuke Matsuo

    Abstract: We propose Test-Time Augmentation (TTA) as an effective technique for addressing combinatorial optimization problems, including the Traveling Salesperson Problem. In general, deep learning models possessing the property of invariance, where the output is uniquely determined regardless of the node indices, have been proposed to learn graph structures efficiently. In contrast, we interpret the permu… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  4. arXiv:2404.04399  [pdf, other

    stat.ML cs.AI cs.LG stat.AP stat.ME

    Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

    Authors: Toru Shirakawa, Yi Li, Yulun Wu, Sky Qiu, Yuxuan Li, Mingduo Zhao, Hiroyasu Iso, Mark van der Laan

    Abstract: We propose Deep Longitudinal Targeted Minimum Loss-based Estimation (Deep LTMLE), a novel approach to estimate the counterfactual mean of outcome under dynamic treatment policies in longitudinal problem settings. Our approach utilizes a transformer architecture with heterogeneous type embedding trained using temporal-difference learning. After obtaining an initial estimate using the transformer, f… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  5. arXiv:2403.03485  [pdf, other

    cs.CV

    NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging

    Authors: Takahiro Shirakawa, Seiichi Uchida

    Abstract: Layout-aware text-to-image generation is a task to generate multi-object images that reflect layout conditions in addition to text conditions. The current layout-aware text-to-image diffusion models still have several issues, including mismatches between the text and layout conditions and quality degradation of generated images. This paper proposes a novel layout-aware text-to-image diffusion mode… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Accepted at CVPR 2024

  6. arXiv:2310.02772  [pdf, other

    cs.NE cs.AI

    Spike Accumulation Forwarding for Effective Training of Spiking Neural Networks

    Authors: Ryuji Saiin, Tomoya Shirakawa, Sota Yoshihara, Yoshihide Sawada, Hiroyuki Kusumoto

    Abstract: In this article, we propose a new paradigm for training spiking neural networks (SNNs), spike accumulation forwarding (SAF). It is known that SNNs are energy-efficient but difficult to train. Consequently, many researchers have proposed various methods to solve this problem, among which online training through time (OTTT) is a method that allows inferring at each time step while suppressing the me… ▽ More

    Submitted 28 June, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: 14 pages, 5 figures, Appendix:10 pages, 2 figures, v6:Published in Transactions on Machine Learning Research

  7. arXiv:2306.12049  [pdf, other

    cs.CV

    Ambigram Generation by A Diffusion Model

    Authors: Takahiro Shirakawa, Seiichi Uchida

    Abstract: Ambigrams are graphical letter designs that can be read not only from the original direction but also from a rotated direction (especially with 180 degrees). Designing ambigrams is difficult even for human experts because keeping their dual readability from both directions is often difficult. This paper proposes an ambigram generation model. As its generation module, we use a diffusion model, whic… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Comments: Accepted at ICDAR 2023