Skip to main content

Showing 1–4 of 4 results for author: Komatsu, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.14976  [pdf, other

    cs.LG cs.AI

    Application of linear regression and quasi-Newton methods to the deep reinforcement learning in continuous action cases

    Authors: Hisato Komatsu

    Abstract: The linear regression (LR) method offers the advantage that optimal parameters can be calculated relatively easily, although its representation capability is limited than that of the deep learning technique. To improve deep reinforcement learning, the Least Squares Deep Q Network (LS-DQN) method was proposed by Levine et al., which combines Deep Q Network (DQN) with LR method. However, the LS-DQN… ▽ More

    Submitted 25 April, 2025; v1 submitted 19 March, 2025; originally announced March 2025.

    Comments: 23 pages, 8 figures

  2. arXiv:2410.07563  [pdf, other

    cs.CL cs.AI cs.LG

    PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency

    Authors: Preferred Elements, :, Kenshin Abe, Kaizaburo Chubachi, Yasuhiro Fujita, Yuta Hirokawa, Kentaro Imajo, Toshiki Kataoka, Hiroyoshi Komatsu, Hiroaki Mikami, Tsuguo Mogami, Shogo Murai, Kosuke Nakago, Daisuke Nishino, Toru Ogawa, Daisuke Okanohara, Yoshihiko Ozaki, Shotaro Sano, Shuji Suzuki, Tianqi Xu, Toshihiko Yanase

    Abstract: We introduce PLaMo-100B, a large-scale language model designed for Japanese proficiency. The model was trained from scratch using 2 trillion tokens, with architecture such as QK Normalization and Z-Loss to ensure training stability during the training process. Post-training techniques, including Supervised Fine-Tuning and Direct Preference Optimization, were applied to refine the model's performan… ▽ More

    Submitted 22 October, 2024; v1 submitted 9 October, 2024; originally announced October 2024.

  3. arXiv:2312.11834  [pdf, other

    cs.MA cs.AI cs.LG physics.soc-ph

    Multi-agent reinforcement learning using echo-state network and its application to pedestrian dynamics

    Authors: Hisato Komatsu

    Abstract: In recent years, simulations of pedestrians using the multi-agent reinforcement learning (MARL) have been studied. This study considered the roads on a grid-world environment, and implemented pedestrians as MARL agents using an echo-state network and the least squares policy iteration method. Under this environment, the ability of these agents to learn to move forward by avoiding other agents was… ▽ More

    Submitted 7 October, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: 25 pages, 19 figures

    Journal ref: J. Stat. Mech. (2025) 043401

  4. arXiv:2101.03634  [pdf

    cs.CL

    The Logic for a Mildly Context-Sensitive Fragment of the Lambek-Grishin Calculus

    Authors: Hiroyoshi Komatsu

    Abstract: While context-free grammars are characterized by a simple proof-theoretic grammatical formalism namely categorial grammar and its logic the Lambek calculus, no such characterizations were known for tree-adjoining grammars, and even for any mildly context-sensitive languages classes in the last forty years despite some efforts. We settle this problem in this paper. On the basis of the existing frag… ▽ More

    Submitted 10 January, 2021; originally announced January 2021.