-
EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements
Authors:
Issa Sugiura,
Takashi Ishida,
Taro Makino,
Chieko Tazuke,
Takanori Nakagawa,
Kosuke Nakago,
David Ha
Abstract:
Financial analysis presents complex challenges that could leverage large language model (LLM) capabilities. However, the scarcity of challenging financial datasets, particularly for Japanese financial data, impedes academic innovation in financial analytics. As LLMs advance, this lack of accessible research resources increasingly hinders their development and evaluation in this specialized domain.…
▽ More
Financial analysis presents complex challenges that could leverage large language model (LLM) capabilities. However, the scarcity of challenging financial datasets, particularly for Japanese financial data, impedes academic innovation in financial analytics. As LLMs advance, this lack of accessible research resources increasingly hinders their development and evaluation in this specialized domain. To address this gap, we introduce EDINET-Bench, an open-source Japanese financial benchmark designed to evaluate the performance of LLMs on challenging financial tasks including accounting fraud detection, earnings forecasting, and industry prediction. EDINET-Bench is constructed by downloading annual reports from the past 10 years from Japan's Electronic Disclosure for Investors' NETwork (EDINET) and automatically assigning labels corresponding to each evaluation task. Our experiments reveal that even state-of-the-art LLMs struggle, performing only slightly better than logistic regression in binary classification for fraud detection and earnings forecasting. These results highlight significant challenges in applying LLMs to real-world financial applications and underscore the need for domain-specific adaptation. Our dataset, benchmark construction code, and evaluation code is publicly available to facilitate future research in finance with LLMs.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency
Authors:
Preferred Elements,
:,
Kenshin Abe,
Kaizaburo Chubachi,
Yasuhiro Fujita,
Yuta Hirokawa,
Kentaro Imajo,
Toshiki Kataoka,
Hiroyoshi Komatsu,
Hiroaki Mikami,
Tsuguo Mogami,
Shogo Murai,
Kosuke Nakago,
Daisuke Nishino,
Toru Ogawa,
Daisuke Okanohara,
Yoshihiko Ozaki,
Shotaro Sano,
Shuji Suzuki,
Tianqi Xu,
Toshihiko Yanase
Abstract:
We introduce PLaMo-100B, a large-scale language model designed for Japanese proficiency. The model was trained from scratch using 2 trillion tokens, with architecture such as QK Normalization and Z-Loss to ensure training stability during the training process. Post-training techniques, including Supervised Fine-Tuning and Direct Preference Optimization, were applied to refine the model's performan…
▽ More
We introduce PLaMo-100B, a large-scale language model designed for Japanese proficiency. The model was trained from scratch using 2 trillion tokens, with architecture such as QK Normalization and Z-Loss to ensure training stability during the training process. Post-training techniques, including Supervised Fine-Tuning and Direct Preference Optimization, were applied to refine the model's performance. Benchmark evaluations suggest that PLaMo-100B performs well, particularly in Japanese-specific tasks, achieving results that are competitive with frontier models like GPT-4. The base model is available at https://huggingface.co/pfnet/plamo-100b.
△ Less
Submitted 22 October, 2024; v1 submitted 9 October, 2024;
originally announced October 2024.
-
Calculations of Real-System Nanoparticles Using Universal Neural Network Potential PFP
Authors:
Gerardo Valadez Huerta,
Yusuke Nanba,
Iori Kurata,
Kosuke Nakago,
So Takamoto,
Chikashi Shinagawa,
Michihisa Koyama
Abstract:
It is essential to explore the stability and activity of real-system nanoparticles theoretically. While applications of theoretical methods for this purpose can be found in literature, the expensive computational costs of conventional theoretical methods hinder their massive applications to practical materials design. With the recent development of neural network algorithms along with the advancem…
▽ More
It is essential to explore the stability and activity of real-system nanoparticles theoretically. While applications of theoretical methods for this purpose can be found in literature, the expensive computational costs of conventional theoretical methods hinder their massive applications to practical materials design. With the recent development of neural network algorithms along with the advancement of computer systems, neural network potentials have emerged as a promising candidate for the description of a wide range of materials, including metals and molecules, with a reasonable computational time. In this study, we successfully validate a universal neural network potential, PFP, for the description of monometallic Ru nanoparticles, PdRuCu ternary alloy nanoparticles, and the NO adsorption on Rh nanoparticles against first-principles calculations. We further conduct molecular dynamics simulations on the NO-Rh system and challenge the PFP to describe a large, supported Pt nanoparticle system.
△ Less
Submitted 2 July, 2021;
originally announced July 2021.
-
Towards Universal Neural Network Potential for Material Discovery Applicable to Arbitrary Combination of 45 Elements
Authors:
So Takamoto,
Chikashi Shinagawa,
Daisuke Motoki,
Kosuke Nakago,
Wenwen Li,
Iori Kurata,
Taku Watanabe,
Yoshihiro Yayama,
Hiroki Iriguchi,
Yusuke Asano,
Tasuku Onodera,
Takafumi Ishii,
Takao Kudo,
Hideki Ono,
Ryohto Sawada,
Ryuichiro Ishitani,
Marc Ong,
Taiki Yamaguchi,
Toshiki Kataoka,
Akihide Hayashi,
Nontawat Charoenphakdee,
Takeshi Ibuka
Abstract:
Computational material discovery is under intense study owing to its ability to explore the vast space of chemical systems. Neural network potentials (NNPs) have been shown to be particularly effective in conducting atomistic simulations for such purposes. However, existing NNPs are generally designed for narrow target materials, making them unsuitable for broader applications in material discover…
▽ More
Computational material discovery is under intense study owing to its ability to explore the vast space of chemical systems. Neural network potentials (NNPs) have been shown to be particularly effective in conducting atomistic simulations for such purposes. However, existing NNPs are generally designed for narrow target materials, making them unsuitable for broader applications in material discovery. To overcome this issue, we have developed a universal NNP called PreFerred Potential (PFP), which is able to handle any combination of 45 elements. Particular emphasis is placed on the datasets, which include a diverse set of virtual structures used to attain the universality. We demonstrated the applicability of PFP in selected domains: lithium diffusion in LiFeSO${}_4$F, molecular adsorption in metal-organic frameworks, an order-disorder transition of Cu-Au alloys, and material discovery for a Fischer-Tropsch catalyst. They showcase the power of PFP, and this technology provides a highly useful tool for material discovery.
△ Less
Submitted 1 April, 2022; v1 submitted 28 June, 2021;
originally announced June 2021.
-
GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Authors:
Kaushalya Madhawa,
Katushiko Ishiguro,
Kosuke Nakago,
Motoki Abe
Abstract:
We propose GraphNVP, the first invertible, normalizing flow-based molecular graph generation model. We decompose the generation of a graph into two steps: generation of (i) an adjacency tensor and (ii) node attributes. This decomposition yields the exact likelihood maximization on graph-structured data, combined with two novel reversible flows. We empirically demonstrate that our model efficiently…
▽ More
We propose GraphNVP, the first invertible, normalizing flow-based molecular graph generation model. We decompose the generation of a graph into two steps: generation of (i) an adjacency tensor and (ii) node attributes. This decomposition yields the exact likelihood maximization on graph-structured data, combined with two novel reversible flows. We empirically demonstrate that our model efficiently generates valid molecular graphs with almost no duplicated molecules. In addition, we observe that the learned latent space can be used to generate molecules with desired chemical properties.
△ Less
Submitted 28 May, 2019;
originally announced May 2019.
-
BayesGrad: Explaining Predictions of Graph Convolutional Networks
Authors:
Hirotaka Akita,
Kosuke Nakago,
Tomoki Komatsu,
Yohei Sugawara,
Shin-ichi Maeda,
Yukino Baba,
Hisashi Kashima
Abstract:
Recent advances in graph convolutional networks have significantly improved the performance of chemical predictions, raising a new research question: "how do we explain the predictions of graph convolutional networks?" A possible approach to answer this question is to visualize evidence substructures responsible for the predictions. For chemical property prediction tasks, the sample size of the tr…
▽ More
Recent advances in graph convolutional networks have significantly improved the performance of chemical predictions, raising a new research question: "how do we explain the predictions of graph convolutional networks?" A possible approach to answer this question is to visualize evidence substructures responsible for the predictions. For chemical property prediction tasks, the sample size of the training data is often small and/or a label imbalance problem occurs, where a few samples belong to a single class and the majority of samples belong to the other classes. This can lead to uncertainty related to the learned parameters of the machine learning model. To address this uncertainty, we propose BayesGrad, utilizing the Bayesian predictive distribution, to define the importance of each node in an input graph, which is computed efficiently using the dropout technique. We demonstrate that BayesGrad successfully visualizes the substructures responsible for the label prediction in the artificial experiment, even when the sample size is small. Furthermore, we use a real dataset to evaluate the effectiveness of the visualization. The basic idea of BayesGrad is not limited to graph-structured data and can be applied to other data types.
△ Less
Submitted 4 July, 2018;
originally announced July 2018.
-
Parallelizable adiabatic gate teleportation
Authors:
Kosuke Nakago,
Michal HajduĊĦek,
Shojun Nakayama,
Mio Murao
Abstract:
We introduce a twisted Heisenberg-type interaction Hamiltonian, a Heisenberg-type spin interaction where the coordinates of the second qubit are twisted according to a unitary gate. We develop parallelizable adiabatic gate teleportation (PAGT) where a sequence of unitary gates is performed in a single step of the adiabatic process. In PAGT, numeric calculations suggest the necessary time for the a…
▽ More
We introduce a twisted Heisenberg-type interaction Hamiltonian, a Heisenberg-type spin interaction where the coordinates of the second qubit are twisted according to a unitary gate. We develop parallelizable adiabatic gate teleportation (PAGT) where a sequence of unitary gates is performed in a single step of the adiabatic process. In PAGT, numeric calculations suggest the necessary time for the adiabatic evolution implementing a sequence of $L$ unitary gates increases at most as $O(L^5)$. However, we show that it has the interesting property that it can map the temporal order of gates to the spatial order of interactions specified by the final Hamiltonian. Using this property, we present a controlled-PAGT scheme to manipulate the order of gates by a control-qubit. In the controlled-PAGT scheme, two differently ordered sequential unitary gates $FG$ and $GF$ are coherently performed depending on the state of a control-qubit by simultaneously applying the twisted Heisenberg-type interaction Hamiltonians implementing unitary gates $F$ and $G$. We investigate why the twisted Heisenberg-type interaction Hamiltonian allows PAGT. We show that the twisted Heisenberg-type interaction Hamiltonian has an ability to perform a transposed unitary gate by just modifying the space ordering of the final Hamiltonian implementing a unitary gate in adiabatic gate teleportation. The dynamics generated by the time-reversed Hamiltonian represented by the transposed unitary gate enables deterministic simulation of a postselected event of parallelized gate teleportation in adiabatic implementation.
△ Less
Submitted 4 September, 2015; v1 submitted 15 October, 2013;
originally announced October 2013.