Search | arXiv e-print repository

Intersectional Bias in Japanese Large Language Models from a Contextualized Perspective

Authors: Hitomi Yanaka, Xinqi He, Jie Lu, Namgi Han, Sunjin Oh, Ryoma Kumon, Yuma Matsuoka, Katsuhiko Watabe, Yuko Itatsu

Abstract: An growing number of studies have examined the social bias of rapidly developed large language models (LLMs). Although most of these studies have focused on bias occurring in a single social attribute, research in social science has shown that social bias often occurs in the form of intersectionality -- the constitutive and contextualized perspective on bias aroused by social attributes. In this s… ▽ More An growing number of studies have examined the social bias of rapidly developed large language models (LLMs). Although most of these studies have focused on bias occurring in a single social attribute, research in social science has shown that social bias often occurs in the form of intersectionality -- the constitutive and contextualized perspective on bias aroused by social attributes. In this study, we construct the Japanese benchmark inter-JBBQ, designed to evaluate the intersectional bias in LLMs on the question-answering setting. Using inter-JBBQ to analyze GPT-4o and Swallow, we find that biased output varies according to its contexts even with the equal combination of social attributes. △ Less

Submitted 13 June, 2025; originally announced June 2025.

Comments: Accepted to the 6th Workshop on Gender Bias in Natural Language Processing (GeBNLP2025) at ACL2025

arXiv:2502.15277 [pdf, other]

Analyzing the Inner Workings of Transformers in Compositional Generalization

Authors: Ryoma Kumon, Hitomi Yanaka

Abstract: The compositional generalization abilities of neural models have been sought after for human-like linguistic competence. The popular method to evaluate such abilities is to assess the models' input-output behavior. However, that does not reveal the internal mechanisms, and the underlying competence of such models in compositional generalization remains unclear. To address this problem, we explore… ▽ More The compositional generalization abilities of neural models have been sought after for human-like linguistic competence. The popular method to evaluate such abilities is to assess the models' input-output behavior. However, that does not reveal the internal mechanisms, and the underlying competence of such models in compositional generalization remains unclear. To address this problem, we explore the inner workings of a Transformer model by finding an existing subnetwork that contributes to the generalization performance and by performing causal analyses on how the model utilizes syntactic features. We find that the model depends on syntactic features to output the correct answer, but that the subnetwork with much better generalization performance than the whole model relies on a non-compositional algorithm in addition to the syntactic features. We also show that the subnetwork improves its generalization performance relatively slowly during the training compared to the in-distribution one, and the non-compositional solution is acquired in the early stages of the training. △ Less

Submitted 21 February, 2025; originally announced February 2025.

Comments: Accepted to NAACL 2025 main

arXiv:2407.03963 [pdf, other]

LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

Authors: LLM-jp, :, Akiko Aizawa, Eiji Aramaki, Bowen Chen, Fei Cheng, Hiroyuki Deguchi, Rintaro Enomoto, Kazuki Fujii, Kensuke Fukumoto, Takuya Fukushima, Namgi Han, Yuto Harada, Chikara Hashimoto, Tatsuya Hiraoka, Shohei Hisada, Sosuke Hosokawa, Lu Jie, Keisuke Kamata, Teruhito Kanazawa, Hiroki Kanezashi, Hiroshi Kataoka, Satoru Katsumata, Daisuke Kawahara, Seiya Kawano , et al. (58 additional authors not shown)

Abstract: This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its… ▽ More This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its activities, and technical reports on the LLMs developed by LLM-jp. For the latest activities, visit https://llm-jp.nii.ac.jp/en/. △ Less

Submitted 30 December, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

arXiv:2406.13363 [pdf, other]

doi 10.18653/v1/2024.findings-acl.783

Evaluating Structural Generalization in Neural Machine Translation

Authors: Ryoma Kumon, Daiki Matsuoka, Hitomi Yanaka

Abstract: Compositional generalization refers to the ability to generalize to novel combinations of previously observed words and syntactic structures. Since it is regarded as a desired property of neural models, recent work has assessed compositional generalization in machine translation as well as semantic parsing. However, previous evaluations with machine translation have focused mostly on lexical gener… ▽ More Compositional generalization refers to the ability to generalize to novel combinations of previously observed words and syntactic structures. Since it is regarded as a desired property of neural models, recent work has assessed compositional generalization in machine translation as well as semantic parsing. However, previous evaluations with machine translation have focused mostly on lexical generalization (i.e., generalization to unseen combinations of known words). Thus, it remains unclear to what extent models can translate sentences that require structural generalization (i.e., generalization to different sorts of syntactic structures). To address this question, we construct SGET, a machine translation dataset covering various types of compositional generalization with control of words and sentence structures. We evaluate neural machine translation models on SGET and show that they struggle more in structural generalization than in lexical generalization. We also find different performance trends in semantic parsing and machine translation, which indicates the importance of evaluations across various tasks. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: To appear at ACL 2024 findings

arXiv:2406.02050 [pdf, ps, other]

JBBQ: Japanese Bias Benchmark for Analyzing Social Biases in Large Language Models

Authors: Hitomi Yanaka, Namgi Han, Ryoma Kumon, Jie Lu, Masashi Takeshita, Ryo Sekizawa, Taisei Kato, Hiromi Arai

Abstract: With the development of large language models (LLMs), social biases in these LLMs have become a pressing issue. Although there are various benchmarks for social biases across languages, the extent to which Japanese LLMs exhibit social biases has not been fully investigated. In this study, we construct the Japanese Bias Benchmark dataset for Question Answering (JBBQ) based on the English bias bench… ▽ More With the development of large language models (LLMs), social biases in these LLMs have become a pressing issue. Although there are various benchmarks for social biases across languages, the extent to which Japanese LLMs exhibit social biases has not been fully investigated. In this study, we construct the Japanese Bias Benchmark dataset for Question Answering (JBBQ) based on the English bias benchmark BBQ, with analysis of social biases in Japanese LLMs. The results show that while current open Japanese LLMs with more parameters show improved accuracies on JBBQ, their bias scores increase. In addition, prompts with a warning about social biases and chain-of-thought prompting reduce the effect of biases in model outputs, but there is room for improvement in extracting the correct evidence from contexts in Japanese. Our dataset is available at https://github.com/ynklab/JBBQ_data. △ Less

Submitted 13 June, 2025; v1 submitted 4 June, 2024; originally announced June 2024.

Comments: Accepted to the 6th Workshop on Gender Bias in Natural Language Processing (GeBNLP2025) at ACL2025

arXiv:1507.07471 [pdf, ps, other]

doi 10.1063/1.4928202

A method for measuring the Neel relaxation time in a frozen ferrofluid

Authors: R. J. Tackett, J. Thakur, N. Mosher, E. Perkins-Harbin, R. E. Kumon, L. Wang, C. Rablau, P. P. Vaishnava

Abstract: We report a novel method of determining the average Neel relaxation time and its temperature dependence by calculating derivatives of the measured time dependence of temperature for a frozen ferrofluid exposed to an alternating magnetic field. The ferrofluid, composed of dextran-coated Fe3O4 nanoparticles (diameter 13.7 nm +/- 4.7 nm), was synthesized via wet chemical precipitation and characteriz… ▽ More We report a novel method of determining the average Neel relaxation time and its temperature dependence by calculating derivatives of the measured time dependence of temperature for a frozen ferrofluid exposed to an alternating magnetic field. The ferrofluid, composed of dextran-coated Fe3O4 nanoparticles (diameter 13.7 nm +/- 4.7 nm), was synthesized via wet chemical precipitation and characterized by x-ray diffraction and transmission electron microscopy. An alternating magnetic field of constant amplitude (H0 = 20 kA/m) driven at frequencies of 171 kHz, 232 kHz and 343 kHz was used to determine the temperature dependent magnetic energy absorption rate in the temperature range from 160 K to 210 K. We found that the specific absorption rate of the ferrofluid decreased monotonically with temperature over this range at the given frequencies. From these measured data, we determined the temperature dependence of the Neel relaxation time and estimate a room-temperature magnetocrystalline anisotropy constant of 40 kJ/m3, in agreement with previously published results. △ Less

Submitted 27 July, 2015; originally announced July 2015.

Showing 1–6 of 6 results for author: Kumon, R