-
Intersectional Bias in Japanese Large Language Models from a Contextualized Perspective
Authors:
Hitomi Yanaka,
Xinqi He,
Jie Lu,
Namgi Han,
Sunjin Oh,
Ryoma Kumon,
Yuma Matsuoka,
Katsuhiko Watabe,
Yuko Itatsu
Abstract:
An growing number of studies have examined the social bias of rapidly developed large language models (LLMs). Although most of these studies have focused on bias occurring in a single social attribute, research in social science has shown that social bias often occurs in the form of intersectionality -- the constitutive and contextualized perspective on bias aroused by social attributes. In this s…
▽ More
An growing number of studies have examined the social bias of rapidly developed large language models (LLMs). Although most of these studies have focused on bias occurring in a single social attribute, research in social science has shown that social bias often occurs in the form of intersectionality -- the constitutive and contextualized perspective on bias aroused by social attributes. In this study, we construct the Japanese benchmark inter-JBBQ, designed to evaluate the intersectional bias in LLMs on the question-answering setting. Using inter-JBBQ to analyze GPT-4o and Swallow, we find that biased output varies according to its contexts even with the equal combination of social attributes.
△ Less
Submitted 13 June, 2025;
originally announced June 2025.
-
Analyzing the Inner Workings of Transformers in Compositional Generalization
Authors:
Ryoma Kumon,
Hitomi Yanaka
Abstract:
The compositional generalization abilities of neural models have been sought after for human-like linguistic competence. The popular method to evaluate such abilities is to assess the models' input-output behavior. However, that does not reveal the internal mechanisms, and the underlying competence of such models in compositional generalization remains unclear. To address this problem, we explore…
▽ More
The compositional generalization abilities of neural models have been sought after for human-like linguistic competence. The popular method to evaluate such abilities is to assess the models' input-output behavior. However, that does not reveal the internal mechanisms, and the underlying competence of such models in compositional generalization remains unclear. To address this problem, we explore the inner workings of a Transformer model by finding an existing subnetwork that contributes to the generalization performance and by performing causal analyses on how the model utilizes syntactic features. We find that the model depends on syntactic features to output the correct answer, but that the subnetwork with much better generalization performance than the whole model relies on a non-compositional algorithm in addition to the syntactic features. We also show that the subnetwork improves its generalization performance relatively slowly during the training compared to the in-distribution one, and the non-compositional solution is acquired in the early stages of the training.
△ Less
Submitted 21 February, 2025;
originally announced February 2025.
-
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Authors:
LLM-jp,
:,
Akiko Aizawa,
Eiji Aramaki,
Bowen Chen,
Fei Cheng,
Hiroyuki Deguchi,
Rintaro Enomoto,
Kazuki Fujii,
Kensuke Fukumoto,
Takuya Fukushima,
Namgi Han,
Yuto Harada,
Chikara Hashimoto,
Tatsuya Hiraoka,
Shohei Hisada,
Sosuke Hosokawa,
Lu Jie,
Keisuke Kamata,
Teruhito Kanazawa,
Hiroki Kanezashi,
Hiroshi Kataoka,
Satoru Katsumata,
Daisuke Kawahara,
Seiya Kawano
, et al. (58 additional authors not shown)
Abstract:
This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its…
▽ More
This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its activities, and technical reports on the LLMs developed by LLM-jp. For the latest activities, visit https://llm-jp.nii.ac.jp/en/.
△ Less
Submitted 30 December, 2024; v1 submitted 4 July, 2024;
originally announced July 2024.
-
Evaluating Structural Generalization in Neural Machine Translation
Authors:
Ryoma Kumon,
Daiki Matsuoka,
Hitomi Yanaka
Abstract:
Compositional generalization refers to the ability to generalize to novel combinations of previously observed words and syntactic structures. Since it is regarded as a desired property of neural models, recent work has assessed compositional generalization in machine translation as well as semantic parsing. However, previous evaluations with machine translation have focused mostly on lexical gener…
▽ More
Compositional generalization refers to the ability to generalize to novel combinations of previously observed words and syntactic structures. Since it is regarded as a desired property of neural models, recent work has assessed compositional generalization in machine translation as well as semantic parsing. However, previous evaluations with machine translation have focused mostly on lexical generalization (i.e., generalization to unseen combinations of known words). Thus, it remains unclear to what extent models can translate sentences that require structural generalization (i.e., generalization to different sorts of syntactic structures). To address this question, we construct SGET, a machine translation dataset covering various types of compositional generalization with control of words and sentence structures. We evaluate neural machine translation models on SGET and show that they struggle more in structural generalization than in lexical generalization. We also find different performance trends in semantic parsing and machine translation, which indicates the importance of evaluations across various tasks.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
JBBQ: Japanese Bias Benchmark for Analyzing Social Biases in Large Language Models
Authors:
Hitomi Yanaka,
Namgi Han,
Ryoma Kumon,
Jie Lu,
Masashi Takeshita,
Ryo Sekizawa,
Taisei Kato,
Hiromi Arai
Abstract:
With the development of large language models (LLMs), social biases in these LLMs have become a pressing issue. Although there are various benchmarks for social biases across languages, the extent to which Japanese LLMs exhibit social biases has not been fully investigated. In this study, we construct the Japanese Bias Benchmark dataset for Question Answering (JBBQ) based on the English bias bench…
▽ More
With the development of large language models (LLMs), social biases in these LLMs have become a pressing issue. Although there are various benchmarks for social biases across languages, the extent to which Japanese LLMs exhibit social biases has not been fully investigated. In this study, we construct the Japanese Bias Benchmark dataset for Question Answering (JBBQ) based on the English bias benchmark BBQ, with analysis of social biases in Japanese LLMs. The results show that while current open Japanese LLMs with more parameters show improved accuracies on JBBQ, their bias scores increase. In addition, prompts with a warning about social biases and chain-of-thought prompting reduce the effect of biases in model outputs, but there is room for improvement in extracting the correct evidence from contexts in Japanese. Our dataset is available at https://github.com/ynklab/JBBQ_data.
△ Less
Submitted 13 June, 2025; v1 submitted 4 June, 2024;
originally announced June 2024.
-
A method for measuring the Neel relaxation time in a frozen ferrofluid
Authors:
R. J. Tackett,
J. Thakur,
N. Mosher,
E. Perkins-Harbin,
R. E. Kumon,
L. Wang,
C. Rablau,
P. P. Vaishnava
Abstract:
We report a novel method of determining the average Neel relaxation time and its temperature dependence by calculating derivatives of the measured time dependence of temperature for a frozen ferrofluid exposed to an alternating magnetic field. The ferrofluid, composed of dextran-coated Fe3O4 nanoparticles (diameter 13.7 nm +/- 4.7 nm), was synthesized via wet chemical precipitation and characteriz…
▽ More
We report a novel method of determining the average Neel relaxation time and its temperature dependence by calculating derivatives of the measured time dependence of temperature for a frozen ferrofluid exposed to an alternating magnetic field. The ferrofluid, composed of dextran-coated Fe3O4 nanoparticles (diameter 13.7 nm +/- 4.7 nm), was synthesized via wet chemical precipitation and characterized by x-ray diffraction and transmission electron microscopy. An alternating magnetic field of constant amplitude (H0 = 20 kA/m) driven at frequencies of 171 kHz, 232 kHz and 343 kHz was used to determine the temperature dependent magnetic energy absorption rate in the temperature range from 160 K to 210 K. We found that the specific absorption rate of the ferrofluid decreased monotonically with temperature over this range at the given frequencies. From these measured data, we determined the temperature dependence of the Neel relaxation time and estimate a room-temperature magnetocrystalline anisotropy constant of 40 kJ/m3, in agreement with previously published results.
△ Less
Submitted 27 July, 2015;
originally announced July 2015.