Teaching Specific Scientific Knowledge into Large Language Models through Additional Training

Hatakeyama-Sato, Kan; Igarashi, Yasuhiko; Katakami, Shun; Nabae, Yuta; Hayakawa, Teruaki

Computer Science > Computation and Language

arXiv:2312.03360 (cs)

[Submitted on 6 Dec 2023 (v1), last revised 18 Dec 2023 (this version, v2)]

Title:Teaching Specific Scientific Knowledge into Large Language Models through Additional Training

Authors:Kan Hatakeyama-Sato, Yasuhiko Igarashi, Shun Katakami, Yuta Nabae, Teruaki Hayakawa

View PDF

Abstract:Through additional training, we explore embedding specialized scientific knowledge into the Llama 2 Large Language Model (LLM). Key findings reveal that effective knowledge integration requires reading texts from multiple perspectives, especially in instructional formats. We utilize text augmentation to tackle the scarcity of specialized texts, including style conversions and translations. Hyperparameter optimization proves crucial, with different size models (7b, 13b, and 70b) reasonably undergoing additional training. Validating our methods, we construct a dataset of 65,000 scientific papers. Although we have succeeded in partially embedding knowledge, the study highlights the complexities and limitations of incorporating specialized information into LLMs, suggesting areas for further improvement.

Comments:	added token information for some texts, and fixed typo
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2312.03360 [cs.CL]
	(or arXiv:2312.03360v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2312.03360

Submission history

From: Kan Hatakeyama-Sato [view email]
[v1] Wed, 6 Dec 2023 08:55:55 UTC (2,125 KB)
[v2] Mon, 18 Dec 2023 01:43:56 UTC (2,129 KB)

Computer Science > Computation and Language

Title:Teaching Specific Scientific Knowledge into Large Language Models through Additional Training

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Teaching Specific Scientific Knowledge into Large Language Models through Additional Training

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators