Search | arXiv e-print repository

Sequential Exchange Monte Carlo: Sampling Method for Multimodal Distribution without Parameter Tuning

Authors: Tomohiro Nabika, Kenji Nagata, Shun Katakami, Masaichiro Mizumaki, Masato Okada

Abstract: The Replica Exchange Monte Carlo (REMC) method, a Markov Chain Monte Carlo (MCMC) algorithm for sampling multimodal distributions, is typically employed in Bayesian inference for complex models. Using the REMC method, multiple probability distributions with different temperatures are defined to enhance sampling efficiency and allow for the high-precision computation of Bayesian free energy. Howeve… ▽ More The Replica Exchange Monte Carlo (REMC) method, a Markov Chain Monte Carlo (MCMC) algorithm for sampling multimodal distributions, is typically employed in Bayesian inference for complex models. Using the REMC method, multiple probability distributions with different temperatures are defined to enhance sampling efficiency and allow for the high-precision computation of Bayesian free energy. However, the REMC method requires the tuning of many parameters, including the number of distributions, temperature, and step size, which makes it difficult for nonexperts to effectively use. Thus, we propose the Sequential Exchange Monte Carlo (SEMC) method, which automates the tuning of parameters by sequentially determining the temperature and step size. Numerical experiments showed that SEMC is as efficient as parameter-tuned REMC and parameter-tuned Sequential Monte Carlo Samplers (SMCS), which is also effective for the Bayesian inference of complex models. △ Less

Submitted 25 February, 2025; originally announced February 2025.

arXiv:2312.03360 [pdf]

Teaching Specific Scientific Knowledge into Large Language Models through Additional Training

Authors: Kan Hatakeyama-Sato, Yasuhiko Igarashi, Shun Katakami, Yuta Nabae, Teruaki Hayakawa

Abstract: Through additional training, we explore embedding specialized scientific knowledge into the Llama 2 Large Language Model (LLM). Key findings reveal that effective knowledge integration requires reading texts from multiple perspectives, especially in instructional formats. We utilize text augmentation to tackle the scarcity of specialized texts, including style conversions and translations. Hyperpa… ▽ More Through additional training, we explore embedding specialized scientific knowledge into the Llama 2 Large Language Model (LLM). Key findings reveal that effective knowledge integration requires reading texts from multiple perspectives, especially in instructional formats. We utilize text augmentation to tackle the scarcity of specialized texts, including style conversions and translations. Hyperparameter optimization proves crucial, with different size models (7b, 13b, and 70b) reasonably undergoing additional training. Validating our methods, we construct a dataset of 65,000 scientific papers. Although we have succeeded in partially embedding knowledge, the study highlights the complexities and limitations of incorporating specialized information into LLMs, suggesting areas for further improvement. △ Less

Submitted 17 December, 2023; v1 submitted 6 December, 2023; originally announced December 2023.

Comments: added token information for some texts, and fixed typo

arXiv:2305.07040 [pdf, other]

Sequential Experimental Design for Spectral Measurement: Active Learning Using a Parametric Model

Authors: Tomohiro Nabika, Kenji Nagata, Shun Katakami, Masaichiro Mizumaki, Masato Okada

Abstract: In this study, we demonstrate a sequential experimental design for spectral measurements by active learning using parametric models as predictors. In spectral measurements, it is necessary to reduce the measurement time because of sample fragility and high energy costs. To improve the efficiency of experiments, sequential experimental designs are proposed, in which the subsequent measurement is de… ▽ More In this study, we demonstrate a sequential experimental design for spectral measurements by active learning using parametric models as predictors. In spectral measurements, it is necessary to reduce the measurement time because of sample fragility and high energy costs. To improve the efficiency of experiments, sequential experimental designs are proposed, in which the subsequent measurement is designed by active learning using the data obtained before the measurement. Conventionally, parametric models are employed in data analysis; when employed for active learning, they are expected to afford a sequential experimental design that improves the accuracy of data analysis. However, due to the complexity of the formulas, a sequential experimental design using general parametric models has not been realized. Therefore, we applied Bayesian inference-based data analysis using the exchange Monte Carlo method to realize a sequential experimental design with general parametric models. In this study, we evaluated the effectiveness of the proposed method by applying it to Bayesian spectral deconvolution and Bayesian Hamiltonian selection in X-ray photoelectron spectroscopy. Using numerical experiments with artificial data, we demonstrated that the proposed method improves the accuracy of model selection and parameter estimation while reducing the measurement time compared with the results achieved without active learning or with active learning using the Gaussian process regression. △ Less

Submitted 11 May, 2023; originally announced May 2023.

Showing 1–3 of 3 results for author: Katakami, S