-
Degradation-Aware and Machine Learning-Driven Uncertainty Quantification in Crystal Plasticity Finite Element: Texture-Driven Plasticity in 316L Stainless Steel
Authors:
Dinesh Kumar,
Eralp Demir,
Julio Spadotto,
Kazuma Kobayashi,
Syed Bahauddin Alam,
Brian Connolly,
Ed Pickering,
Paul Wilcox,
David Knowles,
Mahmoud Mostafavi
Abstract:
The mechanical properties and long-term structural reliability of crystalline materials are strongly influenced by microstructural features such as grain size, morphology, and crystallographic texture. These characteristics not only determine the initial mechanical behavior but also govern the progression of degradation mechanisms, such as strain localization, fatigue damage, and microcrack initia…
▽ More
The mechanical properties and long-term structural reliability of crystalline materials are strongly influenced by microstructural features such as grain size, morphology, and crystallographic texture. These characteristics not only determine the initial mechanical behavior but also govern the progression of degradation mechanisms, such as strain localization, fatigue damage, and microcrack initiation under service conditions. Variability in these microstructural attributes, introduced during manufacturing or evolving through in-service degradation, leads to uncertainty in material performance. Therefore, understanding and quantifying microstructure-sensitive plastic deformation is critical for assessing degradation risk in high-value mechanical systems. This study presents a first-of-its-kind machine learning-driven framework that couples high-fidelity crystal plasticity finite element (CPFE) simulations with data-driven surrogate modeling to accelerate degradation-aware uncertainty quantification in welded structural alloys. Specifically, the impact of crystallographic texture variability in 316L stainless steel weldments, characterized via high-throughput electron backscatter diffraction (EBSD), is examined through CPFE simulations on calibrated representative volume elements (RVEs). A polynomial chaos expansion-based surrogate model is then trained to efficiently emulate the CPFE response using only 200 simulations, reducing computational cost by several orders of magnitude compared to conventional Monte Carlo analysis. The surrogate enables rapid quantification of uncertainty in stress-strain behavior and identifies texture components such as Cube and Goss as key drivers of degradation-relevant plastic response.
△ Less
Submitted 24 May, 2025;
originally announced May 2025.
-
AI-driven Uncertainty Quantification & Multi-Physics Approach to Evaluate Cladding Materials in a Microreactor
Authors:
Alex Foutch,
Kazuma Kobayashi,
Ayodeji Alajo,
Dinesh Kumar,
Syed Bahauddin Alam
Abstract:
The pursuit of enhanced nuclear safety has spurred the development of accident-tolerant cladding (ATC) materials for light water reactors (LWRs). This study investigates the potential of repurposing these ATCs in advanced reactor designs, aiming to expedite material development and reduce costs. The research employs a multi-physics approach, encompassing neutronics, heat transfer, thermodynamics,…
▽ More
The pursuit of enhanced nuclear safety has spurred the development of accident-tolerant cladding (ATC) materials for light water reactors (LWRs). This study investigates the potential of repurposing these ATCs in advanced reactor designs, aiming to expedite material development and reduce costs. The research employs a multi-physics approach, encompassing neutronics, heat transfer, thermodynamics, and structural mechanics, to evaluate four candidate materials (Haynes 230, Zircaloy-4, FeCrAl, and SiC-SiC) within the context of a high-temperature, sodium-cooled microreactor, exemplified by the Kilopower design. While neutronic simulations revealed negligible power profile variations among the materials, finite element analyses highlighted the superior thermal stability of SiC-SiC and the favorable stress resistance of Haynes 230. The high-temperature environment significantly impacted material performance, particularly for Zircaloy-4 and FeCrAl, while SiC-SiC's inherent properties limited its ability to withstand stress loads. Additionally, AI-driven uncertainty quantification and sensitivity analysis were conducted to assess the influence of material property variations on maximum hoop stress. The findings underscore the need for further research into high-temperature material properties to facilitate broader applicability of existing materials to advanced reactors. Haynes 230 is identified as the most promising candidate based on the evaluated criteria.
△ Less
Submitted 18 March, 2025;
originally announced March 2025.
-
Benign Overfitting under Learning Rate Conditions for $α$ Sub-exponential Input
Authors:
Kota Okudo,
Kei Kobayashi
Abstract:
This paper investigates the phenomenon of benign overfitting in binary classification problems with heavy-tailed input distributions, extending the analysis of maximum margin classifiers to $α$ sub-exponential distributions ($α\in (0, 2]$). This generalizes previous work focused on sub-gaussian inputs. We provide generalization error bounds for linear classifiers trained using gradient descent on…
▽ More
This paper investigates the phenomenon of benign overfitting in binary classification problems with heavy-tailed input distributions, extending the analysis of maximum margin classifiers to $α$ sub-exponential distributions ($α\in (0, 2]$). This generalizes previous work focused on sub-gaussian inputs. We provide generalization error bounds for linear classifiers trained using gradient descent on unregularized logistic loss in this heavy-tailed setting. Our results show that, under certain conditions on the dimensionality $p$ and the distance between the centers of the distributions, the misclassification error of the maximum margin classifier asymptotically approaches the noise level, the theoretical optimal value. Moreover, we derive an upper bound on the learning rate $β$ for benign overfitting to occur and show that as the tail heaviness of the input distribution $α$ increases, the upper bound on the learning rate decreases. These results demonstrate that benign overfitting persists even in settings with heavier-tailed inputs than previously studied, contributing to a deeper understanding of the phenomenon in more realistic data environments.
△ Less
Submitted 16 October, 2024; v1 submitted 1 September, 2024;
originally announced September 2024.
-
Learning Decision Trees and Forests with Algorithmic Recourse
Authors:
Kentaro Kanamori,
Takuya Takagi,
Ken Kobayashi,
Yuichi Ike
Abstract:
This paper proposes a new algorithm for learning accurate tree-based models while ensuring the existence of recourse actions. Algorithmic Recourse (AR) aims to provide a recourse action for altering the undesired prediction result given by a model. Typical AR methods provide a reasonable action by solving an optimization task of minimizing the required effort among executable actions. In practice,…
▽ More
This paper proposes a new algorithm for learning accurate tree-based models while ensuring the existence of recourse actions. Algorithmic Recourse (AR) aims to provide a recourse action for altering the undesired prediction result given by a model. Typical AR methods provide a reasonable action by solving an optimization task of minimizing the required effort among executable actions. In practice, however, such actions do not always exist for models optimized only for predictive performance. To alleviate this issue, we formulate the task of learning an accurate classification tree under the constraint of ensuring the existence of reasonable actions for as many instances as possible. Then, we propose an efficient top-down greedy algorithm by leveraging the adversarial training techniques. We also show that our proposed algorithm can be applied to the random forest, which is known as a popular framework for learning tree ensembles. Experimental results demonstrated that our method successfully provided reasonable actions to more instances than the baselines without significantly degrading accuracy and computational efficiency.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Deep Neural Operator Driven Real Time Inference for Nuclear Systems to Enable Digital Twin Solutions
Authors:
Kazuma Kobayashi,
Syed Bahauddin Alam
Abstract:
This paper focuses on the feasibility of Deep Neural Operator (DeepONet) as a robust surrogate modeling method within the context of digital twin (DT) for nuclear energy systems. Through benchmarking and evaluation, this study showcases the generalizability and computational efficiency of DeepONet in solving a challenging particle transport problem. DeepONet also exhibits remarkable prediction acc…
▽ More
This paper focuses on the feasibility of Deep Neural Operator (DeepONet) as a robust surrogate modeling method within the context of digital twin (DT) for nuclear energy systems. Through benchmarking and evaluation, this study showcases the generalizability and computational efficiency of DeepONet in solving a challenging particle transport problem. DeepONet also exhibits remarkable prediction accuracy and speed, outperforming traditional ML methods, making it a suitable algorithm for real-time DT inference. However, the application of DeepONet also reveals challenges related to optimal sensor placement and model evaluation, critical aspects of real-world implementation. Addressing these challenges will further enhance the method's practicality and reliability. Overall, DeepONet presents a promising and transformative nuclear engineering research and applications tool. Its accurate prediction and computational efficiency capabilities can revolutionize DT systems, advancing nuclear engineering research. This study marks an important step towards harnessing the power of surrogate modeling techniques in critical engineering domains.
△ Less
Submitted 28 April, 2024; v1 submitted 14 August, 2023;
originally announced August 2023.
-
An IPW-based Unbiased Ranking Metric in Two-sided Markets
Authors:
Keisho Oh,
Naoki Nishimura,
Minje Sung,
Ken Kobayashi,
Kazuhide Nakata
Abstract:
In modern recommendation systems, unbiased learning-to-rank (LTR) is crucial for prioritizing items from biased implicit user feedback, such as click data. Several techniques, such as Inverse Propensity Weighting (IPW), have been proposed for single-sided markets. However, less attention has been paid to two-sided markets, such as job platforms or dating services, where successful conversions requ…
▽ More
In modern recommendation systems, unbiased learning-to-rank (LTR) is crucial for prioritizing items from biased implicit user feedback, such as click data. Several techniques, such as Inverse Propensity Weighting (IPW), have been proposed for single-sided markets. However, less attention has been paid to two-sided markets, such as job platforms or dating services, where successful conversions require matching preferences from both users. This paper addresses the complex interaction of biases between users in two-sided markets and proposes a tailored LTR approach. We first present a formulation of feedback mechanisms in two-sided matching platforms and point out that their implicit feedback may include position bias from both user groups. On the basis of this observation, we extend the IPW estimator and propose a new estimator, named two-sided IPW, to address the position bases in two-sided markets. We prove that the proposed estimator satisfies the unbiasedness for the ground-truth ranking metric. We conducted numerical experiments on real-world two-sided platforms and demonstrated the effectiveness of our proposed method in terms of both precision and robustness. Our experiments showed that our method outperformed baselines especially when handling rare items, which are less frequently observed in the training data.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
Algorithmic Recourse with Missing Values
Authors:
Kentaro Kanamori,
Takuya Takagi,
Ken Kobayashi,
Yuichi Ike
Abstract:
This paper proposes a new framework of algorithmic recourse (AR) that works even in the presence of missing values. AR aims to provide a recourse action for altering the undesired prediction result given by a classifier. Existing AR methods assume that we can access complete information on the features of an input instance. However, we often encounter missing values in a given instance (e.g., due…
▽ More
This paper proposes a new framework of algorithmic recourse (AR) that works even in the presence of missing values. AR aims to provide a recourse action for altering the undesired prediction result given by a classifier. Existing AR methods assume that we can access complete information on the features of an input instance. However, we often encounter missing values in a given instance (e.g., due to privacy concerns), and previous studies have not discussed such a practical situation. In this paper, we first empirically and theoretically show the risk that a naive approach with a single imputation technique fails to obtain good actions regarding their validity, cost, and features to be changed. To alleviate this risk, we formulate the task of obtaining a valid and low-cost action for a given incomplete instance by incorporating the idea of multiple imputation. Then, we provide some theoretical analyses of our task and propose a practical solution based on mixed-integer linear optimization. Experimental results demonstrated the efficacy of our method in the presence of missing values compared to the baselines.
△ Less
Submitted 22 May, 2024; v1 submitted 27 April, 2023;
originally announced April 2023.
-
Improved generalization with deep neural operators for engineering systems: Path towards digital twin
Authors:
Kazuma Kobayashi,
James Daniell,
Syed Bahauddin Alam
Abstract:
Neural Operator Networks (ONets) represent a novel advancement in machine learning algorithms, offering a robust and generalizable alternative for approximating partial differential equations (PDEs) solutions. Unlike traditional Neural Networks (NN), which directly approximate functions, ONets specialize in approximating mathematical operators, enhancing their efficacy in addressing complex PDEs.…
▽ More
Neural Operator Networks (ONets) represent a novel advancement in machine learning algorithms, offering a robust and generalizable alternative for approximating partial differential equations (PDEs) solutions. Unlike traditional Neural Networks (NN), which directly approximate functions, ONets specialize in approximating mathematical operators, enhancing their efficacy in addressing complex PDEs. In this work, we evaluate the capabilities of Deep Operator Networks (DeepONets), an ONets implementation using a branch/trunk architecture. Three test cases are studied: a system of ODEs, a general diffusion system, and the convection/diffusion Burgers equation. It is demonstrated that DeepONets can accurately learn the solution operators, achieving prediction accuracy scores above 0.96 for the ODE and diffusion problems over the observed domain while achieving zero shot (without retraining) capability. More importantly, when evaluated on unseen scenarios (zero shot feature), the trained models exhibit excellent generalization ability. This underscores ONets vital niche for surrogate modeling and digital twin development across physical systems. While convection-diffusion poses a greater challenge, the results confirm the promise of ONets and motivate further enhancements to the DeepONet algorithm. This work represents an important step towards unlocking the potential of digital twins through robust and generalizable surrogates.
△ Less
Submitted 28 April, 2024; v1 submitted 16 January, 2023;
originally announced January 2023.
-
Explainable, Interpretable & Trustworthy AI for Intelligent Digital Twin: Case Study on Remaining Useful Life
Authors:
Kazuma Kobayashi,
Syed Bahauddin Alam
Abstract:
Artificial intelligence (AI) and Machine learning (ML) are increasingly used in energy and engineering systems, but these models must be fair, unbiased, and explainable. It is critical to have confidence in AI's trustworthiness. ML techniques have been useful in predicting important parameters and in improving model performance. However, for these AI techniques to be useful for making decisions, t…
▽ More
Artificial intelligence (AI) and Machine learning (ML) are increasingly used in energy and engineering systems, but these models must be fair, unbiased, and explainable. It is critical to have confidence in AI's trustworthiness. ML techniques have been useful in predicting important parameters and in improving model performance. However, for these AI techniques to be useful for making decisions, they need to be audited, accounted for, and easy to understand. Therefore, the use of explainable AI (XAI) and interpretable machine learning (IML) is crucial for the accurate prediction of prognostics, such as remaining useful life (RUL), in a digital twin system, to make it intelligent while ensuring that the AI model is transparent in its decision-making processes and that the predictions it generates can be understood and trusted by users. By using AI that is explainable, interpretable, and trustworthy, intelligent digital twin systems can make more accurate predictions of RUL, leading to better maintenance and repair planning, and ultimately, improved system performance. The objective of this paper is to explain the ideas of XAI and IML and to justify the important role of AI/ML in the digital twin framework and components, which requires XAI to understand the prediction better. This paper explains the importance of XAI and IML in both local and global aspects to ensure the use of trustworthy AI/ML applications for RUL prediction. We used the RUL prediction for the XAI and IML studies and leveraged the integrated Python toolbox for interpretable machine learning~(PiML).
△ Less
Submitted 28 April, 2024; v1 submitted 16 January, 2023;
originally announced January 2023.
-
AI-driven non-intrusive uncertainty quantification of advanced nuclear fuels for digital twin-enabling technology
Authors:
Kazuma Kobayashi,
Dinesh Kumar,
Syed Bahauddin Alam
Abstract:
In response to the urgent need to establish AI/ML-integrated Digital Twin (DT) technology within next-generation nuclear systems, advancements in modeling methods and simulation codes are necessary. The increased complexity of models demands significant computational resources to quantify their uncertainties. To address this challenge, a data-driven non-intrusive uncertainty quantification method…
▽ More
In response to the urgent need to establish AI/ML-integrated Digital Twin (DT) technology within next-generation nuclear systems, advancements in modeling methods and simulation codes are necessary. The increased complexity of models demands significant computational resources to quantify their uncertainties. To address this challenge, a data-driven non-intrusive uncertainty quantification method via polynomial chaos expansion is introduced as an efficient strategy within the finite element analysis-based fuel performance code BISON. Models of and fuels, alongside SiC/SiC cladding material, were prepared to demonstrate the proposed method. The impact of four independent uncertain input variables on the system output was quantified, requiring fewer than 100 BISON simulations for each model. This approach not only accelerates the modeling and simulation task but also enhances the reliability in the development of DT-enabling technologies.
△ Less
Submitted 28 April, 2024; v1 submitted 24 November, 2022;
originally announced November 2022.
-
Digital Twin-Centered Hybrid Data-Driven Multi-Stage Deep Learning Framework for Enhanced Nuclear Reactor Power Prediction
Authors:
James Daniell,
Kazuma Kobayashi,
Ayodeji Alajo,
Syed Bahauddin Alam
Abstract:
The accurate and efficient modeling of nuclear reactor transients is crucial for ensuring safe and optimal reactor operation. Traditional physics-based models, while valuable, can be computationally intensive and may not fully capture the complexities of real-world reactor behavior. This paper introduces a novel hybrid digital twin-focused multi-stage deep learning framework that addresses these l…
▽ More
The accurate and efficient modeling of nuclear reactor transients is crucial for ensuring safe and optimal reactor operation. Traditional physics-based models, while valuable, can be computationally intensive and may not fully capture the complexities of real-world reactor behavior. This paper introduces a novel hybrid digital twin-focused multi-stage deep learning framework that addresses these limitations, offering a faster and more robust solution for predicting the final steady-state power of reactor transients. By leveraging a combination of feed-forward neural networks with both classification and regression stages, and training on a unique dataset that integrates real-world measurements of reactor power and controls state from the Missouri University of Science and Technology Reactor (MSTR) with noise-enhanced simulated data, our approach achieves remarkable accuracy (96% classification, 2.3% MAPE). The incorporation of simulated data with noise significantly improves the model's generalization capabilities, mitigating the risk of overfitting. Designed as a digital twin supporting system, this framework integrates real-time, synchronized predictions of reactor state transitions, enabling dynamic operational monitoring and optimization. This innovative solution not only enables rapid and precise prediction of reactor behavior but also has the potential to revolutionize nuclear reactor operations, facilitating enhanced safety protocols, optimized performance, and streamlined decision-making processes. By aligning data-driven insights with the principles of digital twins, this work lays the groundwork for adaptable and scalable solutions in nuclear system management.
△ Less
Submitted 27 November, 2024; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Uncertainty Quantification and Sensitivity analysis for Digital Twin Enabling Technology: Application for BISON Fuel Performance Code
Authors:
Kazuma Kobayashi,
Dinesh Kumar,
Matthew Bonney,
Souvik Chakraborty,
Kyle Paaren,
Syed Alam
Abstract:
To understand the potential of intelligent confirmatory tools, the U.S. Nuclear Regulatory Committee (NRC) initiated a future-focused research project to assess the regulatory viability of machine learning (ML) and artificial intelligence (AI)-driven Digital Twins (DTs) for nuclear power applications. Advanced accident tolerant fuel (ATF) is one of the priority focus areas of the U.S. Department o…
▽ More
To understand the potential of intelligent confirmatory tools, the U.S. Nuclear Regulatory Committee (NRC) initiated a future-focused research project to assess the regulatory viability of machine learning (ML) and artificial intelligence (AI)-driven Digital Twins (DTs) for nuclear power applications. Advanced accident tolerant fuel (ATF) is one of the priority focus areas of the U.S. Department of Energy (DOE). A DT framework can offer game-changing yet practical and informed solutions to the complex problem of qualifying advanced ATFs. Considering the regulatory standpoint of the modeling and simulation (M&S) aspect of DT, uncertainty quantification and sensitivity analysis are paramount to the DT framework's success in terms of multi-criteria and risk-informed decision-making. This chapter introduces the ML-based uncertainty quantification and sensitivity analysis methods while exhibiting actual applications to the finite element-based nuclear fuel performance code BISON.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Reliability-Based Robust Design Optimization Method for Engineering Systems with Uncertainty Quantification
Authors:
Richa Verma,
Dinesh Kumar,
Kazuma Kobayashi,
Syed Alam
Abstract:
Robust optimization is a method for optimization under uncertainties in engineering systems and designs for applications ranging from aeronautics to nuclear. In a robust design process, parameter variability (or uncertainty) is incorporated into the engineering systems' optimization process to assure the systems' quality and reliability. This chapter focuses on a robust optimization approach for d…
▽ More
Robust optimization is a method for optimization under uncertainties in engineering systems and designs for applications ranging from aeronautics to nuclear. In a robust design process, parameter variability (or uncertainty) is incorporated into the engineering systems' optimization process to assure the systems' quality and reliability. This chapter focuses on a robust optimization approach for developing robust and reliable advanced systems and explains the framework for using uncertainty quantification and optimization techniques. For the uncertainty analysis, a polynomial chaos-based approach is combined with the optimization algorithms MOSA (Multi-Objective Simulated Annealing), and the process is discussed with a simplified test function. For the optimization process, gradient-free genetic algorithms are considered as the optimizer scans the whole design space, and the optimal values are not always dependent on the initial values.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Surrogate Modeling-Driven Physics-Informed Multi-fidelity Kriging: Path Forward to Digital Twin Enabling Simulation for Accident Tolerant Fuel
Authors:
Kazuma Kobayashi,
James Daniell,
Shoaib Usman,
Dinesh Kumar,
Syed Alam
Abstract:
The Gaussian Process (GP)-based surrogate model has the inherent capability of capturing the anomaly arising from limited data, lack of data, missing data, and data inconsistencies (noisy/erroneous data) present in the modeling and simulation component of the digital twin framework, specifically for the accident tolerant fuel (ATF) concepts. However, GP will not be very accurate when we have limit…
▽ More
The Gaussian Process (GP)-based surrogate model has the inherent capability of capturing the anomaly arising from limited data, lack of data, missing data, and data inconsistencies (noisy/erroneous data) present in the modeling and simulation component of the digital twin framework, specifically for the accident tolerant fuel (ATF) concepts. However, GP will not be very accurate when we have limited high-fidelity (experimental) data. In addition, it is challenging to apply higher dimensional functions (>20-dimensional function) to approximate predictions with the GP. Furthermore, noisy data or data containing erroneous observations and outliers are major challenges for advanced ATF concepts. Also, the governing differential equation is empirical for longer-term ATF candidates, and data availability is an issue. Physics-informed multi-fidelity Kriging (MFK) can be useful for identifying and predicting the required material properties. MFK is particularly useful with low-fidelity physics (approximating physics) and limited high-fidelity data - which is the case for ATF candidates since there is limited data availability. This chapter explores the method and presents its application to experimental thermal conductivity measurement data for ATF. The MFK method showed its significance for a small number of data that could not be modeled by the conventional Kriging method. Mathematical models constructed with this method can be easily connected to later-stage analysis such as uncertainty quantification and sensitivity analysis and are expected to be applied to fundamental research and a wide range of product development fields. The overarching objective of this chapter is to show the capability of MFK surrogates that can be embedded in a digital twin system for ATF.
△ Less
Submitted 4 November, 2022; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Leveraging Industry 4.0 -- Deep Learning, Surrogate Model and Transfer Learning with Uncertainty Quantification Incorporated into Digital Twin for Nuclear System
Authors:
M. Rahman,
Abid Khan,
Sayeed Anowar,
Md Al-Imran,
Richa Verma,
Dinesh Kumar,
Kazuma Kobayashi,
Syed Alam
Abstract:
Industry 4.0 targets the conversion of the traditional industries into intelligent ones through technological revolution. This revolution is only possible through innovation, optimization, interconnection, and rapid decision-making capability. Numerical models are believed to be the key components of Industry 4.0, facilitating quick decision-making through simulations instead of costly experiments…
▽ More
Industry 4.0 targets the conversion of the traditional industries into intelligent ones through technological revolution. This revolution is only possible through innovation, optimization, interconnection, and rapid decision-making capability. Numerical models are believed to be the key components of Industry 4.0, facilitating quick decision-making through simulations instead of costly experiments. However, numerical investigation of precise, high-fidelity models for optimization or decision-making is usually time-consuming and computationally expensive. In such instances, data-driven surrogate models are excellent substitutes for fast computational analysis and the probabilistic prediction of the output parameter for new input parameters. The emergence of Internet of Things (IoT) and Machine Learning (ML) has made the concept of surrogate modeling even more viable. However, these surrogate models contain intrinsic uncertainties, originate from modeling defects, or both. These uncertainties, if not quantified and minimized, can produce a skewed result. Therefore, proper implementation of uncertainty quantification techniques is crucial during optimization, cost reduction, or safety enhancement processes analysis. This chapter begins with a brief overview of the concept of surrogate modeling, transfer learning, IoT and digital twins. After that, a detailed overview of uncertainties, uncertainty quantification frameworks, and specifics of uncertainty quantification methodologies for a surrogate model linked to a digital twin is presented. Finally, the use of uncertainty quantification approaches in the nuclear industry has been addressed.
△ Less
Submitted 30 September, 2022;
originally announced October 2022.
-
Machine Learning and Artificial Intelligence-Driven Multi-Scale Modeling for High Burnup Accident-Tolerant Fuels for Light Water-Based SMR Applications
Authors:
Md. Shamim Hassan,
Abid Hossain Khan,
Richa Verma,
Dinesh Kumar,
Kazuma Kobayashi,
Shoaib Usman,
Syed Alam
Abstract:
The concept of small modular reactor has changed the outlook for tackling future energy crises. This new reactor technology is very promising considering its lower investment requirements, modularity, design simplicity, and enhanced safety features. The application of artificial intelligence-driven multi-scale modeling (neutronics, thermal hydraulics, fuel performance, etc.) incorporating Digital…
▽ More
The concept of small modular reactor has changed the outlook for tackling future energy crises. This new reactor technology is very promising considering its lower investment requirements, modularity, design simplicity, and enhanced safety features. The application of artificial intelligence-driven multi-scale modeling (neutronics, thermal hydraulics, fuel performance, etc.) incorporating Digital Twin and associated uncertainties in the research of small modular reactors is a recent concept. In this work, a comprehensive study is conducted on the multiscale modeling of accident-tolerant fuels. The application of these fuels in the light water-based small modular reactors is explored. This chapter also focuses on the application of machine learning and artificial intelligence in the design optimization, control, and monitoring of small modular reactors. Finally, a brief assessment of the research gap on the application of artificial intelligence to the development of high burnup composite accident-tolerant fuels is provided. Necessary actions to fulfill these gaps are also discussed.
△ Less
Submitted 25 September, 2022;
originally announced September 2022.
-
Approximate Bayesian Computation of Bézier Simplices
Authors:
Akinori Tanaka,
Akiyoshi Sannai,
Ken Kobayashi,
Naoki Hamada
Abstract:
Bézier simplex fitting algorithms have been recently proposed to approximate the Pareto set/front of multi-objective continuous optimization problems. These new methods have shown to be successful at approximating various shapes of Pareto sets/fronts when sample points exactly lie on the Pareto set/front. However, if the sample points scatter away from the Pareto set/front, those methods often lik…
▽ More
Bézier simplex fitting algorithms have been recently proposed to approximate the Pareto set/front of multi-objective continuous optimization problems. These new methods have shown to be successful at approximating various shapes of Pareto sets/fronts when sample points exactly lie on the Pareto set/front. However, if the sample points scatter away from the Pareto set/front, those methods often likely suffer from over-fitting. To overcome this issue, in this paper, we extend the Bézier simplex model to a probabilistic one and propose a new learning algorithm of it, which falls into the framework of approximate Bayesian computation (ABC) based on the Wasserstein distance. We also study the convergence property of the Wasserstein ABC algorithm. An extensive experimental evaluation on publicly available problem instances shows that the new algorithm converges on a finite sample. Moreover, it outperforms the deterministic fitting methods on noisy instances.
△ Less
Submitted 12 April, 2021; v1 submitted 10 April, 2021;
originally announced April 2021.
-
Representing Hierarchical Structure by Using Cone Embedding
Authors:
Daisuke Takehara,
Kei Kobayashi
Abstract:
Graph embedding is becoming an important method with applications in various areas, including social networks and knowledge graph completion. In particular, Poincaré embedding has been proposed to capture the hierarchical structure of graphs, and its effectiveness has been reported. However, most of the existing methods have isometric mappings in the embedding space, and the choice of the origin p…
▽ More
Graph embedding is becoming an important method with applications in various areas, including social networks and knowledge graph completion. In particular, Poincaré embedding has been proposed to capture the hierarchical structure of graphs, and its effectiveness has been reported. However, most of the existing methods have isometric mappings in the embedding space, and the choice of the origin point can be arbitrary. This fact is not desirable when the distance from the origin is used as an indicator of hierarchy, as in the case of Poincaré embedding. In this paper, we propose cone embedding, embedding method in a metric cone, which solve these problems, and we gain further benefits: 1) we provide an indicator of hierarchical information that is both geometrically and intuitively natural to interpret, and 2) we can extract the hierarchical structure from a graph embedding output of other methods by learning additional one-dimensional parameters.
△ Less
Submitted 10 May, 2022; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Ordered Counterfactual Explanation by Mixed-Integer Linear Optimization
Authors:
Kentaro Kanamori,
Takuya Takagi,
Ken Kobayashi,
Yuichi Ike,
Kento Uemura,
Hiroki Arimura
Abstract:
Post-hoc explanation methods for machine learning models have been widely used to support decision-making. One of the popular methods is Counterfactual Explanation (CE), also known as Actionable Recourse, which provides a user with a perturbation vector of features that alters the prediction result. Given a perturbation vector, a user can interpret it as an "action" for obtaining one's desired dec…
▽ More
Post-hoc explanation methods for machine learning models have been widely used to support decision-making. One of the popular methods is Counterfactual Explanation (CE), also known as Actionable Recourse, which provides a user with a perturbation vector of features that alters the prediction result. Given a perturbation vector, a user can interpret it as an "action" for obtaining one's desired decision result. In practice, however, showing only a perturbation vector is often insufficient for users to execute the action. The reason is that if there is an asymmetric interaction among features, such as causality, the total cost of the action is expected to depend on the order of changing features. Therefore, practical CE methods are required to provide an appropriate order of changing features in addition to a perturbation vector. For this purpose, we propose a new framework called Ordered Counterfactual Explanation (OrdCE). We introduce a new objective function that evaluates a pair of an action and an order based on feature interaction. To extract an optimal pair, we propose a mixed-integer linear optimization approach with our objective function. Numerical experiments on real datasets demonstrated the effectiveness of our OrdCE in comparison with unordered CE methods.
△ Less
Submitted 14 March, 2021; v1 submitted 21 December, 2020;
originally announced December 2020.
-
Prediction of hierarchical time series using structured regularization and its application to artificial neural networks
Authors:
Tomokaze Shiratori,
Ken Kobayashi,
Yuichi Takano
Abstract:
This paper discusses the prediction of hierarchical time series, where each upper-level time series is calculated by summing appropriate lower-level time series. Forecasts for such hierarchical time series should be coherent, meaning that the forecast for an upper-level time series equals the sum of forecasts for corresponding lower-level time series. Previous methods for making coherent forecasts…
▽ More
This paper discusses the prediction of hierarchical time series, where each upper-level time series is calculated by summing appropriate lower-level time series. Forecasts for such hierarchical time series should be coherent, meaning that the forecast for an upper-level time series equals the sum of forecasts for corresponding lower-level time series. Previous methods for making coherent forecasts consist of two phases: first computing base (incoherent) forecasts and then reconciling those forecasts based on their inherent hierarchical structure. With the aim of improving time series predictions, we propose a structured regularization method for completing both phases simultaneously. The proposed method is based on a prediction model for bottom-level time series and uses a structured regularization term to incorporate upper-level forecasts into the prediction model. We also develop a backpropagation algorithm specialized for application of our method to artificial neural networks for time series prediction. Experimental results using synthetic and real-world datasets demonstrate the superiority of our method in terms of prediction accuracy and computational efficiency.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
Why is the Mahalanobis Distance Effective for Anomaly Detection?
Authors:
Ryo Kamoi,
Kei Kobayashi
Abstract:
The Mahalanobis distance-based confidence score, a recently proposed anomaly detection method for pre-trained neural classifiers, achieves state-of-the-art performance on both out-of-distribution (OoD) and adversarial examples detection. This work analyzes why this method exhibits such strong performance in practical settings while imposing an implausible assumption; namely, that class conditional…
▽ More
The Mahalanobis distance-based confidence score, a recently proposed anomaly detection method for pre-trained neural classifiers, achieves state-of-the-art performance on both out-of-distribution (OoD) and adversarial examples detection. This work analyzes why this method exhibits such strong performance in practical settings while imposing an implausible assumption; namely, that class conditional distributions of pre-trained features have tied covariance. Although the Mahalanobis distance-based method is claimed to be motivated by classification prediction confidence, we find that its superior performance stems from information not useful for classification. This suggests that the reason the Mahalanobis confidence score works so well is mistaken, and makes use of different information from ODIN, another popular OoD detection method based on prediction confidence. This perspective motivates us to combine these two methods, and the combined detector exhibits improved performance and robustness. These findings provide insight into the behavior of neural classifiers in response to anomalous inputs.
△ Less
Submitted 30 April, 2020; v1 submitted 29 February, 2020;
originally announced March 2020.
-
Likelihood Assignment for Out-of-Distribution Inputs in Deep Generative Models is Sensitive to Prior Distribution Choice
Authors:
Ryo Kamoi,
Kei Kobayashi
Abstract:
Recent work has shown that deep generative models assign higher likelihood to out-of-distribution inputs than to training data. We show that a factor underlying this phenomenon is a mismatch between the nature of the prior distribution and that of the data distribution, a problem found in widely used deep generative models such as VAEs and Glow. While a typical choice for a prior distribution is a…
▽ More
Recent work has shown that deep generative models assign higher likelihood to out-of-distribution inputs than to training data. We show that a factor underlying this phenomenon is a mismatch between the nature of the prior distribution and that of the data distribution, a problem found in widely used deep generative models such as VAEs and Glow. While a typical choice for a prior distribution is a standard Gaussian distribution, properties of distributions of real data sets may not be consistent with a unimodal prior distribution. This paper focuses on the relationship between the choice of a prior distribution and the likelihoods assigned to out-of-distribution inputs. We propose the use of a mixture distribution as a prior to make likelihoods assigned by deep generative models sensitive to out-of-distribution inputs. Furthermore, we explain the theoretical advantages of adopting a mixture distribution as the prior, and we present experimental results to support our claims. Finally, we demonstrate that a mixture prior lowers the out-of-distribution likelihood with respect to two pairs of real image data sets: Fashion-MNIST vs. MNIST and CIFAR10 vs. SVHN.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.
-
Asymptotic Risk of Bezier Simplex Fitting
Authors:
Akinori Tanaka,
Akiyoshi Sannai,
Ken Kobayashi,
Naoki Hamada
Abstract:
The Bezier simplex fitting is a novel data modeling technique which exploits geometric structures of data to approximate the Pareto front of multi-objective optimization problems. There are two fitting methods based on different sampling strategies. The inductive skeleton fitting employs a stratified subsampling from each skeleton of a simplex, whereas the all-at-once fitting uses a non-stratified…
▽ More
The Bezier simplex fitting is a novel data modeling technique which exploits geometric structures of data to approximate the Pareto front of multi-objective optimization problems. There are two fitting methods based on different sampling strategies. The inductive skeleton fitting employs a stratified subsampling from each skeleton of a simplex, whereas the all-at-once fitting uses a non-stratified sampling which treats a simplex as a whole. In this paper, we analyze the asymptotic risks of those Bézier simplex fitting methods and derive the optimal subsample ratio for the inductive skeleton fitting. It is shown that the inductive skeleton fitting with the optimal ratio has a smaller risk when the degree of a Bezier simplex is less than three. Those results are verified numerically under small to moderate sample sizes. In addition, we provide two complementary applications of our theory: a generalized location problem and a multi-objective hyper-parameter tuning of the group lasso. The former can be represented by a Bezier simplex of degree two where the inductive skeleton fitting outperforms. The latter can be represented by a Bezier simplex of degree three where the all-at-once fitting gets an advantage.
△ Less
Submitted 17 June, 2019;
originally announced June 2019.
-
DNN-based Source Enhancement to Increase Objective Sound Quality Assessment Score
Authors:
Yuma Koizumi,
Kenta Niwa,
Yusuke Hioka,
Kazunori Kobayashi,
Yoichi Haneda
Abstract:
We propose a training method for deep neural network (DNN)-based source enhancement to increase objective sound quality assessment (OSQA) scores such as the perceptual evaluation of speech quality (PESQ). In many conventional studies, DNNs have been used as a mapping function to estimate time-frequency masks and trained to minimize an analytically tractable objective function such as the mean squa…
▽ More
We propose a training method for deep neural network (DNN)-based source enhancement to increase objective sound quality assessment (OSQA) scores such as the perceptual evaluation of speech quality (PESQ). In many conventional studies, DNNs have been used as a mapping function to estimate time-frequency masks and trained to minimize an analytically tractable objective function such as the mean squared error (MSE). Since OSQA scores have been used widely for sound-quality evaluation, constructing DNNs to increase OSQA scores would be better than using the minimum-MSE to create high-quality output signals. However, since most OSQA scores are not analytically tractable, \textit{i.e.}, they are black boxes, the gradient of the objective function cannot be calculated by simply applying back-propagation. To calculate the gradient of the OSQA-based objective function, we formulated a DNN optimization scheme on the basis of \textit{black-box optimization}, which is used for training a computer that plays a game. For a black-box-optimization scheme, we adopt the policy gradient method for calculating the gradient on the basis of a sampling algorithm. To simulate output signals using the sampling algorithm, DNNs are used to estimate the probability density function of the output signals that maximize OSQA scores. The OSQA scores are calculated from the simulated output signals, and the DNNs are trained to increase the probability of generating the simulated output signals that achieve high OSQA scores. Through several experiments, we found that OSQA scores significantly increased by applying the proposed method, even though the MSE was not minimized.
△ Less
Submitted 22 October, 2018;
originally announced October 2018.
-
Revisiting the Vector Space Model: Sparse Weighted Nearest-Neighbor Method for Extreme Multi-Label Classification
Authors:
Tatsuhiro Aoshima,
Kei Kobayashi,
Mihoko Minami
Abstract:
Machine learning has played an important role in information retrieval (IR) in recent times. In search engines, for example, query keywords are accepted and documents are returned in order of relevance to the given query; this can be cast as a multi-label ranking problem in machine learning. Generally, the number of candidate documents is extremely large (from several thousand to several million);…
▽ More
Machine learning has played an important role in information retrieval (IR) in recent times. In search engines, for example, query keywords are accepted and documents are returned in order of relevance to the given query; this can be cast as a multi-label ranking problem in machine learning. Generally, the number of candidate documents is extremely large (from several thousand to several million); thus, the classifier must handle many labels. This problem is referred to as extreme multi-label classification (XMLC). In this paper, we propose a novel approach to XMLC termed the Sparse Weighted Nearest-Neighbor Method. This technique can be derived as a fast implementation of state-of-the-art (SOTA) one-versus-rest linear classifiers for very sparse datasets. In addition, we show that the classifier can be written as a sparse generalization of a representer theorem with a linear kernel. Furthermore, our method can be viewed as the vector space model used in IR. Finally, we show that the Sparse Weighted Nearest-Neighbor Method can process data points in real time on XMLC datasets with equivalent performance to SOTA models, with a single thread and smaller storage footprint. In particular, our method exhibits superior performance to the SOTA models on a dataset with 3 million labels.
△ Less
Submitted 12 February, 2018;
originally announced February 2018.