Search | arXiv e-print repository

Comparison of CNN-based deep learning architectures for unsteady CFD acceleration on small datasets

Authors: Sangam Khanal, Shilaj Baral, Joongoo Jeon

Abstract: CFD acceleration for virtual nuclear reactors or digital twin technology is a primary goal in the nuclear industry. This study compares advanced convolutional neural network (CNN) architectures for accelerating unsteady computational fluid dynamics (CFD) simulations using small datasets based on a challenging natural convection flow dataset. The advanced architectures such as autoencoders, UNet, a… ▽ More CFD acceleration for virtual nuclear reactors or digital twin technology is a primary goal in the nuclear industry. This study compares advanced convolutional neural network (CNN) architectures for accelerating unsteady computational fluid dynamics (CFD) simulations using small datasets based on a challenging natural convection flow dataset. The advanced architectures such as autoencoders, UNet, and ConvLSTM-UNet, were evaluated under identical conditions to determine their predictive accuracy and robustness in autoregressive time-series predictions. ConvLSTM-UNet consistently outperformed other models, particularly in difference value calculation, achieving lower maximum errors and stable residuals. However, error accumulation remains a challenge, limiting reliable predictions to approximately 10 timesteps. This highlights the need for enhanced strategies to improve long-term prediction stability. The novelty of this work lies in its fair comparison of state-of-the-art CNN models within the RePIT framework, demonstrating their potential for accelerating CFD simulations while identifying limitations under small data conditions. Future research will focus on exploring alternative models, such as graph neural networks and implicit neural representations. These efforts aim to develop a robust hybrid approach for long-term unsteady CFD acceleration, contributing to practical applications in virtual nuclear reactor. △ Less

Submitted 5 February, 2025; originally announced February 2025.

Comments: 9 figures, 3 Tables

arXiv:2501.14877 [pdf, other]

DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images

Authors: Sami Baral, Li Lucy, Ryan Knight, Alice Ng, Luca Soldaini, Neil T. Heffernan, Kyle Lo

Abstract: In real-world settings, vision language models (VLMs) should robustly handle naturalistic, noisy visual content as well as domain-specific language and concepts. For example, K-12 educators using digital learning platforms may need to examine and provide feedback across many images of students' math work. To assess the potential of VLMs to support educators in settings like this one, we introduce… ▽ More In real-world settings, vision language models (VLMs) should robustly handle naturalistic, noisy visual content as well as domain-specific language and concepts. For example, K-12 educators using digital learning platforms may need to examine and provide feedback across many images of students' math work. To assess the potential of VLMs to support educators in settings like this one, we introduce DrawEduMath, an English-language dataset of 2,030 images of students' handwritten responses to K-12 math problems. Teachers provided detailed annotations, including free-form descriptions of each image and 11,661 question-answer (QA) pairs. These annotations capture a wealth of pedagogical insights, ranging from students' problem-solving strategies to the composition of their drawings, diagrams, and writing. We evaluate VLMs on teachers' QA pairs, as well as 44,362 synthetic QA pairs derived from teachers' descriptions using language models (LMs). We show that even state-of-the-art VLMs leave much room for improvement on DrawEduMath questions. We also find that synthetic QAs, though imperfect, can yield similar model rankings as teacher-written QAs. We release DrawEduMath to support the evaluation of VLMs' abilities to reason mathematically over images gathered with educational contexts in mind. △ Less

Submitted 24 January, 2025; originally announced January 2025.

Comments: 19 pages, 10 figures, Accepted to NAACL 2025

arXiv:2411.08910 [pdf, other]

Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses

Authors: Sami Baral, Eamon Worden, Wen-Chiang Lim, Zhuang Luo, Christopher Santorelli, Ashish Gurung, Neil Heffernan

Abstract: The effectiveness of feedback in enhancing learning outcomes is well documented within Educational Data Mining (EDM). Various prior research has explored methodologies to enhance the effectiveness of feedback. Recent developments in Large Language Models (LLMs) have extended their utility in enhancing automated feedback systems. This study aims to explore the potential of LLMs in facilitating auto… ▽ More The effectiveness of feedback in enhancing learning outcomes is well documented within Educational Data Mining (EDM). Various prior research has explored methodologies to enhance the effectiveness of feedback. Recent developments in Large Language Models (LLMs) have extended their utility in enhancing automated feedback systems. This study aims to explore the potential of LLMs in facilitating automated feedback in math education. We examine the effectiveness of LLMs in evaluating student responses by comparing 3 different models: Llama, SBERT-Canberra, and GPT4 model. The evaluation requires the model to provide both a quantitative score and qualitative feedback on the student's responses to open-ended math problems. We employ Mistral, a version of Llama catered to math, and fine-tune this model for evaluating student responses by leveraging a dataset of student responses and teacher-written feedback for middle-school math problems. A similar approach was taken for training the SBERT model as well, while the GPT4 model used a zero-shot learning approach. We evaluate the model's performance in scoring accuracy and the quality of feedback by utilizing judgments from 2 teachers. The teachers utilized a shared rubric in assessing the accuracy and relevance of the generated feedback. We conduct both quantitative and qualitative analyses of the model performance. By offering a detailed comparison of these methods, this study aims to further the ongoing development of automated feedback systems and outlines potential future directions for leveraging generative LLMs to create more personalized learning experiences. △ Less

Submitted 29 October, 2024; originally announced November 2024.

Comments: 12 pages including references, 4 figures, 9 tables

arXiv:2409.19566 [pdf, other]

Abstractive Summarization of Low resourced Nepali language using Multilingual Transformers

Authors: Prakash Dhakal, Daya Sagar Baral

Abstract: Automatic text summarization in Nepali language is an unexplored area in natural language processing (NLP). Although considerable research has been dedicated to extractive summarization, the area of abstractive summarization, especially for low-resource languages such as Nepali, remains largely unexplored. This study explores the use of multilingual transformer models, specifically mBART and mT5,… ▽ More Automatic text summarization in Nepali language is an unexplored area in natural language processing (NLP). Although considerable research has been dedicated to extractive summarization, the area of abstractive summarization, especially for low-resource languages such as Nepali, remains largely unexplored. This study explores the use of multilingual transformer models, specifically mBART and mT5, for generating headlines for Nepali news articles through abstractive summarization. The research addresses key challenges associated with summarizing texts in Nepali by first creating a summarization dataset through web scraping from various Nepali news portals. These multilingual models were then fine-tuned using different strategies. The performance of the fine-tuned models were then assessed using ROUGE scores and human evaluation to ensure the generated summaries were coherent and conveyed the original meaning. During the human evaluation, the participants were asked to select the best summary among those generated by the models, based on criteria such as relevance, fluency, conciseness, informativeness, factual accuracy, and coverage. During the evaluation with ROUGE scores, the 4-bit quantized mBART with LoRA model was found to be effective in generating better Nepali news headlines in comparison to other models and also it was selected 34.05% of the time during the human evaluation, outperforming all other fine-tuned models created for Nepali News headline generation. △ Less

Submitted 29 September, 2024; originally announced September 2024.

arXiv:2409.13177 [pdf, other]

An Adaptive End-to-End IoT Security Framework Using Explainable AI and LLMs

Authors: Sudipto Baral, Sajal Saha, Anwar Haque

Abstract: The exponential growth of the Internet of Things (IoT) has significantly increased the complexity and volume of cybersecurity threats, necessitating the development of advanced, scalable, and interpretable security frameworks. This paper presents an innovative, comprehensive framework for real-time IoT attack detection and response that leverages Machine Learning (ML), Explainable AI (XAI), and La… ▽ More The exponential growth of the Internet of Things (IoT) has significantly increased the complexity and volume of cybersecurity threats, necessitating the development of advanced, scalable, and interpretable security frameworks. This paper presents an innovative, comprehensive framework for real-time IoT attack detection and response that leverages Machine Learning (ML), Explainable AI (XAI), and Large Language Models (LLM). By integrating XAI techniques such as SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) with a model-independent architecture, we ensure our framework's adaptability across various ML algorithms. Additionally, the incorporation of LLMs enhances the interpretability and accessibility of detection decisions, providing system administrators with actionable, human-understandable explanations of detected threats. Our end-to-end framework not only facilitates a seamless transition from model development to deployment but also represents a real-world application capability that is often lacking in existing research. Based on our experiments with the CIC-IOT-2023 dataset \cite{neto2023ciciot2023}, Gemini and OPENAI LLMS demonstrate unique strengths in attack mitigation: Gemini offers precise, focused strategies, while OPENAI provides extensive, in-depth security measures. Incorporating SHAP and LIME algorithms within XAI provides comprehensive insights into attack detection, emphasizing opportunities for model improvement through detailed feature analysis, fine-tuning, and the adaptation of misclassifications to enhance accuracy. △ Less

Submitted 19 September, 2024; originally announced September 2024.

Comments: 6 pages, 1 figure, Accepted in 2024 IEEE WF-IoT Conference

arXiv:2407.17822 [pdf]

Advanced deep-reinforcement-learning methods for flow control: group-invariant and positional-encoding networks improve learning speed and quality

Authors: Joongoo Jeon, Jean Rabault, Joel Vasanth, Francisco Alcántara-Ávila, Shilaj Baral, Ricardo Vinuesa

Abstract: Flow control is key to maximize energy efficiency in a wide range of applications. However, traditional flow-control methods face significant challenges in addressing non-linear systems and high-dimensional data, limiting their application in realistic energy systems. This study advances deep-reinforcement-learning (DRL) methods for flow control, particularly focusing on integrating group-invarian… ▽ More Flow control is key to maximize energy efficiency in a wide range of applications. However, traditional flow-control methods face significant challenges in addressing non-linear systems and high-dimensional data, limiting their application in realistic energy systems. This study advances deep-reinforcement-learning (DRL) methods for flow control, particularly focusing on integrating group-invariant networks and positional encoding into DRL architectures. Our methods leverage multi-agent reinforcement learning (MARL) to exploit policy invariance in space, in combination with group-invariant networks to ensure local symmetry invariance. Additionally, a positional encoding inspired by the transformer architecture is incorporated to provide location information to the agents, mitigating action constraints from strict invariance. The proposed methods are verified using a case study of Rayleigh-Bénard convection, where the goal is to minimize the Nusselt number Nu. The group-invariant neural networks (GI-NNs) show faster convergence compared to the base MARL, achieving better average policy performance. The GI-NNs not only cut DRL training time in half but also notably enhance learning reproducibility. Positional encoding further enhances these results, effectively reducing the minimum Nu and stabilizing convergence. Interestingly, group invariant networks specialize in improving learning speed and positional encoding specializes in improving learning quality. These results demonstrate that choosing a suitable feature-representation method according to the purpose as well as the characteristics of each control problem is essential. We believe that the results of this study will not only inspire novel DRL methods with invariant and unique representations, but also provide useful insights for industrial applications. △ Less

Submitted 25 October, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

arXiv:2306.03412 [pdf, other]

DEK-Forecaster: A Novel Deep Learning Model Integrated with EMD-KNN for Traffic Prediction

Authors: Sajal Saha, Sudipto Baral, Anwar Haque

Abstract: Internet traffic volume estimation has a significant impact on the business policies of the ISP (Internet Service Provider) industry and business successions. Forecasting the internet traffic demand helps to shed light on the future traffic trend, which is often helpful for ISPs decision-making in network planning activities and investments. Besides, the capability to understand future trend contr… ▽ More Internet traffic volume estimation has a significant impact on the business policies of the ISP (Internet Service Provider) industry and business successions. Forecasting the internet traffic demand helps to shed light on the future traffic trend, which is often helpful for ISPs decision-making in network planning activities and investments. Besides, the capability to understand future trend contributes to managing regular and long-term operations. This study aims to predict the network traffic volume demand using deep sequence methods that incorporate Empirical Mode Decomposition (EMD) based noise reduction, Empirical rule based outlier detection, and $K$-Nearest Neighbour (KNN) based outlier mitigation. In contrast to the former studies, the proposed model does not rely on a particular EMD decomposed component called Intrinsic Mode Function (IMF) for signal denoising. In our proposed traffic prediction model, we used an average of all IMFs components for signal denoising. Moreover, the abnormal data points are replaced by $K$ nearest data points average, and the value for $K$ has been optimized based on the KNN regressor prediction error measured in Root Mean Squared Error (RMSE). Finally, we selected the best time-lagged feature subset for our prediction model based on AutoRegressive Integrated Moving Average (ARIMA) and Akaike Information Criterion (AIC) value. Our experiments are conducted on real-world internet traffic datasets from industry, and the proposed method is compared with various traditional deep sequence baseline models. Our results show that the proposed EMD-KNN integrated prediction models outperform comparative models. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: 13 pages, 9 figures

arXiv:2205.15219 [pdf, other]

Automatic Short Math Answer Grading via In-context Meta-learning

Authors: Mengxue Zhang, Sami Baral, Neil Heffernan, Andrew Lan

Abstract: Automatic short answer grading is an important research direction in the exploration of how to use artificial intelligence (AI)-based tools to improve education. Current state-of-the-art approaches use neural language models to create vectorized representations of students responses, followed by classifiers to predict the score. However, these approaches have several key limitations, including i)… ▽ More Automatic short answer grading is an important research direction in the exploration of how to use artificial intelligence (AI)-based tools to improve education. Current state-of-the-art approaches use neural language models to create vectorized representations of students responses, followed by classifiers to predict the score. However, these approaches have several key limitations, including i) they use pre-trained language models that are not well-adapted to educational subject domains and/or student-generated text and ii) they almost always train one model per question, ignoring the linkage across a question and result in a significant model storage problem due to the size of advanced language models. In this paper, we study the problem of automatic short answer grading for students' responses to math questions and propose a novel framework for this task. First, we use MathBERT, a variant of the popular language model BERT adapted to mathematical content, as our base model and fine-tune it for the downstream task of student response grading. Second, we use an in-context learning approach that provides scoring examples as input to the language model to provide additional context information and promote generalization to previously unseen questions. We evaluate our framework on a real-world dataset of student responses to open-ended math questions and show that our framework (often significantly) outperforms existing approaches, especially for new questions that are not seen during training. △ Less

Submitted 8 July, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

Comments: To appear EDM 2022

arXiv:2012.07565 [pdf, other]

Automating Document Classification with Distant Supervision to Increase the Efficiency of Systematic Reviews

Authors: Xiaoxiao Li, Rabah Al-Zaidy, Amy Zhang, Stefan Baral, Le Bao, C. Lee Giles

Abstract: Objective: Systematic reviews of scholarly documents often provide complete and exhaustive summaries of literature relevant to a research question. However, well-done systematic reviews are expensive, time-demanding, and labor-intensive. Here, we propose an automatic document classification approach to significantly reduce the effort in reviewing documents. Methods: We first describe a manual docu… ▽ More Objective: Systematic reviews of scholarly documents often provide complete and exhaustive summaries of literature relevant to a research question. However, well-done systematic reviews are expensive, time-demanding, and labor-intensive. Here, we propose an automatic document classification approach to significantly reduce the effort in reviewing documents. Methods: We first describe a manual document classification procedure that is used to curate a pertinent training dataset and then propose three classifiers: a keyword-guided method, a cluster analysis-based refined method, and a random forest approach that utilizes a large set of feature tokens. As an example, this approach is used to identify documents studying female sex workers that are assumed to contain content relevant to either HIV or violence. We compare the performance of the three classifiers by cross-validation and conduct a sensitivity analysis on the portion of data utilized in training the model. Results: The random forest approach provides the highest area under the curve (AUC) for both receiver operating characteristic (ROC) and precision/recall (PR). Analyses of precision and recall suggest that random forest could facilitate manually reviewing 20\% of the articles while containing 80\% of the relevant cases. Finally, we found a good classifier could be obtained by using a relatively small training sample size. Conclusions: In sum, the automated procedure of document classification presented here could improve both the precision and efficiency of systematic reviews, as well as facilitating live reviews, where reviews are updated regularly. △ Less

Submitted 9 December, 2020; originally announced December 2020.

Showing 1–9 of 9 results for author: Baral, S