-
Accurate and Uncertainty-Aware Multi-Task Prediction of HEA Properties Using Prior-Guided Deep Gaussian Processes
Authors:
Sk Md Ahnaf Akif Alvi,
Mrinalini Mulukutla,
Nicolas Flores,
Danial Khatamsaz,
Jan Janssen,
Danny Perez,
Douglas Allaire,
Vahid Attari,
Raymundo Arroyave
Abstract:
Surrogate modeling techniques have become indispensable in accelerating the discovery and optimization of high-entropy alloys(HEAs), especially when integrating computational predictions with sparse experimental observations. This study systematically evaluates the fitting performance of four prominent surrogate models conventional Gaussian Processes(cGP), Deep Gaussian Processes(DGP), encoder-dec…
▽ More
Surrogate modeling techniques have become indispensable in accelerating the discovery and optimization of high-entropy alloys(HEAs), especially when integrating computational predictions with sparse experimental observations. This study systematically evaluates the fitting performance of four prominent surrogate models conventional Gaussian Processes(cGP), Deep Gaussian Processes(DGP), encoder-decoder neural networks for multi-output regression and XGBoost applied to a hybrid dataset of experimental and computational properties in the AlCoCrCuFeMnNiV HEA system. We specifically assess their capabilities in predicting correlated material properties, including yield strength, hardness, modulus, ultimate tensile strength, elongation, and average hardness under dynamic and quasi-static conditions, alongside auxiliary computational properties. The comparison highlights the strengths of hierarchical and deep modeling approaches in handling heteroscedastic, heterotopic, and incomplete data commonly encountered in materials informatics. Our findings illustrate that DGP infused with machine learning-based prior outperform other surrogates by effectively capturing inter-property correlations and input-dependent uncertainty. This enhanced predictive accuracy positions advanced surrogate models as powerful tools for robust and data-efficient materials design.
△ Less
Submitted 13 June, 2025;
originally announced June 2025.
-
Semiconductor SEM Image Defect Classification Using Supervised and Semi-Supervised Learning with Vision Transformers
Authors:
Chien-Fu,
Huang,
Katherine Sieg,
Leonid Karlinksy,
Nash Flores,
Rebekah Sheraw,
Xin Zhang
Abstract:
Controlling defects in semiconductor processes is important for maintaining yield, improving production cost, and preventing time-dependent critical component failures. Electron beam-based imaging has been used as a tool to survey wafers in the line and inspect for defects. However, manual classification of images for these nano-scale defects is limited by time, labor constraints, and human biases…
▽ More
Controlling defects in semiconductor processes is important for maintaining yield, improving production cost, and preventing time-dependent critical component failures. Electron beam-based imaging has been used as a tool to survey wafers in the line and inspect for defects. However, manual classification of images for these nano-scale defects is limited by time, labor constraints, and human biases. In recent years, deep learning computer vision algorithms have shown to be effective solutions for image-based inspection applications in industry. This work proposes application of vision transformer (ViT) neural networks for automatic defect classification (ADC) of scanning electron microscope (SEM) images of wafer defects. We evaluated our proposed methods on 300mm wafer semiconductor defect data from our fab in IBM Albany. We studied 11 defect types from over 7400 total images and investigated the potential of transfer learning of DinoV2 and semi-supervised learning for improved classification accuracy and efficient computation. We were able to achieve classification accuracies of over 90% with less than 15 images per defect class. Our work demonstrates the potential to apply the proposed framework for a platform agnostic in-house classification tool with faster turnaround time and flexibility.
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
-
Jailbreaking Large Language Models with Symbolic Mathematics
Authors:
Emet Bethany,
Mazal Bethany,
Juan Arturo Nolazco Flores,
Sumit Kumar Jha,
Peyman Najafirad
Abstract:
Recent advancements in AI safety have led to increased efforts in training and red-teaming large language models (LLMs) to mitigate unsafe content generation. However, these safety mechanisms may not be comprehensive, leaving potential vulnerabilities unexplored. This paper introduces MathPrompt, a novel jailbreaking technique that exploits LLMs' advanced capabilities in symbolic mathematics to by…
▽ More
Recent advancements in AI safety have led to increased efforts in training and red-teaming large language models (LLMs) to mitigate unsafe content generation. However, these safety mechanisms may not be comprehensive, leaving potential vulnerabilities unexplored. This paper introduces MathPrompt, a novel jailbreaking technique that exploits LLMs' advanced capabilities in symbolic mathematics to bypass their safety mechanisms. By encoding harmful natural language prompts into mathematical problems, we demonstrate a critical vulnerability in current AI safety measures. Our experiments across 13 state-of-the-art LLMs reveal an average attack success rate of 73.6\%, highlighting the inability of existing safety training mechanisms to generalize to mathematically encoded inputs. Analysis of embedding vectors shows a substantial semantic shift between original and encoded prompts, helping explain the attack's success. This work emphasizes the importance of a holistic approach to AI safety, calling for expanded red-teaming efforts to develop robust safeguards across all potential input types and their associated risks.
△ Less
Submitted 5 November, 2024; v1 submitted 16 September, 2024;
originally announced September 2024.
-
Effects of the COVID-19 Pandemic on Learning and Teaching: a Case Study from Higher Education
Authors:
Nidia Guadalupe López Flores,
Anna Sigridur Islind,
María Óskarsdóttir
Abstract:
In December 2019, the first case of SARS-CoV-2 infection was identified in Wuhan, China. Since that day, COVID-19 has spread worldwide, affecting 153 million people. Education, as many other sectors, has managed to adapt to the requirements and barriers implied by the impossibility to teach students face-to-face as it was done before. Yet, little is known about the implications of emergency remote…
▽ More
In December 2019, the first case of SARS-CoV-2 infection was identified in Wuhan, China. Since that day, COVID-19 has spread worldwide, affecting 153 million people. Education, as many other sectors, has managed to adapt to the requirements and barriers implied by the impossibility to teach students face-to-face as it was done before. Yet, little is known about the implications of emergency remote teaching (ERT) during the pandemic. This study describes and analyzes the impact of the pandemic on the study patterns of higher education students. The analysis was performed by the integration of three main components: (1) interaction with the learning management system (LMS), (2) Assignment submission rate, and (3) Teachers' perspective. Several variables were created to analyze the study patterns, clicks on different LMS components, usage during the day, week and part of the term, the time span of interaction with the LMS, and grade categories. The results showed significant differences in study patterns depending on the year of study, and the variables reflecting the effect of teachers' changes in the course structure are identified. This study outlines the first insights of higher education's new normality, providing important implications for supporting teachers in creating academic material that adequately addresses students' particular needs depending on their year of study, changes in study pattern, and distribution of time and activity through the term.
△ Less
Submitted 4 May, 2021;
originally announced May 2021.
-
Fair Algorithms for Clustering
Authors:
Suman K. Bera,
Deeparnab Chakrabarty,
Nicolas J. Flores,
Maryam Negahbani
Abstract:
We study the problem of finding low-cost Fair Clusterings in data where each data point may belong to many protected groups. Our work significantly generalizes the seminal work of Chierichetti et.al. (NIPS 2017) as follows.
- We allow the user to specify the parameters that define fair representation. More precisely, these parameters define the maximum over- and minimum under-representation of a…
▽ More
We study the problem of finding low-cost Fair Clusterings in data where each data point may belong to many protected groups. Our work significantly generalizes the seminal work of Chierichetti et.al. (NIPS 2017) as follows.
- We allow the user to specify the parameters that define fair representation. More precisely, these parameters define the maximum over- and minimum under-representation of any group in any cluster.
- Our clustering algorithm works on any $\ell_p$-norm objective (e.g. $k$-means, $k$-median, and $k$-center). Indeed, our algorithm transforms any vanilla clustering solution into a fair one incurring only a slight loss in quality.
- Our algorithm also allows individuals to lie in multiple protected groups. In other words, we do not need the protected groups to partition the data and we can maintain fairness across different groups simultaneously.
Our experiments show that on established data sets, our algorithm performs much better in practice than what our theoretical results suggest.
△ Less
Submitted 17 June, 2019; v1 submitted 8 January, 2019;
originally announced January 2019.
-
Implementing generating functions to obtain power indices with coalition configuration
Authors:
Jorge Rodríguez Veiga,
Guido I. Novoa Flores,
Balbina V. Casas Méndez
Abstract:
We consider the Banzhaf-Coleman and Owen power indices for weighted majority games modified by a coalition configuration. We present calculation algorithms of them that make use of the method of generating functions. We programmed the procedure in the open language R and it is illustrated by a real life example taken from social sciences.
We consider the Banzhaf-Coleman and Owen power indices for weighted majority games modified by a coalition configuration. We present calculation algorithms of them that make use of the method of generating functions. We programmed the procedure in the open language R and it is illustrated by a real life example taken from social sciences.
△ Less
Submitted 1 July, 2015;
originally announced July 2015.