-
LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders
Authors:
Borna Khodabandeh,
Amirabbas Afzali,
Amirhossein Afsharrad,
Seyed Shahabeddin Mousavi,
Sanjay Lall,
Sajjad Amini,
Seyed-Mohsen Moosavi-Dezfooli
Abstract:
Visual encoders have become fundamental components in modern computer vision pipelines. However, ensuring robustness against adversarial perturbations remains a critical challenge. Recent efforts have explored both supervised and unsupervised adversarial fine-tuning strategies. We identify two key limitations in these approaches: (i) they often suffer from instability, especially during the early…
▽ More
Visual encoders have become fundamental components in modern computer vision pipelines. However, ensuring robustness against adversarial perturbations remains a critical challenge. Recent efforts have explored both supervised and unsupervised adversarial fine-tuning strategies. We identify two key limitations in these approaches: (i) they often suffer from instability, especially during the early stages of fine-tuning, resulting in suboptimal convergence and degraded performance on clean data, and (ii) they exhibit a suboptimal trade-off between robustness and clean data accuracy, hindering the simultaneous optimization of both objectives. To overcome these challenges, we propose Lagrangian-Optimized Robust Embeddings (LORE), a novel unsupervised adversarial fine-tuning framework. LORE utilizes constrained optimization, which offers a principled approach to balancing competing goals, such as improving robustness while preserving nominal performance. By enforcing embedding-space proximity constraints, LORE effectively maintains clean data performance throughout adversarial fine-tuning. Extensive experiments show that LORE significantly improves zero-shot adversarial robustness with minimal degradation in clean data accuracy. Furthermore, we demonstrate the effectiveness of the adversarially fine-tuned CLIP image encoder in out-of-distribution generalization and enhancing the interpretability of image embeddings.
△ Less
Submitted 24 May, 2025;
originally announced May 2025.
-
A tutorial on kriging-based stochastic simulation optimization
Authors:
Sasan Amini,
Inneke Van Nieuwenhuyse
Abstract:
This tutorial focuses on kriging-based simulation optimization, emphasizing the importance of data efficiency in optimization problems involving expensive simulation models. It discusses how kriging models contribute to developing algorithms that minimize the number of required simulations, particularly in the presence of noisy evaluations. The tutorial compares the performance of kriging-based al…
▽ More
This tutorial focuses on kriging-based simulation optimization, emphasizing the importance of data efficiency in optimization problems involving expensive simulation models. It discusses how kriging models contribute to developing algorithms that minimize the number of required simulations, particularly in the presence of noisy evaluations. The tutorial compares the performance of kriging-based algorithms against traditional polynomial-based optimization methods using an illustrative example. Additionally, it discusses key extensions of kriging-based algorithms, including multi-objective and constrained optimization, providing insights into their application in complex, real-world settings.
△ Less
Submitted 4 February, 2025;
originally announced February 2025.
-
GEC-RAG: Improving Generative Error Correction via Retrieval-Augmented Generation for Automatic Speech Recognition Systems
Authors:
Amin Robatian,
Mohammad Hajipour,
Mohammad Reza Peyghan,
Fatemeh Rajabi,
Sajjad Amini,
Shahrokh Ghaemmaghami,
Iman Gholampour
Abstract:
Automatic Speech Recognition (ASR) systems have demonstrated remarkable performance across various applications. However, limited data and the unique language features of specific domains, such as low-resource languages, significantly degrade their performance and lead to higher Word Error Rates (WER). In this study, we propose Generative Error Correction via Retrieval-Augmented Generation (GEC-RA…
▽ More
Automatic Speech Recognition (ASR) systems have demonstrated remarkable performance across various applications. However, limited data and the unique language features of specific domains, such as low-resource languages, significantly degrade their performance and lead to higher Word Error Rates (WER). In this study, we propose Generative Error Correction via Retrieval-Augmented Generation (GEC-RAG), a novel approach designed to improve ASR accuracy for low-resource domains, like Persian. Our approach treats the ASR system as a black-box, a common practice in cloud-based services, and proposes a Retrieval-Augmented Generation (RAG) approach within the In-Context Learning (ICL) scheme to enhance the quality of ASR predictions. By constructing a knowledge base that pairs ASR predictions (1-best and 5-best hypotheses) with their corresponding ground truths, GEC-RAG retrieves lexically similar examples to the ASR transcription using the Term Frequency-Inverse Document Frequency (TF-IDF) measure. This process provides relevant error patterns of the system alongside the ASR transcription to the Generative Large Language Model (LLM), enabling targeted corrections. Our results demonstrate that this strategy significantly reduces WER in Persian and highlights a potential for domain adaptation and low-resource scenarios. This research underscores the effectiveness of using RAG in enhancing ASR systems without requiring direct model modification or fine-tuning, making it adaptable to any domain by simply updating the transcription knowledge base with domain-specific data.
△ Less
Submitted 18 January, 2025;
originally announced January 2025.
-
ULTra: Unveiling Latent Token Interpretability in Transformer-Based Understanding and Segmentation
Authors:
Hesam Hosseini,
Ghazal Hosseini Mighan,
Amirabbas Afzali,
Sajjad Amini,
Amir Houmansadr
Abstract:
Transformers have revolutionized Computer Vision (CV) through self-attention mechanisms. However, their complexity makes latent token representations difficult to interpret. We introduce ULTra, a framework for interpreting Transformer embeddings and uncovering meaningful semantic patterns within them. ULTra enables unsupervised semantic segmentation using pre-trained models without requiring fine-…
▽ More
Transformers have revolutionized Computer Vision (CV) through self-attention mechanisms. However, their complexity makes latent token representations difficult to interpret. We introduce ULTra, a framework for interpreting Transformer embeddings and uncovering meaningful semantic patterns within them. ULTra enables unsupervised semantic segmentation using pre-trained models without requiring fine-tuning. Additionally, we propose a self-supervised training approach that refines segmentation performance by learning an external transformation matrix without modifying the underlying model. Our method achieves state-of-the-art performance in unsupervised semantic segmentation, outperforming existing segmentation methods. Furthermore, we validate ULTra for model interpretation on both synthetic and real-world scenarios, including Object Selection and interpretable text summarization using LLMs, demonstrating its broad applicability in explaining the semantic structure of latent token representations.
△ Less
Submitted 22 March, 2025; v1 submitted 15 November, 2024;
originally announced November 2024.
-
Twin branching in shape memory alloys: a 1D model with energy dissipation effects
Authors:
Stanislaw Stupkiewicz,
Seyedshoja Amini,
Mohsen Rezaee-Hajidehi
Abstract:
We develop a 1D model of twin branching in shape memory alloys. The free energy of the branched microstructure comprises the interfacial and elastic strain energy contributions, both expressed in terms of the average twin spacing treated as a continuous function of the position. The total free energy is then minimized, and the corresponding Euler-Lagrange equation is solved numerically using the f…
▽ More
We develop a 1D model of twin branching in shape memory alloys. The free energy of the branched microstructure comprises the interfacial and elastic strain energy contributions, both expressed in terms of the average twin spacing treated as a continuous function of the position. The total free energy is then minimized, and the corresponding Euler-Lagrange equation is solved numerically using the finite element method. The model can be considered as a continuous counterpart of the recent discrete model of Seiner et al. (2020), and our results show a very good agreement with that model in the entire range of physically relevant parameters. Furthermore, our continuous setting facilitates incorporation of energy dissipation into the model. The effect of rate-independent dissipation on the evolution of the branched microstructure is thus studied. The results show that significant effects on the microstructure and energy of the system are expected only for relatively small domain sizes.
△ Less
Submitted 24 April, 2025; v1 submitted 11 September, 2024;
originally announced September 2024.
-
MeanSparse: Post-Training Robustness Enhancement Through Mean-Centered Feature Sparsification
Authors:
Sajjad Amini,
Mohammadreza Teymoorianfard,
Shiqing Ma,
Amir Houmansadr
Abstract:
We present a simple yet effective method to improve the robustness of both Convolutional and attention-based Neural Networks against adversarial examples by post-processing an adversarially trained model. Our technique, MeanSparse, cascades the activation functions of a trained model with novel operators that sparsify mean-centered feature vectors. This is equivalent to reducing feature variations…
▽ More
We present a simple yet effective method to improve the robustness of both Convolutional and attention-based Neural Networks against adversarial examples by post-processing an adversarially trained model. Our technique, MeanSparse, cascades the activation functions of a trained model with novel operators that sparsify mean-centered feature vectors. This is equivalent to reducing feature variations around the mean, and we show that such reduced variations merely affect the model's utility, yet they strongly attenuate the adversarial perturbations and decrease the attacker's success rate. Our experiments show that, when applied to the top models in the RobustBench leaderboard, MeanSparse achieves a new robustness record of 75.28% (from 73.71%), 44.78% (from 42.67%) and 62.12% (from 59.56%) on CIFAR-10, CIFAR-100 and ImageNet, respectively, in terms of AutoAttack accuracy. Code is available at https://github.com/SPIN-UMass/MeanSparse
△ Less
Submitted 2 October, 2024; v1 submitted 9 June, 2024;
originally announced June 2024.
-
Matrix Completion via Nonsmooth Regularization of Fully Connected Neural Networks
Authors:
Sajad Faramarzi,
Farzan Haddadi,
Sajjad Amini,
Masoud Ahookhosh
Abstract:
Conventional matrix completion methods approximate the missing values by assuming the matrix to be low-rank, which leads to a linear approximation of missing values. It has been shown that enhanced performance could be attained by using nonlinear estimators such as deep neural networks. Deep fully connected neural networks (FCNNs), one of the most suitable architectures for matrix completion, suff…
▽ More
Conventional matrix completion methods approximate the missing values by assuming the matrix to be low-rank, which leads to a linear approximation of missing values. It has been shown that enhanced performance could be attained by using nonlinear estimators such as deep neural networks. Deep fully connected neural networks (FCNNs), one of the most suitable architectures for matrix completion, suffer from over-fitting due to their high capacity, which leads to low generalizability. In this paper, we control over-fitting by regularizing the FCNN model in terms of the $\ell_{1}$ norm of intermediate representations and nuclear norm of weight matrices. As such, the resulting regularized objective function becomes nonsmooth and nonconvex, i.e., existing gradient-based methods cannot be applied to our model. We propose a variant of the proximal gradient method and investigate its convergence to a critical point. In the initial epochs of FCNN training, the regularization terms are ignored, and through epochs, the effect of that increases. The gradual addition of nonsmooth regularization terms is the main reason for the better performance of the deep neural network with nonsmooth regularization terms (DNN-NSR) algorithm. Our simulations indicate the superiority of the proposed algorithm in comparison with existing linear and nonlinear algorithms.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Energy and morphology of martensite-twinned martensite interface in CuAlNi shape memory alloy: a phase-field study
Authors:
Seyedshoja Amini,
Mohsen Rezaee-Hajidehi,
Stanislaw Stupkiewicz
Abstract:
Needle-like twins are observed experimentally within the transition layer at the martensite-twinned martensite interface. We utilize a phase-field approach to investigate this microstructure. Our goal is to simulate the morphology of the transition layer and to perform a detailed analysis to characterize its interfacial and elastic micro-strain energy. To illustrate the micromechanical framework d…
▽ More
Needle-like twins are observed experimentally within the transition layer at the martensite-twinned martensite interface. We utilize a phase-field approach to investigate this microstructure. Our goal is to simulate the morphology of the transition layer and to perform a detailed analysis to characterize its interfacial and elastic micro-strain energy. To illustrate the micromechanical framework developed for that purpose, sample computations are carried out for a CuAlNi shape memory alloy undergoing the cubic-to-orthorhombic martensitic transformation. A particular focus of the study is on size-dependent morphology through examining the impact of twin spacing. Additionally, our results reveal that certain twin volume fractions lead to the emergence of twin branching, as a way to minimize the total free energy stored in the microstructure.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Finite element stress analysis of a combined stacker-reclaimer machine: A design audit report
Authors:
Erfan Khodabandeh,
Shahaboddin Amini,
Aliakbar Taghipour
Abstract:
Design audit or design verification is an important step in engineering of heavy mobile materials handling equipment. Usually, the costumers employ third parties for audition of contractors engineering. Here a part of design audit of a combined stacker-reclaimer machine is reported. This equipment is designed and constructed by a local supplier in Iran for the iron ore pelletizing plants at GOHARZ…
▽ More
Design audit or design verification is an important step in engineering of heavy mobile materials handling equipment. Usually, the costumers employ third parties for audition of contractors engineering. Here a part of design audit of a combined stacker-reclaimer machine is reported. This equipment is designed and constructed by a local supplier in Iran for the iron ore pelletizing plants at GOHARZAMIN Iron Ore Company. The structure plays an important role in mobile material handling machines such as Stackers and Reclaimers and its failure and damage may cause considerable financial and human life losses. In this report the undercarriage of stacker-reclaimer machine including gantry and traveling system are numerically analyzed. The Finite Element Method is used for stress prediction under the critical operating loads according to the design standards. The critical areas of the undercarriage are identified and it is observed that, the maximum stress is in the safe range.
△ Less
Submitted 4 June, 2021;
originally announced June 2021.
-
Growth of Large-Area Graphene Films from Metal-Carbon Melts
Authors:
Shaahin Amini,
Javier Garay,
Guanxiong Liu,
Alexander A. Balandin,
Reza Abbaschian
Abstract:
We have demonstrated a new method for the large-area graphene growth, which can lead to a scalable low-cost high-throughput production technology. The method is based on growing single-layer or few-layer graphene films from a molten phase. The process involves dissolving carbon inside a molten metal at a specified temperature and then allowing the dissolved carbon to nucleate and grow on top of th…
▽ More
We have demonstrated a new method for the large-area graphene growth, which can lead to a scalable low-cost high-throughput production technology. The method is based on growing single-layer or few-layer graphene films from a molten phase. The process involves dissolving carbon inside a molten metal at a specified temperature and then allowing the dissolved carbon to nucleate and grow on top of the melt at a lower temperature. The examined metals for the metal - carbon melts included copper and nickel. For the latter, pristine single layer graphene was grown successfully. The resulting graphene layers were subjected to detailed microscopic and Raman spectroscopic characterization. The deconvolution of the Raman 2D band was used to accurately determine the number of atomic planes in the resulting graphene layers and access their quality. The results indicate that our technology can provide bulk graphite films, few-layer graphene as well as high-quality single layer graphene on metals. Our approach can also be used for producing graphene-metal thermal interface materials for thermal management applications.
△ Less
Submitted 17 November, 2010;
originally announced November 2010.