Skip to main content

Showing 1–50 of 115 results for author: Mousavi, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.05330  [pdf

    cs.AI

    MVRS: The Multimodal Virtual Reality Stimuli-based Emotion Recognition Dataset

    Authors: Seyed Muhammad Hossein Mousavi, Atiye Ilanloo

    Abstract: Automatic emotion recognition has become increasingly important with the rise of AI, especially in fields like healthcare, education, and automotive systems. However, there is a lack of multimodal datasets, particularly involving body motion and physiological signals, which limits progress in the field. To address this, the MVRS dataset is introduced, featuring synchronized recordings from 13 part… ▽ More

    Submitted 31 August, 2025; originally announced September 2025.

  2. arXiv:2508.15850  [pdf, ps, other

    cs.CR cs.LG

    Linkage Attacks Expose Identity Risks in Public ECG Data Sharing

    Authors: Ziyu Wang, Elahe Khatibi, Farshad Firouzi, Sanaz Rahimi Mousavi, Krishnendu Chakrabarty, Amir M. Rahmani

    Abstract: The increasing availability of publicly shared electrocardiogram (ECG) data raises critical privacy concerns, as its biometric properties make individuals vulnerable to linkage attacks. Unlike prior studies that assume idealized adversarial capabilities, we evaluate ECG privacy risks under realistic conditions where attackers operate with partial knowledge. Using data from 109 participants across… ▽ More

    Submitted 20 August, 2025; originally announced August 2025.

  3. arXiv:2508.09188  [pdf

    cs.CV

    Synthetic Data Generation for Emotional Depth Faces: Optimizing Conditional DCGANs via Genetic Algorithms in the Latent Space and Stabilizing Training with Knowledge Distillation

    Authors: Seyed Muhammad Hossein Mousavi, S. Younes Mirinezhad

    Abstract: Affective computing faces a major challenge: the lack of high-quality, diverse depth facial datasets for recognizing subtle emotional expressions. We propose a framework for synthetic depth face generation using an optimized GAN with Knowledge Distillation (EMA teacher models) to stabilize training, improve quality, and prevent mode collapse. We also apply Genetic Algorithms to evolve GAN latent v… ▽ More

    Submitted 7 August, 2025; originally announced August 2025.

  4. arXiv:2507.16835  [pdf, ps, other

    eess.AS cs.CL

    Evaluating Speech-to-Text x LLM x Text-to-Speech Combinations for AI Interview Systems

    Authors: Rumi Allbert, Nima Yazdani, Ali Ansari, Aruj Mahajan, Amirhossein Afsharrad, Seyed Shahabeddin Mousavi

    Abstract: Voice-based conversational AI systems increasingly rely on cascaded architectures that combine speech-to-text (STT), large language models (LLMs), and text-to-speech (TTS) components. We present a large-scale empirical comparison of STT x LLM x TTS stacks using data sampled from over 300,000 AI-conducted job interviews. We used an LLM-as-a-Judge automated evaluation framework to assess conversatio… ▽ More

    Submitted 21 August, 2025; v1 submitted 15 July, 2025; originally announced July 2025.

  5. arXiv:2506.23864  [pdf, ps, other

    cs.CL

    Garbage In, Reasoning Out? Why Benchmark Scores are Unreliable and What to Do About It

    Authors: Seyed Mahed Mousavi, Edoardo Cecchinato, Lucia Hornikova, Giuseppe Riccardi

    Abstract: We conduct a systematic audit of three widely used reasoning benchmarks, SocialIQa, FauxPas-EAI, and ToMi, and uncover pervasive flaws in both benchmark items and evaluation methodology. Using five LLMs (GPT-{3, 3.5, 4, o1}, and LLaMA 3.1) as diagnostic tools, we identify structural, semantic, and pragmatic issues in benchmark design (e.g., duplicated items, ambiguous wording, and implausible answ… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

  6. arXiv:2506.15524  [pdf, ps, other

    cs.CV

    NTIRE 2025 Image Shadow Removal Challenge Report

    Authors: Florin-Alexandru Vasluianu, Tim Seizinger, Zhuyun Zhou, Cailian Chen, Zongwei Wu, Radu Timofte, Mingjia Li, Jin Hu, Hainuo Wang, Hengxing Liu, Jiarui Wang, Qiming Hu, Xiaojie Guo, Xin Lu, Jiarong Yang, Yuanfei Bao, Anya Hu, Zihao Fan, Kunyu Wang, Jie Xiao, Xi Wang, Xueyang Fu, Zheng-Jun Zha, Yu-Fan Lin, Chia-Ming Lee , et al. (57 additional authors not shown)

    Abstract: This work examines the findings of the NTIRE 2025 Shadow Removal Challenge. A total of 306 participants have registered, with 17 teams successfully submitting their solutions during the final evaluation phase. Following the last two editions, this challenge had two evaluation tracks: one focusing on reconstruction fidelity and the other on visual perception through a user study. Both tracks were e… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  7. arXiv:2506.07666  [pdf, ps, other

    cs.LG

    ProARD: progressive adversarial robustness distillation: provide wide range of robust students

    Authors: Seyedhamidreza Mousavi, Seyedali Mousavi, Masoud Daneshtalab

    Abstract: Adversarial Robustness Distillation (ARD) has emerged as an effective method to enhance the robustness of lightweight deep neural networks against adversarial attacks. Current ARD approaches have leveraged a large robust teacher network to train one robust lightweight student. However, due to the diverse range of edge devices and resource constraints, current approaches require training a new stud… ▽ More

    Submitted 27 August, 2025; v1 submitted 9 June, 2025; originally announced June 2025.

  8. arXiv:2506.05431  [pdf

    cs.CV cs.AI cs.LG

    Robustness Evaluation for Video Models with Reinforcement Learning

    Authors: Ashwin Ramesh Babu, Sajad Mousavi, Vineet Gundecha, Sahand Ghorbanpour, Avisek Naug, Antonio Guillen, Ricardo Luna Gutierrez, Soumyendu Sarkar

    Abstract: Evaluating the robustness of Video classification models is very challenging, specifically when compared to image-based models. With their increased temporal dimension, there is a significant increase in complexity and computational cost. One of the key challenges is to keep the perturbations to a minimum to induce misclassification. In this work, we propose a multi-agent reinforcement learning ap… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: Accepted at the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2025

  9. arXiv:2506.05429  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Coordinated Robustness Evaluation Framework for Vision-Language Models

    Authors: Ashwin Ramesh Babu, Sajad Mousavi, Vineet Gundecha, Sahand Ghorbanpour, Avisek Naug, Antonio Guillen, Ricardo Luna Gutierrez, Soumyendu Sarkar

    Abstract: Vision-language models, which integrate computer vision and natural language processing capabilities, have demonstrated significant advancements in tasks such as image captioning and visual question and answering. However, similar to traditional models, they are susceptible to small perturbations, posing a challenge to their robustness, particularly in deployment scenarios. Evaluating the robustne… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: Accepted: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2025

  10. arXiv:2506.05146  [pdf, ps, other

    cs.CV cs.CL

    CIVET: Systematic Evaluation of Understanding in VLMs

    Authors: Massimo Rizzoli, Simone Alghisi, Olha Khomyn, Gabriel Roccabruna, Seyed Mahed Mousavi, Giuseppe Riccardi

    Abstract: While Vision-Language Models (VLMs) have achieved competitive performance in various tasks, their comprehension of the underlying structure and semantics of a scene remains understudied. To investigate the understanding of VLMs, we study their capability regarding object properties and relations in a controlled and interpretable manner. To this scope, we introduce CIVET, a novel and extensible fra… ▽ More

    Submitted 19 June, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

  11. arXiv:2505.18884  [pdf, other

    cs.LG cs.AI cs.CV math.OC

    LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders

    Authors: Borna Khodabandeh, Amirabbas Afzali, Amirhossein Afsharrad, Seyed Shahabeddin Mousavi, Sanjay Lall, Sajjad Amini, Seyed-Mohsen Moosavi-Dezfooli

    Abstract: Visual encoders have become fundamental components in modern computer vision pipelines. However, ensuring robustness against adversarial perturbations remains a critical challenge. Recent efforts have explored both supervised and unsupervised adversarial fine-tuning strategies. We identify two key limitations in these approaches: (i) they often suffer from instability, especially during the early… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

  12. arXiv:2505.18781  [pdf, ps, other

    cs.LG

    Geometry Aware Operator Transformer as an Efficient and Accurate Neural Surrogate for PDEs on Arbitrary Domains

    Authors: Shizheng Wen, Arsh Kumbhat, Levi Lingsch, Sepehr Mousavi, Yizhou Zhao, Praveen Chandrashekar, Siddhartha Mishra

    Abstract: The very challenging task of learning solution operators of PDEs on arbitrary domains accurately and efficiently is of vital importance to engineering and industrial simulations. Despite the existence of many operator learning algorithms to approximate such PDEs, we find that accurate models are not necessarily computationally efficient and vice versa. We address this issue by proposing a geometry… ▽ More

    Submitted 27 May, 2025; v1 submitted 24 May, 2025; originally announced May 2025.

  13. arXiv:2505.13518  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Data Balancing Strategies: A Survey of Resampling and Augmentation Methods

    Authors: Behnam Yousefimehr, Mehdi Ghatee, Mohammad Amin Seifi, Javad Fazli, Sajed Tavakoli, Zahra Rafei, Shervin Ghaffari, Abolfazl Nikahd, Mahdi Razi Gandomani, Alireza Orouji, Ramtin Mahmoudi Kashani, Sarina Heshmati, Negin Sadat Mousavi

    Abstract: Imbalanced data poses a significant obstacle in machine learning, as an unequal distribution of class labels often results in skewed predictions and diminished model accuracy. To mitigate this problem, various resampling strategies have been developed, encompassing both oversampling and undersampling techniques aimed at modifying class proportions. Conventional oversampling approaches like SMOTE e… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

  14. arXiv:2504.14092  [pdf, other

    cs.CV

    Retinex-guided Histogram Transformer for Mask-free Shadow Removal

    Authors: Wei Dong, Han Zhou, Seyed Amirreza Mousavi, Jun Chen

    Abstract: While deep learning methods have achieved notable progress in shadow removal, many existing approaches rely on shadow masks that are difficult to obtain, limiting their generalization to real-world scenes. In this work, we propose ReHiT, an efficient mask-free shadow removal framework based on a hybrid CNN-Transformer architecture guided by Retinex theory. We first introduce a dual-branch pipeline… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    Comments: Accpeted by CVPR 2025 NTIRE Workshop, Retinex Guidance, Histogram Transformer

  15. arXiv:2503.14513  [pdf

    cs.CV cs.AI eess.IV

    Synthetic Data Generation of Body Motion Data by Neural Gas Network for Emotion Recognition

    Authors: Seyed Muhammad Hossein Mousavi

    Abstract: In the domain of emotion recognition using body motion, the primary challenge lies in the scarcity of diverse and generalizable datasets. Automatic emotion recognition uses machine learning and artificial intelligence techniques to recognize a person's emotional state from various data types, such as text, images, sound, and body motion. Body motion poses unique challenges as many factors, such as… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

    Comments: 18 pages

    MSC Class: A.I

  16. arXiv:2503.13495  [pdf, other

    eess.SP cs.LG

    TransECG: Leveraging Transformers for Explainable ECG Re-identification Risk Analysis

    Authors: Ziyu Wang, Elahe Khatibi, Kianoosh Kazemi, Iman Azimi, Sanaz Mousavi, Shaista Malik, Amir M. Rahmani

    Abstract: Electrocardiogram (ECG) signals are widely shared across multiple clinical applications for diagnosis, health monitoring, and biometric authentication. While valuable for healthcare, they also carry unique biometric identifiers that pose privacy risks, especially when ECG data shared across multiple entities. These risks are amplified in shared environments, where re-identification threats can com… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  17. arXiv:2503.12301  [pdf, ps, other

    cs.LG cs.CL

    One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise

    Authors: Amirabbas Afzali, Amirhossein Afsharrad, Seyed Shahabeddin Mousavi, Sanjay Lall

    Abstract: Large Language Models (LLMs) have made significant strides in generating human-like responses, largely due to preference alignment techniques. However, these methods often assume unbiased human feedback, which is rarely the case in real-world scenarios. This paper introduces Content-Aware Noise-Resilient Preference Optimization (CNRPO), a novel framework that addresses multiple sources of content-… ▽ More

    Submitted 15 September, 2025; v1 submitted 15 March, 2025; originally announced March 2025.

  18. arXiv:2503.02228  [pdf, other

    cs.CV cs.AI

    One Patient's Annotation is Another One's Initialization: Towards Zero-Shot Surgical Video Segmentation with Cross-Patient Initialization

    Authors: Seyed Amir Mousavi, Utku Ozbulak, Francesca Tozzi, Nikdokht Rashidian, Wouter Willaert, Joris Vankerschaver, Wesley De Neve

    Abstract: Video object segmentation is an emerging technology that is well-suited for real-time surgical video segmentation, offering valuable clinical assistance in the operating room by ensuring consistent frame tracking. However, its adoption is limited by the need for manual intervention to select the tracked object, making it impractical in surgical settings. In this work, we tackle this challenge with… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

  19. arXiv:2502.20934  [pdf, ps, other

    cs.CV cs.AI

    Revisiting the Evaluation Bias Introduced by Frame Sampling Strategies in Surgical Video Segmentation Using SAM2

    Authors: Utku Ozbulak, Seyed Amir Mousavi, Francesca Tozzi, Niki Rashidian, Wouter Willaert, Wesley De Neve, Joris Vankerschaver

    Abstract: Real-time video segmentation is a promising opportunity for AI-assisted surgery, offering intraoperative guidance by identifying tools and anatomical structures. Despite growing interest in surgical video segmentation, annotation protocols vary widely across datasets -- some provide dense, frame-by-frame labels, while others rely on sparse annotations sampled at low frame rates such as 1 FPS. In t… ▽ More

    Submitted 30 July, 2025; v1 submitted 28 February, 2025; originally announced February 2025.

    Comments: Accepted for publication in the 28th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) Workshop on Fairness of AI in Medical Imaging (FAIMI), 2025

  20. arXiv:2502.11208  [pdf, other

    cs.CY cs.HC

    Setting the Course, but Forgetting to Steer: Analyzing Compliance with GDPR's Right of Access to Data by Instagram, TikTok, and YouTube

    Authors: Sai Keerthana Karnam, Abhisek Dash, Sepehr Mousavi, Stefan Bechtold, Krishna P. Gummadi, Animesh Mukherjee, Ingmar Weber, Savvas Zannettou

    Abstract: The comprehensibility and reliability of data download packages (DDPs) provided under the General Data Protection Regulation's (GDPR) right of access are vital for both individuals and researchers. These DDPs enable users to understand and control their personal data, yet issues like complexity and incomplete information often limit their utility. Also, despite their growing use in research to stu… ▽ More

    Submitted 16 February, 2025; originally announced February 2025.

    Comments: This is a work in progress and the draft may undergo some changes in future

  21. arXiv:2502.08337  [pdf

    cs.LG cs.AI eess.SY

    Hierarchical Multi-Agent Framework for Carbon-Efficient Liquid-Cooled Data Center Clusters

    Authors: Soumyendu Sarkar, Avisek Naug, Antonio Guillen, Vineet Gundecha, Ricardo Luna Gutierrez, Sahand Ghorbanpour, Sajad Mousavi, Ashwin Ramesh Babu, Desik Rengarajan, Cullen Bash

    Abstract: Reducing the environmental impact of cloud computing requires efficient workload distribution across geographically dispersed Data Center Clusters (DCCs) and simultaneously optimizing liquid and air (HVAC) cooling with time shift of workloads within individual data centers (DC). This paper introduces Green-DCC, which proposes a Reinforcement Learning (RL) based hierarchical controller to optimize… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

  22. arXiv:2501.19205  [pdf, other

    cs.LG

    RIGNO: A Graph-based framework for robust and accurate operator learning for PDEs on arbitrary domains

    Authors: Sepehr Mousavi, Shizheng Wen, Levi Lingsch, Maximilian Herde, Bogdan Raonić, Siddhartha Mishra

    Abstract: Learning the solution operators of PDEs on arbitrary domains is challenging due to the diversity of possible domain shapes, in addition to the often intricate underlying physics. We propose an end-to-end graph neural network (GNN) based neural operator to learn PDE solution operators from data on point clouds in arbitrary domains. Our multi-scale model maps data between input/output point clouds b… ▽ More

    Submitted 31 January, 2025; originally announced January 2025.

  23. arXiv:2501.16353  [pdf

    cs.NE cs.AI cs.LG eess.SP

    Synthetic Data Generation by Supervised Neural Gas Network for Physiological Emotion Recognition Data

    Authors: S. Muhammad Hossein Mousavi

    Abstract: Data scarcity remains a significant challenge in the field of emotion recognition using physiological signals, as acquiring comprehensive and diverse datasets is often prevented by privacy concerns and logistical constraints. This limitation restricts the development and generalization of robust emotion recognition models, making the need for effective synthetic data generation methods more critic… ▽ More

    Submitted 19 January, 2025; originally announced January 2025.

    Comments: 14 pages

  24. arXiv:2501.15539  [pdf, other

    cs.SI cs.CY

    Studying Behavioral Addiction by Combining Surveys and Digital Traces: A Case Study of TikTok

    Authors: Cai Yang, Sepehr Mousavi, Abhisek Dash, Krishna P. Gummadi, Ingmar Weber

    Abstract: Opaque algorithms disseminate and mediate the content that users consume on online social media platforms. This algorithmic mediation serves users with contents of their liking, on the other hand, it may cause several inadvertent risks to society at scale. While some of these risks, e.g., filter bubbles or dissemination of hateful content, are well studied in the community, behavioral addiction, d… ▽ More

    Submitted 26 January, 2025; originally announced January 2025.

    Comments: Accepted at ICWSM 2025, to appear

  25. arXiv:2501.14122  [pdf

    cs.LG cs.AI cs.CR cs.CV

    Reinforcement Learning Platform for Adversarial Black-box Attacks with Custom Distortion Filters

    Authors: Soumyendu Sarkar, Ashwin Ramesh Babu, Sajad Mousavi, Vineet Gundecha, Sahand Ghorbanpour, Avisek Naug, Ricardo Luna Gutierrez, Antonio Guillen

    Abstract: We present a Reinforcement Learning Platform for Adversarial Black-box untargeted and targeted attacks, RLAB, that allows users to select from various distortion filters to create adversarial examples. The platform uses a Reinforcement Learning agent to add minimum distortion to input images while still causing misclassification by the target model. The agent uses a novel dual-action method to exp… ▽ More

    Submitted 15 April, 2025; v1 submitted 23 January, 2025; originally announced January 2025.

    Comments: Accepted at the 2025 AAAI Conference on Artificial Intelligence Proceedings

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, Volume 39, 2025

  26. arXiv:2501.12774  [pdf, other

    cs.CL

    LLMs as Repositories of Factual Knowledge: Limitations and Solutions

    Authors: Seyed Mahed Mousavi, Simone Alghisi, Giuseppe Riccardi

    Abstract: LLMs' sources of knowledge are data snapshots containing factual information about entities collected at different timestamps and from different media types (e.g. wikis, social media, etc.). Such unstructured knowledge is subject to change due to updates through time from past to present. Equally important are the inconsistencies and inaccuracies occurring in different information sources. Consequ… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

  27. arXiv:2412.18409  [pdf, other

    cs.CV

    The Impact of the Single-Label Assumption in Image Recognition Benchmarking

    Authors: Esla Timothy Anzaku, Seyed Amir Mousavi, Arnout Van Messem, Wesley De Neve

    Abstract: Deep neural networks (DNNs) are typically evaluated under the assumption that each image has a single correct label. However, many images in benchmarks like ImageNet contain multiple valid labels, creating a mismatch between evaluation protocols and the actual complexity of visual data. This mismatch can penalize DNNs for predicting correct but unannotated labels, which may partly explain reported… ▽ More

    Submitted 27 May, 2025; v1 submitted 24 December, 2024; originally announced December 2024.

    Comments: 34 pages, 7 figures

  28. arXiv:2412.05576  [pdf, ps, other

    cs.LG cs.CE cs.NE physics.flu-dyn

    STONet: A neural operator for modeling solute transport in micro-cracked reservoirs

    Authors: Ehsan Haghighat, Mohammad Hesan Adeli, S Mohammad Mousavi, Ruben Juanes

    Abstract: In this work, we introduce a novel neural operator, the Solute Transport Operator Network (STONet), to efficiently model contaminant transport in micro-cracked porous media. STONet's model architecture is specifically designed for this problem and uniquely integrates an enriched DeepONet structure with a transformer-based multi-head attention mechanism, enhancing performance without incurring addi… ▽ More

    Submitted 1 July, 2025; v1 submitted 7 December, 2024; originally announced December 2024.

  29. arXiv:2408.10228  [pdf, other

    eess.SP cs.LG

    ECG Unveiled: Analysis of Client Re-identification Risks in Real-World ECG Datasets

    Authors: Ziyu Wang, Anil Kanduri, Seyed Amir Hossein Aqajari, Salar Jafarlou, Sanaz R. Mousavi, Pasi Liljeberg, Shaista Malik, Amir M. Rahmani

    Abstract: While ECG data is crucial for diagnosing and monitoring heart conditions, it also contains unique biometric information that poses significant privacy risks. Existing ECG re-identification studies rely on exhaustive analysis of numerous deep learning features, confining to ad-hoc explainability towards clinicians decision making. In this work, we delve into explainability of ECG re-identification… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  30. arXiv:2408.07841  [pdf

    cs.LG cs.AI eess.SY

    SustainDC: Benchmarking for Sustainable Data Center Control

    Authors: Avisek Naug, Antonio Guillen, Ricardo Luna, Vineet Gundecha, Desik Rengarajan, Sahand Ghorbanpour, Sajad Mousavi, Ashwin Ramesh Babu, Dejan Markovikj, Lekhapriya D Kashyap, Soumyendu Sarkar

    Abstract: Machine learning has driven an exponential increase in computational demand, leading to massive data centers that consume significant amounts of energy and contribute to climate change. This makes sustainable data center control a priority. In this paper, we introduce SustainDC, a set of Python environments for benchmarking multi-agent reinforcement learning (MARL) algorithms for data centers (DC)… ▽ More

    Submitted 30 April, 2025; v1 submitted 14 August, 2024; originally announced August 2024.

    Comments: Accepted at Advances in Neural Information Processing Systems 2024 (NeurIPS 2024)

    Report number: volume 37, year 2024, pages 100630 -100669

    Journal ref: Advances in Neural Information Processing Systems 37 (NeurIPS 2024)

  31. arXiv:2408.05639  [pdf

    cs.AR cs.AI

    Enhancing Computational Efficiency in Intensive Domains via Redundant Residue Number Systems

    Authors: Soudabeh Mousavi, Dara Rahmati, Saeid Gorgin, Jeong-A Lee

    Abstract: In computation-intensive domains such as digital signal processing, encryption, and neural networks, the performance of arithmetic units, including adders and multipliers, is pivotal. Conventional numerical systems often fall short of meeting the efficiency requirements of these applications concerning area, time, and power consumption. Innovative approaches like residue number systems (RNS) and r… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

    Comments: This paper has been accepted by the 21st International SoC Conference (ISOCC), 2024, 2 pages

  32. arXiv:2407.14202  [pdf, other

    cs.NE cs.AI

    SHS: Scorpion Hunting Strategy Swarm Algorithm

    Authors: Abhilash Singh, Seyed Muhammad Hossein Mousavi, Kumar Gaurav

    Abstract: We introduced the Scorpion Hunting Strategy (SHS), a novel population-based, nature-inspired optimisation algorithm. This algorithm draws inspiration from the hunting strategy of scorpions, which identify, locate, and capture their prey using the alpha and beta vibration operators. These operators control the SHS algorithm's exploitation and exploration abilities. To formulate an optimisation meth… ▽ More

    Submitted 30 August, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

  33. arXiv:2407.09950  [pdf

    cs.LG cs.NE

    PSO Fuzzy XGBoost Classifier Boosted with Neural Gas Features on EEG Signals in Emotion Recognition

    Authors: Seyed Muhammad Hossein Mousavi

    Abstract: Emotion recognition is the technology-driven process of identifying and categorizing human emotions from various data sources, such as facial expressions, voice patterns, body motion, and physiological signals, such as EEG. These physiological indicators, though rich in data, present challenges due to their complexity and variability, necessitating sophisticated feature selection and extraction me… ▽ More

    Submitted 30 August, 2024; v1 submitted 13 July, 2024; originally announced July 2024.

    Comments: PSO, Fuzzy, XGBoost, Neural Gas Network (NGN), Feature Selection, EEG Signals, Emotion Recognition

  34. The Magic XRoom: A Flexible VR Platform for Controlled Emotion Elicitation and Recognition

    Authors: S. M. Hossein Mousavi, Matteo Besenzoni, Davide Andreoletti, Achille Peternier, Silvia Giordano

    Abstract: Affective computing has recently gained popularity, especially in the field of human-computer interaction systems, where effectively evoking and detecting emotions is of paramount importance to enhance users experience. However, several issues are hindering progress in the field. In fact, the complexity of emotions makes it difficult to understand their triggers and control their elicitation. Addi… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Proceedings of the 25th International Conference on Mobile Human-Computer Interaction

  35. arXiv:2407.05189  [pdf

    cs.CL

    Enhancing Language Learning through Technology: Introducing a New English-Azerbaijani (Arabic Script) Parallel Corpus

    Authors: Jalil Nourmohammadi Khiarak, Ammar Ahmadi, Taher Ak-bari Saeed, Meysam Asgari-Chenaghlu, ToÄŸrul Atabay, Mohammad Reza Baghban Karimi, Ismail Ceferli, Farzad Hasanvand, Seyed Mahboub Mousavi, Morteza Noshad

    Abstract: This paper introduces a pioneering English-Azerbaijani (Arabic Script) parallel corpus, designed to bridge the technological gap in language learning and machine translation (MT) for under-resourced languages. Consisting of 548,000 parallel sentences and approximately 9 million words per language, this dataset is derived from diverse sources such as news articles and holy texts, aiming to enhance… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: This paper is accepted and published at NeTTT 2024 Conf

  36. arXiv:2407.00463  [pdf, other

    cs.LG cs.AI cs.CL cs.HC eess.AS

    Open-Source Conversational AI with SpeechBrain 1.0

    Authors: Mirco Ravanelli, Titouan Parcollet, Adel Moumen, Sylvain de Langen, Cem Subakan, Peter Plantinga, Yingzhi Wang, Pooneh Mousavi, Luca Della Libera, Artem Ploujnikov, Francesco Paissan, Davide Borra, Salah Zaiem, Zeyu Zhao, Shucong Zhang, Georgios Karakasidis, Sung-Lin Yeh, Pierre Champion, Aku Rouhe, Rudolf Braun, Florian Mai, Juan Zuluaga-Gomez, Seyed Mahed Mousavi, Andreas Nautsch, Ha Nguyen , et al. (8 additional authors not shown)

    Abstract: SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and replicability by releasing both the pre-trained models and the complete "recipes" of code and algorithms required for training them. This paper prese… ▽ More

    Submitted 16 October, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

    Comments: Accepted to the Journal of Machine Learning research (JMLR), Machine Learning Open Source Software

  37. arXiv:2406.06399  [pdf, other

    cs.CL cs.AI

    Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue

    Authors: Simone Alghisi, Massimo Rizzoli, Gabriel Roccabruna, Seyed Mahed Mousavi, Giuseppe Riccardi

    Abstract: We study the limitations of Large Language Models (LLMs) for the task of response generation in human-machine dialogue. Several techniques have been proposed in the literature for different dialogue types (e.g., Open-Domain). However, the evaluations of these techniques have been limited in terms of base LLMs, dialogue types and evaluation metrics. In this work, we extensively analyze different LL… ▽ More

    Submitted 3 August, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted at INLG 2024

  38. arXiv:2406.06313  [pdf, other

    cs.LG

    ProAct: Progressive Training for Hybrid Clipped Activation Function to Enhance Resilience of DNNs

    Authors: Seyedhamidreza Mousavi, Mohammad Hasan Ahmadilivani, Jaan Raik, Maksim Jenihhin, Masoud Daneshtalab

    Abstract: Deep Neural Networks (DNNs) are extensively employed in safety-critical applications where ensuring hardware reliability is a primary concern. To enhance the reliability of DNNs against hardware faults, activation restriction techniques significantly mitigate the fault effects at the DNN structure level, irrespective of accelerator architectures. State-of-the-art methods offer either neuron-wise o… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  39. arXiv:2405.18732  [pdf, other

    physics.geo-ph cs.AI cs.LG physics.app-ph

    Gemini & Physical World: Large Language Models Can Estimate the Intensity of Earthquake Shaking from Multi-Modal Social Media Posts

    Authors: S. Mostafa Mousavi, Marc Stogaitis, Tajinder Gadh, Richard M Allen, Alexei Barski, Robert Bosch, Patrick Robertson, Nivetha Thiruverahan, Youngmin Cho, Aman Raj

    Abstract: This paper presents a novel approach to extract scientifically valuable information about Earth's physical phenomena from unconventional sources, such as multi-modal social media posts. Employing a state-of-the-art large language model (LLM), Gemini 1.5 Pro (Reid et al. 2024), we estimate earthquake ground shaking intensity from these unstructured posts. The model's output, in the form of Modified… ▽ More

    Submitted 14 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  40. arXiv:2405.10658  [pdf, other

    cs.LG

    Cost-Effective Fault Tolerance for CNNs Using Parameter Vulnerability Based Hardening and Pruning

    Authors: Mohammad Hasan Ahmadilivani, Seyedhamidreza Mousavi, Jaan Raik, Masoud Daneshtalab, Maksim Jenihhin

    Abstract: Convolutional Neural Networks (CNNs) have become integral in safety-critical applications, thus raising concerns about their fault tolerance. Conventional hardware-dependent fault tolerance methods, such as Triple Modular Redundancy (TMR), are computationally expensive, imposing a remarkable overhead on CNNs. Whereas fault tolerance techniques can be applied either at the hardware level or at the… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 7 pages, 7 figures, 2 tables, 32 references, the paper is accepted at IOLTS 2024

  41. arXiv:2404.16198  [pdf

    cs.CL

    Towards Efficient Patient Recruitment for Clinical Trials: Application of a Prompt-Based Learning Model

    Authors: Mojdeh Rahmanian, Seyed Mostafa Fakhrahmad, Seyedeh Zahra Mousavi

    Abstract: Objective: Clinical trials are essential for advancing pharmaceutical interventions, but they face a bottleneck in selecting eligible participants. Although leveraging electronic health records (EHR) for recruitment has gained popularity, the complex nature of unstructured medical texts presents challenges in efficiently identifying participants. Natural Language Processing (NLP) techniques have e… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    ACM Class: I.7

  42. arXiv:2404.12498  [pdf

    cs.LG cs.AI eess.SY

    A Configurable Pythonic Data Center Model for Sustainable Cooling and ML Integration

    Authors: Avisek Naug, Antonio Guillen, Ricardo Luna Gutierrez, Vineet Gundecha, Sahand Ghorbanpour, Sajad Mousavi, Ashwin Ramesh Babu, Soumyendu Sarkar

    Abstract: There have been growing discussions on estimating and subsequently reducing the operational carbon footprint of enterprise data centers. The design and intelligent control for data centers have an important impact on data center carbon footprint. In this paper, we showcase PyDCM, a Python library that enables extremely fast prototyping of data center design and applies reinforcement learning-enabl… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: NeurIPS 2023 Workshop on Tackling Climate Change with Machine Learning https://www.climatechange.ai/papers/neurips2023/15. arXiv admin note: substantial text overlap with arXiv:2310.03906

  43. arXiv:2404.10786  [pdf

    cs.DC cs.AI cs.LG cs.MA eess.SY

    Sustainability of Data Center Digital Twins with Reinforcement Learning

    Authors: Soumyendu Sarkar, Avisek Naug, Antonio Guillen, Ricardo Luna, Vineet Gundecha, Ashwin Ramesh Babu, Sajad Mousavi

    Abstract: The rapid growth of machine learning (ML) has led to an increased demand for computational power, resulting in larger data centers (DCs) and higher energy consumption. To address this issue and reduce carbon emissions, intelligent design and control of DC components such as IT servers, cabinets, HVAC cooling, flexible load shifting, and battery energy storage are essential. However, the complexity… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 2024 Proceedings of the AAAI Conference on Artificial Intelligence

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 20, pp. 22322-22330, Mar. 2024

  44. arXiv:2404.08700  [pdf, other

    cs.CL cs.AI

    DyKnow: Dynamically Verifying Time-Sensitive Factual Knowledge in LLMs

    Authors: Seyed Mahed Mousavi, Simone Alghisi, Giuseppe Riccardi

    Abstract: LLMs acquire knowledge from massive data snapshots collected at different timestamps. Their knowledge is then commonly evaluated using static benchmarks. However, factual knowledge is generally subject to time-sensitive changes, and static benchmarks cannot address those cases. We present an approach to dynamically evaluate the knowledge in LLMs and their time-sensitiveness against Wikidata, a pub… ▽ More

    Submitted 2 October, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  45. arXiv:2403.18985  [pdf

    cs.LG cs.AI cs.CR cs.CV cs.MA

    Robustness and Visual Explanation for Black Box Image, Video, and ECG Signal Classification with Reinforcement Learning

    Authors: Soumyendu Sarkar, Ashwin Ramesh Babu, Sajad Mousavi, Vineet Gundecha, Avisek Naug, Sahand Ghorbanpour

    Abstract: We present a generic Reinforcement Learning (RL) framework optimized for crafting adversarial attacks on different model types spanning from ECG signal analysis (1D), image classification (2D), and video classification (3D). The framework focuses on identifying sensitive regions and inducing misclassifications with minimal distortions and various distortion types. The novel RL method outperforms s… ▽ More

    Submitted 22 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: AAAI Proceedings reference: https://ojs.aaai.org/index.php/AAAI/article/view/30579

    Journal ref: 2024 Proceedings of the AAAI Conference on Artificial Intelligence

  46. arXiv:2403.14092  [pdf

    cs.LG cs.AI cs.MA eess.SY

    Carbon Footprint Reduction for Sustainable Data Centers in Real-Time

    Authors: Soumyendu Sarkar, Avisek Naug, Ricardo Luna, Antonio Guillen, Vineet Gundecha, Sahand Ghorbanpour, Sajad Mousavi, Dejan Markovikj, Ashwin Ramesh Babu

    Abstract: As machine learning workloads significantly increase energy consumption, sustainable data centers with low carbon emissions are becoming a top priority for governments and corporations worldwide. This requires a paradigm shift in optimizing power consumption in cooling and IT loads, shifting flexible loads based on the availability of renewable energy in the power grid, and leveraging battery stor… ▽ More

    Submitted 18 May, 2025; v1 submitted 20 March, 2024; originally announced March 2024.

    Journal ref: 2024 Proceedings of the AAAI Conference on Artificial Intelligence

  47. arXiv:2403.12410  [pdf

    cs.SI

    TikTok and the Art of Personalization: Investigating Exploration and Exploitation on Social Media Feeds

    Authors: Karan Vombatkere, Sepehr Mousavi, Savvas Zannettou, Franziska Roesner, Krishna P. Gummadi

    Abstract: Recommendation algorithms for social media feeds often function as black boxes from the perspective of users. We aim to detect whether social media feed recommendations are personalized to users, and to characterize the factors contributing to personalization in these feeds. We introduce a general framework to examine a set of social media feed recommendations for a user as a timeline. We label it… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: ACM Web Conference 2024

  48. arXiv:2402.01826  [pdf, other

    cs.CL cs.AI

    Leveraging Large Language Models for Analyzing Blood Pressure Variations Across Biological Sex from Scientific Literature

    Authors: Yuting Guo, Seyedeh Somayyeh Mousavi, Reza Sameni, Abeed Sarker

    Abstract: Hypertension, defined as blood pressure (BP) that is above normal, holds paramount significance in the realm of public health, as it serves as a critical precursor to various cardiovascular diseases (CVDs) and significantly contributes to elevated mortality rates worldwide. However, many existing BP measurement technologies and standards might be biased because they do not consider clinical outcom… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  49. arXiv:2402.01598  [pdf, other

    q-bio.QM cs.LG stat.AP

    Learning from Two Decades of Blood Pressure Data: Demography-Specific Patterns Across 75 Million Patient Encounters

    Authors: Seyedeh Somayyeh Mousavi, Yuting Guo, Abeed Sarker, Reza Sameni

    Abstract: Hypertension is a global health concern with an increasing prevalence, underscoring the need for effective monitoring and analysis of blood pressure (BP) dynamics. We analyzed a substantial BP dataset comprising 75,636,128 records from 2,054,462 unique patients collected between 2000 and 2022 at Emory Healthcare in Georgia, USA, representing a demographically diverse population. We examined and co… ▽ More

    Submitted 23 April, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  50. arXiv:2401.02297  [pdf, other

    cs.CL

    Are LLMs Robust for Spoken Dialogues?

    Authors: Seyed Mahed Mousavi, Gabriel Roccabruna, Simone Alghisi, Massimo Rizzoli, Mirco Ravanelli, Giuseppe Riccardi

    Abstract: Large Pre-Trained Language Models have demonstrated state-of-the-art performance in different downstream tasks, including dialogue state tracking and end-to-end response generation. Nevertheless, most of the publicly available datasets and benchmarks on task-oriented dialogues focus on written conversations. Consequently, the robustness of the developed models to spoken interactions is unknown. In… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.