Search | arXiv e-print repository

Red Teaming Contemporary AI Models: Insights from Spanish and Basque Perspectives

Authors: Miguel Romero-Arjona, Pablo Valle, Juan C. Alonso, Ana B. Sánchez, Miriam Ugarte, Antonia Cazalilla, Vicente Cambrón, José A. Parejo, Aitor Arrieta, Sergio Segura

Abstract: The battle for AI leadership is on, with OpenAI in the United States and DeepSeek in China as key contenders. In response to these global trends, the Spanish government has proposed ALIA, a public and transparent AI infrastructure incorporating small language models designed to support Spanish and co-official languages such as Basque. This paper presents the results of Red Teaming sessions, where… ▽ More The battle for AI leadership is on, with OpenAI in the United States and DeepSeek in China as key contenders. In response to these global trends, the Spanish government has proposed ALIA, a public and transparent AI infrastructure incorporating small language models designed to support Spanish and co-official languages such as Basque. This paper presents the results of Red Teaming sessions, where ten participants applied their expertise and creativity to manually test three of the latest models from these initiatives$\unicode{x2013}$OpenAI o3-mini, DeepSeek R1, and ALIA Salamandra$\unicode{x2013}$focusing on biases and safety concerns. The results, based on 670 conversations, revealed vulnerabilities in all the models under test, with biased or unsafe responses ranging from 29.5% in o3-mini to 50.6% in Salamandra. These findings underscore the persistent challenges in developing reliable and trustworthy AI systems, particularly those intended to support Spanish and Basque languages. △ Less

Submitted 13 March, 2025; originally announced March 2025.

arXiv:2410.10609 [pdf, other]

Lambda-Skip Connections: the architectural component that prevents Rank Collapse

Authors: Federico Arangath Joseph, Jerome Sieber, Melanie N. Zeilinger, Carmen Amo Alonso

Abstract: Rank collapse, a phenomenon where embedding vectors in sequence models rapidly converge to a uniform token or equilibrium state, has recently gained attention in the deep learning literature. This phenomenon leads to reduced expressivity and potential training instabilities due to vanishing gradients. Empirical evidence suggests that architectural components like skip connections, LayerNorm, and M… ▽ More Rank collapse, a phenomenon where embedding vectors in sequence models rapidly converge to a uniform token or equilibrium state, has recently gained attention in the deep learning literature. This phenomenon leads to reduced expressivity and potential training instabilities due to vanishing gradients. Empirical evidence suggests that architectural components like skip connections, LayerNorm, and MultiLayer Perceptrons (MLPs) play critical roles in mitigating rank collapse. While this issue is well-documented for transformers, alternative sequence models, such as State Space Models (SSMs), which have recently gained prominence, have not been thoroughly examined for similar vulnerabilities. This paper extends the theory of rank collapse from transformers to SSMs using a unifying framework that captures both architectures. We study how a parametrized version of the classic skip connection component, which we call \emph{lambda-skip connections}, provides guarantees for rank collapse prevention. Through analytical results, we present a sufficient condition to guarantee prevention of rank collapse across all the aforementioned architectures. We also study the necessity of this condition via ablation studies and analytical examples. To our knowledge, this is the first study that provides a general guarantee to prevent rank collapse, and that investigates rank collapse in the context of SSMs, offering valuable understanding for both theoreticians and practitioners. Finally, we validate our findings with experiments demonstrating the crucial role of architectural components such as skip connections and gating mechanisms in preventing rank collapse. △ Less

Submitted 13 February, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

arXiv:2408.11841 [pdf, other]

doi 10.1073/pnas.2414955121

Could ChatGPT get an Engineering Degree? Evaluating Higher Education Vulnerability to AI Assistants

Authors: Beatriz Borges, Negar Foroutan, Deniz Bayazit, Anna Sotnikova, Syrielle Montariol, Tanya Nazaretzky, Mohammadreza Banaei, Alireza Sakhaeirad, Philippe Servant, Seyed Parsa Neshaei, Jibril Frej, Angelika Romanou, Gail Weiss, Sepideh Mamooler, Zeming Chen, Simin Fan, Silin Gao, Mete Ismayilzada, Debjit Paul, Alexandre Schöpfer, Andrej Janchevski, Anja Tiede, Clarence Linden, Emanuele Troiani, Francesco Salvi , et al. (65 additional authors not shown)

Abstract: AI assistants are being increasingly used by students enrolled in higher education institutions. While these tools provide opportunities for improved teaching and education, they also pose significant challenges for assessment and learning outcomes. We conceptualize these challenges through the lens of vulnerability, the potential for university assessments and learning outcomes to be impacted by… ▽ More AI assistants are being increasingly used by students enrolled in higher education institutions. While these tools provide opportunities for improved teaching and education, they also pose significant challenges for assessment and learning outcomes. We conceptualize these challenges through the lens of vulnerability, the potential for university assessments and learning outcomes to be impacted by student use of generative AI. We investigate the potential scale of this vulnerability by measuring the degree to which AI assistants can complete assessment questions in standard university-level STEM courses. Specifically, we compile a novel dataset of textual assessment questions from 50 courses at EPFL and evaluate whether two AI assistants, GPT-3.5 and GPT-4 can adequately answer these questions. We use eight prompting strategies to produce responses and find that GPT-4 answers an average of 65.8% of questions correctly, and can even produce the correct answer across at least one prompting strategy for 85.1% of questions. When grouping courses in our dataset by degree program, these systems already pass non-project assessments of large numbers of core courses in various degree programs, posing risks to higher education accreditation that will be amplified as these models improve. Our results call for revising program-level assessment design in higher education in light of advances in generative AI. △ Less

Submitted 27 November, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

Comments: 20 pages, 8 figures

Journal ref: PNAS (2024) Vol. 121 | No. 49

arXiv:2405.15731 [pdf, other]

Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks

Authors: Jerome Sieber, Carmen Amo Alonso, Alexandre Didier, Melanie N. Zeilinger, Antonio Orvieto

Abstract: Softmax attention is the principle backbone of foundation models for various artificial intelligence applications, yet its quadratic complexity in sequence length can limit its inference throughput in long-context settings. To address this challenge, alternative architectures such as linear attention, State Space Models (SSMs), and Recurrent Neural Networks (RNNs) have been considered as more effi… ▽ More Softmax attention is the principle backbone of foundation models for various artificial intelligence applications, yet its quadratic complexity in sequence length can limit its inference throughput in long-context settings. To address this challenge, alternative architectures such as linear attention, State Space Models (SSMs), and Recurrent Neural Networks (RNNs) have been considered as more efficient alternatives. While connections between these approaches exist, such models are commonly developed in isolation and there is a lack of theoretical understanding of the shared principles underpinning these architectures and their subtle differences, greatly influencing performance and scalability. In this paper, we introduce the Dynamical Systems Framework (DSF), which allows a principled investigation of all these architectures in a common representation. Our framework facilitates rigorous comparisons, providing new insights on the distinctive characteristics of each model class. For instance, we compare linear attention and selective SSMs, detailing their differences and conditions under which both are equivalent. We also provide principled comparisons between softmax attention and other model classes, discussing the theoretical conditions under which softmax attention can be approximated. Additionally, we substantiate these new insights with empirical validations and mathematical arguments. This shows the DSF's potential to guide the systematic development of future more efficient and scalable foundation models. △ Less

Submitted 8 December, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

Comments: NeurIPS 2024

arXiv:2405.15454 [pdf, other]

Linearly Controlled Language Generation with Performative Guarantees

Authors: Emily Cheng, Marco Baroni, Carmen Amo Alonso

Abstract: The increasing prevalence of Large Language Models (LMs) in critical applications highlights the need for controlled language generation strategies that are not only computationally efficient but that also enjoy performance guarantees. To achieve this, we use a common model of concept semantics as linearly represented in an LM's latent space. In particular, we take the view that natural language g… ▽ More The increasing prevalence of Large Language Models (LMs) in critical applications highlights the need for controlled language generation strategies that are not only computationally efficient but that also enjoy performance guarantees. To achieve this, we use a common model of concept semantics as linearly represented in an LM's latent space. In particular, we take the view that natural language generation traces a trajectory in this continuous semantic space, realized by the language model's hidden activations. This view permits a control-theoretic treatment of text generation in latent space, in which we propose a lightweight, gradient-free intervention that dynamically steers trajectories away from regions corresponding to undesired meanings. Crucially, we show that this intervention, which we compute in closed form, is guaranteed (in probability) to steer the output into the allowed region. Finally, we demonstrate on a toxicity avoidance objective that the intervention steers language away from undesired content while maintaining text quality. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2403.16899 [pdf, other]

State Space Models as Foundation Models: A Control Theoretic Overview

Authors: Carmen Amo Alonso, Jerome Sieber, Melanie N. Zeilinger

Abstract: In recent years, there has been a growing interest in integrating linear state-space models (SSM) in deep neural network architectures of foundation models. This is exemplified by the recent success of Mamba, showing better performance than the state-of-the-art Transformer architectures in language tasks. Foundation models, like e.g. GPT-4, aim to encode sequential data into a latent space in orde… ▽ More In recent years, there has been a growing interest in integrating linear state-space models (SSM) in deep neural network architectures of foundation models. This is exemplified by the recent success of Mamba, showing better performance than the state-of-the-art Transformer architectures in language tasks. Foundation models, like e.g. GPT-4, aim to encode sequential data into a latent space in order to learn a compressed representation of the data. The same goal has been pursued by control theorists using SSMs to efficiently model dynamical systems. Therefore, SSMs can be naturally connected to deep sequence modeling, offering the opportunity to create synergies between the corresponding research areas. This paper is intended as a gentle introduction to SSM-based architectures for control theorists and summarizes the latest research developments. It provides a systematic review of the most successful SSM proposals and highlights their main features from a control theoretic perspective. Additionally, we present a comparative analysis of these models, evaluating their performance on a standardized benchmark designed for assessing a model's efficiency at learning long sequences. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.10762 [pdf, other]

NARRATE: Versatile Language Architecture for Optimal Control in Robotics

Authors: Seif Ismail, Antonio Arbues, Ryan Cotterell, René Zurbrügg, Carmen Amo Alonso

Abstract: The impressive capabilities of Large Language Models (LLMs) have led to various efforts to enable robots to be controlled through natural language instructions, opening exciting possibilities for human-robot interaction The goal is for the motor-control task to be performed accurately, efficiently and safely while also enjoying the flexibility imparted by LLMs to specify and adjust the task throug… ▽ More The impressive capabilities of Large Language Models (LLMs) have led to various efforts to enable robots to be controlled through natural language instructions, opening exciting possibilities for human-robot interaction The goal is for the motor-control task to be performed accurately, efficiently and safely while also enjoying the flexibility imparted by LLMs to specify and adjust the task through natural language. In this work, we demonstrate how a careful layering of an LLM in combination with a Model Predictive Control (MPC) formulation allows for accurate and flexible robotic control via natural language while taking into consideration safety constraints. In particular, we rely on the LLM to effectively frame constraints and objective functions as mathematical expressions, which are later used in the motor-control module via MPC. The transparency of the optimization formulation allows for interpretability of the task and enables adjustments through human feedback. We demonstrate the validity of our method through extensive experiments on long-horizon reasoning, contact-rich, and multi-object interaction tasks. Our evaluations show that NARRATE outperforms current existing methods on these benchmarks and effectively transfers to the real world on two different embodiments. Videos, Code and Prompts at narrate-mpc.github.io △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2306.08004 [pdf]

Detection and classification of faults aimed at preventive maintenance of PV systems

Authors: Edgar Hernando Sepúlveda Oviedo, Louise Travé-Massuyès, Audine Subias, Marko Pavlov, Corinne Alonso

Abstract: Diagnosis in PV systems aims to detect, locate and identify faults. Diagnosing these faults is vital to guarantee energy production and extend the useful life of PV power plants. In the literature, multiple machine learning approaches have been proposed for this purpose. However, few of these works have paid special attention to the detection of fine faults and the specialized process of extractio… ▽ More Diagnosis in PV systems aims to detect, locate and identify faults. Diagnosing these faults is vital to guarantee energy production and extend the useful life of PV power plants. In the literature, multiple machine learning approaches have been proposed for this purpose. However, few of these works have paid special attention to the detection of fine faults and the specialized process of extraction and selection of features for their classification. A fine fault is one whose characteristic signature is difficult to distinguish to that of a healthy panel. As a contribution to the detection of fine faults (especially of the snail trail type), this article proposes an innovative approach based on the Random Forest (RF) algorithm. This approach uses a complex feature extraction and selection method that improves the computational time of fault classification while maintaining high accuracy. △ Less

Submitted 13 June, 2023; originally announced June 2023.

Journal ref: XI Congreso Internacional de Ingenier{í}a Mec{á}nica, Mecatr{ó}nica y Automatizaci{ó}n 2023, Universidad Nacional de Colombia, Apr 2023, Carthag{è}ne, Colombia

arXiv:2306.08003 [pdf]

DTW k-means clustering for fault detection in photovoltaic modules

Authors: Edgar Hernando Sepúlveda Oviedo, Louise Travé-Massuyès, Audine Subias, Marko Pavlov, Corinne Alonso

Abstract: The increase in the use of photovoltaic (PV) energy in the world has shown that the useful life and maintenance of a PV plant directly depend on theability to quickly detect severe faults on a PV plant. To solve this problem of detection, data based approaches have been proposed in the literature.However, these previous solutions consider only specific behavior of one or few faults. Most of these… ▽ More The increase in the use of photovoltaic (PV) energy in the world has shown that the useful life and maintenance of a PV plant directly depend on theability to quickly detect severe faults on a PV plant. To solve this problem of detection, data based approaches have been proposed in the literature.However, these previous solutions consider only specific behavior of one or few faults. Most of these approaches can be qualified as supervised, requiring an enormous labelling effort (fault types clearly identified in each technology). In addition, most of them are validated in PV cells or one PV module. That is hardly applicable in large-scale PV plants considering their complexity. Alternatively, some unsupervised well-known approaches based on data try to detect anomalies but are not able to identify precisely the type of fault. The most performant of these methods do manage to efficiently group healthy panels and separate them from faulty panels. In that way, this article presents an unsupervised approach called DTW K-means. This approach takes advantages of both the dynamic time warping (DWT) metric and the Kmeans clustering algorithm as a data-driven approach. The results of this mixed method in a PV string are compared to diagnostic labels established by visual inspection of the panels. △ Less

Submitted 13 June, 2023; originally announced June 2023.

Journal ref: XI Congreso Internacional de Ingenier{í}a Mec{á}nica, Mecatr{ó}nica y Automatizaci{ó}n 2023, Apr 2023, Carthag{è}ne, Colombia

arXiv:2206.13342 [pdf, other]

Open Set Classification of Untranscribed Handwritten Documents

Authors: José Ramón Prieto, Juan José Flores, Enrique Vidal, Alejandro H. Toselli, David Garrido, Carlos Alonso

Abstract: Huge amounts of digital page images of important manuscripts are preserved in archives worldwide. The amounts are so large that it is generally unfeasible for archivists to adequately tag most of the documents with the required metadata so as to low proper organization of the archives and effective exploration by scholars and the general public. The class or ``typology'' of a document is perhaps t… ▽ More Huge amounts of digital page images of important manuscripts are preserved in archives worldwide. The amounts are so large that it is generally unfeasible for archivists to adequately tag most of the documents with the required metadata so as to low proper organization of the archives and effective exploration by scholars and the general public. The class or ``typology'' of a document is perhaps the most important tag to be included in the metadata. The technical problem is one of automatic classification of documents, each consisting of a set of untranscribed handwritten text images, by the textual contents of the images. The approach considered is based on ``probabilistic indexing'', a relatively novel technology which allows to effectively represent the intrinsic word-level uncertainty exhibited by handwritten text images. We assess the performance of this approach on a large collection of complex notarial manuscripts from the Spanish Archivo Hostórico Provincial de Cádiz, with promising results. △ Less

Submitted 20 June, 2022; originally announced June 2022.

arXiv:2103.14990 [pdf, other]

Effective GPU Parallelization of Distributed and Localized Model Predictive Control

Authors: Carmen Amo Alonso, Shih-Hao Tseng

Abstract: To effectively control large-scale distributed systems online, model predictive control (MPC) has to swiftly solve the underlying high-dimensional optimization. There are multiple techniques applied to accelerate the solving process in the literature, mainly attributed to software-based algorithmic advancements and hardware-assisted computation enhancements. However, those methods focus on arithme… ▽ More To effectively control large-scale distributed systems online, model predictive control (MPC) has to swiftly solve the underlying high-dimensional optimization. There are multiple techniques applied to accelerate the solving process in the literature, mainly attributed to software-based algorithmic advancements and hardware-assisted computation enhancements. However, those methods focus on arithmetic accelerations and overlook the benefits of the underlying system's structure. In particular, the existing decoupled software-hardware algorithm design that naively parallelizes the arithmetic operations by the hardware does not tackle the hardware overheads such as CPU-GPU and thread-to-thread communications in a principled manner. Also, the advantages of parallelizable subproblem decomposition in distributed MPC are not well recognized and exploited. As a result, we have not reached the full potential of hardware acceleration for MPC. In this paper, we explore those opportunities by leveraging GPU to parallelize the distributed and localized MPC (DLMPC) algorithm. We exploit the locality constraints embedded in the DLMPC formulation to reduce the hardware-intrinsic communication overheads. Our parallel implementation achieves up to 50x faster runtime than its CPU counterparts under various parameters. Furthermore, we find that the locality-aware GPU parallelization could halve the optimization runtime comparing to the naive acceleration. Overall, our results demonstrate the performance gains brought by software-hardware co-design with the information exchange structure in mind. △ Less

Submitted 27 March, 2021; originally announced March 2021.

Comments: Submitted to 2021 Control and Decision Conference

arXiv:1802.09353 [pdf, other]

doi 10.1109/IVS.2017.7995984

Novel Common Vehicle Information Model (CVIM) for Future Automotive Vehicle Big Data Marketplaces

Authors: Johannes Pillmann, Christian Wietfeld, Adrian Zarcula, Thomas Raugust, Daniel Calvo Alonso

Abstract: Even though connectivity services have been introduced in many of the most recent car models, access to vehicle data is currently limited due to its proprietary nature. The European project AutoMat has therefore developed an open Marketplace providing a single point of access for brand-independent vehicle data. Thereby, vehicle sensor data can be leveraged for the design and implementation of enti… ▽ More Even though connectivity services have been introduced in many of the most recent car models, access to vehicle data is currently limited due to its proprietary nature. The European project AutoMat has therefore developed an open Marketplace providing a single point of access for brand-independent vehicle data. Thereby, vehicle sensor data can be leveraged for the design and implementation of entirely new services even beyond trafficrelated applications (such as hyper-local traffic forecasts). This paper presents the architecture for a Vehicle Big Data Marketplace as enabler of cross-sectorial and innovative vehicle data services. Therefore, the novel Common Vehicle Information Model (CVIM) is defined as an open and harmonized data model, allowing the aggregation of brand-independent and generic data sets. Within this work the realization of a prototype CVIM and Marketplace implementation is presented. The two use-cases of local weather prediction and road quality measurements are introduced to show the applicability of the AutoMat concept and prototype to non-automotive application △ Less

Submitted 21 February, 2018; originally announced February 2018.

Journal ref: Intelligent Vehicles Symposium (IV), 2017 IEEE

Showing 1–12 of 12 results for author: Alonso, C