Search | arXiv e-print repository

Supporting architecture evaluation for ATAM scenarios with LLMs

Authors: Rafael Capilla, J. Andrés Díaz-Pace, Yamid Ramírez, Jennifer Pérez, Vanessa Rodríguez-Horcajo

Abstract: Architecture evaluation methods have long been used to evaluate software designs. Several evaluation methods have been proposed and used to analyze tradeoffs between different quality attributes. Having competing qualities leads to conflicts for selecting which quality-attribute scenarios are the most suitable ones that an architecture should tackle and for prioritizing the scenarios required by t… ▽ More Architecture evaluation methods have long been used to evaluate software designs. Several evaluation methods have been proposed and used to analyze tradeoffs between different quality attributes. Having competing qualities leads to conflicts for selecting which quality-attribute scenarios are the most suitable ones that an architecture should tackle and for prioritizing the scenarios required by the stakeholders. In this context, architecture evaluation is carried out manually, often involving long brainstorming sessions to decide which are the most adequate quality scenarios. To reduce this effort and make the assessment and selection of scenarios more efficient, we suggest the usage of LLMs to partially automate evaluation activities. As a first step to validate this hypothesis, this work studies MS Copilot as an LLM tool to analyze quality scenarios suggested by students in a software architecture course and compares the students' results with the assessment provided by the LLM. Our initial study reveals that the LLM produces in most cases better and more accurate results regarding the risks, sensitivity points and tradeoff analysis of the quality scenarios. Overall, the use of generative AI has the potential to partially automate and support the architecture evaluation tasks, improving the human decision-making process. △ Less

Submitted 30 May, 2025; originally announced June 2025.

arXiv:2505.06267 [pdf, other]

AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks

Authors: Ilyas Oulkadda, Julien Perez

Abstract: The widespread adoption of Large Language Models (LLMs) for code generation, exemplified by GitHub Copilot\footnote{A coding extension powered by a Code-LLM to assist in code completion tasks} surpassing a million users, highlights the transformative potential of these tools in improving developer productivity. However, this rapid growth also underscores critical concerns regarding the quality, sa… ▽ More The widespread adoption of Large Language Models (LLMs) for code generation, exemplified by GitHub Copilot\footnote{A coding extension powered by a Code-LLM to assist in code completion tasks} surpassing a million users, highlights the transformative potential of these tools in improving developer productivity. However, this rapid growth also underscores critical concerns regarding the quality, safety, and reliability of the code they generate. As Code-LLMs evolve, they face significant challenges, including the diminishing returns of model scaling and the scarcity of new, high-quality training data. To address these issues, this paper introduces Adversarial Knowledge Distillation (AKD), a novel approach that leverages adversarially generated synthetic datasets to distill the capabilities of larger models into smaller, more efficient ones. By systematically stress-testing and refining the reasoning capabilities of Code-LLMs, AKD provides a framework for enhancing model robustness, reliability, and security while improving their parameter-efficiency. We believe this work represents a critical step toward ensuring dependable automated code generation within the constraints of existing data and the cost-efficiency of model execution. △ Less

Submitted 5 May, 2025; originally announced May 2025.

arXiv:2505.02148 [pdf, other]

Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving

Authors: Alexey Nekrasov, Malcolm Burdorf, Stewart Worrall, Bastian Leibe, Julie Stephany Berrio Perez

Abstract: To operate safely, autonomous vehicles (AVs) need to detect and handle unexpected objects or anomalies on the road. While significant research exists for anomaly detection and segmentation in 2D, research progress in 3D is underexplored. Existing datasets lack high-quality multimodal data that are typically found in AVs. This paper presents a novel dataset for anomaly segmentation in driving scena… ▽ More To operate safely, autonomous vehicles (AVs) need to detect and handle unexpected objects or anomalies on the road. While significant research exists for anomaly detection and segmentation in 2D, research progress in 3D is underexplored. Existing datasets lack high-quality multimodal data that are typically found in AVs. This paper presents a novel dataset for anomaly segmentation in driving scenarios. To the best of our knowledge, it is the first publicly available dataset focused on road anomaly segmentation with dense 3D semantic labeling, incorporating both LiDAR and camera data, as well as sequential information to enable anomaly detection across various ranges. This capability is critical for the safe navigation of autonomous vehicles. We adapted and evaluated several baseline models for 3D segmentation, highlighting the challenges of 3D anomaly detection in driving environments. Our dataset and evaluation code will be openly available, facilitating the testing and performance comparison of different approaches. △ Less

Submitted 4 May, 2025; originally announced May 2025.

Comments: Accepted for publication at CVPR 2025. Project page: https://www.vision.rwth-aachen.de/stu-dataset

arXiv:2504.16538 [pdf, other]

Streetscape Analysis with Generative AI (SAGAI): Vision-Language Assessment and Mapping of Urban Scenes

Authors: Joan Perez, Giovanni Fusco

Abstract: Streetscapes are an essential component of urban space. Their assessment is presently either limited to morphometric properties of their mass skeleton or requires labor-intensive qualitative evaluations of visually perceived qualities. This paper introduces SAGAI: Streetscape Analysis with Generative Artificial Intelligence, a modular workflow for scoring street-level urban scenes using open-acces… ▽ More Streetscapes are an essential component of urban space. Their assessment is presently either limited to morphometric properties of their mass skeleton or requires labor-intensive qualitative evaluations of visually perceived qualities. This paper introduces SAGAI: Streetscape Analysis with Generative Artificial Intelligence, a modular workflow for scoring street-level urban scenes using open-access data and vision-language models. SAGAI integrates OpenStreetMap geometries, Google Street View imagery, and a lightweight version of the LLaVA model to generate structured spatial indicators from images via customizable natural language prompts. The pipeline includes an automated mapping module that aggregates visual scores at both the point and street levels, enabling direct cartographic interpretation. It operates without task-specific training or proprietary software dependencies, supporting scalable and interpretable analysis of urban environments. Two exploratory case studies in Nice and Vienna illustrate SAGAI's capacity to produce geospatial outputs from vision-language inference. The initial results show strong performance for binary urban-rural scene classification, moderate precision in commercial feature detection, and lower estimates, but still informative, of sidewalk width. Fully deployable by any user, SAGAI can be easily adapted to a wide range of urban research themes, such as walkability, safety, or urban design, through prompt modification alone. △ Less

Submitted 23 April, 2025; originally announced April 2025.

Comments: 25 pages, 6 figures in main paper, 6 figures in appendices

ACM Class: I.2; I.4; J.4

arXiv:2504.15845 [pdf, ps, other]

Contrasting Deadlock-Free Session Processes (Extended Version)

Authors: Juan C. Jaramillo, Jorge A. Pérez

Abstract: Deadlock freedom is a crucial property for message-passing programs. Over the years, several different type systems for concurrent processes that ensure deadlock freedom have been proposed; this diversity raises the question of how they compare. We address this question, considering two type systems not covered in prior work: Kokke etal's HCP, a type system based on a linear logic with hypersequen… ▽ More Deadlock freedom is a crucial property for message-passing programs. Over the years, several different type systems for concurrent processes that ensure deadlock freedom have been proposed; this diversity raises the question of how they compare. We address this question, considering two type systems not covered in prior work: Kokke etal's HCP, a type system based on a linear logic with hypersequents, and Padovani's priority-based type system for asynchronous processes, dubbed P. Their distinctive features make formal comparisons relevant and challenging. Our findings are two-fold: (1) the hypersequent setting does not drastically change the class of deadlock-free processes induced by linear logic, and (2) we relate the classes of deadlock-free processes induced by HCP and P. We prove that our results hold under both synchronous and asynchronous communication. Our results provide new insights into the essential mechanisms involved in statically avoiding deadlocks in concurrency. △ Less

Submitted 25 April, 2025; v1 submitted 22 April, 2025; originally announced April 2025.

Comments: Full version of an ECOOP 25 paper

arXiv:2504.06774 [pdf, other]

Hybrid machine learning models based on physical patterns to accelerate CFD simulations: a short guide on autoregressive models

Authors: Arindam Sengupta, Rodrigo Abadía-Heredia, Ashton Hetherington, José Miguel Pérez, Soledad Le Clainche

Abstract: Accurate modeling of the complex dynamics of fluid flows is a fundamental challenge in computational physics and engineering. This study presents an innovative integration of High-Order Singular Value Decomposition (HOSVD) with Long Short-Term Memory (LSTM) architectures to address the complexities of reduced-order modeling (ROM) in fluid dynamics. HOSVD improves the dimensionality reduction proce… ▽ More Accurate modeling of the complex dynamics of fluid flows is a fundamental challenge in computational physics and engineering. This study presents an innovative integration of High-Order Singular Value Decomposition (HOSVD) with Long Short-Term Memory (LSTM) architectures to address the complexities of reduced-order modeling (ROM) in fluid dynamics. HOSVD improves the dimensionality reduction process by preserving multidimensional structures, surpassing the limitations of Singular Value Decomposition (SVD). The methodology is tested across numerical and experimental data sets, including two- and three-dimensional (2D and 3D) cylinder wake flows, spanning both laminar and turbulent regimes. The emphasis is also on exploring how the depth and complexity of LSTM architectures contribute to improving predictive performance. Simpler architectures with a single dense layer effectively capture the periodic dynamics, demonstrating the network's ability to model non-linearities and chaotic dynamics. The addition of extra layers provides higher accuracy at minimal computational cost. These additional layers enable the network to expand its representational capacity, improving the prediction accuracy and reliability. The results demonstrate that HOSVD outperforms SVD in all tested scenarios, as evidenced by using different error metrics. Efficient mode truncation by HOSVD-based models enables the capture of complex temporal patterns, offering reliable predictions even in challenging, noise-influenced data sets. The findings underscore the adaptability and robustness of HOSVD-LSTM architectures, offering a scalable framework for modeling fluid dynamics. △ Less

Submitted 9 April, 2025; originally announced April 2025.

arXiv:2504.03814 [pdf, other]

Recursive Training Loops in LLMs: How training data properties modulate distribution shift in generated data?

Authors: Grgur Kovač, Jérémy Perez, Rémy Portelas, Peter Ford Dominey, Pierre-Yves Oudeyer

Abstract: Large language models (LLMs) are increasingly contributing to the creation of content on the Internet. This creates a feedback loop as subsequent generations of models will be trained on this generated, synthetic data. This phenomenon is receiving increasing interest, in particular because previous studies have shown that it may lead to distribution shift - models misrepresent and forget the true… ▽ More Large language models (LLMs) are increasingly contributing to the creation of content on the Internet. This creates a feedback loop as subsequent generations of models will be trained on this generated, synthetic data. This phenomenon is receiving increasing interest, in particular because previous studies have shown that it may lead to distribution shift - models misrepresent and forget the true underlying distributions of human data they are expected to approximate (e.g. resulting in a drastic loss of quality). In this study, we study the impact of human data properties on distribution shift dynamics in iterated training loops. We first confirm that the distribution shift dynamics greatly vary depending on the human data by comparing four datasets (two based on Twitter and two on Reddit). We then test whether data quality may influence the rate of this shift. We find that it does on the twitter, but not on the Reddit datasets. We then focus on a Reddit dataset and conduct a more exhaustive evaluation of a large set of dataset properties. This experiment associated lexical diversity with larger, and semantic diversity with smaller detrimental shifts, suggesting that incorporating text with high lexical (but limited semantic) diversity could exacerbate the degradation of generated text. We then focus on the evolution of political bias, and find that the type of shift observed (bias reduction, amplification or inversion) depends on the political lean of the human (true) distribution. Overall, our work extends the existing literature on the consequences of recursive fine-tuning by showing that this phenomenon is highly dependent on features of the human data on which training occurs. This suggests that different parts of internet (e.g. GitHub, Reddit) may undergo different types of shift depending on their properties. △ Less

Submitted 8 April, 2025; v1 submitted 4 April, 2025; originally announced April 2025.

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2502.14156 [pdf, other]

Mixed Signals: A Diverse Point Cloud Dataset for Heterogeneous LiDAR V2X Collaboration

Authors: Katie Z Luo, Minh-Quan Dao, Zhenzhen Liu, Mark Campbell, Wei-Lun Chao, Kilian Q. Weinberger, Ezio Malis, Vincent Fremont, Bharath Hariharan, Mao Shan, Stewart Worrall, Julie Stephany Berrio Perez

Abstract: Vehicle-to-everything (V2X) collaborative perception has emerged as a promising solution to address the limitations of single-vehicle perception systems. However, existing V2X datasets are limited in scope, diversity, and quality. To address these gaps, we present Mixed Signals, a comprehensive V2X dataset featuring 45.1k point clouds and 240.6k bounding boxes collected from three connected autono… ▽ More Vehicle-to-everything (V2X) collaborative perception has emerged as a promising solution to address the limitations of single-vehicle perception systems. However, existing V2X datasets are limited in scope, diversity, and quality. To address these gaps, we present Mixed Signals, a comprehensive V2X dataset featuring 45.1k point clouds and 240.6k bounding boxes collected from three connected autonomous vehicles (CAVs) equipped with two different types of LiDAR sensors, plus a roadside unit with dual LiDARs. Our dataset provides precisely aligned point clouds and bounding box annotations across 10 classes, ensuring reliable data for perception training. We provide detailed statistical analysis on the quality of our dataset and extensively benchmark existing V2X methods on it. Mixed Signals V2X Dataset is one of the highest quality, large-scale datasets publicly available for V2X perception research. Details on the website https://mixedsignalsdataset.cs.cornell.edu/. △ Less

Submitted 19 February, 2025; originally announced February 2025.

arXiv:2502.10567 [pdf, other]

Efficient Hierarchical Contrastive Self-supervising Learning for Time Series Classification via Importance-aware Resolution Selection

Authors: Kevin Garcia, Juan Manuel Perez, Yifeng Gao

Abstract: Recently, there has been a significant advancement in designing Self-Supervised Learning (SSL) frameworks for time series data to reduce the dependency on data labels. Among these works, hierarchical contrastive learning-based SSL frameworks, which learn representations by contrasting data embeddings at multiple resolutions, have gained considerable attention. Due to their ability to gather more i… ▽ More Recently, there has been a significant advancement in designing Self-Supervised Learning (SSL) frameworks for time series data to reduce the dependency on data labels. Among these works, hierarchical contrastive learning-based SSL frameworks, which learn representations by contrasting data embeddings at multiple resolutions, have gained considerable attention. Due to their ability to gather more information, they exhibit better generalization in various downstream tasks. However, when the time series data length is significant long, the computational cost is often significantly higher than that of other SSL frameworks. In this paper, to address this challenge, we propose an efficient way to train hierarchical contrastive learning models. Inspired by the fact that each resolution's data embedding is highly dependent, we introduce importance-aware resolution selection based training framework to reduce the computational cost. In the experiment, we demonstrate that the proposed method significantly improves training time while preserving the original model's integrity in extensive time series classification performance evaluations. Our code could be found here, https://github.com/KEEBVIN/IARS △ Less

Submitted 14 February, 2025; originally announced February 2025.

Comments: Appears in IEEEBigData-2024

ACM Class: I.2

arXiv:2412.08232 [pdf, ps, other]

doi 10.4204/EPTCS.414.1

A Gentle Overview of Asynchronous Session-based Concurrency: Deadlock Freedom by Typing

Authors: Bas van den Heuvel, Jorge A. Pérez

Abstract: While formal models of concurrency tend to focus on synchronous communication, asynchronous communication is relevant in practice. In this paper, we will discuss asynchronous communication in the context of session-based concurrency, the model of computation in which session types specify the structure of the two-party protocols implemented by the channels of a communicating process. We overview r… ▽ More While formal models of concurrency tend to focus on synchronous communication, asynchronous communication is relevant in practice. In this paper, we will discuss asynchronous communication in the context of session-based concurrency, the model of computation in which session types specify the structure of the two-party protocols implemented by the channels of a communicating process. We overview recent work on addressing the challenge of ensuring the deadlock-freedom property for message-passing processes that communicate asynchronously in cyclic process networks governed by session types. We offer a gradual presentation of three typed process frameworks and outline how they may be used to guarantee deadlock freedom for a concurrent functional language with sessions. △ Less

Submitted 11 December, 2024; originally announced December 2024.

Comments: In Proceedings ICE 2024, arXiv:2412.07570

ACM Class: D.1.3; D.2.4; D.3.1

Journal ref: EPTCS 414, 2024, pp. 1-20

arXiv:2412.04728 [pdf, other]

doi 10.1145/3638380.3638440

Robots in the Wild: Contextually-Adaptive Human-Robot Interactions in Urban Public Environments

Authors: Xinyan Yu, Yiyuan Wang, Tram Thi Minh Tran, Yi Zhao, Julie Stephany Berrio Perez, Marius Hoggenmuller, Justine Humphry, Lian Loke, Lynn Masuda, Callum Parker, Martin Tomitsch, Stewart Worrall

Abstract: The increasing transition of human-robot interaction (HRI) context from controlled settings to dynamic, real-world public environments calls for enhanced adaptability in robotic systems. This can go beyond algorithmic navigation or traditional HRI strategies in structured settings, requiring the ability to navigate complex public urban systems containing multifaceted dynamics and various socio-tec… ▽ More The increasing transition of human-robot interaction (HRI) context from controlled settings to dynamic, real-world public environments calls for enhanced adaptability in robotic systems. This can go beyond algorithmic navigation or traditional HRI strategies in structured settings, requiring the ability to navigate complex public urban systems containing multifaceted dynamics and various socio-technical needs. Therefore, our proposed workshop seeks to extend the boundaries of adaptive HRI research beyond predictable, semi-structured contexts and highlight opportunities for adaptable robot interactions in urban public environments. This half-day workshop aims to explore design opportunities and challenges in creating contextually-adaptive HRI within these spaces and establish a network of interested parties within the OzCHI research community. By fostering ongoing discussions, sharing of insights, and collaborations, we aim to catalyse future research that empowers robots to navigate the inherent uncertainties and complexities of real-world public interactions. △ Less

Submitted 9 December, 2024; v1 submitted 5 December, 2024; originally announced December 2024.

arXiv:2412.02384 [pdf, other]

Theory building for empirical software engineering in qualitative research: Operationalization

Authors: Jorge Pérez, Jessica Díaz, Ángel González-Prieto, Sergio Gil-Borrás

Abstract: Context: This work is part of a research project whose ultimate goal is to systematize theory building in qualitative research in the field of software engineering. The proposed methodology involves four phases: conceptualization, operationalization, testing, and application. In previous work, we performed the conceptualization of a theory that investigates the structure of IT departments and team… ▽ More Context: This work is part of a research project whose ultimate goal is to systematize theory building in qualitative research in the field of software engineering. The proposed methodology involves four phases: conceptualization, operationalization, testing, and application. In previous work, we performed the conceptualization of a theory that investigates the structure of IT departments and teams when software-intensive organizations adopt a culture called DevOps. Objective: This paper presents a set of procedures to systematize the operationalization phase in theory building and their application in the context of DevOps team structures. Method: We operationalize the concepts and propositions that make up our theory to generate constructs and empirically testable hypotheses. Instead of using causal relations to operationalize the propositions, we adopt logical implication, which avoids the problems associated with causal reasoning. Strategies are proposed to ensure that the resulting theory aligns with the criterion of parsimony. Results: The operationalization phase is described from three perspectives: specification, implementation, and practical application. First, the operationalization process is formally defined. Second, a set of procedures for operating both concepts and propositions is described. Finally, the usefulness of the proposed procedures is demonstrated in a case study. Conclusions: This paper is a pioneering contribution in offering comprehensive guidelines for theory operationalization using logical implication. By following established procedures and using concrete examples, researchers can better ensure the success of their theory-building efforts through careful operationalization. △ Less

Submitted 3 December, 2024; originally announced December 2024.

Comments: 22 pages, 7 figures

ACM Class: D.2.0

arXiv:2411.18677 [pdf, other]

MatchDiffusion: Training-free Generation of Match-cuts

Authors: Alejandro Pardo, Fabio Pizzati, Tong Zhang, Alexander Pondaven, Philip Torr, Juan Camilo Perez, Bernard Ghanem

Abstract: Match-cuts are powerful cinematic tools that create seamless transitions between scenes, delivering strong visual and metaphorical connections. However, crafting match-cuts is a challenging, resource-intensive process requiring deliberate artistic planning. In MatchDiffusion, we present the first training-free method for match-cut generation using text-to-video diffusion models. MatchDiffusion lev… ▽ More Match-cuts are powerful cinematic tools that create seamless transitions between scenes, delivering strong visual and metaphorical connections. However, crafting match-cuts is a challenging, resource-intensive process requiring deliberate artistic planning. In MatchDiffusion, we present the first training-free method for match-cut generation using text-to-video diffusion models. MatchDiffusion leverages a key property of diffusion models: early denoising steps define the scene's broad structure, while later steps add details. Guided by this insight, MatchDiffusion employs "Joint Diffusion" to initialize generation for two prompts from shared noise, aligning structure and motion. It then applies "Disjoint Diffusion", allowing the videos to diverge and introduce unique details. This approach produces visually coherent videos suited for match-cuts. User studies and metrics demonstrate MatchDiffusion's effectiveness and potential to democratize match-cut creation. △ Less

Submitted 27 November, 2024; originally announced November 2024.

Comments: https://matchdiffusion.github.io

arXiv:2411.07714 [pdf, ps, other]

doi 10.46298/entics.14735

Typed Non-determinism in Concurrent Calculi: The Eager Way

Authors: Bas van den Heuvel, Daniele Nantes-Sobrinho, Joseph W. N. Paulus, Jorge A. Pérez

Abstract: We consider the problem of designing typed concurrent calculi with non-deterministic choice in which types leverage linearity for controlling resources, thereby ensuring strong correctness properties for processes. This problem is constrained by the delicate tension between non-determinism and linearity. Prior work developed a session-typed π-calculus with standard non-deterministic choice; well-t… ▽ More We consider the problem of designing typed concurrent calculi with non-deterministic choice in which types leverage linearity for controlling resources, thereby ensuring strong correctness properties for processes. This problem is constrained by the delicate tension between non-determinism and linearity. Prior work developed a session-typed π-calculus with standard non-deterministic choice; well-typed processes enjoy type preservation and deadlock-freedom. Central to this typed calculus is a lazy semantics that gradually discards branches in choices. This lazy semantics, however, is complex: various technical elements are needed to describe the non-deterministic behavior of typed processes. This paper develops an entirely new approach, based on an eager semantics, which more directly represents choices and commitment. We present a π-calculus in which non-deterministic choices are governed by this eager semantics and session types. We establish its key correctness properties, including deadlock-freedom, and demonstrate its expressivity by correctly translating a typed resource λ-calculus. △ Less

Submitted 7 December, 2024; v1 submitted 12 November, 2024; originally announced November 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2408.07915

Journal ref: Electronic Notes in Theoretical Informatics and Computer Science, Volume 4 - Proceedings of MFPS XL (December 11, 2024) entics:14735

arXiv:2410.20280 [pdf, other]

MarDini: Masked Autoregressive Diffusion for Video Generation at Scale

Authors: Haozhe Liu, Shikun Liu, Zijian Zhou, Mengmeng Xu, Yanping Xie, Xiao Han, Juan C. Pérez, Ding Liu, Kumara Kahatapitiya, Menglin Jia, Jui-Chieh Wu, Sen He, Tao Xiang, Jürgen Schmidhuber, Juan-Manuel Pérez-Rúa

Abstract: We introduce MarDini, a new family of video diffusion models that integrate the advantages of masked auto-regression (MAR) into a unified diffusion model (DM) framework. Here, MAR handles temporal planning, while DM focuses on spatial generation in an asymmetric network design: i) a MAR-based planning model containing most of the parameters generates planning signals for each masked frame using lo… ▽ More We introduce MarDini, a new family of video diffusion models that integrate the advantages of masked auto-regression (MAR) into a unified diffusion model (DM) framework. Here, MAR handles temporal planning, while DM focuses on spatial generation in an asymmetric network design: i) a MAR-based planning model containing most of the parameters generates planning signals for each masked frame using low-resolution input; ii) a lightweight generation model uses these signals to produce high-resolution frames via diffusion de-noising. MarDini's MAR enables video generation conditioned on any number of masked frames at any frame positions: a single model can handle video interpolation (e.g., masking middle frames), image-to-video generation (e.g., masking from the second frame onward), and video expansion (e.g., masking half the frames). The efficient design allocates most of the computational resources to the low-resolution planning model, making computationally expensive but important spatio-temporal attention feasible at scale. MarDini sets a new state-of-the-art for video interpolation; meanwhile, within few inference steps, it efficiently generates videos on par with those of much more expensive advanced image-to-video models. △ Less

Submitted 26 October, 2024; originally announced October 2024.

Comments: Project Page: https://mardini-vidgen.github.io

arXiv:2410.15690 [pdf, other]

Efficient Terminology Integration for LLM-based Translation in Specialized Domains

Authors: Sejoon Kim, Mingi Sung, Jeonghwan Lee, Hyunkuk Lim, Jorge Froilan Gimenez Perez

Abstract: Traditional machine translation methods typically involve training models directly on large parallel corpora, with limited emphasis on specialized terminology. However, In specialized fields such as patent, finance, or biomedical domains, terminology is crucial for translation, with many terms that needs to be translated following agreed-upon conventions. In this paper we introduce a methodology t… ▽ More Traditional machine translation methods typically involve training models directly on large parallel corpora, with limited emphasis on specialized terminology. However, In specialized fields such as patent, finance, or biomedical domains, terminology is crucial for translation, with many terms that needs to be translated following agreed-upon conventions. In this paper we introduce a methodology that efficiently trains models with a smaller amount of data while preserving the accuracy of terminology translation. We achieve this through a systematic process of term extraction and glossary creation using the Trie Tree algorithm, followed by data reconstruction to teach the LLM how to integrate these specialized terms. This methodology enhances the model's ability to handle specialized terminology and ensures high-quality translations, particularly in fields where term consistency is crucial. Our approach has demonstrated exceptional performance, achieving the highest translation score among participants in the WMT patent task to date, showcasing its effectiveness and broad applicability in specialized translation domains where general methods often fall short. △ Less

Submitted 21 October, 2024; originally announced October 2024.

Comments: Accepted to WMT 2024

arXiv:2410.12174 [pdf, other]

Exploring Large Language Models for Hate Speech Detection in Rioplatense Spanish

Authors: Juan Manuel Pérez, Paula Miguel, Viviana Cotik

Abstract: Hate speech detection deals with many language variants, slang, slurs, expression modalities, and cultural nuances. This outlines the importance of working with specific corpora, when addressing hate speech within the scope of Natural Language Processing, recently revolutionized by the irruption of Large Language Models. This work presents a brief analysis of the performance of large language mode… ▽ More Hate speech detection deals with many language variants, slang, slurs, expression modalities, and cultural nuances. This outlines the importance of working with specific corpora, when addressing hate speech within the scope of Natural Language Processing, recently revolutionized by the irruption of Large Language Models. This work presents a brief analysis of the performance of large language models in the detection of Hate Speech for Rioplatense Spanish. We performed classification experiments leveraging chain-of-thought reasoning with ChatGPT 3.5, Mixtral, and Aya, comparing their results with those of a state-of-the-art BERT classifier. These experiments outline that, even if large language models show a lower precision compared to the fine-tuned BERT classifier and, in some cases, they find hard-to-get slurs or colloquialisms, they still are sensitive to highly nuanced cases (particularly, homophobic/transphobic hate speech). We make our code and models publicly available for future research. △ Less

Submitted 15 October, 2024; originally announced October 2024.

arXiv:2410.04855 [pdf, other]

Unsupervised Skill Discovery for Robotic Manipulation through Automatic Task Generation

Authors: Paul Jansonnie, Bingbing Wu, Julien Perez, Jan Peters

Abstract: Learning skills that interact with objects is of major importance for robotic manipulation. These skills can indeed serve as an efficient prior for solving various manipulation tasks. We propose a novel Skill Learning approach that discovers composable behaviors by solving a large and diverse number of autonomously generated tasks. Our method learns skills allowing the robot to consistently and ro… ▽ More Learning skills that interact with objects is of major importance for robotic manipulation. These skills can indeed serve as an efficient prior for solving various manipulation tasks. We propose a novel Skill Learning approach that discovers composable behaviors by solving a large and diverse number of autonomously generated tasks. Our method learns skills allowing the robot to consistently and robustly interact with objects in its environment. The discovered behaviors are embedded in primitives which can be composed with Hierarchical Reinforcement Learning to solve unseen manipulation tasks. In particular, we leverage Asymmetric Self-Play to discover behaviors and Multiplicative Compositional Policies to embed them. We compare our method to Skill Learning baselines and find that our skills are more interactive. Furthermore, the learned skills can be used to solve a set of unseen manipulation tasks, in simulation as well as on a real robotic platform. △ Less

Submitted 7 October, 2024; originally announced October 2024.

Comments: Accepted at the 2024 IEEE-RAS International Conference on Humanoid Robots

arXiv:2409.05994 [pdf, other]

MessIRve: A Large-Scale Spanish Information Retrieval Dataset

Authors: Francisco Valentini, Viviana Cotik, Damián Furman, Ivan Bercovich, Edgar Altszyler, Juan Manuel Pérez

Abstract: Information retrieval (IR) is the task of finding relevant documents in response to a user query. Although Spanish is the second most spoken native language, current IR benchmarks lack Spanish data, hindering the development of information access tools for Spanish speakers. We introduce MessIRve, a large-scale Spanish IR dataset with around 730 thousand queries from Google's autocomplete API and r… ▽ More Information retrieval (IR) is the task of finding relevant documents in response to a user query. Although Spanish is the second most spoken native language, current IR benchmarks lack Spanish data, hindering the development of information access tools for Spanish speakers. We introduce MessIRve, a large-scale Spanish IR dataset with around 730 thousand queries from Google's autocomplete API and relevant documents sourced from Wikipedia. MessIRve's queries reflect diverse Spanish-speaking regions, unlike other datasets that are translated from English or do not consider dialectal variations. The large size of the dataset allows it to cover a wide variety of topics, unlike smaller datasets. We provide a comprehensive description of the dataset, comparisons with existing datasets, and baseline evaluations of prominent IR models. Our contributions aim to advance Spanish IR research and improve information access for Spanish speakers. △ Less

Submitted 9 September, 2024; originally announced September 2024.

arXiv:2408.13135 [pdf, other]

Deep Learning at the Intersection: Certified Robustness as a Tool for 3D Vision

Authors: Gabriel Pérez S, Juan C. Pérez, Motasem Alfarra, Jesús Zarzar, Sara Rojas, Bernard Ghanem, Pablo Arbeláez

Abstract: This paper presents preliminary work on a novel connection between certified robustness in machine learning and the modeling of 3D objects. We highlight an intriguing link between the Maximal Certified Radius (MCR) of a classifier representing a space's occupancy and the space's Signed Distance Function (SDF). Leveraging this relationship, we propose to use the certification method of randomized s… ▽ More This paper presents preliminary work on a novel connection between certified robustness in machine learning and the modeling of 3D objects. We highlight an intriguing link between the Maximal Certified Radius (MCR) of a classifier representing a space's occupancy and the space's Signed Distance Function (SDF). Leveraging this relationship, we propose to use the certification method of randomized smoothing (RS) to compute SDFs. Since RS' high computational cost prevents its practical usage as a way to compute SDFs, we propose an algorithm to efficiently run RS in low-dimensional applications, such as 3D space, by expressing RS' fundamental operations as Gaussian smoothing on pre-computed voxel grids. Our approach offers an innovative and practical tool to compute SDFs, validated through proof-of-concept experiments in novel view synthesis. This paper bridges two previously disparate areas of machine learning, opening new avenues for further exploration and potential cross-domain advancements. △ Less

Submitted 23 August, 2024; originally announced August 2024.

Comments: This paper is an accepted extended abstract to the LatinX workshop at ICCV 2023. This was uploaded a year late

arXiv:2408.09223 [pdf, other]

A theoretical framework for reservoir computing on networks of organic electrochemical transistors

Authors: Nicholas W. Landry, Beckett R. Hyde, Jake C. Perez, Sean E. Shaheen, Juan G. Restrepo

Abstract: Efficient and accurate prediction of physical systems is important even when the rules of those systems cannot be easily learned. Reservoir computing, a type of recurrent neural network with fixed nonlinear units, is one such prediction method and is valued for its ease of training. Organic electrochemical transistors (OECTs) are physical devices with nonlinear transient properties that can be use… ▽ More Efficient and accurate prediction of physical systems is important even when the rules of those systems cannot be easily learned. Reservoir computing, a type of recurrent neural network with fixed nonlinear units, is one such prediction method and is valued for its ease of training. Organic electrochemical transistors (OECTs) are physical devices with nonlinear transient properties that can be used as the nonlinear units of a reservoir computer. We present a theoretical framework for simulating reservoir computers using OECTs as the non-linear units as a test bed for designing physical reservoir computers. We present a proof of concept demonstrating that such an implementation can accurately predict the Lorenz attractor with comparable performance to standard reservoir computer implementations. We explore the effect of operating parameters and find that the prediction performance strongly depends on the pinch-off voltage of the OECTs. △ Less

Submitted 17 August, 2024; originally announced August 2024.

Comments: 10 pages, 8 figures

arXiv:2407.07762 [pdf]

doi 10.1109/TE.2023.3241099

Learning and Motivational Impact of Game-Based Learning: Comparing Face-to-Face and Online Formats on Computer Science Education

Authors: Daniel López-Fernández, Aldo Gordillo, Jennifer Pérez, Edmundo Tovar

Abstract: Contribution: This article analyzes the learning and motivational impact of teacher-authored educational video games on computer science education and compares its effectiveness in both face-to-face and online (remote) formats. This work presents comparative data and findings obtained from 217 students who played the game in a face-to-face format (control group) and 104 students who played the gam… ▽ More Contribution: This article analyzes the learning and motivational impact of teacher-authored educational video games on computer science education and compares its effectiveness in both face-to-face and online (remote) formats. This work presents comparative data and findings obtained from 217 students who played the game in a face-to-face format (control group) and 104 students who played the game in an online format (experimental group). Background: Serious video games have been proven effective at computer science education, however, it is still unknown whether the effectiveness of these games is the same regardless of their format, face-to-face or online. Moreover, the usage of games created through authoring tools has barely been explored. Research Questions: Are teacher-authored educational video games effective in terms of learning and motivation for computer science students? Does the effectiveness of teacher-authored educational video games depend on whether they are used in a face-to-face or online format? Methodology: A quasi-experiment has been conducted by using three instruments (pre-test, post-test, and questionnaire) with the purpose of comparing the effectiveness of game-based learning in face-to-face and online formats. A total of 321 computer science students played a teacher-authored educational video game aimed to learn about software design. Findings: The results reveal that teacher-authored educational video games are highly effective in terms of knowledge acquisition and motivation both in face-to-face and online formats. The results also show that some students' perceptions were more positive when a face-to-face format was used. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 10 pages, 3 figures. Accepted version of a journal article published in IEEE Transactions on Education

Journal ref: IEEE Transactions on Education, Volume 66, Issue 4, 2023

arXiv:2407.07258 [pdf, other]

Identification of emotions on Twitter during the 2022 electoral process in Colombia

Authors: Juan Jose Iguaran Fernandez, Juan Manuel Perez, German Rosati

Abstract: The study of Twitter as a means for analyzing social phenomena has gained interest in recent years due to the availability of large amounts of data in a relatively spontaneous environment. Within opinion-mining tasks, emotion detection is specially relevant, as it allows for the identification of people's subjective responses to different social events in a more granular way than traditional senti… ▽ More The study of Twitter as a means for analyzing social phenomena has gained interest in recent years due to the availability of large amounts of data in a relatively spontaneous environment. Within opinion-mining tasks, emotion detection is specially relevant, as it allows for the identification of people's subjective responses to different social events in a more granular way than traditional sentiment analysis based on polarity. In the particular case of political events, the analysis of emotions in social networks can provide valuable information on the perception of candidates, proposals, and other important aspects of the public debate. In spite of this importance, there are few studies on emotion detection in Spanish and, to the best of our knowledge, few resources are public for opinion mining in Colombian Spanish, highlighting the need for generating resources addressing the specific cultural characteristics of this variety. In this work, we present a small corpus of tweets in Spanish related to the 2022 Colombian presidential elections, manually labeled with emotions using a fine-grained taxonomy. We perform classification experiments using supervised state-of-the-art models (BERT models) and compare them with GPT-3.5 in few-shot learning settings. We make our dataset and code publicly available for research purposes. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.06391 [pdf, other]

Around Classical and Intuitionistic Linear Processes

Authors: Juan C. Jaramillo, Dan Frumin, Jorge A. Pérez

Abstract: Curry-Howard correspondences between Linear Logic (LL) and session types provide a firm foundation for concurrent processes. As the correspondences hold for intuitionistic and classic versions of LL (ILL and CLL), we obtain two different families of type systems for concurrency. An open question remains: how do these two families exactly relate to each other? Based upon a translation from CLL to I… ▽ More Curry-Howard correspondences between Linear Logic (LL) and session types provide a firm foundation for concurrent processes. As the correspondences hold for intuitionistic and classic versions of LL (ILL and CLL), we obtain two different families of type systems for concurrency. An open question remains: how do these two families exactly relate to each other? Based upon a translation from CLL to ILL due to Laurent (2018), we provide two complementary answers, in the form of full abstraction results based on a typed observational equivalence due to Atkey (2017). Our results elucidate hitherto missing formal links between seemingly related yet different type systems for concurrency. △ Less

Submitted 22 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

Comments: Full version, 19 pages + appendices

arXiv:2407.04503 [pdf, ps, other]

When LLMs Play the Telephone Game: Cultural Attractors as Conceptual Tools to Evaluate LLMs in Multi-turn Settings

Authors: Jérémy Perez, Grgur Kovač, Corentin Léger, Cédric Colas, Gaia Molinaro, Maxime Derex, Pierre-Yves Oudeyer, Clément Moulin-Frier

Abstract: As large language models (LLMs) start interacting with each other and generating an increasing amount of text online, it becomes crucial to better understand how information is transformed as it passes from one LLM to the next. While significant research has examined individual LLM behaviors, existing studies have largely overlooked the collective behaviors and information distortions arising from… ▽ More As large language models (LLMs) start interacting with each other and generating an increasing amount of text online, it becomes crucial to better understand how information is transformed as it passes from one LLM to the next. While significant research has examined individual LLM behaviors, existing studies have largely overlooked the collective behaviors and information distortions arising from iterated LLM interactions. Small biases, negligible at the single output level, risk being amplified in iterated interactions, potentially leading the content to evolve towards attractor states. In a series of telephone game experiments, we apply a transmission chain design borrowed from the human cultural evolution literature: LLM agents iteratively receive, produce, and transmit texts from the previous to the next agent in the chain. By tracking the evolution of text toxicity, positivity, difficulty, and length across transmission chains, we uncover the existence of biases and attractors, and study their dependence on the initial text, the instructions, language model, and model size. For instance, we find that more open-ended instructions lead to stronger attraction effects compared to more constrained tasks. We also find that different text properties display different sensitivity to attraction effects, with toxicity leading to stronger attractors than length. These findings highlight the importance of accounting for multi-step transmission dynamics and represent a first step towards a more comprehensive understanding of LLM cultural dynamics. △ Less

Submitted 2 June, 2025; v1 submitted 5 July, 2024; originally announced July 2024.

Comments: Code available at https://github.com/jeremyperez2/TelephoneGameLLM. Companion website with a Data Explorer tool at https://sites.google.com/view/telephone-game-llm

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2406.06474 [pdf, other]

Towards a Personal Health Large Language Model

Authors: Justin Cosentino, Anastasiya Belyaeva, Xin Liu, Nicholas A. Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G. Gomes, Allen Jiang, Roy Lee, Yun Liu, Javier Perez, Jameson K. Rogers, Cathy Speed, Shyam Tailor, Megan Walker, Jeffrey Yu, Tim Althoff, Conor Heneghan, John Hernandez, Mark Malhotra , et al. (9 additional authors not shown)

Abstract: In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We… ▽ More In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We created and curated three datasets that test 1) production of personalized insights and recommendations from sleep patterns, physical activity, and physiological responses, 2) expert domain knowledge, and 3) prediction of self-reported sleep outcomes. For the first task we designed 857 case studies in collaboration with domain experts to assess real-world scenarios in sleep and fitness. Through comprehensive evaluation of domain-specific rubrics, we observed that Gemini Ultra 1.0 and PH-LLM are not statistically different from expert performance in fitness and, while experts remain superior for sleep, fine-tuning PH-LLM provided significant improvements in using relevant domain knowledge and personalizing information for sleep insights. We evaluated PH-LLM domain knowledge using multiple choice sleep medicine and fitness examinations. PH-LLM achieved 79% on sleep and 88% on fitness, exceeding average scores from a sample of human experts. Finally, we trained PH-LLM to predict self-reported sleep quality outcomes from textual and multimodal encoding representations of wearable data, and demonstrate that multimodal encoding is required to match performance of specialized discriminative models. Although further development and evaluation are necessary in the safety-critical personal health domain, these results demonstrate both the broad knowledge and capabilities of Gemini models and the benefit of contextualizing physiological data for personal health applications as done with PH-LLM. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 72 pages

arXiv:2406.06464 [pdf, other]

Transforming Wearable Data into Health Insights using Large Language Model Agents

Authors: Mike A. Merrill, Akshay Paruchuri, Naghmeh Rezaei, Geza Kovacs, Javier Perez, Yun Liu, Erik Schenck, Nova Hammerquist, Jake Sunshine, Shyam Tailor, Kumar Ayush, Hao-Wei Su, Qian He, Cory Y. McLean, Mark Malhotra, Shwetak Patel, Jiening Zhan, Tim Althoff, Daniel McDuff, Xin Liu

Abstract: Despite the proliferation of wearable health trackers and the importance of sleep and exercise to health, deriving actionable personalized insights from wearable data remains a challenge because doing so requires non-trivial open-ended analysis of these data. The recent rise of large language model (LLM) agents, which can use tools to reason about and interact with the world, presents a promising… ▽ More Despite the proliferation of wearable health trackers and the importance of sleep and exercise to health, deriving actionable personalized insights from wearable data remains a challenge because doing so requires non-trivial open-ended analysis of these data. The recent rise of large language model (LLM) agents, which can use tools to reason about and interact with the world, presents a promising opportunity to enable such personalized analysis at scale. Yet, the application of LLM agents in analyzing personal health is still largely untapped. In this paper, we introduce the Personal Health Insights Agent (PHIA), an agent system that leverages state-of-the-art code generation and information retrieval tools to analyze and interpret behavioral health data from wearables. We curate two benchmark question-answering datasets of over 4000 health insights questions. Based on 650 hours of human and expert evaluation we find that PHIA can accurately address over 84% of factual numerical questions and more than 83% of crowd-sourced open-ended questions. This work has implications for advancing behavioral health across the population, potentially enabling individuals to interpret their own wearable data, and paving the way for a new era of accessible, personalized wellness regimens that are informed by data-driven insights. △ Less

Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

Comments: 38 pages

arXiv:2405.17146 [pdf, other]

Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration

Authors: Juan C. Pérez, Alejandro Pardo, Mattia Soldan, Hani Itani, Juan Leon-Alcazar, Bernard Ghanem

Abstract: This study investigates whether Compressed-Language Models (CLMs), i.e. language models operating on raw byte streams from Compressed File Formats~(CFFs), can understand files compressed by CFFs. We focus on the JPEG format as a representative CFF, given its commonality and its representativeness of key concepts in compression, such as entropy coding and run-length encoding. We test if CLMs unders… ▽ More This study investigates whether Compressed-Language Models (CLMs), i.e. language models operating on raw byte streams from Compressed File Formats~(CFFs), can understand files compressed by CFFs. We focus on the JPEG format as a representative CFF, given its commonality and its representativeness of key concepts in compression, such as entropy coding and run-length encoding. We test if CLMs understand the JPEG format by probing their capabilities to perform along three axes: recognition of inherent file properties, handling of files with anomalies, and generation of new files. Our findings demonstrate that CLMs can effectively perform these tasks. These results suggest that CLMs can understand the semantics of compressed data when directly operating on the byte streams of files produced by CFFs. The possibility to directly operate on raw compressed files offers the promise to leverage some of their remarkable characteristics, such as their ubiquity, compactness, multi-modality and segment-nature. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2403.08882 [pdf, other]

Cultural evolution in populations of Large Language Models

Authors: Jérémy Perez, Corentin Léger, Marcela Ovando-Tellez, Chris Foulon, Joan Dussauld, Pierre-Yves Oudeyer, Clément Moulin-Frier

Abstract: Research in cultural evolution aims at providing causal explanations for the change of culture over time. Over the past decades, this field has generated an important body of knowledge, using experimental, historical, and computational methods. While computational models have been very successful at generating testable hypotheses about the effects of several factors, such as population structure o… ▽ More Research in cultural evolution aims at providing causal explanations for the change of culture over time. Over the past decades, this field has generated an important body of knowledge, using experimental, historical, and computational methods. While computational models have been very successful at generating testable hypotheses about the effects of several factors, such as population structure or transmission biases, some phenomena have so far been more complex to capture using agent-based and formal models. This is in particular the case for the effect of the transformations of social information induced by evolved cognitive mechanisms. We here propose that leveraging the capacity of Large Language Models (LLMs) to mimic human behavior may be fruitful to address this gap. On top of being an useful approximation of human cultural dynamics, multi-agents models featuring generative agents are also important to study for their own sake. Indeed, as artificial agents are bound to participate more and more to the evolution of culture, it is crucial to better understand the dynamics of machine-generated cultural evolution. We here present a framework for simulating cultural evolution in populations of LLMs, allowing the manipulation of variables known to be important in cultural evolution, such as network structure, personality, and the way social information is aggregated and transformed. The software we developed for conducting these simulations is open-source and features an intuitive user-interface, which we hope will help to build bridges between the fields of cultural evolution and generative artificial intelligence. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: 17 pages, 20 figures. Open-source code available at https://github.com/jeremyperez2/LLM-Culture

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2403.07842 [pdf, other]

Quantifying and Mitigating Privacy Risks for Tabular Generative Models

Authors: Chaoyi Zhu, Jiayi Tang, Hans Brouwer, Juan F. Pérez, Marten van Dijk, Lydia Y. Chen

Abstract: Synthetic data from generative models emerges as the privacy-preserving data-sharing solution. Such a synthetic data set shall resemble the original data without revealing identifiable private information. The backbone technology of tabular synthesizers is rooted in image generative models, ranging from Generative Adversarial Networks (GANs) to recent diffusion models. Recent prior work sheds ligh… ▽ More Synthetic data from generative models emerges as the privacy-preserving data-sharing solution. Such a synthetic data set shall resemble the original data without revealing identifiable private information. The backbone technology of tabular synthesizers is rooted in image generative models, ranging from Generative Adversarial Networks (GANs) to recent diffusion models. Recent prior work sheds light on the utility-privacy tradeoff on tabular data, revealing and quantifying privacy risks on synthetic data. We first conduct an exhaustive empirical analysis, highlighting the utility-privacy tradeoff of five state-of-the-art tabular synthesizers, against eight privacy attacks, with a special focus on membership inference attacks. Motivated by the observation of high data quality but also high privacy risk in tabular diffusion, we propose DP-TLDM, Differentially Private Tabular Latent Diffusion Model, which is composed of an autoencoder network to encode the tabular data and a latent diffusion model to synthesize the latent tables. Following the emerging f-DP framework, we apply DP-SGD to train the auto-encoder in combination with batch clipping and use the separation value as the privacy metric to better capture the privacy gain from DP algorithms. Our empirical evaluation demonstrates that DP-TLDM is capable of achieving a meaningful theoretical privacy guarantee while also significantly enhancing the utility of synthetic data. Specifically, compared to other DP-protected tabular generative models, DP-TLDM improves the synthetic quality by an average of 35% in data resemblance, 15% in the utility for downstream tasks, and 50% in data discriminability, all while preserving a comparable level of privacy risk. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2402.00823 [pdf, other]

SLIM: Skill Learning with Multiple Critics

Authors: David Emukpere, Bingbing Wu, Julien Perez, Jean-Michel Renders

Abstract: Self-supervised skill learning aims to acquire useful behaviors that leverage the underlying dynamics of the environment. Latent variable models, based on mutual information maximization, have been successful in this task but still struggle in the context of robotic manipulation. As it requires impacting a possibly large set of degrees of freedom composing the environment, mutual information maxim… ▽ More Self-supervised skill learning aims to acquire useful behaviors that leverage the underlying dynamics of the environment. Latent variable models, based on mutual information maximization, have been successful in this task but still struggle in the context of robotic manipulation. As it requires impacting a possibly large set of degrees of freedom composing the environment, mutual information maximization fails alone in producing useful and safe manipulation behaviors. Furthermore, tackling this by augmenting skill discovery rewards with additional rewards through a naive combination might fail to produce desired behaviors. To address this limitation, we introduce SLIM, a multi-critic learning approach for skill discovery with a particular focus on robotic manipulation. Our main insight is that utilizing multiple critics in an actor-critic framework to gracefully combine multiple reward functions leads to a significant improvement in latent-variable skill discovery for robotic manipulation while overcoming possible interference occurring among rewards which hinders convergence to useful skills. Furthermore, in the context of tabletop manipulation, we demonstrate the applicability of our novel skill discovery approach to acquire safe and efficient motor primitives in a hierarchical reinforcement learning fashion and leverage them through planning, significantly surpassing baseline approaches for skill discovery. △ Less

Submitted 21 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

Comments: Accepted at IEEE ICRA 2024

arXiv:2401.16434 [pdf]

doi 10.1016/j.egyr.2023.01.039

A novel ANROA based control approach for grid-tied multi-functional solar energy conversion system

Authors: Dinanath Prasad, Narendra Kumar, Rakhi Sharma, Hasmat Malik, Fausto Pedro García Márquez, Jesús María Pinar Pérez

Abstract: An adaptive control approach for a three-phase grid-interfaced solar photovoltaic system based on the new Neuro-Fuzzy Inference System with Rain Optimization Algorithm (ANROA) methodology is proposed and discussed in this manuscript. This method incorporates an Adaptive Neuro-fuzzy Inference System (ANFIS) with a Rain Optimization Algorithm (ROA). The ANFIS controller has excellent maximum trackin… ▽ More An adaptive control approach for a three-phase grid-interfaced solar photovoltaic system based on the new Neuro-Fuzzy Inference System with Rain Optimization Algorithm (ANROA) methodology is proposed and discussed in this manuscript. This method incorporates an Adaptive Neuro-fuzzy Inference System (ANFIS) with a Rain Optimization Algorithm (ROA). The ANFIS controller has excellent maximum tracking capability because it includes features of both neural and fuzzy techniques. The ROA technique is in charge of controlling the voltage source converter switching. Avoiding power quality problems including voltage fluctuations, harmonics, and flickers as well as unbalanced loads and reactive power usage is the major goal. Besides, the proposed method performs at zero voltage regulation and unity power factor modes. The suggested control approach has been modeled and simulated, and its performance has been assessed using existing alternative methods. A statistical analysis of proposed and existing techniques has been also presented and discussed. The results of the simulations demonstrate that, when compared to alternative approaches, the suggested strategy may properly and effectively identify the best global solutions. Furthermore, the system's robustness has been studied by using MATLAB/SIMULINK environment and experimentally by Field Programmable Gate Arrays Controller (FPGA)-based Hardware-in-Loop (HLL). △ Less

Submitted 26 January, 2024; originally announced January 2024.

Comments: The paper was published in Energy Reports journal (ELSEVIER). Cite as: Prasad, D., Kumar, N., Sharma, R., Malik, H., Márquez, F. P. G., & Pinar-Pérez, J. M. (2023). A novel ANROA based control approach for grid-tied multi-functional solar energy conversion system. Energy Reports, 9, 2044-2057

Journal ref: Energy Reports (2023) Elsevier

arXiv:2401.14763 [pdf, ps, other]

Comparing Session Type Systems derived from Linear Logic

Authors: Bas van den Heuvel, Jorge A. Pérez

Abstract: Session types are a typed approach to message-passing concurrency, where types describe sequences of intended exchanges over channels. Session type systems have been given strong logical foundations via Curry-Howard correspondences with linear logic, a resource-aware logic that naturally captures structured interactions. These logical foundations provide an elegant framework to specify and (static… ▽ More Session types are a typed approach to message-passing concurrency, where types describe sequences of intended exchanges over channels. Session type systems have been given strong logical foundations via Curry-Howard correspondences with linear logic, a resource-aware logic that naturally captures structured interactions. These logical foundations provide an elegant framework to specify and (statically) verify message-passing processes. In this paper, we rigorously compare different type systems for concurrency derived from the Curry-Howard correspondence between linear logic and session types. We address the main divide between these type systems: the classical and intuitionistic presentations of linear logic. Over the years, these presentations have given rise to separate research strands on logical foundations for concurrency; the differences between their derived type systems have only been addressed informally. To formally assess these differences, we develop $π\mathsf{ULL}$, a session type system that encompasses type systems derived from classical and intuitionistic interpretations of linear logic. Based on a fragment of Girard's Logic of Unity, $π\mathsf{ULL}$ provides a basic reference framework: we compare existing session type systems by characterizing fragments of $π\mathsf{ULL}$ that coincide with classical and intuitionistic formulations. We analyze the significance of our characterizations by considering the locality principle (enforced by intuitionistic interpretations but not by classical ones) and forms of process composition induced by the interpretations. △ Less

Submitted 22 August, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

Comments: Preprint to appear in JLAMP; revised/extended version of https://doi.org/10.4204/EPTCS.314.1

arXiv:2401.08251 [pdf]

doi 10.1016/j.rser.2022.112753

A techno-economic model for avoiding conflicts of interest between owners of offshore wind farms and maintenance suppliers

Authors: Alberto Pliego Marugán, Fausto Pedro García Márquez, Jesús María Pinar Pérez

Abstract: Currently, wind energy is one of the most important sources of renewable energy. Offshore locations for wind turbines are increasingly exploited because of their numerous advantages. However, offshore wind farms require high investment in maintenance service. Due to its complexity and special requirements, maintenance service is usually outsourced by wind farm owners. In this paper, we propose a n… ▽ More Currently, wind energy is one of the most important sources of renewable energy. Offshore locations for wind turbines are increasingly exploited because of their numerous advantages. However, offshore wind farms require high investment in maintenance service. Due to its complexity and special requirements, maintenance service is usually outsourced by wind farm owners. In this paper, we propose a novel approach to determine, quantify, and reduce the possible conflicts of interest between owners and maintenance suppliers. We created a complete techno-economic model to address this problem from an impartial point of view. An iterative process was developed to obtain statistical results that can help stakeholders negotiate the terms of the contract, in which the availability of the wind farm is the reference parameter by which to determine penalisations and incentives. Moreover, a multi-objective programming problem was addressed that maximises the profits of both parties without losing the alignment of their interests. The main scientific contribution of this paper is the maintenance analysis of offshore wind farms from two perspectives: that of the owner and the maintenance supplier. This analysis evaluates the conflicts of interest of both parties. In addition, we demonstrate that proper adjustment of some parameters, such as penalisation, incentives, and resources, and adequate control of availability can help reduce this conflict of interests. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: Published in Renewable and Sustainable Energy Reviews (ELSEVIER) 10 July 2022. DOI: https://doi.org/10.1016/j.rser.2022.112753 Cite as: Marugán, A. P., Márquez, F. P. G., & Pérez, J. M. P. (2022). A techno-economic model for avoiding conflicts of interest between owners of offshore wind farms and maintenance suppliers. Renewable and Sustainable Energy Reviews, 168, 112753

arXiv:2312.12487 [pdf, other]

Adaptive Guidance: Training-free Acceleration of Conditional Diffusion Models

Authors: Angela Castillo, Jonas Kohler, Juan C. Pérez, Juan Pablo Pérez, Albert Pumarola, Bernard Ghanem, Pablo Arbeláez, Ali Thabet

Abstract: This paper presents a comprehensive study on the role of Classifier-Free Guidance (CFG) in text-conditioned diffusion models from the perspective of inference efficiency. In particular, we relax the default choice of applying CFG in all diffusion steps and instead search for efficient guidance policies. We formulate the discovery of such policies in the differentiable Neural Architecture Search fr… ▽ More This paper presents a comprehensive study on the role of Classifier-Free Guidance (CFG) in text-conditioned diffusion models from the perspective of inference efficiency. In particular, we relax the default choice of applying CFG in all diffusion steps and instead search for efficient guidance policies. We formulate the discovery of such policies in the differentiable Neural Architecture Search framework. Our findings suggest that the denoising steps proposed by CFG become increasingly aligned with simple conditional steps, which renders the extra neural network evaluation of CFG redundant, especially in the second half of the denoising process. Building upon this insight, we propose "Adaptive Guidance" (AG), an efficient variant of CFG, that adaptively omits network evaluations when the denoising process displays convergence. Our experiments demonstrate that AG preserves CFG's image quality while reducing computation by 25%. Thus, AG constitutes a plug-and-play alternative to Guidance Distillation, achieving 50% of the speed-ups of the latter while being training-free and retaining the capacity to handle negative prompts. Finally, we uncover further redundancies of CFG in the first half of the diffusion process, showing that entire neural function evaluations can be replaced by simple affine transformations of past score estimates. This method, termed LinearAG, offers even cheaper inference at the cost of deviating from the baseline model. Our findings provide insights into the efficiency of the conditional denoising process that contribute to more practical and swift deployment of text-conditioned diffusion models. △ Less

Submitted 19 December, 2023; originally announced December 2023.

arXiv:2312.11075 [pdf, other]

doi 10.18653/v1/2024.acl-long.622

Split and Rephrase with Large Language Models

Authors: David Ponce, Thierry Etchegoyhen, Jesús Calleja Pérez, Harritxu Gete

Abstract: The Split and Rephrase (SPRP) task, which consists in splitting complex sentences into a sequence of shorter grammatical sentences, while preserving the original meaning, can facilitate the processing of complex texts for humans and machines alike. It is also a valuable testbed to evaluate natural language processing models, as it requires modelling complex grammatical aspects. In this work, we ev… ▽ More The Split and Rephrase (SPRP) task, which consists in splitting complex sentences into a sequence of shorter grammatical sentences, while preserving the original meaning, can facilitate the processing of complex texts for humans and machines alike. It is also a valuable testbed to evaluate natural language processing models, as it requires modelling complex grammatical aspects. In this work, we evaluate large language models on the task, showing that they can provide large improvements over the state of the art on the main metrics, although still lagging in terms of splitting compliance. Results from two human evaluations further support the conclusions drawn from automated metric results. We provide a comprehensive study that includes prompting variants, domain shift, fine-tuned pretrained language models of varying parameter size and training data volumes, contrasted with both zero-shot and few-shot approaches on instruction-tuned language models. Although the latter were markedly outperformed by fine-tuned models, they may constitute a reasonable off-the-shelf alternative. Our results provide a fine-grained analysis of the potential and limitations of large language models for SPRP, with significant improvements achievable using relatively small amounts of training data and model parameters overall, and remaining limitations for all models on the task. △ Less

Submitted 3 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2311.07894 [pdf]

Security in Drones

Authors: Jonathan Morgan, Julio Perez, Jordan Wade, Sundar Krishnan

Abstract: Drones are used in our everyday world for private, commercial, and government uses. It is important to establish both the cyber threats drone users face and security practices to combat those threats. Privacy will always be the main concern when using drones. Protecting information legally collected on drones and protecting people from the illegal collection of their data are topics that security… ▽ More Drones are used in our everyday world for private, commercial, and government uses. It is important to establish both the cyber threats drone users face and security practices to combat those threats. Privacy will always be the main concern when using drones. Protecting information legally collected on drones and protecting people from the illegal collection of their data are topics that security professionals should consider before their organization uses drones. In this article, the authors discuss the importance of security in drones. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2310.19075 [pdf, other]

Bespoke Solvers for Generative Flow Models

Authors: Neta Shaul, Juan Perez, Ricky T. Q. Chen, Ali Thabet, Albert Pumarola, Yaron Lipman

Abstract: Diffusion or flow-based models are powerful generative paradigms that are notoriously hard to sample as samples are defined as solutions to high-dimensional Ordinary or Stochastic Differential Equations (ODEs/SDEs) which require a large Number of Function Evaluations (NFE) to approximate well. Existing methods to alleviate the costly sampling process include model distillation and designing dedica… ▽ More Diffusion or flow-based models are powerful generative paradigms that are notoriously hard to sample as samples are defined as solutions to high-dimensional Ordinary or Stochastic Differential Equations (ODEs/SDEs) which require a large Number of Function Evaluations (NFE) to approximate well. Existing methods to alleviate the costly sampling process include model distillation and designing dedicated ODE solvers. However, distillation is costly to train and sometimes can deteriorate quality, while dedicated solvers still require relatively large NFE to produce high quality samples. In this paper we introduce "Bespoke solvers", a novel framework for constructing custom ODE solvers tailored to the ODE of a given pre-trained flow model. Our approach optimizes an order consistent and parameter-efficient solver (e.g., with 80 learnable parameters), is trained for roughly 1% of the GPU time required for training the pre-trained model, and significantly improves approximation and generation quality compared to dedicated solvers. For example, a Bespoke solver for a CIFAR10 model produces samples with Fréchet Inception Distance (FID) of 2.73 with 10 NFE, and gets to 1% of the Ground Truth (GT) FID (2.59) for this model with only 20 NFE. On the more challenging ImageNet-64$\times$64, Bespoke samples at 2.2 FID with 10 NFE, and gets within 2% of GT FID (1.71) with 20 NFE. △ Less

Submitted 29 October, 2023; originally announced October 2023.

arXiv:2310.07173 [pdf]

Unleashing quantum algorithms with Qinterpreter: bridging the gap between theory and practice across leading quantum computing platforms

Authors: Wilmer Contreras Sepúlveda, Ángel David Torres-Palencia, José Javier Sánchez Mondragón, Braulio Misael Villegas-Martínez, J. Jesús Escobedo-Alatorre, Sandra Gesing, Néstor Lozano-Crisóstomo, Julio César García-Melgarejo, Juan Carlos Sánchez Pérez, Eddie Nelson Palacios- Pérez, Omar PalilleroSandoval

Abstract: Quantum computing is a rapidly emerging and promising field that has the potential to revolutionize numerous research domains, including drug design, network technologies and sustainable energy. Due to the inherent complexity and divergence from classical computing, several major quantum computing libraries have been developed to implement quantum algorithms, namely IBM Qiskit, Amazon Braket, Cirq… ▽ More Quantum computing is a rapidly emerging and promising field that has the potential to revolutionize numerous research domains, including drug design, network technologies and sustainable energy. Due to the inherent complexity and divergence from classical computing, several major quantum computing libraries have been developed to implement quantum algorithms, namely IBM Qiskit, Amazon Braket, Cirq, PyQuil, and PennyLane. These libraries allow for quantum simulations on classical computers and facilitate program execution on corresponding quantum hardware, e.g., Qiskit programs on IBM quantum computers. While all platforms have some differences, the main concepts are the same. QInterpreter is a tool embedded in the Quantum Science Gateway QubitHub using Jupyter Notebooks that translates seamlessly programs from one library to the other and visualizes the results. It combines the five well-known quantum libraries: into a unified framework. Designed as an educational tool for beginners, Qinterpreter enables the development and execution of quantum circuits across various platforms in a straightforward way. The work highlights the versatility and accessibility of Qinterpreter in quantum programming and underscores our ultimate goal of pervading Quantum Computing through younger, less specialized, and diverse cultural and national communities. △ Less

Submitted 16 October, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

Comments: Final article submitted to Peer J computer science Journal

arXiv:2309.16683 [pdf, other]

doi 10.1038/s41598-023-38259-7

Controlling the Solo12 Quadruped Robot with Deep Reinforcement Learning

Authors: Michel Aractingi, Pierre-Alexandre Léziart, Thomas Flayols, Julien Perez, Tomi Silander, Philippe Souères

Abstract: Quadruped robots require robust and general locomotion skills to exploit their mobility potential in complex and challenging environments. In this work, we present the first implementation of a robust end-to-end learning-based controller on the Solo12 quadruped. Our method is based on deep reinforcement learning of joint impedance references. The resulting control policies follow a commanded veloc… ▽ More Quadruped robots require robust and general locomotion skills to exploit their mobility potential in complex and challenging environments. In this work, we present the first implementation of a robust end-to-end learning-based controller on the Solo12 quadruped. Our method is based on deep reinforcement learning of joint impedance references. The resulting control policies follow a commanded velocity reference while being efficient in its energy consumption, robust and easy to deploy. We detail the learning procedure and method for transfer on the real robot. In our experiments, we show that the Solo12 robot is a suitable open-source platform for research combining learning and control because of the easiness in transferring and deploying learned controllers. △ Less

Submitted 2 August, 2023; originally announced September 2023.

Report number: Rapport LAAS no 22263

Journal ref: Scientific Reports, 2023, 13 (11945), pp.12

arXiv:2309.08428 [pdf, other]

Virtual Harassment, Real Understanding: Using a Serious Game and Bayesian Networks to Study Cyberbullying

Authors: Jaime Pérez, Mario Castro, Edmond Awad, Gregorio López

Abstract: Cyberbullying among minors is a pressing concern in our digital society, necessitating effective prevention and intervention strategies. Traditional data collection methods often intrude on privacy and yield limited insights. This study explores an innovative approach, employing a serious game - designed with purposes beyond entertainment - as a non-intrusive tool for data collection and education… ▽ More Cyberbullying among minors is a pressing concern in our digital society, necessitating effective prevention and intervention strategies. Traditional data collection methods often intrude on privacy and yield limited insights. This study explores an innovative approach, employing a serious game - designed with purposes beyond entertainment - as a non-intrusive tool for data collection and education. In contrast to traditional correlation-based analyses, we propose a causality-based approach using Bayesian Networks to unravel complex relationships in the collected data and quantify result uncertainties. This robust analytical tool yields interpretable outcomes, enhances transparency in assumptions, and fosters open scientific discourse. Preliminary pilot studies with the serious game show promising results, surpassing the informative capacity of traditional demographic and psychological questionnaires, suggesting its potential as an alternative methodology. Additionally, we demonstrate how our approach facilitates the examination of risk profiles and the identification of intervention strategies to mitigate this cybercrime. We also address research limitations and potential enhancements, considering the noise and variability of data in social studies and video games. This research advances our understanding of cyberbullying and showcase the potential of serious games and causality-based approaches in studying complex social issues. △ Less

Submitted 15 September, 2023; originally announced September 2023.

arXiv:2309.06046 [pdf, other]

BatMan-CLR: Making Few-shots Meta-Learners Resilient Against Label Noise

Authors: Jeroen M. Galjaard, Robert Birke, Juan Perez, Lydia Y. Chen

Abstract: The negative impact of label noise is well studied in classical supervised learning yet remains an open research question in meta-learning. Meta-learners aim to adapt to unseen learning tasks by learning a good initial model in meta-training and consecutively fine-tuning it according to new tasks during meta-testing. In this paper, we present the first extensive analysis of the impact of varying l… ▽ More The negative impact of label noise is well studied in classical supervised learning yet remains an open research question in meta-learning. Meta-learners aim to adapt to unseen learning tasks by learning a good initial model in meta-training and consecutively fine-tuning it according to new tasks during meta-testing. In this paper, we present the first extensive analysis of the impact of varying levels of label noise on the performance of state-of-the-art meta-learners, specifically gradient-based $N$-way $K$-shot learners. We show that the accuracy of Reptile, iMAML, and foMAML drops by up to 42% on the Omniglot and CifarFS datasets when meta-training is affected by label noise. To strengthen the resilience against label noise, we propose two sampling techniques, namely manifold (Man) and batch manifold (BatMan), which transform the noisy supervised learners into semi-supervised ones to increase the utility of noisy labels. We first construct manifold samples of $N$-way $2$-contrastive-shot tasks through augmentation, learning the embedding via a contrastive loss in meta-training, and then perform classification through zeroing on the embedding in meta-testing. We show that our approach can effectively mitigate the impact of meta-training label noise. Even with 60% wrong labels \batman and \man can limit the meta-testing accuracy drop to ${2.5}$, ${9.4}$, ${1.1}$ percent points, respectively, with existing meta-learners across the Omniglot, CifarFS, and MiniImagenet datasets. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: 10 pages,3 figures

arXiv:2308.15075 [pdf, other]

Benchmarking 5G MEC and Cloud infrastructures for planning IoT messaging of CCAM data

Authors: Felipe Mogollón, Zaloa Fernández, Josu Pérez, Ángel Martín

Abstract: Vehicles embed lots of sensors supporting driving and safety. Combined with connectivity, they bring new possibilities for Connected, Cooperative and Automated Mobility (CCAM) services that exploit local and global data for a wide understanding beyond the myopic view of local sensors. Internet of Things (IoT) messaging solutions are ideal for vehicular data as they ship core features like the sepa… ▽ More Vehicles embed lots of sensors supporting driving and safety. Combined with connectivity, they bring new possibilities for Connected, Cooperative and Automated Mobility (CCAM) services that exploit local and global data for a wide understanding beyond the myopic view of local sensors. Internet of Things (IoT) messaging solutions are ideal for vehicular data as they ship core features like the separation of geographic areas, the fusion of different producers on data/sensor types, and concurrent subscription support. Multi-access Edge Computing (MEC) and Cloud infrastructures are key to hosting a virtualized and distributed IoT platform. Currently, the are no benchmarks for assessing the appropriate size of an IoT platform for multiple vehicular data types such as text, image, binary point clouds and video-formatted samples. This paper formulates and executes the tests to get a benchmarking of the performance of a MEC and Cloud platform according to actors' concurrency, data volumes and business levels parameters. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: 6 pages, 5 figures, 6 tables, IEEE International Conference on Intelligent Transportation Systems

arXiv:2308.02976 [pdf, ps, other]

Spanish Pre-trained BERT Model and Evaluation Data

Authors: José Cañete, Gabriel Chaperon, Rodrigo Fuentes, Jou-Hui Ho, Hojin Kang, Jorge Pérez

Abstract: The Spanish language is one of the top 5 spoken languages in the world. Nevertheless, finding resources to train or evaluate Spanish language models is not an easy task. In this paper we help bridge this gap by presenting a BERT-based language model pre-trained exclusively on Spanish data. As a second contribution, we also compiled several tasks specifically for the Spanish language in a single re… ▽ More The Spanish language is one of the top 5 spoken languages in the world. Nevertheless, finding resources to train or evaluate Spanish language models is not an easy task. In this paper we help bridge this gap by presenting a BERT-based language model pre-trained exclusively on Spanish data. As a second contribution, we also compiled several tasks specifically for the Spanish language in a single repository much in the spirit of the GLUE benchmark. By fine-tuning our pre-trained Spanish model, we obtain better results compared to other BERT-based models pre-trained on multilingual corpora for most of the tasks, even achieving a new state-of-the-art on some of them. We have publicly released our model, the pre-training data, and the compilation of the Spanish benchmarks. △ Less

Submitted 5 August, 2023; originally announced August 2023.

Comments: Published as workshop paper at Practical ML for Developing Countries Workshop @ ICLR 2020

arXiv:2308.02197 [pdf, other]

Edge Dynamic Map architecture for C-ITS applications

Authors: Mikel García, Gorka Velez, Josu Pérez, Ángel Martín, Zaloa Fernández, Naiara Aginako

Abstract: Cooperative Intelligent Transport Systems (C-ITS) create, share and process massive amounts of data which needs to be real-time managed to enable new cooperative and autonomous driving applications. Vehicle-to-Everything (V2X) communications facilitate information exchange among vehicles and infrastructures using various protocols. By providing computer power, data storage, and low latency capabil… ▽ More Cooperative Intelligent Transport Systems (C-ITS) create, share and process massive amounts of data which needs to be real-time managed to enable new cooperative and autonomous driving applications. Vehicle-to-Everything (V2X) communications facilitate information exchange among vehicles and infrastructures using various protocols. By providing computer power, data storage, and low latency capabilities, Multi-access Edge Computing (MEC) has become a key enabling technology in the transport industry. The Local Dynamic Map (LDM) concept has consequently been extended to its utilisation in MECs, into an efficient, collaborative, and centralised Edge Dynamic Map (EDM) for C-ITS applications. This research presents an EDM architecture for V2X communications and implements a real-time proof-of-concept using a Time-Series Database (TSDB) engine to store vehicular message information. The performance evaluation includes data insertion and querying, assessing the system's capacity and scale for low-latency Cooperative Awareness Message (CAM) applications. Traffic simulations using SUMO have been employed to generate virtual routes for thousands of vehicles, demonstrating the transmission of virtual CAM messages to the EDM. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Comments: Accepted in the 26th IEEE International Conference on Intelligent Transportation Systems (ITSC 2023)

arXiv:2308.01165 [pdf, ps, other]

Termination in Concurrency, Revisited

Authors: Joseph W. N. Paulus, Jorge A. Pérez, Daniele Nantes-Sobrinho

Abstract: Termination is a central property in sequential programming models: a term is terminating if all its reduction sequences are finite. Termination is also important in concurrency in general, and for message-passing programs in particular. A variety of type systems that enforce termination by typing have been developed. In this paper, we rigorously compare several type systems for $π$-calculus proce… ▽ More Termination is a central property in sequential programming models: a term is terminating if all its reduction sequences are finite. Termination is also important in concurrency in general, and for message-passing programs in particular. A variety of type systems that enforce termination by typing have been developed. In this paper, we rigorously compare several type systems for $π$-calculus processes from the unifying perspective of termination. Adopting session types as reference framework, we consider two different type systems: one follows Deng and Sangiorgi's weight-based approach; the other is Caires and Pfenning's Curry-Howard correspondence between linear logic and session types. Our technical results precisely connect these very different type systems, and shed light on the classes of client/server interactions they admit as correct. △ Less

Submitted 2 August, 2023; originally announced August 2023.

arXiv:2306.10985 [pdf, other]

LARG, Language-based Automatic Reward and Goal Generation

Authors: Julien Perez, Denys Proux, Claude Roux, Michael Niemaz

Abstract: Goal-conditioned and Multi-Task Reinforcement Learning (GCRL and MTRL) address numerous problems related to robot learning, including locomotion, navigation, and manipulation scenarios. Recent works focusing on language-defined robotic manipulation tasks have led to the tedious production of massive human annotations to create dataset of textual descriptions associated with trajectories. To levera… ▽ More Goal-conditioned and Multi-Task Reinforcement Learning (GCRL and MTRL) address numerous problems related to robot learning, including locomotion, navigation, and manipulation scenarios. Recent works focusing on language-defined robotic manipulation tasks have led to the tedious production of massive human annotations to create dataset of textual descriptions associated with trajectories. To leverage reinforcement learning with text-based task descriptions, we need to produce reward functions associated with individual tasks in a scalable manner. In this paper, we leverage recent capabilities of Large Language Models (LLMs) and introduce \larg, Language-based Automatic Reward and Goal Generation, an approach that converts a text-based task description into its corresponding reward and goal-generation functions We evaluate our approach for robotic manipulation and demonstrate its ability to train and execute policies in a scalable manner, without the need for handcrafted reward functions. △ Less

Submitted 19 June, 2023; originally announced June 2023.

arXiv:2306.08904 [pdf, other]

Enhancing Neural Rendering Methods with Image Augmentations

Authors: Juan C. Pérez, Sara Rojas, Jesus Zarzar, Bernard Ghanem

Abstract: Faithfully reconstructing 3D geometry and generating novel views of scenes are critical tasks in 3D computer vision. Despite the widespread use of image augmentations across computer vision applications, their potential remains underexplored when learning neural rendering methods (NRMs) for 3D scenes. This paper presents a comprehensive analysis of the use of image augmentations in NRMs, where we… ▽ More Faithfully reconstructing 3D geometry and generating novel views of scenes are critical tasks in 3D computer vision. Despite the widespread use of image augmentations across computer vision applications, their potential remains underexplored when learning neural rendering methods (NRMs) for 3D scenes. This paper presents a comprehensive analysis of the use of image augmentations in NRMs, where we explore different augmentation strategies. We found that introducing image augmentations during training presents challenges such as geometric and photometric inconsistencies for learning NRMs from images. Specifically, geometric inconsistencies arise from alterations in shapes, positions, and orientations from the augmentations, disrupting spatial cues necessary for accurate 3D reconstruction. On the other hand, photometric inconsistencies arise from changes in pixel intensities introduced by the augmentations, affecting the ability to capture the underlying 3D structures of the scene. We alleviate these issues by focusing on color manipulations and introducing learnable appearance embeddings that allow NRMs to explain away photometric variations. Our experiments demonstrate the benefits of incorporating augmentations when learning NRMs, including improved photometric quality and surface reconstruction, as well as enhanced robustness against data quality issues, such as reduced training data and image degradations. △ Less

Submitted 15 June, 2023; originally announced June 2023.

arXiv:2306.04204 [pdf, ps, other]

Monitoring Blackbox Implementations of Multiparty Session Protocols

Authors: Bas van den Heuvel, Jorge A. Pérez, Rares A. Dobre

Abstract: We present a framework for the distributed monitoring of networks of components that coordinate by message-passing, following multiparty session protocols specified as global types. We improve over prior works by (i) supporting components whose exact specification is unknown ("blackboxes") and (ii) covering protocols that cannot be analyzed by existing techniques. We first give a procedure for syn… ▽ More We present a framework for the distributed monitoring of networks of components that coordinate by message-passing, following multiparty session protocols specified as global types. We improve over prior works by (i) supporting components whose exact specification is unknown ("blackboxes") and (ii) covering protocols that cannot be analyzed by existing techniques. We first give a procedure for synthesizing monitors for blackboxes from global types, and precisely define when a blackbox correctly satisfies its global type. Then, we prove that monitored blackboxes are sound (they correctly follow the protocol) and transparent (blackboxes with and without monitors are behaviorally equivalent). △ Less

Submitted 3 October, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

Comments: Full version with appendices of our RV'23 paper

arXiv:2306.01365 [pdf, other]

doi 10.1016/j.knosys.2024.111440

Generation of Probabilistic Synthetic Data for Serious Games: A Case Study on Cyberbullying

Authors: Jaime Pérez, Mario Castro, Edmond Awad, Gregorio López

Abstract: Synthetic data generation has been a growing area of research in recent years. However, its potential applications in serious games have not been thoroughly explored. Advances in this field could anticipate data modelling and analysis, as well as speed up the development process. To try to fill this gap in the literature, we propose a simulator architecture for generating probabilistic synthetic d… ▽ More Synthetic data generation has been a growing area of research in recent years. However, its potential applications in serious games have not been thoroughly explored. Advances in this field could anticipate data modelling and analysis, as well as speed up the development process. To try to fill this gap in the literature, we propose a simulator architecture for generating probabilistic synthetic data for serious games based on interactive narratives. This architecture is designed to be generic and modular so that it can be used by other researchers on similar problems. To simulate the interaction of synthetic players with questions, we use a cognitive testing model based on the Item Response Theory framework. We also show how probabilistic graphical models (in particular Bayesian networks) can be used to introduce expert knowledge and external data into the simulation. Finally, we apply the proposed architecture and methods in a use case of a serious game focused on cyberbullying. We perform Bayesian inference experiments using a hierarchical model to demonstrate the identifiability and robustness of the generated data. △ Less

Submitted 3 July, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

Journal ref: Knowledge-Based Systems, Volume 286, 2024, pp. 111440, 2024

Showing 1–50 of 168 results for author: Pérez, J