-
A Digital Twin Framework for Generation-IV Reactors with Reinforcement Learning-Enabled Health-Aware Supervisory Control
Authors:
Jasmin Y. Lim,
Dimitrios Pylorof,
Humberto E. Garcia,
Karthik Duraisamy
Abstract:
Generation IV (Gen-IV) nuclear power plants are envisioned to replace the current reactor fleet, bringing improvements in performance, safety, reliability, and sustainability. However, large cost investments currently inhibit the deployment of these advanced reactor concepts. Digital twins bridge real-world systems with digital tools to reduce costs, enhance decision-making, and boost operational…
▽ More
Generation IV (Gen-IV) nuclear power plants are envisioned to replace the current reactor fleet, bringing improvements in performance, safety, reliability, and sustainability. However, large cost investments currently inhibit the deployment of these advanced reactor concepts. Digital twins bridge real-world systems with digital tools to reduce costs, enhance decision-making, and boost operational efficiency. In this work, a digital twin framework is designed to operate the Gen-IV Fluoride-salt-cooled High-temperature Reactor, utilizing data-enhanced methods to optimize operational and maintenance policies while adhering to system constraints. The closed-loop framework integrates surrogate modeling, reinforcement learning, and Bayesian inference to streamline end-to-end communication for online regulation and self-adjustment. Reinforcement learning is used to consider component health and degradation to drive the target power generations, with constraints enforced through a Reference Governor control algorithm that ensures compliance with pump flow rate and temperature limits. These input driving modules benefit from detailed online simulations that are assimilated to measurement data with Bayesian filtering. The digital twin is demonstrated in three case studies: a one-year long-term operational period showcasing maintenance planning capabilities, short-term accuracy refinement with high-frequency measurements, and system shock capturing that demonstrates real-time recalibration capabilities when change in boundary conditions. These demonstrations validate robustness for health-aware and constraint-informed nuclear plant operation, with general applicability to other advanced reactor concepts and complex engineering systems.
△ Less
Submitted 8 June, 2025;
originally announced June 2025.
-
Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search
Authors:
Dongge Han,
Menglin Xia,
Daniel Madrigal Diaz,
Samuel Kessler,
Ankur Mallick,
Xuchao Zhang,
Mirian Del Carmen Hipolito Garcia,
Jin Xu,
Victor Rühle,
Saravan Rajmohan
Abstract:
Small language models (SLMs) offer promising and efficient alternatives to large language models (LLMs). However, SLMs' limited capacity restricts their reasoning capabilities and makes them sensitive to prompt variations. To address these challenges, we propose a novel framework that enhances SLM reasoning capabilities through LLM generated blueprints. The blueprints provide structured, high-leve…
▽ More
Small language models (SLMs) offer promising and efficient alternatives to large language models (LLMs). However, SLMs' limited capacity restricts their reasoning capabilities and makes them sensitive to prompt variations. To address these challenges, we propose a novel framework that enhances SLM reasoning capabilities through LLM generated blueprints. The blueprints provide structured, high-level reasoning guides that help SLMs systematically tackle related problems. Furthermore, our framework integrates a prompt template search mechanism to mitigate the SLMs' sensitivity to prompt variations. Our framework demonstrates improved SLM performance across various tasks, including math (GSM8K), coding (MBPP), and logic reasoning (BBH). Our approach improves the reasoning capabilities of SLMs without increasing model size or requiring additional training, offering a lightweight and deployment-friendly solution for on-device or resource-constrained environments.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Exploring How LLMs Capture and Represent Domain-Specific Knowledge
Authors:
Mirian Hipolito Garcia,
Camille Couturier,
Daniel Madrigal Diaz,
Ankur Mallick,
Anastasios Kyrillidis,
Robert Sim,
Victor Ruhle,
Saravan Rajmohan
Abstract:
We study whether Large Language Models (LLMs) inherently capture domain-specific nuances in natural language. Our experiments probe the domain sensitivity of LLMs by examining their ability to distinguish queries from different domains using hidden states generated during the prefill phase. We reveal latent domain-related trajectories that indicate the model's internal recognition of query domains…
▽ More
We study whether Large Language Models (LLMs) inherently capture domain-specific nuances in natural language. Our experiments probe the domain sensitivity of LLMs by examining their ability to distinguish queries from different domains using hidden states generated during the prefill phase. We reveal latent domain-related trajectories that indicate the model's internal recognition of query domains. We also study the robustness of these domain representations to variations in prompt styles and sources. Our approach leverages these representations for model selection, mapping the LLM that best matches the domain trace of the input query (i.e., the model with the highest performance on similar traces). Our findings show that LLMs can differentiate queries for related domains, and that the fine-tuned model is not always the most accurate. Unlike previous work, our interpretations apply to both closed and open-ended generative tasks
△ Less
Submitted 24 April, 2025; v1 submitted 23 April, 2025;
originally announced April 2025.
-
HARP 2.0: Expanding Hosted, Asynchronous, Remote Processing for Deep Learning in the DAW
Authors:
Christodoulos Benetatos,
Frank Cwitkowitz,
Nathan Pruyne,
Hugo Flores Garcia,
Patrick O'Reilly,
Zhiyao Duan,
Bryan Pardo
Abstract:
HARP 2.0 brings deep learning models to digital audio workstation (DAW) software through hosted, asynchronous, remote processing, allowing users to route audio from a plug-in interface through any compatible Gradio endpoint to perform arbitrary transformations. HARP renders endpoint-defined controls and processed audio in-plugin, meaning users can explore a variety of cutting-edge deep learning mo…
▽ More
HARP 2.0 brings deep learning models to digital audio workstation (DAW) software through hosted, asynchronous, remote processing, allowing users to route audio from a plug-in interface through any compatible Gradio endpoint to perform arbitrary transformations. HARP renders endpoint-defined controls and processed audio in-plugin, meaning users can explore a variety of cutting-edge deep learning models without ever leaving the DAW. In the 2.0 release we introduce support for MIDI-based models and audio/MIDI labeling models, provide a streamlined pyharp Python API for model developers, and implement numerous interface and stability improvements. Through this work, we hope to bridge the gap between model developers and creatives, improving access to deep learning models by seamlessly integrating them into DAW workflows.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
Sketch2Sound: Controllable Audio Generation via Time-Varying Signals and Sonic Imitations
Authors:
Hugo Flores García,
Oriol Nieto,
Justin Salamon,
Bryan Pardo,
Prem Seetharaman
Abstract:
We present Sketch2Sound, a generative audio model capable of creating high-quality sounds from a set of interpretable time-varying control signals: loudness, brightness, and pitch, as well as text prompts. Sketch2Sound can synthesize arbitrary sounds from sonic imitations (i.e.,~a vocal imitation or a reference sound-shape). Sketch2Sound can be implemented on top of any text-to-audio latent diffus…
▽ More
We present Sketch2Sound, a generative audio model capable of creating high-quality sounds from a set of interpretable time-varying control signals: loudness, brightness, and pitch, as well as text prompts. Sketch2Sound can synthesize arbitrary sounds from sonic imitations (i.e.,~a vocal imitation or a reference sound-shape). Sketch2Sound can be implemented on top of any text-to-audio latent diffusion transformer (DiT), and requires only 40k steps of fine-tuning and a single linear layer per control, making it more lightweight than existing methods like ControlNet. To synthesize from sketchlike sonic imitations, we propose applying random median filters to the control signals during training, allowing Sketch2Sound to be prompted using controls with flexible levels of temporal specificity. We show that Sketch2Sound can synthesize sounds that follow the gist of input controls from a vocal imitation while retaining the adherence to an input text prompt and audio quality compared to a text-only baseline. Sketch2Sound allows sound artists to create sounds with the semantic flexibility of text prompts and the expressivity and precision of a sonic gesture or vocal imitation. Sound examples are available at https://hugofloresgarcia.art/sketch2sound/.
△ Less
Submitted 14 April, 2025; v1 submitted 11 December, 2024;
originally announced December 2024.
-
EcoAct: Economic Agent Determines When to Register What Action
Authors:
Shaokun Zhang,
Jieyu Zhang,
Dujian Ding,
Mirian Hipolito Garcia,
Ankur Mallick,
Daniel Madrigal,
Menglin Xia,
Victor Rühle,
Qingyun Wu,
Chi Wang
Abstract:
Recent advancements have enabled Large Language Models (LLMs) to function as agents that can perform actions using external tools. This requires registering, i.e., integrating tool information into the LLM context prior to taking actions. Current methods indiscriminately incorporate all candidate tools into the agent's context and retain them across multiple reasoning steps. This process remains o…
▽ More
Recent advancements have enabled Large Language Models (LLMs) to function as agents that can perform actions using external tools. This requires registering, i.e., integrating tool information into the LLM context prior to taking actions. Current methods indiscriminately incorporate all candidate tools into the agent's context and retain them across multiple reasoning steps. This process remains opaque to LLM agents and is not integrated into their reasoning procedures, leading to inefficiencies due to increased context length from irrelevant tools. To address this, we introduce EcoAct, a tool using algorithm that allows LLMs to selectively register tools as needed, optimizing context use. By integrating the tool registration process into the reasoning procedure, EcoAct reduces computational costs by over 50% in multiple steps reasoning tasks while maintaining performance, as demonstrated through extensive experiments. Moreover, it can be plugged into any reasoning pipeline with only minor modifications to the prompt, making it applicable to LLM agents now and future.
△ Less
Submitted 3 November, 2024;
originally announced November 2024.
-
Socially Pertinent Robots in Gerontological Healthcare
Authors:
Xavier Alameda-Pineda,
Angus Addlesee,
Daniel Hernández García,
Chris Reinke,
Soraya Arias,
Federica Arrigoni,
Alex Auternaud,
Lauriane Blavette,
Cigdem Beyan,
Luis Gomez Camara,
Ohad Cohen,
Alessandro Conti,
Sébastien Dacunha,
Christian Dondrup,
Yoav Ellinson,
Francesco Ferro,
Sharon Gannot,
Florian Gras,
Nancie Gunson,
Radu Horaud,
Moreno D'Incà,
Imad Kimouche,
Séverin Lemaignan,
Oliver Lemon,
Cyril Liotard
, et al. (19 additional authors not shown)
Abstract:
Despite the many recent achievements in developing and deploying social robotics, there are still many underexplored environments and applications for which systematic evaluation of such systems by end-users is necessary. While several robotic platforms have been used in gerontological healthcare, the question of whether or not a social interactive robot with multi-modal conversational capabilitie…
▽ More
Despite the many recent achievements in developing and deploying social robotics, there are still many underexplored environments and applications for which systematic evaluation of such systems by end-users is necessary. While several robotic platforms have been used in gerontological healthcare, the question of whether or not a social interactive robot with multi-modal conversational capabilities will be useful and accepted in real-life facilities is yet to be answered. This paper is an attempt to partially answer this question, via two waves of experiments with patients and companions in a day-care gerontological facility in Paris with a full-sized humanoid robot endowed with social and conversational interaction capabilities. The software architecture, developed during the H2020 SPRING project, together with the experimental protocol, allowed us to evaluate the acceptability (AES) and usability (SUS) with more than 60 end-users. Overall, the users are receptive to this technology, especially when the robot perception and action skills are robust to environmental clutter and flexible to handle a plethora of different interactions.
△ Less
Submitted 11 February, 2025; v1 submitted 11 April, 2024;
originally announced April 2024.
-
Exploring Musical Roots: Applying Audio Embeddings to Empower Influence Attribution for a Generative Music Model
Authors:
Julia Barnett,
Hugo Flores Garcia,
Bryan Pardo
Abstract:
Every artist has a creative process that draws inspiration from previous artists and their works. Today, "inspiration" has been automated by generative music models. The black box nature of these models obscures the identity of the works that influence their creative output. As a result, users may inadvertently appropriate, misuse, or copy existing artists' works. We establish a replicable methodo…
▽ More
Every artist has a creative process that draws inspiration from previous artists and their works. Today, "inspiration" has been automated by generative music models. The black box nature of these models obscures the identity of the works that influence their creative output. As a result, users may inadvertently appropriate, misuse, or copy existing artists' works. We establish a replicable methodology to systematically identify similar pieces of music audio in a manner that is useful for understanding training data attribution. A key aspect of our approach is to harness an effective music audio similarity measure. We compare the effect of applying CLMR and CLAP embeddings to similarity measurement in a set of 5 million audio clips used to train VampNet, a recent open source generative music model. We validate this approach with a human listening study. We also explore the effect that modifications of an audio example (e.g., pitch shifting, time stretching, background noise) have on similarity measurements. This work is foundational to incorporating automated influence attribution into generative modeling, which promises to let model creators and users move from ignorant appropriation to informed creation. Audio samples that accompany this paper are available at https://tinyurl.com/exploring-musical-roots.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Detecting Agreement in Multi-party Conversational AI
Authors:
Laura Schauer,
Jason Sweeney,
Charlie Lyttle,
Zein Said,
Aron Szeles,
Cale Clark,
Katie McAskill,
Xander Wickham,
Tom Byars,
Daniel Hernández Garcia,
Nancie Gunson,
Angus Addlesee,
Oliver Lemon
Abstract:
Today, conversational systems are expected to handle conversations in multi-party settings, especially within Socially Assistive Robots (SARs). However, practical usability remains difficult as there are additional challenges to overcome, such as speaker recognition, addressee recognition, and complex turn-taking. In this paper, we present our work on a multi-party conversational system, which inv…
▽ More
Today, conversational systems are expected to handle conversations in multi-party settings, especially within Socially Assistive Robots (SARs). However, practical usability remains difficult as there are additional challenges to overcome, such as speaker recognition, addressee recognition, and complex turn-taking. In this paper, we present our work on a multi-party conversational system, which invites two users to play a trivia quiz game. The system detects users' agreement or disagreement on a final answer and responds accordingly. Our evaluation includes both performance and user assessment results, with a focus on detecting user agreement. Our annotated transcripts and the code for the proposed system have been released open-source on GitHub.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Detecting agreement in multi-party dialogue: evaluating speaker diarisation versus a procedural baseline to enhance user engagement
Authors:
Angus Addlesee,
Daniel Denley,
Andy Edmondson,
Nancie Gunson,
Daniel Hernández Garcia,
Alexandre Kha,
Oliver Lemon,
James Ndubuisi,
Neil O'Reilly,
Lia Perochaud,
Raphaël Valeri,
Miebaka Worika
Abstract:
Conversational agents participating in multi-party interactions face significant challenges in dialogue state tracking, since the identity of the speaker adds significant contextual meaning. It is common to utilise diarisation models to identify the speaker. However, it is not clear if these are accurate enough to correctly identify specific conversational events such as agreement or disagreement…
▽ More
Conversational agents participating in multi-party interactions face significant challenges in dialogue state tracking, since the identity of the speaker adds significant contextual meaning. It is common to utilise diarisation models to identify the speaker. However, it is not clear if these are accurate enough to correctly identify specific conversational events such as agreement or disagreement during a real-time interaction. This study uses a cooperative quiz, where the conversational agent acts as quiz-show host, to determine whether diarisation or a frequency-and-proximity-based method is more accurate at determining agreement, and whether this translates to feelings of engagement from the players. Experimental results show that our procedural system was more engaging to players, and was more accurate at detecting agreement, reaching an average accuracy of 0.44 compared to 0.28 for the diarised system.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
Authors:
Chen Dun,
Mirian Hipolito Garcia,
Guoqing Zheng,
Ahmed Hassan Awadallah,
Anastasios Kyrillidis,
Robert Sim
Abstract:
Large Language Models (LLMs) have the ability to solve a variety of tasks, such as text summarization and mathematical questions, just out of the box, but they are often trained with a single task in mind. Due to high computational costs, the current trend is to use prompt instruction tuning to better adjust monolithic, pretrained LLMs for new -- but often individual -- downstream tasks. Thus, how…
▽ More
Large Language Models (LLMs) have the ability to solve a variety of tasks, such as text summarization and mathematical questions, just out of the box, but they are often trained with a single task in mind. Due to high computational costs, the current trend is to use prompt instruction tuning to better adjust monolithic, pretrained LLMs for new -- but often individual -- downstream tasks. Thus, how one would expand prompt tuning to handle -- concomitantly -- heterogeneous tasks and data distributions is a widely open question. To address this gap, we suggest the use of \emph{Mixture of Prompts}, or MoPs, associated with smart gating functionality: the latter -- whose design is one of the contributions of this paper -- can identify relevant skills embedded in different groups of prompts and dynamically assign combined experts (i.e., collection of prompts), based on the target task. Additionally, MoPs are empirically agnostic to any model compression technique applied -- for efficiency reasons -- as well as instruction data source and task composition. In practice, MoPs can simultaneously mitigate prompt training "interference" in multi-task, multi-source scenarios (e.g., task and data heterogeneity across sources), as well as possible implications from model approximations. As a highlight, MoPs manage to decrease final perplexity from $\sim20\%$ up to $\sim70\%$, as compared to baselines, in the federated scenario, and from $\sim 3\%$ up to $\sim30\%$ in the centralized scenario.
△ Less
Submitted 17 January, 2025; v1 submitted 4 October, 2023;
originally announced October 2023.
-
Multi-party Goal Tracking with LLMs: Comparing Pre-training, Fine-tuning, and Prompt Engineering
Authors:
Angus Addlesee,
Weronika Sieińska,
Nancie Gunson,
Daniel Hernández Garcia,
Christian Dondrup,
Oliver Lemon
Abstract:
This paper evaluates the extent to which current Large Language Models (LLMs) can capture task-oriented multi-party conversations (MPCs). We have recorded and transcribed 29 MPCs between patients, their companions, and a social robot in a hospital. We then annotated this corpus for multi-party goal-tracking and intent-slot recognition. People share goals, answer each other's goals, and provide oth…
▽ More
This paper evaluates the extent to which current Large Language Models (LLMs) can capture task-oriented multi-party conversations (MPCs). We have recorded and transcribed 29 MPCs between patients, their companions, and a social robot in a hospital. We then annotated this corpus for multi-party goal-tracking and intent-slot recognition. People share goals, answer each other's goals, and provide other people's goals in MPCs - none of which occur in dyadic interactions. To understand user goals in MPCs, we compared three methods in zero-shot and few-shot settings: we fine-tuned T5, created pre-training tasks to train DialogLM using LED, and employed prompt engineering techniques with GPT-3.5-turbo, to determine which approach can complete this novel task with limited data. GPT-3.5-turbo significantly outperformed the others in a few-shot setting. The `reasoning' style prompt, when given 7% of the corpus as example annotated conversations, was the best performing method. It correctly annotated 62.32% of the goal tracking MPCs, and 69.57% of the intent-slot recognition MPCs. A `story' style prompt increased model hallucination, which could be detrimental if deployed in safety-critical settings. We conclude that multi-party conversations still challenge state-of-the-art LLMs.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
VampNet: Music Generation via Masked Acoustic Token Modeling
Authors:
Hugo Flores Garcia,
Prem Seetharaman,
Rithesh Kumar,
Bryan Pardo
Abstract:
We introduce VampNet, a masked acoustic token modeling approach to music synthesis, compression, inpainting, and variation. We use a variable masking schedule during training which allows us to sample coherent music from the model by applying a variety of masking approaches (called prompts) during inference. VampNet is non-autoregressive, leveraging a bidirectional transformer architecture that at…
▽ More
We introduce VampNet, a masked acoustic token modeling approach to music synthesis, compression, inpainting, and variation. We use a variable masking schedule during training which allows us to sample coherent music from the model by applying a variety of masking approaches (called prompts) during inference. VampNet is non-autoregressive, leveraging a bidirectional transformer architecture that attends to all tokens in a forward pass. With just 36 sampling passes, VampNet can generate coherent high-fidelity musical waveforms. We show that by prompting VampNet in various ways, we can apply it to tasks like music compression, inpainting, outpainting, continuation, and looping with variation (vamping). Appropriately prompted, VampNet is capable of maintaining style, genre, instrumentation, and other high-level aspects of the music. This flexible prompting capability makes VampNet a powerful music co-creation tool. Code and audio samples are available online.
△ Less
Submitted 12 July, 2023; v1 submitted 10 July, 2023;
originally announced July 2023.
-
Learning to Specialize: Joint Gating-Expert Training for Adaptive MoEs in Decentralized Settings
Authors:
Yehya Farhat,
Hamza ElMokhtar Shili,
Fangshuo Liao,
Chen Dun,
Mirian Hipolito Garcia,
Guoqing Zheng,
Ahmed Hassan Awadallah,
Robert Sim,
Dimitrios Dimitriadis,
Anastasios Kyrillidis
Abstract:
Mixture-of-Experts (MoEs) achieve scalability by dynamically activating subsets of their components. Yet, understanding how expertise emerges through joint training of gating mechanisms and experts remains incomplete, especially in scenarios without clear task partitions. Motivated by inference costs and data heterogeneity, we study how joint training of gating functions and experts can dynamicall…
▽ More
Mixture-of-Experts (MoEs) achieve scalability by dynamically activating subsets of their components. Yet, understanding how expertise emerges through joint training of gating mechanisms and experts remains incomplete, especially in scenarios without clear task partitions. Motivated by inference costs and data heterogeneity, we study how joint training of gating functions and experts can dynamically allocate domain-specific expertise across multiple underlying data distributions. As an outcome of our framework, we develop an instance tailored specifically to decentralized training scenarios, introducing \textit{Dynamically Decentralized Orchestration of MoEs} or \texttt{DDOME}. \texttt{DDOME} leverages heterogeneity emerging from distributional shifts across decentralized data sources to specialize experts dynamically. By integrating a pretrained common expert to inform a gating function, \texttt{DDOME} achieves personalized expert subset selection on-the-fly, facilitating just-in-time personalization. We empirically validate \texttt{DDOME} within a Federated Learning (FL) context: \texttt{DDOME} attains from 4\% up to an 24\% accuracy improvement over state-of-the-art FL baselines in image and text classification tasks, while maintaining competitive zero-shot generalization capabilities. Furthermore, we provide theoretical insights confirming that the joint gating-experts training is critical for achieving meaningful expert specialization.
△ Less
Submitted 3 June, 2025; v1 submitted 14 June, 2023;
originally announced June 2023.
-
FAENet: Frame Averaging Equivariant GNN for Materials Modeling
Authors:
Alexandre Duval,
Victor Schmidt,
Alex Hernandez Garcia,
Santiago Miret,
Fragkiskos D. Malliaros,
Yoshua Bengio,
David Rolnick
Abstract:
Applications of machine learning techniques for materials modeling typically involve functions known to be equivariant or invariant to specific symmetries. While graph neural networks (GNNs) have proven successful in such tasks, they enforce symmetries via the model architecture, which often reduces their expressivity, scalability and comprehensibility. In this paper, we introduce (1) a flexible f…
▽ More
Applications of machine learning techniques for materials modeling typically involve functions known to be equivariant or invariant to specific symmetries. While graph neural networks (GNNs) have proven successful in such tasks, they enforce symmetries via the model architecture, which often reduces their expressivity, scalability and comprehensibility. In this paper, we introduce (1) a flexible framework relying on stochastic frame-averaging (SFA) to make any model E(3)-equivariant or invariant through data transformations. (2) FAENet: a simple, fast and expressive GNN, optimized for SFA, that processes geometric information without any symmetrypreserving design constraints. We prove the validity of our method theoretically and empirically demonstrate its superior accuracy and computational scalability in materials modeling on the OC20 dataset (S2EF, IS2RE) as well as common molecular modeling tasks (QM9, QM7-X). A package implementation is available at https://faenet.readthedocs.io.
△ Less
Submitted 28 April, 2023;
originally announced May 2023.
-
MROS: A framework for robot self-adaptation
Authors:
Gustavo Rezende Silva,
Darko Bozhinoski,
Mario Garzon Oviedo,
Mariano Ramírez Montero,
Nadia Hammoudeh Garcia,
Harshavardhan Deshpande,
Andrzej Wasowski,
Carlos Hernandez Corbato
Abstract:
Self-adaptation can be used in robotics to increase system robustness and reliability. This work describes the Metacontrol method for self-adaptation in robotics. Particularly, it details how the MROS (Metacontrol for ROS Systems) framework implements and packages Metacontrol, and it demonstrate how MROS can be applied in a navigation scenario where a mobile robot navigates in a factory floor. Vid…
▽ More
Self-adaptation can be used in robotics to increase system robustness and reliability. This work describes the Metacontrol method for self-adaptation in robotics. Particularly, it details how the MROS (Metacontrol for ROS Systems) framework implements and packages Metacontrol, and it demonstrate how MROS can be applied in a navigation scenario where a mobile robot navigates in a factory floor. Video: https://www.youtube.com/watch?v=ISe9aMskJuE
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Federated Multilingual Models for Medical Transcript Analysis
Authors:
Andre Manoel,
Mirian Hipolito Garcia,
Tal Baumel,
Shize Su,
Jialei Chen,
Dan Miller,
Danny Karmon,
Robert Sim,
Dimitrios Dimitriadis
Abstract:
Federated Learning (FL) is a novel machine learning approach that allows the model trainer to access more data samples, by training the model across multiple decentralized data sources, while data access constraints are in place. Such trained models can achieve significantly higher performance beyond what can be done when trained on a single data source. As part of FL's promises, none of the train…
▽ More
Federated Learning (FL) is a novel machine learning approach that allows the model trainer to access more data samples, by training the model across multiple decentralized data sources, while data access constraints are in place. Such trained models can achieve significantly higher performance beyond what can be done when trained on a single data source. As part of FL's promises, none of the training data is ever transmitted to any central location, ensuring that sensitive data remains local and private. These characteristics make FL perfectly suited for large-scale applications in healthcare, where a variety of compliance constraints restrict how data may be handled, processed, and stored. Despite the apparent benefits of federated learning, the heterogeneity in the local data distributions pose significant challenges, and such challenges are even more pronounced in the case of multilingual data providers. In this paper we present a federated learning system for training a large-scale multi-lingual model suitable for fine-tuning on downstream tasks such as medical entity tagging. Our work represents one of the first such production-scale systems, capable of training across multiple highly heterogeneous data providers, and achieving levels of accuracy that could not be otherwise achieved by using central training with public data. Finally, we show that the global model performance can be further improved by a training step performed locally.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Influencer Detection with Dynamic Graph Neural Networks
Authors:
Elena Tiukhova,
Emiliano Penaloza,
María Óskarsdóttir,
Hernan Garcia,
Alejandro Correa Bahnsen,
Bart Baesens,
Monique Snoeck,
Cristián Bravo
Abstract:
Leveraging network information for prediction tasks has become a common practice in many domains. Being an important part of targeted marketing, influencer detection can potentially benefit from incorporating dynamic network representation. In this work, we investigate different dynamic Graph Neural Networks (GNNs) configurations for influencer detection and evaluate their prediction performance u…
▽ More
Leveraging network information for prediction tasks has become a common practice in many domains. Being an important part of targeted marketing, influencer detection can potentially benefit from incorporating dynamic network representation. In this work, we investigate different dynamic Graph Neural Networks (GNNs) configurations for influencer detection and evaluate their prediction performance using a unique corporate data set. We show that using deep multi-head attention in GNN and encoding temporal attributes significantly improves performance. Furthermore, our empirical evaluation illustrates that capturing neighborhood representation is more beneficial that using network centrality measures.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Proactive Detractor Detection Framework Based on Message-Wise Sentiment Analysis Over Customer Support Interactions
Authors:
Juan Sebastián Salcedo Gallo,
Jesús Solano,
Javier Hernán García,
David Zarruk-Valencia,
Alejandro Correa-Bahnsen
Abstract:
In this work, we propose a framework relying solely on chat-based customer support (CS) interactions for predicting the recommendation decision of individual users. For our case study, we analyzed a total number of 16.4k users and 48.7k customer support conversations within the financial vertical of a large e-commerce company in Latin America. Consequently, our main contributions and objectives ar…
▽ More
In this work, we propose a framework relying solely on chat-based customer support (CS) interactions for predicting the recommendation decision of individual users. For our case study, we analyzed a total number of 16.4k users and 48.7k customer support conversations within the financial vertical of a large e-commerce company in Latin America. Consequently, our main contributions and objectives are to use Natural Language Processing (NLP) to assess and predict the recommendation behavior where, in addition to using static sentiment analysis, we exploit the predictive power of each user's sentiment dynamics. Our results show that, with respective feature interpretability, it is possible to predict the likelihood of a user to recommend a product or service, based solely on the message-wise sentiment evolution of their CS conversations in a fully automated way.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Multi-Environment based Meta-Learning with CSI Fingerprints for Radio Based Positioning
Authors:
Anastasios Foliadis,
Mario H. Castañeda Garcia,
Richard A. Stirling-Gallacher,
Reiner S. Thomä
Abstract:
Radio based positioning of a user equipment (UE) based on deep learning (DL) methods using channel state information (CSI) fingerprints have shown promising results. DL models are able to capture complex properties embedded in the CSI about a particular environment and map UE's CSI to the UE's position. However, the CSI fingerprints and the DL models trained on such fingerprints are highly depende…
▽ More
Radio based positioning of a user equipment (UE) based on deep learning (DL) methods using channel state information (CSI) fingerprints have shown promising results. DL models are able to capture complex properties embedded in the CSI about a particular environment and map UE's CSI to the UE's position. However, the CSI fingerprints and the DL models trained on such fingerprints are highly dependent on a particular propagation environment, which generally limits the transfer of knowledge of the DL models from one environment to another. In this paper, we propose a DL model consisting of two parts: the first part aims to learn environment independent features while the second part combines those features depending on the particular environment. To improve transfer learning, we propose a meta learning scheme for training the first part over multiple environments. We show that for positioning in a new environment, initializing a DL model with the meta learned environment independent function achieves higher UE positioning accuracy compared to regular transfer learning from one environment to the new environment, or compared to training the DL model from scratch with only fingerprints from the new environment. Our proposed scheme is able to create an environment independent function which can embed knowledge from multiple environments and more effectively learn from a new environment.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
Proceedings of the AI-HRI Symposium at AAAI-FSS 2022
Authors:
Zhao Han,
Emmanuel Senft,
Muneeb I. Ahmad,
Shelly Bagchi,
Amir Yazdani,
Jason R. Wilson,
Boyoung Kim,
Ruchen Wen,
Justin W. Hart,
Daniel Hernández García,
Matteo Leonetti,
Ross Mead,
Reuth Mirsky,
Ahalya Prabhakar,
Megan L. Zimmerman
Abstract:
The Artificial Intelligence (AI) for Human-Robot Interaction (HRI) Symposium has been a successful venue of discussion and collaboration on AI theory and methods aimed at HRI since 2014. This year, after a review of the achievements of the AI-HRI community over the last decade in 2021, we are focusing on a visionary theme: exploring the future of AI-HRI. Accordingly, we added a Blue Sky Ideas trac…
▽ More
The Artificial Intelligence (AI) for Human-Robot Interaction (HRI) Symposium has been a successful venue of discussion and collaboration on AI theory and methods aimed at HRI since 2014. This year, after a review of the achievements of the AI-HRI community over the last decade in 2021, we are focusing on a visionary theme: exploring the future of AI-HRI. Accordingly, we added a Blue Sky Ideas track to foster a forward-thinking discussion on future research at the intersection of AI and HRI. As always, we appreciate all contributions related to any topic on AI/HRI and welcome new researchers who wish to take part in this growing community.
With the success of past symposia, AI-HRI impacts a variety of communities and problems, and has pioneered the discussions in recent trends and interests. This year's AI-HRI Fall Symposium aims to bring together researchers and practitioners from around the globe, representing a number of university, government, and industry laboratories. In doing so, we hope to accelerate research in the field, support technology transition and user adoption, and determine future directions for our group and our research.
△ Less
Submitted 28 November, 2022; v1 submitted 28 September, 2022;
originally announced September 2022.
-
Educating Educators to Integrate Inclusive Design Across a 4-Year CS Degree Program
Authors:
Lara Letaw,
Rosalinda Garcia,
Patricia Morreale,
Gail Verdi,
Heather Garcia,
Geraldine Jimena Noa,
Spencer P. Madsen,
Maria Jesus Alzugaray-Orellana,
Margaret Burnett
Abstract:
How can an entire CS faculty, who together have been teaching the ACM standard CS curricula, shift to teaching elements of inclusive design across a 4-year undergraduate CS program? And will they even want to try? To investigate these questions, we developed an educate-the-educators curriculum to support this shift. The overall goal of the educate-the-educators curriculum was to enable CS faculty…
▽ More
How can an entire CS faculty, who together have been teaching the ACM standard CS curricula, shift to teaching elements of inclusive design across a 4-year undergraduate CS program? And will they even want to try? To investigate these questions, we developed an educate-the-educators curriculum to support this shift. The overall goal of the educate-the-educators curriculum was to enable CS faculty to creatively engage with embedding inclusive design into their courses in "minimally invasive" ways. GenderMag, an inclusive design evaluation method, was selected for use. The curriculum targeted the following learning outcomes: to enable CS faculty: (1) to analyze the costs and benefits of integrating inclusive design into their own course(s); (2) to evaluate software using the GenderMag method, and recognize its use to identify meaningful issues in software; (3) to integrate inclusive design into existing course materials with provided resources and collaboration; and (4) to prepare to engage and guide students on learning GenderMag concepts. We conducted a field study over a spring/summer followed by end-of-fall interviews, during which we worked with 18 faculty members to integrate inclusive design into 13 courses. Ten of these faculty then taught 7 of these courses that were on the Fall 2021 schedule, across 16 sections. We present the new educate-the-educators curriculum and report on the faculty's experiences acting upon it over the three-month field study and subsequent interviews. Our results showed that, of the 18 faculty we worked with, 83% chose to modify their courses; by Fall 2021, faculty across all four years of a CS degree program had begun teaching inclusive design concepts. When we followed up with the 10 Fall 2021 faculty, 91% of their reported outcomes indicated that the incorporations of inclusive design concepts in their courses went as well as or better than expected.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
Deep Optical Coding Design in Computational Imaging
Authors:
Henry Arguello,
Jorge Bacca,
Hasindu Kariyawasam,
Edwin Vargas,
Miguel Marquez,
Ramith Hettiarachchi,
Hans Garcia,
Kithmini Herath,
Udith Haputhanthri,
Balpreet Singh Ahluwalia,
Peter So,
Dushan N. Wadduwage,
Chamira U. S. Edussooriya
Abstract:
Computational optical imaging (COI) systems leverage optical coding elements (CE) in their setups to encode a high-dimensional scene in a single or multiple snapshots and decode it by using computational algorithms. The performance of COI systems highly depends on the design of its main components: the CE pattern and the computational method used to perform a given task. Conventional approaches re…
▽ More
Computational optical imaging (COI) systems leverage optical coding elements (CE) in their setups to encode a high-dimensional scene in a single or multiple snapshots and decode it by using computational algorithms. The performance of COI systems highly depends on the design of its main components: the CE pattern and the computational method used to perform a given task. Conventional approaches rely on random patterns or analytical designs to set the distribution of the CE. However, the available data and algorithm capabilities of deep neural networks (DNNs) have opened a new horizon in CE data-driven designs that jointly consider the optical encoder and computational decoder. Specifically, by modeling the COI measurements through a fully differentiable image formation model that considers the physics-based propagation of light and its interaction with the CEs, the parameters that define the CE and the computational decoder can be optimized in an end-to-end (E2E) manner. Moreover, by optimizing just CEs in the same framework, inference tasks can be performed from pure optics. This work surveys the recent advances on CE data-driven design and provides guidelines on how to parametrize different optical elements to include them in the E2E framework. Since the E2E framework can handle different inference applications by changing the loss function and the DNN, we present low-level tasks such as spectral imaging reconstruction or high-level tasks such as pose estimation with privacy preserving enhanced by using optimal task-based optical architectures. Finally, we illustrate classification and 3D object recognition applications performed at the speed of the light using all-optics DNN.
△ Less
Submitted 17 August, 2022; v1 submitted 27 June, 2022;
originally announced July 2022.
-
FLUTE: A Scalable, Extensible Framework for High-Performance Federated Learning Simulations
Authors:
Mirian Hipolito Garcia,
Andre Manoel,
Daniel Madrigal Diaz,
Fatemehsadat Mireshghallah,
Robert Sim,
Dimitrios Dimitriadis
Abstract:
In this paper we introduce "Federated Learning Utilities and Tools for Experimentation" (FLUTE), a high-performance open-source platform for federated learning research and offline simulations. The goal of FLUTE is to enable rapid prototyping and simulation of new federated learning algorithms at scale, including novel optimization, privacy, and communications strategies. We describe the architect…
▽ More
In this paper we introduce "Federated Learning Utilities and Tools for Experimentation" (FLUTE), a high-performance open-source platform for federated learning research and offline simulations. The goal of FLUTE is to enable rapid prototyping and simulation of new federated learning algorithms at scale, including novel optimization, privacy, and communications strategies. We describe the architecture of FLUTE, enabling arbitrary federated modeling schemes to be realized. We compare the platform with other state-of-the-art platforms and describe available features of FLUTE for experimentation in core areas of active research, such as optimization, privacy, and scalability. A comparison with other established platforms shows speed-ups of up to 42x and savings in memory footprint of 3x. A sample of the platform capabilities is also presented for a range of tasks, as well as other functionality, such as linear scaling for the number of participating clients, and a variety of federated optimizers, including FedAdam, DGA, etcetera.
△ Less
Submitted 14 November, 2022; v1 submitted 25 March, 2022;
originally announced March 2022.
-
A hybrid model-based evolutionary optimization with passive boundaries for physical human-robot interaction
Authors:
Gustavo J. G. Lahr,
Henrique B. Garcia,
Arash Ajoudani,
Thiago Boaventura,
Glauco A. P. Caurin
Abstract:
The field of physical human-robot interaction has dramatically evolved in the last decades. As a result, the robotic system's requirements have become more challenging, including personalized behavior for different tasks and users. Various machine learning techniques have been proposed to give the robot such adaptability features. This paper proposes a model-based evolutionary optimization algorit…
▽ More
The field of physical human-robot interaction has dramatically evolved in the last decades. As a result, the robotic system's requirements have become more challenging, including personalized behavior for different tasks and users. Various machine learning techniques have been proposed to give the robot such adaptability features. This paper proposes a model-based evolutionary optimization algorithm to tune the apparent impedance of a wrist rehabilitation device. We used passivity to define boundaries for the possible controller outcomes, limiting the shared autonomy of the robot and ensuring the coupled system stability. The experiment consists of a hardware-in-the-loop optimization and a one-degree-of-freedom robot used for wrist rehabilitation. Experimental results showed that the proposed technique could generate customized passive impedance controllers for three subjects. Furthermore, when compared with a constant impedance controller, the method suggested decreased in 20\% the root mean square of interaction torques while maintaining stability during optimization.
△ Less
Submitted 1 March, 2022;
originally announced March 2022.
-
Reliable Deep Learning based Localization with CSI Fingerprints and Multiple Base Stations
Authors:
Anastasios Foliadis,
Mario H. Castañeda Garcia,
Richard A. Stirling-Gallacher,
Reiner S. Thomä
Abstract:
Deep learning (DL) methods have been recently proposed for user equipment (UE) localization in wireless communication networks, based on the channel state information (CSI) between a UE and each base station (BS) in the uplink. With the CSI from the available BSs, UE localization can be performed in different ways. One the one hand, a single neural network (NN) can be trained for the UE localizati…
▽ More
Deep learning (DL) methods have been recently proposed for user equipment (UE) localization in wireless communication networks, based on the channel state information (CSI) between a UE and each base station (BS) in the uplink. With the CSI from the available BSs, UE localization can be performed in different ways. One the one hand, a single neural network (NN) can be trained for the UE localization by considering the CSI from all the available BSs as one overall fingerprint of the user's location. On the other hand, the CSI at each BS can be used to obtain an estimate of the UE's position with a separate NN at each BS, and then the position estimates of all BSs are combined to obtain an overall estimate of the UE position. In this work, we show that UE localization with the latter approach can achieve a higher positioning accuracy. We propose to consider the uncertainty in the UE localization at each BS, such that overall UE's position is determined by combining the position estimates of the different BSs based on the uncertainty at each BS. With this approach, a more reliable position estimate can be obtained in case of variations in the channel.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
Deep Learning Tools for Audacity: Helping Researchers Expand the Artist's Toolkit
Authors:
Hugo Flores Garcia,
Aldo Aguilar,
Ethan Manilow,
Dmitry Vedenko,
Bryan Pardo
Abstract:
We present a software framework that integrates neural networks into the popular open-source audio editing software, Audacity, with a minimal amount of developer effort. In this paper, we showcase some example use cases for both end-users and neural network developers. We hope that this work fosters a new level of interactivity between deep learning practitioners and end-users.
We present a software framework that integrates neural networks into the popular open-source audio editing software, Audacity, with a minimal amount of developer effort. In this paper, we showcase some example use cases for both end-users and neural network developers. We hope that this work fosters a new level of interactivity between deep learning practitioners and end-users.
△ Less
Submitted 28 October, 2021; v1 submitted 25 October, 2021;
originally announced October 2021.
-
AI-HRI 2021 Proceedings
Authors:
Reuth Mirsky,
Megan Zimmerman,
Muneed Ahmad,
Shelly Bagchi,
Felix Gervits,
Zhao Han,
Justin Hart,
Daniel Hernández García,
Matteo Leonetti,
Ross Mead,
Emmanuel Senft,
Jivko Sinapov,
Jason Wilson
Abstract:
The Artificial Intelligence (AI) for Human-Robot Interaction (HRI) Symposium has been a successful venue of discussion and collaboration since 2014. During that time, these symposia provided a fertile ground for numerous collaborations and pioneered many discussions revolving trust in HRI, XAI for HRI, service robots, interactive learning, and more.
This year, we aim to review the achievements o…
▽ More
The Artificial Intelligence (AI) for Human-Robot Interaction (HRI) Symposium has been a successful venue of discussion and collaboration since 2014. During that time, these symposia provided a fertile ground for numerous collaborations and pioneered many discussions revolving trust in HRI, XAI for HRI, service robots, interactive learning, and more.
This year, we aim to review the achievements of the AI-HRI community in the last decade, identify the challenges facing ahead, and welcome new researchers who wish to take part in this growing community. Taking this wide perspective, this year there will be no single theme to lead the symposium and we encourage AI-HRI submissions from across disciplines and research interests. Moreover, with the rising interest in AR and VR as part of an interaction and following the difficulties in running physical experiments during the pandemic, this year we specifically encourage researchers to submit works that do not include a physical robot in their evaluation, but promote HRI research in general. In addition, acknowledging that ethics is an inherent part of the human-robot interaction, we encourage submissions of works on ethics for HRI. Over the course of the two-day meeting, we will host a collaborative forum for discussion of current efforts in AI-HRI, with additional talks focused on the topics of ethics in HRI and ubiquitous HRI.
△ Less
Submitted 23 September, 2021; v1 submitted 22 September, 2021;
originally announced September 2021.
-
Leveraging Hierarchical Structures for Few-Shot Musical Instrument Recognition
Authors:
Hugo Flores Garcia,
Aldo Aguilar,
Ethan Manilow,
Bryan Pardo
Abstract:
Deep learning work on musical instrument recognition has generally focused on instrument classes for which we have abundant data. In this work, we exploit hierarchical relationships between instruments in a few-shot learning setup to enable classification of a wider set of musical instruments, given a few examples at inference. We apply a hierarchical loss function to the training of prototypical…
▽ More
Deep learning work on musical instrument recognition has generally focused on instrument classes for which we have abundant data. In this work, we exploit hierarchical relationships between instruments in a few-shot learning setup to enable classification of a wider set of musical instruments, given a few examples at inference. We apply a hierarchical loss function to the training of prototypical networks, combined with a method to aggregate prototypes hierarchically, mirroring the structure of a predefined musical instrument hierarchy. These extensions require no changes to the network architecture and new levels can be easily added or removed. Compared to a non-hierarchical few-shot baseline, our method leads to a significant increase in classification accuracy and significant decrease mistake severity on instrument classes unseen in training.
△ Less
Submitted 29 July, 2021; v1 submitted 14 July, 2021;
originally announced July 2021.
-
A Tutorial on 5G NR V2X Communications
Authors:
Mario H. Castañeda Garcia,
Alejandro Molina-Galan,
Mate Boban,
Javier Gozalvez,
Baldomero Coll-Perales,
Taylan Şahin,
Apostolos Kousaridas
Abstract:
The Third Generation Partnership Project (3GPP) has recently published its Release 16 that includes the first Vehicle to-Everything (V2X) standard based on the 5G New Radio (NR) air interface. 5G NR V2X introduces advanced functionalities on top of the 5G NR air interface to support connected and automated driving use cases with stringent requirements. This paper presents an in-depth tutorial of t…
▽ More
The Third Generation Partnership Project (3GPP) has recently published its Release 16 that includes the first Vehicle to-Everything (V2X) standard based on the 5G New Radio (NR) air interface. 5G NR V2X introduces advanced functionalities on top of the 5G NR air interface to support connected and automated driving use cases with stringent requirements. This paper presents an in-depth tutorial of the 3GPP Release 16 5G NR V2X standard for V2X communications, with a particular focus on the sidelink, since it is the most significant part of 5G NR V2X. The main part of the paper is an in-depth treatment of the key aspects of 5G NR V2X: the physical layer, the resource allocation, the quality of service management, the enhancements introduced to the Uu interface and the mobility management for V2N (Vehicle to Network) communications, as well as the co-existence mechanisms between 5G NR V2X and LTE V2X. We also review the use cases, the system architecture, and describe the evaluation methodology and simulation assumptions for 5G NR V2X. Finally, we provide an outlook on possible 5G NR V2X enhancements, including those identified within Release 17.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
CSI-Based Localization with CNNs Exploiting Phase Information
Authors:
Anastasios Foliadis,
Mario H. Castañeda Garcia,
Richard A. Stirling-Gallacher,
Reiner S. Thomä
Abstract:
In this paper we study the use of the Channel State Information (CSI) as fingerprint inputs of a Convolutional Neural Network (CNN) for localization. We examine whether the CSI can be used as a distinct fingerprint corresponding to a single position by considering the inconsistencies with its raw phase that cause the CSI to be unreliable. We propose two methods to produce reliable fingerprints inc…
▽ More
In this paper we study the use of the Channel State Information (CSI) as fingerprint inputs of a Convolutional Neural Network (CNN) for localization. We examine whether the CSI can be used as a distinct fingerprint corresponding to a single position by considering the inconsistencies with its raw phase that cause the CSI to be unreliable. We propose two methods to produce reliable fingerprints including the phase information. Furthermore, we examine the structure of the CNN and more specifically the impact of pooling on the positioning performance, and show that pooling over the subcarriers can be more beneficial than over the antennas.
△ Less
Submitted 22 January, 2021;
originally announced January 2021.
-
Position Information from Single-Bounce Reflections
Authors:
Anastasios Kakkavas,
Mario H. Castañeda García,
Gonzalo Seco-Granados,
Henk Wymeersch,
Richard A. Stirling-Gallacher,
Josef A. Nossek
Abstract:
In the context of positioning a target with a single-anchor, this contribution focuses on the Fisher information about the position, orientation and clock offset of the target provided by single-bounce reflections. The availability of prior knowledge of the target's environment is taken into account via a prior distribution of the position of virtual anchors, and the rank, intensity and direction…
▽ More
In the context of positioning a target with a single-anchor, this contribution focuses on the Fisher information about the position, orientation and clock offset of the target provided by single-bounce reflections. The availability of prior knowledge of the target's environment is taken into account via a prior distribution of the position of virtual anchors, and the rank, intensity and direction of provided information is studied. We show that when no prior knowledge is available, single-bounce reflections offer position information in the direction parallel to the reflecting surface, irrespective of the target's and anchor's locations. We provide a geometrically intuitive explanation of the results and present numerical examples demonstrating their potential implications.
△ Less
Submitted 2 December, 2020;
originally announced December 2020.
-
Power Allocation and Parameter Estimation for Multipath-based 5G Positioning
Authors:
Anastasios Kakkavas,
Henk Wymeersch,
Gonzalo Seco-Granados,
Mario H. Castañeda García,
Richard A. Stirling-Gallacher,
Josef A. Nossek
Abstract:
We consider a single-anchor multiple-input multiple-output (MIMO) orthogonal frequency-division multiplexing (OFDM) system with imperfectly synchronized transmitter (Tx) and receiver (Rx) clocks, where the Rx estimates its position based on the received reference signals. The Tx, having (imperfect) prior knowledge about the Rx location and the surrounding geometry, transmits the reference signals…
▽ More
We consider a single-anchor multiple-input multiple-output (MIMO) orthogonal frequency-division multiplexing (OFDM) system with imperfectly synchronized transmitter (Tx) and receiver (Rx) clocks, where the Rx estimates its position based on the received reference signals. The Tx, having (imperfect) prior knowledge about the Rx location and the surrounding geometry, transmits the reference signals based on a set of fixed beams. In this work, we develop strategies for the power allocation among the beams aiming to minimize the expected Cramér-Rao lower bound (CRLB) for Rx positioning. Additional constraints on the design are included to ensure that the line-of-sight (LOS) path is detected with high probability. Furthermore, the effect of clock asynchronism on the resulting allocation strategies is also studied. We also propose a gridless compressed sensing-based position estimation algorithm, which exploits the information on the clock offset provided by non-line-of-sight paths, and show that it is asymptotically efficient.
△ Less
Submitted 2 December, 2020;
originally announced December 2020.
-
MROS: Runtime Adaptation For Robot Control Architectures
Authors:
Darko Bozhinoski,
Carlos Hernandez Corbato,
Mario Garzon Oviedo,
Gijs van der Hoorn,
Nadia Hammoudeh Garcia,
Harshavardhan Deshpande,
Jon Tjerngren,
Andrzej Wasowski
Abstract:
Known attempts to build autonomous robots rely on complex control architectures, often implemented with the Robot Operating System platform (ROS). Runtime adaptation is needed in these systems, to cope with component failures and with contingencies arising from dynamic environments-otherwise, these affect the reliability and quality of the mission execution. Existing proposals on how to build self…
▽ More
Known attempts to build autonomous robots rely on complex control architectures, often implemented with the Robot Operating System platform (ROS). Runtime adaptation is needed in these systems, to cope with component failures and with contingencies arising from dynamic environments-otherwise, these affect the reliability and quality of the mission execution. Existing proposals on how to build self-adaptive systems in robotics usually require a major re-design of the control architecture and rely on complex tools unfamiliar to the robotics community. Moreover, they are hard to reuse across applications.
This paper presents MROS: a model-based framework for run-time adaptation of robot control architectures based on ROS. MROS uses a combination of domain-specific languages to model architectural variants and captures mission quality concerns, and an ontology-based implementation of the MAPE-K and meta-control visions for run-time adaptation. The experiment results obtained applying MROS in two realistic ROS-based robotic demonstrators show the benefits of our approach in terms of the quality of the mission execution, and MROS' extensibility and re-usability across robotic applications.
△ Less
Submitted 23 November, 2021; v1 submitted 18 October, 2020;
originally announced October 2020.
-
Explainable Representations of the Social State: A Model for Social Human-Robot Interactions
Authors:
Daniel Hernández García,
Yanchao Yu,
Weronika Sieińska,
Jose L. Part,
Nancie Gunson,
Oliver Lemon,
Christian Dondrup
Abstract:
In this paper, we propose a minimum set of concepts and signals needed to track the social state during Human-Robot Interaction. We look into the problem of complex continuous interactions in a social context with multiple humans and robots, and discuss the creation of an explainable and tractable representation/model of their social interaction. We discuss these representations according to their…
▽ More
In this paper, we propose a minimum set of concepts and signals needed to track the social state during Human-Robot Interaction. We look into the problem of complex continuous interactions in a social context with multiple humans and robots, and discuss the creation of an explainable and tractable representation/model of their social interaction. We discuss these representations according to their representational and communicational properties, and organize them into four cognitive domains (scene-understanding, behaviour-profiling, mental-state, and dialogue-grounding).
△ Less
Submitted 9 October, 2020;
originally announced October 2020.
-
Recognizing Human Internal States: A Conceptor-Based Approach
Authors:
Madeleine Bartlett,
Daniel Hernandez Garcia,
Serge Thill,
Tony Belpaeme
Abstract:
The past few decades has seen increased interest in the application of social robots to interventions for Autism Spectrum Disorder as behavioural coaches [4]. We consider that robots embedded in therapies could also provide quantitative diagnostic information by observing patient behaviours. The social nature of ASD symptoms means that, to achieve this, robots need to be able to recognize the inte…
▽ More
The past few decades has seen increased interest in the application of social robots to interventions for Autism Spectrum Disorder as behavioural coaches [4]. We consider that robots embedded in therapies could also provide quantitative diagnostic information by observing patient behaviours. The social nature of ASD symptoms means that, to achieve this, robots need to be able to recognize the internal states their human interaction partners are experiencing, e.g. states of confusion, engagement etc. Approaching this problem can be broken down into two questions: (1) what information, accessible to robots, can be used to recognize internal states, and (2) how can a system classify internal states such that it allows for sufficiently detailed diagnostic information? In this paper we discuss these two questions in depth and propose a novel, conceptor-based classifier. We report the initial results of this system in a proof-of-concept study and outline plans for future work.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.
-
Proceedings of the SREC (Social Robots in Therapy and Care) Workshop at HRI 2019
Authors:
Pablo Gomez Esteban,
Daniel Hernández García,
Hee Rin Lee,
Marta Romeo,
Emmanuel Senft,
Erik Billing
Abstract:
Robot-Assisted Therapy (RAT) has successfully been used in Human Robot Interaction (HRI) research by including social robots in health-care interventions by virtue of their ability to engage human users in both social and emotional dimensions. Robots used for these tasks must be designed with several user groups in mind, including both individuals receiving therapy and care professionals responsib…
▽ More
Robot-Assisted Therapy (RAT) has successfully been used in Human Robot Interaction (HRI) research by including social robots in health-care interventions by virtue of their ability to engage human users in both social and emotional dimensions. Robots used for these tasks must be designed with several user groups in mind, including both individuals receiving therapy and care professionals responsible for the treatment. These robots must also be able to perceive their context of use, recognize human actions and intentions, and follow the therapeutic goals to perform meaningful and personalized treatment. Effective interactions require for robots to be capable of coordinated, timely behavior in response to social cues. This means being able to estimate and predict levels of engagement, attention, intentionality and emotional state during human-robot interactions. An additional challenge for social robots in therapy and care is the wide range of needs and conditions the different users can have during their interventions, even if they may share the same pathologies their current requirements and the objectives of their therapies can varied extensively. Therefore, it becomes crucial for robots to adapt their behaviors and interaction scenario to the specific needs, preferences and requirements of the patients they interact with. This personalization should be considered in terms of the robot behavior and the intervention scenario and must reflect the needs, preferences and requirements of the user.
△ Less
Submitted 5 September, 2019;
originally announced September 2019.
-
Cooperative Localization with Angular Measurements and Posterior Linearization
Authors:
Yibo Wu,
Bile Peng,
Henk Wymeersch,
Gonzalo Seco-Granados,
Anastasios Kakkavas,
Mario H. Castañeda Garcia,
Richard A. Stirling-Gallacher
Abstract:
The application of cooperative localization in vehicular networks is attractive to improve accuracy and coverage. Conventional distance measurements between vehicles are limited by the need for synchronization and provide no heading information of the vehicle. To address this, we present a cooperative localization algorithm using posterior linearization belief propagation (PLBP) utilizing angle-of…
▽ More
The application of cooperative localization in vehicular networks is attractive to improve accuracy and coverage. Conventional distance measurements between vehicles are limited by the need for synchronization and provide no heading information of the vehicle. To address this, we present a cooperative localization algorithm using posterior linearization belief propagation (PLBP) utilizing angle-of-arrival (AoA)-only measurements. Simulation results show that both directional and positional root mean squared error (RMSE) of vehicles can be decreased significantly and converge to a low value in a few iterations. Furthermore, the influence of parameters for the vehicular network, such as vehicle density, communication radius, prior uncertainty and AoA measurements noise, is analyzed.
△ Less
Submitted 10 July, 2019;
originally announced July 2019.
-
5G Downlink Multi-Beam Signal Design for LOS Positioning
Authors:
Anastasios Kakkavas,
Gonzalo Seco-Granados,
Henk Wymeersch,
Mario H. Castañeda García,
Richard A. Stirling-Gallacher,
Josef A. Nossek
Abstract:
In this work, we study optimal transmit strategies for minimizing the positioning error bound in a line-of-sight scenario, under different levels of prior knowledge of the channel parameters. For the case of perfect prior knowledge, we prove that two beams are optimal, and determine their beam directions and optimal power allocation. For the imperfect prior knowledge case, we compute the optimal p…
▽ More
In this work, we study optimal transmit strategies for minimizing the positioning error bound in a line-of-sight scenario, under different levels of prior knowledge of the channel parameters. For the case of perfect prior knowledge, we prove that two beams are optimal, and determine their beam directions and optimal power allocation. For the imperfect prior knowledge case, we compute the optimal power allocation among the beams of a codebook for two different robustness-related objectives, namely average or maximum squared position error bound minimization. Our numerical results show that our low-complexity approach can outperform existing methods that entail higher signaling and computational overhead.
△ Less
Submitted 2 December, 2020; v1 submitted 4 June, 2019;
originally announced June 2019.
-
Proceedings of the Workshop on Social Robots in Therapy: Focusing on Autonomy and Ethical Challenges
Authors:
Pablo G. Esteban,
Daniel Hernández García,
Hee Rin Lee,
Pauline Chevalier,
Paul Baxter,
Cindy L. Bethel,
Jainendra Shukla,
Joan Oliver,
Domènec Puig,
Jason R. Wilson,
Linda Tickle-Degnen,
Madeleine Bartlett,
Tony Belpaeme,
Serge Thill,
Kim Baraka,
Francisco S. Melo,
Manuela Veloso,
David Becerra,
Maja Matarić,
Eduard Fosch-Villaronga,
Jordi Albo-Canals,
Gloria Beraldo,
Emanuele Menegatti,
Valentina De Tommasi,
Roberto Mancin
, et al. (13 additional authors not shown)
Abstract:
Robot-Assisted Therapy (RAT) has successfully been used in HRI research by including social robots in health-care interventions by virtue of their ability to engage human users both social and emotional dimensions. Research projects on this topic exist all over the globe in the USA, Europe, and Asia. All of these projects have the overall ambitious goal to increase the well-being of a vulnerable p…
▽ More
Robot-Assisted Therapy (RAT) has successfully been used in HRI research by including social robots in health-care interventions by virtue of their ability to engage human users both social and emotional dimensions. Research projects on this topic exist all over the globe in the USA, Europe, and Asia. All of these projects have the overall ambitious goal to increase the well-being of a vulnerable population. Typical work in RAT is performed using remote controlled robots; a technique called Wizard-of-Oz (WoZ). The robot is usually controlled, unbeknownst to the patient, by a human operator. However, WoZ has been demonstrated to not be a sustainable technique in the long-term. Providing the robots with autonomy (while remaining under the supervision of the therapist) has the potential to lighten the therapists burden, not only in the therapeutic session itself but also in longer-term diagnostic tasks. Therefore, there is a need for exploring several degrees of autonomy in social robots used in therapy. Increasing the autonomy of robots might also bring about a new set of challenges. In particular, there will be a need to answer new ethical questions regarding the use of robots with a vulnerable population, as well as a need to ensure ethically-compliant robot behaviours. Therefore, in this workshop we want to gather findings and explore which degree of autonomy might help to improve health-care interventions and how we can overcome the ethical challenges inherent to it.
△ Less
Submitted 18 December, 2018;
originally announced December 2018.
-
LOS MIMO Design based on Multiple Optimum Antenna Separations
Authors:
Mario H. Castañeda García,
Marcin Iwanow,
Richard A. Stirling-Gallacher
Abstract:
The use of multiple antennas in a transmit and receive antenna array for MIMO wireless communication allows the spatial degrees of freedom in rich scattering environments to be exploited. However, for line-of-sight (LOS) MIMO channels with uniform linear arrays (ULAs) at the transmitter and receiver, the antenna separations at the transmit and receive array need to be optimized to maximize the spa…
▽ More
The use of multiple antennas in a transmit and receive antenna array for MIMO wireless communication allows the spatial degrees of freedom in rich scattering environments to be exploited. However, for line-of-sight (LOS) MIMO channels with uniform linear arrays (ULAs) at the transmitter and receiver, the antenna separations at the transmit and receive array need to be optimized to maximize the spatial degrees of freedom and the channel capacity. In this paper, we first revisit the derivation of the optimum antenna separation at the transmit and receive ULAs in a LOS MIMO system, and provide the general expression for the optimum antenna separation product, which consists of multiple solutions. Although only the solution corresponding to the smallest antenna separation product is usually considered in the literature, we exploit the multiple solutions for a LOS MIMO design over a range of distances between the transmitter and receiver. In particular, we consider the LOS MIMO design in a vehicle-to-vehicle (V2V) communication scenario, over a range of distances between the transmit and receive vehicle.
△ Less
Submitted 11 September, 2018;
originally announced September 2018.
-
Performance Limits of Single-Anchor Millimeter-Wave Positioning
Authors:
Anastasios Kakkavas,
Mario H. Castañeda García,
Richard A. Strirling-Gallacher,
Josef A. Nossek
Abstract:
The fundamental limits of single-anchor multi-antenna positioning are investigated. Exploiting the structure of the multiple input-multiple output-orthogonal frequency division multiplexing (MIMO-OFDM) channel at millimeter-wave frequencies, we present geometrically intuitive asymptotic expressions for the Fisher information on position, orientation and velocity for large bandwidth and number of a…
▽ More
The fundamental limits of single-anchor multi-antenna positioning are investigated. Exploiting the structure of the multiple input-multiple output-orthogonal frequency division multiplexing (MIMO-OFDM) channel at millimeter-wave frequencies, we present geometrically intuitive asymptotic expressions for the Fisher information on position, orientation and velocity for large bandwidth and number of antennas. The effects of synchronization errors and mobility are studied and it is shown that non-line-of-sight (NLOS) paths can be used to estimate the synchronization error and drastically improve the positioning performance. We also find that, in the presence of line-of-sight (LOS), mobility has a small impact on the achievable positioning accuracy, but in the NLOS-only scenario it can significantly improve the achievable performance, depending on the variance of the synchronization error. Finally, considering a communication system with device-specific transmission and reception constraints, we compare the positioning accuracy between the downlink and the uplink and show that they are equivalent under the same signal-to-noise ratio (SNR).
△ Less
Submitted 2 December, 2020; v1 submitted 24 August, 2018;
originally announced August 2018.
-
Multi-Array 5G V2V Relative Positioning: Performance Bounds
Authors:
Anastasios Kakkavas,
Mario H. Castañeda García,
Richard A. Stirling-Gallacher,
Josef A. Nossek
Abstract:
We study the performance bounds of vehicle-to-vehicle (V2V) relative positioning for vehicles with multiple antenna arrays. The Cramér-Rao bound for the estimation of the relative position and the orientation of the Tx vehicle is derived, when angle of arrival (AOA) measurements with or without time-difference of arrival (TDOA) measurements are used. In addition, geometrically intuitive expression…
▽ More
We study the performance bounds of vehicle-to-vehicle (V2V) relative positioning for vehicles with multiple antenna arrays. The Cramér-Rao bound for the estimation of the relative position and the orientation of the Tx vehicle is derived, when angle of arrival (AOA) measurements with or without time-difference of arrival (TDOA) measurements are used. In addition, geometrically intuitive expressions for the corresponding Fisher information are provided. The derived bounds are numerically evaluated for different carrier frequencies, bandwidths and array configurations under different V2V scenarios, i.e. overtaking and platooning. The significance of the AOA and TDOA measurements for position estimation is investigated. The achievable positioning accuracy is then compared with the present requirements of the 3rd Generation Partnership Project (3GPP) 5G New Radio (NR) vehicle-to-everything (V2X) standardization.
△ Less
Submitted 4 June, 2019; v1 submitted 22 August, 2018;
originally announced August 2018.
-
An Improved Statistic for the Pooled Triangle Test against PRNU-Copy Attack
Authors:
Mauro Barni,
Hector Santoyo Garcia,
Benedetta Tondi
Abstract:
We propose a new statistic to improve the pooled version of the triangle test used to combat the fingerprint-copy counter-forensic attack against PRNU-based camera identification [1]. As opposed to the original version of the test, the new statistic exploits the one-tail nature of the test, weighting differently positive and negative deviations from the expected value of the correlation between th…
▽ More
We propose a new statistic to improve the pooled version of the triangle test used to combat the fingerprint-copy counter-forensic attack against PRNU-based camera identification [1]. As opposed to the original version of the test, the new statistic exploits the one-tail nature of the test, weighting differently positive and negative deviations from the expected value of the correlation between the image under analysis and the candidate images, i.e., those image suspected to have been used during the attack. The experimental results confirm the superior performance of the new test, especially when the conditions of the test are challenging ones, that is when the number of images used for the fingerprint-copy attack is large and the size of the image under test is small.
△ Less
Submitted 8 May, 2018;
originally announced May 2018.
-
Simulation of Quantum Circuits via Stabilizer Frames
Authors:
Héctor J. García,
Igor L. Markov
Abstract:
Generic quantum-circuit simulation appears intractable for conventional computers and may be unnecessary because useful quantum circuits exhibit significant structure that can be exploited during simulation. For example, Gottesman and Knill identified an important subclass, called stabilizer circuits, which can be simulated efficiently using group-theory techniques and insights from quantum physic…
▽ More
Generic quantum-circuit simulation appears intractable for conventional computers and may be unnecessary because useful quantum circuits exhibit significant structure that can be exploited during simulation. For example, Gottesman and Knill identified an important subclass, called stabilizer circuits, which can be simulated efficiently using group-theory techniques and insights from quantum physics. Realistic circuits enriched with quantum error-correcting codes and fault-tolerant procedures are dominated by stabilizer subcircuits and contain a relatively small number of non-Clifford components. Therefore, we develop new data structures and algorithms that facilitate parallel simulation of such circuits. Stabilizer frames offer more compact storage than previous approaches but require more sophisticated bookkeeping. Our implementation, called Quipu, simulates certain quantum arithmetic circuits (e.g., reversible ripple-carry adders) in polynomial time and space for equal superpositions of $n$-qubits. On such instances, known linear-algebraic simulation techniques, such as the (state-of-the-art) BDD-based simulator QuIDDPro, take exponential time. We simulate quantum Fourier transform and quantum fault-tolerant circuits using Quipu, and the results demonstrate that our stabilizer-based technique empirically outperforms QuIDDPro in all cases. While previous high-performance, structure-aware simulations of quantum circuits were difficult to parallelize, we demonstrate that Quipu can be parallelized with a nontrivial computational speedup.
△ Less
Submitted 10 December, 2017;
originally announced December 2017.
-
On the Geometry of Stabilizer States
Authors:
Héctor J. García,
Igor L. Markov,
Andrew W. Cross
Abstract:
Large-scale quantum computation is likely to require massive quantum error correction (QEC). QEC codes and circuits are described via the stabilizer formalism, which represents stabilizer states by keeping track of the operators that preserve them. Such states are obtained by stabilizer circuits (consisting of CNOT, Hadamard and Phase gates) and can be represented compactly on conventional compute…
▽ More
Large-scale quantum computation is likely to require massive quantum error correction (QEC). QEC codes and circuits are described via the stabilizer formalism, which represents stabilizer states by keeping track of the operators that preserve them. Such states are obtained by stabilizer circuits (consisting of CNOT, Hadamard and Phase gates) and can be represented compactly on conventional computers using $O(n^2)$ bits, where $n$ is the number of qubits. As an additional application, the work by Aaronson and Gottesman suggests the use of superpositions of stabilizer states to represent arbitrary quantum states. To aid in such applications and improve our understanding of stabilizer states, we characterize and count nearest-neighbor stabilizer states, quantify the distribution of angles between pairs of stabilizer states, study succinct stabilizer superpositions and stabilizer bivectors, explore the approximation of non-stabilizer states by single stabilizer states and short linear combinations of stabilizer states, develop an improved inner-product computation for stabilizer states via synthesis of compact canonical stabilizer circuits, propose an orthogonalization procedure for stabilizer states, and evaluate several of these algorithms empirically.
△ Less
Submitted 20 November, 2017;
originally announced November 2017.
-
A Simple Text Analytics Model To Assist Literary Criticism: comparative approach and example on James Joyce against Shakespeare and the Bible
Authors:
Renato Fabbri,
Luis Henrique Garcia
Abstract:
Literary analysis, criticism or studies is a largely valued field with dedicated journals and researchers which remains mostly within the humanities scope. Text analytics is the computer-aided process of deriving information from texts. In this article we describe a simple and generic model for performing literary analysis using text analytics. The method relies on statistical measures of: 1) toke…
▽ More
Literary analysis, criticism or studies is a largely valued field with dedicated journals and researchers which remains mostly within the humanities scope. Text analytics is the computer-aided process of deriving information from texts. In this article we describe a simple and generic model for performing literary analysis using text analytics. The method relies on statistical measures of: 1) token and sentence sizes and 2) Wordnet synset features. These measures are then used in Principal Component Analysis where the texts to be analyzed are observed against Shakespeare and the Bible, regarded as reference literature. The model is validated by analyzing selected works from James Joyce (1882-1941), one of the most important writers of the 20th century. We discuss the consistency of this approach, the reasons why we did not use other techniques (e.g. part-of-speech tagging) and the ways by which the analysis model might be adapted and enhanced.
△ Less
Submitted 24 October, 2017;
originally announced October 2017.
-
Efficient Inner-product Algorithm for Stabilizer States
Authors:
Hector J. Garcia,
Igor L. Markov,
Andrew W. Cross
Abstract:
Large-scale quantum computation is likely to require massive quantum error correction (QEC). QEC codes and circuits are described via the stabilizer formalism, which represents stabilizer states by keeping track of the operators that preserve them. Such states are obtained by stabilizer circuits (consisting of CNOT, Hadamard and Phase only) and can be represented compactly on conventional computer…
▽ More
Large-scale quantum computation is likely to require massive quantum error correction (QEC). QEC codes and circuits are described via the stabilizer formalism, which represents stabilizer states by keeping track of the operators that preserve them. Such states are obtained by stabilizer circuits (consisting of CNOT, Hadamard and Phase only) and can be represented compactly on conventional computers using Omega(n^2) bits, where n is the number of qubits. Although techniques for the efficient simulation of stabilizer circuits have been studied extensively, techniques for efficient manipulation of stabilizer states are not currently available. To this end, we design new algorithms for: (i) obtaining canonical generators for stabilizer states, (ii) obtaining canonical stabilizer circuits, and (iii) computing the inner product between stabilizer states. Our inner-product algorithm takes O(n^3) time in general, but observes quadratic behavior for many practical instances relevant to QECC (e.g., GHZ states). We prove that each n-qubit stabilizer state has exactly 4(2^n - 1) nearest-neighbor stabilizer states, and verify this claim experimentally using our algorithms. We design techniques for representing arbitrary quantum states using stabilizer frames and generalize our algorithms to compute the inner product between two such frames.
△ Less
Submitted 7 August, 2013; v1 submitted 24 October, 2012;
originally announced October 2012.
-
High-performance Energy Minimization with Applications to Adiabatic Quantum Computing
Authors:
Hector J. Garcia,
Igor L. Markov
Abstract:
Energy minimization of Ising spin-glasses has played a central role in statistical and solid-state physics, facilitating studies of phase transitions and magnetism. Recent proposals suggest using Ising spin-glasses for non-traditional computing as a way to harness the nature's ability to find min-energy configurations, and to take advantage of quantum tunneling to boost combinatorial optimization.…
▽ More
Energy minimization of Ising spin-glasses has played a central role in statistical and solid-state physics, facilitating studies of phase transitions and magnetism. Recent proposals suggest using Ising spin-glasses for non-traditional computing as a way to harness the nature's ability to find min-energy configurations, and to take advantage of quantum tunneling to boost combinatorial optimization. Laboratory demonstrations have been unconvincing so far and lack a non-quantum baseline for definitive comparisons. In this work we (i) design and evaluate new computational techniques to simulate natural energy minimization in spin glasses and (ii) explore their application to study design alternatives in quantum adiabatic computers. Unlike previous work, our algorithms are not limited to planar Ising topologies. In one CPU-day, our branch-and-bound algorithm finds ground states on 100 spins, while our local search approximates ground states on 1, 000, 000 spins. We use this computational tool as a simulator to study the significance of hyper-couplings in the context of recently implemented adiabatic quantum computers.
△ Less
Submitted 18 June, 2013; v1 submitted 19 December, 2009;
originally announced December 2009.