-
Growing Perspectives: Modelling Embodied Perspective Taking and Inner Narrative Development Using Large Language Models
Authors:
Sabrina Patania,
Luca Annese,
Anna Lambiase,
Anita Pellegrini,
Tom Foulsham,
Azzurra Ruggeri,
Silvia Rossi,
Silvia Serino,
Dimitri Ognibene
Abstract:
Language and embodied perspective taking are essential for human collaboration, yet few computational models address both simultaneously. This work investigates the PerspAct system [1], which integrates the ReAct (Reason and Act) paradigm with Large Language Models (LLMs) to simulate developmental stages of perspective taking, grounded in Selman's theory [2]. Using an extended director task, we ev…
▽ More
Language and embodied perspective taking are essential for human collaboration, yet few computational models address both simultaneously. This work investigates the PerspAct system [1], which integrates the ReAct (Reason and Act) paradigm with Large Language Models (LLMs) to simulate developmental stages of perspective taking, grounded in Selman's theory [2]. Using an extended director task, we evaluate GPT's ability to generate internal narratives aligned with specified developmental stages, and assess how these influence collaborative performance both qualitatively (action selection) and quantitatively (task efficiency). Results show that GPT reliably produces developmentally-consistent narratives before task execution but often shifts towards more advanced stages during interaction, suggesting that language exchanges help refine internal representations. Higher developmental stages generally enhance collaborative effectiveness, while earlier stages yield more variable outcomes in complex contexts. These findings highlight the potential of integrating embodied perspective taking and language in LLMs to better model developmental dynamics and stress the importance of evaluating internal speech during combined linguistic and embodied tasks.
△ Less
Submitted 15 September, 2025;
originally announced September 2025.
-
Exact Persistent Stochastic Non-Interference
Authors:
Carla Piazza,
Riccardo Romanello,
Sabina Rossi
Abstract:
Persistent Stochastic Non-Interference (PSNI) was introduced to capture a quantitative security property in stochastic process algebras, ensuring that a high-level process does not influence the observable behaviour of a low-level component, as formalised via lumpable bisimulation. In this work, we revisit PSNI from a performance-oriented perspective and propose a new characterisation based on a r…
▽ More
Persistent Stochastic Non-Interference (PSNI) was introduced to capture a quantitative security property in stochastic process algebras, ensuring that a high-level process does not influence the observable behaviour of a low-level component, as formalised via lumpable bisimulation. In this work, we revisit PSNI from a performance-oriented perspective and propose a new characterisation based on a refined behavioural relation. We introduce \emph{weak-exact equivalence}, which extends exact equivalence with a relaxed treatment of internal (\(τ\)) actions, enabling precise control over quantitative observables while accommodating unobservable transitions. Based on this, we define \emph{Exact PSNI} (EPSNI), a variant of PSNI characterised via weak-exact equivalence. We show that EPSNI admits the same bisimulation-based and unwinding-style characterisations as PSNI, and enjoys analogous compositionality properties. These results confirm weak-exact equivalence as a robust foundation for reasoning about non-interference in stochastic systems.
△ Less
Submitted 26 August, 2025;
originally announced August 2025.
-
Who Sees What? Structured Thought-Action Sequences for Epistemic Reasoning in LLMs
Authors:
Luca Annese,
Sabrina Patania,
Silvia Serino,
Tom Foulsham,
Silvia Rossi,
Azzurra Ruggeri,
Dimitri Ognibene
Abstract:
Recent advances in large language models (LLMs) and reasoning frameworks have opened new possibilities for improving the perspective -taking capabilities of autonomous agents. However, tasks that involve active perception, collaborative reasoning, and perspective taking (understanding what another agent can see or knows) pose persistent challenges for current LLM-based systems. This study investig…
▽ More
Recent advances in large language models (LLMs) and reasoning frameworks have opened new possibilities for improving the perspective -taking capabilities of autonomous agents. However, tasks that involve active perception, collaborative reasoning, and perspective taking (understanding what another agent can see or knows) pose persistent challenges for current LLM-based systems. This study investigates the potential of structured examples derived from transformed solution graphs generated by the Fast Downward planner to improve the performance of LLM-based agents within a ReAct framework. We propose a structured solution-processing pipeline that generates three distinct categories of examples: optimal goal paths (G-type), informative node paths (E-type), and step-by-step optimal decision sequences contrasting alternative actions (L-type). These solutions are further converted into ``thought-action'' examples by prompting an LLM to explicitly articulate the reasoning behind each decision. While L-type examples slightly reduce clarification requests and overall action steps, they do not yield consistent improvements. Agents are successful in tasks requiring basic attentional filtering but struggle in scenarios that required mentalising about occluded spaces or weighing the costs of epistemic actions. These findings suggest that structured examples alone are insufficient for robust perspective-taking, underscoring the need for explicit belief tracking, cost modelling, and richer environments to enable socially grounded collaboration in LLM-based agents.
△ Less
Submitted 20 August, 2025;
originally announced August 2025.
-
Pseudo-likelihood produces associative memories able to generalize, even for asymmetric couplings
Authors:
Francesco D'Amico,
Dario Bocchi,
Luca Maria Del Bono,
Saverio Rossi,
Matteo Negri
Abstract:
Energy-based probabilistic models learned by maximizing the likelihood of the data are limited by the intractability of the partition function. A widely used workaround is to maximize the pseudo-likelihood, which replaces the global normalization with tractable local normalizations. Here we show that, in the zero-temperature limit, a network trained to maximize pseudo-likelihood naturally implemen…
▽ More
Energy-based probabilistic models learned by maximizing the likelihood of the data are limited by the intractability of the partition function. A widely used workaround is to maximize the pseudo-likelihood, which replaces the global normalization with tractable local normalizations. Here we show that, in the zero-temperature limit, a network trained to maximize pseudo-likelihood naturally implements an associative memory: if the training set is small, patterns become fixed-point attractors whose basins of attraction exceed those of any classical Hopfield rule. We explain quantitatively this effect on uncorrelated random patterns. Moreover, we show that, for different structured datasets coming from computer science (random feature model, MNIST), physics (spin glasses) and biology (proteins), as the number of training examples increases the learned network goes beyond memorization, developing meaningful attractors with non-trivial correlations with test examples, thus showing the ability to generalize. Our results therefore reveal pseudo-likelihood works both as an efficient inference tool and as a principled mechanism for memory and generalization.
△ Less
Submitted 7 July, 2025;
originally announced July 2025.
-
Scaling Laws for Uncertainty in Deep Learning
Authors:
Mattia Rosso,
Simone Rossi,
Giulio Franzese,
Markus Heinonen,
Maurizio Filippone
Abstract:
Deep learning has recently revealed the existence of scaling laws, demonstrating that model performance follows predictable trends based on dataset and model sizes. Inspired by these findings and fascinating phenomena emerging in the over-parameterized regime, we examine a parallel direction: do similar scaling laws govern predictive uncertainties in deep learning? In identifiable parametric model…
▽ More
Deep learning has recently revealed the existence of scaling laws, demonstrating that model performance follows predictable trends based on dataset and model sizes. Inspired by these findings and fascinating phenomena emerging in the over-parameterized regime, we examine a parallel direction: do similar scaling laws govern predictive uncertainties in deep learning? In identifiable parametric models, such scaling laws can be derived in a straightforward manner by treating model parameters in a Bayesian way. In this case, for example, we obtain $O(1/N)$ contraction rates for epistemic uncertainty with respect to the number of data $N$. However, in over-parameterized models, these guarantees do not hold, leading to largely unexplored behaviors. In this work, we empirically show the existence of scaling laws associated with various measures of predictive uncertainty with respect to dataset and model sizes. Through experiments on vision and language tasks, we observe such scaling laws for in- and out-of-distribution predictive uncertainty estimated through popular approximate Bayesian inference and ensemble methods. Besides the elegance of scaling laws and the practical utility of extrapolating uncertainties to larger data or models, this work provides strong evidence to dispel recurring skepticism against Bayesian approaches: "In many applications of deep learning we have so much data available: what do we need Bayes for?". Our findings show that "so much data" is typically not enough to make epistemic uncertainty negligible.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
From predictions to confidence intervals: an empirical study of conformal prediction methods for in-context learning
Authors:
Zhe Huang,
Simone Rossi,
Rui Yuan,
Thomas Hannagan
Abstract:
Transformers have become a standard architecture in machine learning, demonstrating strong in-context learning (ICL) abilities that allow them to learn from the prompt at inference time. However, uncertainty quantification for ICL remains an open challenge, particularly in noisy regression tasks. This paper investigates whether ICL can be leveraged for distribution-free uncertainty estimation, pro…
▽ More
Transformers have become a standard architecture in machine learning, demonstrating strong in-context learning (ICL) abilities that allow them to learn from the prompt at inference time. However, uncertainty quantification for ICL remains an open challenge, particularly in noisy regression tasks. This paper investigates whether ICL can be leveraged for distribution-free uncertainty estimation, proposing a method based on conformal prediction to construct prediction intervals with guaranteed coverage. While traditional conformal methods are computationally expensive due to repeated model fitting, we exploit ICL to efficiently generate confidence intervals in a single forward pass. Our empirical analysis compares this approach against ridge regression-based conformal methods, showing that conformal prediction with in-context learning (CP with ICL) achieves robust and scalable uncertainty estimates. Additionally, we evaluate its performance under distribution shifts and establish scaling laws to guide model training. These findings bridge ICL and conformal prediction, providing a theoretically grounded and new framework for uncertainty quantification in transformer-based models.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL
Authors:
Simone Papicchio,
Simone Rossi,
Luca Cagliero,
Paolo Papotti
Abstract:
Large Language Models (LLMs) have shown impressive capabilities in transforming natural language questions about relational databases into SQL queries. Despite recent improvements, small LLMs struggle to handle questions involving multiple tables and complex SQL patterns under a Zero-Shot Learning (ZSL) setting. Supervised Fine-Tuning (SFT) partially compensates for the knowledge deficits in pretr…
▽ More
Large Language Models (LLMs) have shown impressive capabilities in transforming natural language questions about relational databases into SQL queries. Despite recent improvements, small LLMs struggle to handle questions involving multiple tables and complex SQL patterns under a Zero-Shot Learning (ZSL) setting. Supervised Fine-Tuning (SFT) partially compensates for the knowledge deficits in pretrained models but falls short while dealing with queries involving multi-hop reasoning. To bridge this gap, different LLM training strategies to reinforce reasoning capabilities have been proposed, ranging from leveraging a thinking process within ZSL, including reasoning traces in SFT, or adopt Reinforcement Learning (RL) strategies. However, the influence of reasoning on Text2SQL performance is still largely unexplored. This paper investigates to what extent LLM reasoning capabilities influence their Text2SQL performance on four benchmark datasets. To this end, it considers the following LLM settings: (1) ZSL, including general-purpose reasoning or not; (2) SFT, with and without task-specific reasoning traces; (3) RL, exploring the use of different rewarding functions, both the established EXecution accuracy (EX) and a mix with fine-grained ones that also account the precision, recall, and cardinality of partially correct answers; (4) SFT+RL, i.e, a two-stage approach that combines SFT and RL. The results show that general-purpose reasoning under ZSL proves to be ineffective in tackling complex Text2SQL cases. Small LLMs benefit from SFT with reasoning much more than larger ones. RL is generally beneficial across all tested models and datasets. The use of the fine-grained metrics turns out to be the most effective RL strategy. Thanks to RL and the novel text2SQL rewards, the 7B Qwen-Coder-2.5 model performs on par with 400+ Billion ones (including gpt-4o) on the Bird dataset.
△ Less
Submitted 27 April, 2025; v1 submitted 21 April, 2025;
originally announced April 2025.
-
Assessing Code Understanding in LLMs
Authors:
Cosimo Laneve,
Alvise Spanò,
Dalila Ressi,
Sabina Rossi,
Michele Bugliesi
Abstract:
We present an empirical evaluation of Large Language Models in code understanding associated with non-trivial, semantic-preserving program transformations such as copy propagation or constant folding. Our findings show that LLMs fail to judge semantic equivalence in approximately 41\% of cases when no context is provided and in 29\% when given a simple generic context. To improve accuracy, we advo…
▽ More
We present an empirical evaluation of Large Language Models in code understanding associated with non-trivial, semantic-preserving program transformations such as copy propagation or constant folding. Our findings show that LLMs fail to judge semantic equivalence in approximately 41\% of cases when no context is provided and in 29\% when given a simple generic context. To improve accuracy, we advocate integrating LLMs with code-optimization tools to enhance training and facilitate more robust program understanding.
△ Less
Submitted 31 March, 2025;
originally announced April 2025.
-
Enhancing Robot Assistive Behaviour with Reinforcement Learning and Theory of Mind
Authors:
Antonio Andriella,
Giovanni Falcone,
Silvia Rossi
Abstract:
The adaptation to users' preferences and the ability to infer and interpret humans' beliefs and intents, which is known as the Theory of Mind (ToM), are two crucial aspects for achieving effective human-robot collaboration. Despite its importance, very few studies have investigated the impact of adaptive robots with ToM abilities. In this work, we present an exploratory comparative study to invest…
▽ More
The adaptation to users' preferences and the ability to infer and interpret humans' beliefs and intents, which is known as the Theory of Mind (ToM), are two crucial aspects for achieving effective human-robot collaboration. Despite its importance, very few studies have investigated the impact of adaptive robots with ToM abilities. In this work, we present an exploratory comparative study to investigate how social robots equipped with ToM abilities impact users' performance and perception. We design a two-layer architecture. The Q-learning agent on the first layer learns the robot's higher-level behaviour. On the second layer, a heuristic-based ToM infers the user's intended strategy and is responsible for implementing the robot's assistance, as well as providing the motivation behind its choice. We conducted a user study in a real-world setting, involving 56 participants who interacted with either an adaptive robot capable of ToM, or with a robot lacking such abilities. Our findings suggest that participants in the ToM condition performed better, accepted the robot's assistance more often, and perceived its ability to adapt, predict and recognise their intents to a higher degree. Our preliminary insights could inform future research and pave the way for designing more complex computation architectures for adaptive behaviour with ToM capabilities.
△ Less
Submitted 11 November, 2024;
originally announced November 2024.
-
Multimodal Coherent Explanation Generation of Robot Failures
Authors:
Pradip Pramanick,
Silvia Rossi
Abstract:
The explainability of a robot's actions is crucial to its acceptance in social spaces. Explaining why a robot fails to complete a given task is particularly important for non-expert users to be aware of the robot's capabilities and limitations. So far, research on explaining robot failures has only considered generating textual explanations, even though several studies have shown the benefits of m…
▽ More
The explainability of a robot's actions is crucial to its acceptance in social spaces. Explaining why a robot fails to complete a given task is particularly important for non-expert users to be aware of the robot's capabilities and limitations. So far, research on explaining robot failures has only considered generating textual explanations, even though several studies have shown the benefits of multimodal ones. However, a simple combination of multiple modalities may lead to semantic incoherence between the information across different modalities - a problem that is not well-studied. An incoherent multimodal explanation can be difficult to understand, and it may even become inconsistent with what the robot and the human observe and how they perform reasoning with the observations. Such inconsistencies may lead to wrong conclusions about the robot's capabilities. In this paper, we introduce an approach to generate coherent multimodal explanations by checking the logical coherence of explanations from different modalities, followed by refinements as required. We propose a classification approach for coherence assessment, where we evaluate if an explanation logically follows another. Our experiments suggest that fine-tuning a neural network that was pre-trained to recognize textual entailment, performs well for coherence assessment of multimodal explanations. Code & data: https://pradippramanick.github.io/coherent-explain/.
△ Less
Submitted 1 October, 2024;
originally announced October 2024.
-
Measuring Transparency in Intelligent Robots
Authors:
Georgios Angelopoulos,
Dimitri Lacroix,
Ricarda Wullenkord,
Alessandra Rossi,
Silvia Rossi,
Friederike Eyssel
Abstract:
As robots become increasingly integrated into our daily lives, the need to make them transparent has never been more critical. Yet, despite its importance in human-robot interaction, a standardized measure of robot transparency has been missing until now. This paper addresses this gap by presenting the first comprehensive scale to measure perceived transparency in robotic systems, available in Eng…
▽ More
As robots become increasingly integrated into our daily lives, the need to make them transparent has never been more critical. Yet, despite its importance in human-robot interaction, a standardized measure of robot transparency has been missing until now. This paper addresses this gap by presenting the first comprehensive scale to measure perceived transparency in robotic systems, available in English, German, and Italian languages. Our approach conceptualizes transparency as a multidimensional construct, encompassing explainability, legibility, predictability, and meta-understanding. The proposed scale was a product of a rigorous three-stage process involving 1,223 participants. Firstly, we generated the items of our scale, secondly, we conducted an exploratory factor analysis, and thirdly, a confirmatory factor analysis served to validate the factor structure of the newly developed TOROS scale. The final scale encompasses 26 items and comprises three factors: Illegibility, Explainability, and Predictability. TOROS demonstrates high cross-linguistic reliability, inter-factor correlation, model fit, internal consistency, and convergent validity across the three cross-national samples. This empirically validated tool enables the assessment of robot transparency and contributes to the theoretical understanding of this complex construct. By offering a standardized measure, we facilitate consistent and comparable research in human-robot interaction in which TOROS can serve as a benchmark.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation
Authors:
Yixiao Wang,
Chen Tang,
Lingfeng Sun,
Simone Rossi,
Yichen Xie,
Chensheng Peng,
Thomas Hannagan,
Stefano Sabatini,
Nicola Poerio,
Masayoshi Tomizuka,
Wei Zhan
Abstract:
Diffusion models are promising for joint trajectory prediction and controllable generation in autonomous driving, but they face challenges of inefficient inference steps and high computational demands. To tackle these challenges, we introduce Optimal Gaussian Diffusion (OGD) and Estimated Clean Manifold (ECM) Guidance. OGD optimizes the prior distribution for a small diffusion time $T$ and starts…
▽ More
Diffusion models are promising for joint trajectory prediction and controllable generation in autonomous driving, but they face challenges of inefficient inference steps and high computational demands. To tackle these challenges, we introduce Optimal Gaussian Diffusion (OGD) and Estimated Clean Manifold (ECM) Guidance. OGD optimizes the prior distribution for a small diffusion time $T$ and starts the reverse diffusion process from it. ECM directly injects guidance gradients to the estimated clean manifold, eliminating extensive gradient backpropagation throughout the network. Our methodology streamlines the generative process, enabling practical applications with reduced computational overhead. Experimental validation on the large-scale Argoverse 2 dataset demonstrates our approach's superior performance, offering a viable solution for computationally efficient, high-quality joint trajectory prediction and controllable generation for autonomous driving. Our project webpage is at https://yixiaowang7.github.io/OptTrajDiff_Page/.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
Vulnerability Detection in Ethereum Smart Contracts via Machine Learning: A Qualitative Analysis
Authors:
Dalila Ressi,
Alvise Spanò,
Lorenzo Benetollo,
Carla Piazza,
Michele Bugliesi,
Sabina Rossi
Abstract:
Smart contracts are central to a myriad of critical blockchain applications, from financial transactions to supply chain management. However, their adoption is hindered by security vulnerabilities that can result in significant financial losses. Most vulnerability detection tools and methods available nowadays leverage either static analysis methods or machine learning. Unfortunately, as valuable…
▽ More
Smart contracts are central to a myriad of critical blockchain applications, from financial transactions to supply chain management. However, their adoption is hindered by security vulnerabilities that can result in significant financial losses. Most vulnerability detection tools and methods available nowadays leverage either static analysis methods or machine learning. Unfortunately, as valuable as they are, both approaches suffer from limitations that make them only partially effective. In this survey, we analyze the state of the art in machine-learning vulnerability detection for Ethereum smart contracts, by categorizing existing tools and methodologies, evaluating them, and highlighting their limitations. Our critical assessment unveils issues such as restricted vulnerability coverage and dataset construction flaws, providing us with new metrics to overcome the difficulties that restrain a sound comparison of existing solutions. Driven by our findings, we discuss best practices to enhance the accuracy, scope, and efficiency of vulnerability detection in smart contracts. Our guidelines address the known flaws while at the same time opening new avenues for research and development. By shedding light on current challenges and offering novel directions for improvement, we contribute to the advancement of secure smart contract development and blockchain technology as a whole.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
SpectralZoom: Efficient Segmentation with an Adaptive Hyperspectral Camera
Authors:
Jackson Arnold,
Sophia Rossi,
Chloe Petrosino,
Ethan Mitchell,
Sanjeev J. Koppal
Abstract:
Hyperspectral image segmentation is crucial for many fields such as agriculture, remote sensing, biomedical imaging, battlefield sensing and astronomy. However, the challenge of hyper and multi spectral imaging is its large data footprint. We propose both a novel camera design and a vision transformer-based (ViT) algorithm that alleviate both the captured data footprint and the computational load…
▽ More
Hyperspectral image segmentation is crucial for many fields such as agriculture, remote sensing, biomedical imaging, battlefield sensing and astronomy. However, the challenge of hyper and multi spectral imaging is its large data footprint. We propose both a novel camera design and a vision transformer-based (ViT) algorithm that alleviate both the captured data footprint and the computational load for hyperspectral segmentation. Our camera is able to adaptively sample image regions or patches at different resolutions, instead of capturing the entire hyperspectral cube at one high resolution. Our segmentation algorithm works in concert with the camera, applying ViT-based segmentation only to adaptively selected patches. We show results both in simulation and on a real hardware platform demonstrating both accurate segmentation results and reduced computational burden.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Smart Contract Languages: a comparative analysis
Authors:
Massimo Bartoletti,
Lorenzo Benetollo,
Michele Bugliesi,
Silvia Crafa,
Giacomo Dal Sasso,
Roberto Pettinau,
Andrea Pinna,
Mattia Piras,
Sabina Rossi,
Stefano Salis,
Alvise Spanò,
Viacheslav Tkachenko,
Roberto Tonelli,
Roberto Zunino
Abstract:
Smart contracts have played a pivotal role in the evolution of blockchains and Decentralized Applications (DApps). As DApps continue to gain widespread adoption, multiple smart contract languages have been and are being made available to developers, each with its distinctive features, strengths, and weaknesses. In this paper, we examine the smart contract languages used in major blockchain platfor…
▽ More
Smart contracts have played a pivotal role in the evolution of blockchains and Decentralized Applications (DApps). As DApps continue to gain widespread adoption, multiple smart contract languages have been and are being made available to developers, each with its distinctive features, strengths, and weaknesses. In this paper, we examine the smart contract languages used in major blockchain platforms, with the goal of providing a comprehensive assessment of their main properties. Our analysis targets the programming languages rather than the underlying architecture: as a result, while we do consider the interplay between language design and blockchain model, our main focus remains on language-specific features such as usability, programming style, safety and security. To conduct our assessment, we propose an original benchmark which encompasses a wide, yet manageable, spectrum of key use cases that cut across all the smart contract languages under examination.
△ Less
Submitted 8 August, 2024; v1 submitted 5 April, 2024;
originally announced April 2024.
-
An Early Categorization of Prompt Injection Attacks on Large Language Models
Authors:
Sippo Rossi,
Alisia Marianne Michel,
Raghava Rao Mukkamala,
Jason Bennett Thatcher
Abstract:
Large language models and AI chatbots have been at the forefront of democratizing artificial intelligence. However, the releases of ChatGPT and other similar tools have been followed by growing concerns regarding the difficulty of controlling large language models and their outputs. Currently, we are witnessing a cat-and-mouse game where users attempt to misuse the models with a novel attack calle…
▽ More
Large language models and AI chatbots have been at the forefront of democratizing artificial intelligence. However, the releases of ChatGPT and other similar tools have been followed by growing concerns regarding the difficulty of controlling large language models and their outputs. Currently, we are witnessing a cat-and-mouse game where users attempt to misuse the models with a novel attack called prompt injections. In contrast, the developers attempt to discover the vulnerabilities and block the attacks simultaneously. In this paper, we provide an overview of these emergent threats and present a categorization of prompt injections, which can guide future research on prompt injections and act as a checklist of vulnerabilities in the development of LLM interfaces. Moreover, based on previous literature and our own empirical research, we discuss the implications of prompt injections to LLM end users, developers, and researchers.
△ Less
Submitted 31 January, 2024;
originally announced February 2024.
-
Training program on sign language: social inclusion through Virtual Reality in ISENSE project
Authors:
Alessia Bisio,
Enrique Yeguas-Bolívar,
Pilar Aparicio-Martínez,
María Dolores Redel-Macías,
Sara Pinzi,
Stefano Rossi,
Juri Taborri
Abstract:
Structured hand gestures that incorporate visual motions and signs are used in sign language. Sign language is a valuable means of daily communication for individuals who are deaf or have speech impairments, but it is still rare among hearing people, and fewer are capable of understand it. Within the academic context, parents and teachers play a crucial role in supporting deaf students from childh…
▽ More
Structured hand gestures that incorporate visual motions and signs are used in sign language. Sign language is a valuable means of daily communication for individuals who are deaf or have speech impairments, but it is still rare among hearing people, and fewer are capable of understand it. Within the academic context, parents and teachers play a crucial role in supporting deaf students from childhood by facilitating their learning of sign language. In the last years, among all the teaching tools useful for learning sign language, the use of Virtual Reality (VR) has increased, as it has been demonstrated to improve retention, memory and attention during the learning process. The ISENSE project has been created to assist students with deafness during their academic life by proposing different technological tools for teaching sign language to the hearing community in the academic context. As part of the ISENSE project, this work aims to develop an application for Spanish and Italian sign language recognition that exploits the VR environment to quickly and easily create a comprehensive database of signs and an Artificial Intelligence (AI)-based software to accurately classify and recognize static and dynamic signs: from letters to sentences.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Increasing Transparency of Reinforcement Learning using Shielding for Human Preferences and Explanations
Authors:
Georgios Angelopoulos,
Luigi Mangiacapra,
Alessandra Rossi,
Claudia Di Napoli,
Silvia Rossi
Abstract:
The adoption of Reinforcement Learning (RL) in several human-centred applications provides robots with autonomous decision-making capabilities and adaptability based on the observations of the operating environment. In such scenarios, however, the learning process can make robots' behaviours unclear and unpredictable to humans, thus preventing a smooth and effective Human-Robot Interaction (HRI).…
▽ More
The adoption of Reinforcement Learning (RL) in several human-centred applications provides robots with autonomous decision-making capabilities and adaptability based on the observations of the operating environment. In such scenarios, however, the learning process can make robots' behaviours unclear and unpredictable to humans, thus preventing a smooth and effective Human-Robot Interaction (HRI). As a consequence, it becomes crucial to avoid robots performing actions that are unclear to the user. In this work, we investigate whether including human preferences in RL (concerning the actions the robot performs during learning) improves the transparency of a robot's behaviours. For this purpose, a shielding mechanism is included in the RL algorithm to include human preferences and to monitor the learning agent's decisions. We carried out a within-subjects study involving 26 participants to evaluate the robot's transparency in terms of Legibility, Predictability, and Expectability in different settings. Results indicate that considering human preferences during learning improves Legibility with respect to providing only Explanations, and combining human preferences with explanations elucidating the rationale behind the robot's decisions further amplifies transparency. Results also confirm that an increase in transparency leads to an increase in the safety, comfort, and reliability of the robot. These findings show the importance of transparency during learning and suggest a paradigm for robotic applications with human in the loop.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Noninterference Analysis of Reversible Systems: An Approach Based on Branching Bisimilarity
Authors:
Andrea Esposito,
Alessandro Aldini,
Marco Bernardo,
Sabina Rossi
Abstract:
The theory of noninterference supports the analysis of information leakage and the execution of secure computations in multi-level security systems. Classical equivalence-based approaches to noninterference mainly rely on weak bisimulation semantics. We show that this approach is not sufficient to identify potential covert channels in the presence of reversible computations. As illustrated via a d…
▽ More
The theory of noninterference supports the analysis of information leakage and the execution of secure computations in multi-level security systems. Classical equivalence-based approaches to noninterference mainly rely on weak bisimulation semantics. We show that this approach is not sufficient to identify potential covert channels in the presence of reversible computations. As illustrated via a database management system example, the activation of backward computations may trigger information flows that are not observable when proceeding in the standard forward direction. To capture the effects of back-and-forth computations, it is necessary to switch to a more expressive semantics, which has been proven to be branching bisimilarity in a previous work by De Nicola, Montanari, and Vaandrager. In this paper we investigate a taxonomy of noninterference properties based on branching bisimilarity along with their preservation and compositionality features, then we compare it with the taxonomy of Focardi and Gorrieri based on weak bisimilarity.
△ Less
Submitted 21 January, 2025; v1 submitted 27 November, 2023;
originally announced November 2023.
-
Class Balanced Dynamic Acquisition for Domain Adaptive Semantic Segmentation using Active Learning
Authors:
Marc Schachtsiek,
Simone Rossi,
Thomas Hannagan
Abstract:
Domain adaptive active learning is leading the charge in label-efficient training of neural networks. For semantic segmentation, state-of-the-art models jointly use two criteria of uncertainty and diversity to select training labels, combined with a pixel-wise acquisition strategy. However, we show that such methods currently suffer from a class imbalance issue which degrades their performance for…
▽ More
Domain adaptive active learning is leading the charge in label-efficient training of neural networks. For semantic segmentation, state-of-the-art models jointly use two criteria of uncertainty and diversity to select training labels, combined with a pixel-wise acquisition strategy. However, we show that such methods currently suffer from a class imbalance issue which degrades their performance for larger active learning budgets. We then introduce Class Balanced Dynamic Acquisition (CBDA), a novel active learning method that mitigates this issue, especially in high-budget regimes. The more balanced labels increase minority class performance, which in turn allows the model to outperform the previous baseline by 0.6, 1.7, and 2.4 mIoU for budgets of 5%, 10%, and 20%, respectively. Additionally, the focus on minority classes leads to improvements of the minimum class performance of 0.5, 2.9, and 4.6 IoU respectively. The top-performing model even exceeds the fully supervised baseline, showing that a more balanced label than the entire ground truth can be beneficial.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
On permutation symmetries in Bayesian neural network posteriors: a variational perspective
Authors:
Simone Rossi,
Ankit Singh,
Thomas Hannagan
Abstract:
The elusive nature of gradient-based optimization in neural networks is tied to their loss landscape geometry, which is poorly understood. However recent work has brought solid evidence that there is essentially no loss barrier between the local solutions of gradient descent, once accounting for weight-permutations that leave the network's computation unchanged. This raises questions for approxima…
▽ More
The elusive nature of gradient-based optimization in neural networks is tied to their loss landscape geometry, which is poorly understood. However recent work has brought solid evidence that there is essentially no loss barrier between the local solutions of gradient descent, once accounting for weight-permutations that leave the network's computation unchanged. This raises questions for approximate inference in Bayesian neural networks (BNNs), where we are interested in marginalizing over multiple points in the loss landscape. In this work, we first extend the formalism of marginalized loss barrier and solution interpolation to BNNs, before proposing a matching algorithm to search for linearly connected solutions. This is achieved by aligning the distributions of two independent approximate Bayesian solutions with respect to permutation matrices. We build on the results of Ainsworth et al. (2023), reframing the problem as a combinatorial optimization one, using an approximation to the sum of bilinear assignment problem. We then experiment on a variety of architectures and datasets, finding nearly zero marginalized loss barriers for linearly connected solutions.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Proceeding of the 1st Workshop on Social Robots Personalisation At the crossroads between engineering and humanities (CONCATENATE)
Authors:
Imene Tarakli,
Georgios Angelopoulos,
Mehdi Hellou,
Camille Vindolet,
Boris Abramovic,
Rocco Limongelli,
Dimitri Lacroix,
Andrea Bertolini,
Silvia Rossi,
Alessandro Di Nuovo,
Angelo Cangelosi,
Gordon Cheng
Abstract:
Nowadays, robots are expected to interact more physically, cognitively, and socially with people. They should adapt to unpredictable contexts alongside individuals with various behaviours. For this reason, personalisation is a valuable attribute for social robots as it allows them to act according to a specific user's needs and preferences and achieve natural and transparent robot behaviours for h…
▽ More
Nowadays, robots are expected to interact more physically, cognitively, and socially with people. They should adapt to unpredictable contexts alongside individuals with various behaviours. For this reason, personalisation is a valuable attribute for social robots as it allows them to act according to a specific user's needs and preferences and achieve natural and transparent robot behaviours for humans. If correctly implemented, personalisation could also be the key to the large-scale adoption of social robotics. However, achieving personalisation is arduous as it requires us to expand the boundaries of robotics by taking advantage of the expertise of various domains. Indeed, personalised robots need to analyse and model user interactions while considering their involvement in the adaptative process. It also requires us to address ethical and socio-cultural aspects of personalised HRI to achieve inclusive and diverse interaction and avoid deception and misplaced trust when interacting with the users. At the same time, policymakers need to ensure regulations in view of possible short-term and long-term adaptive HRI. This workshop aims to raise an interdisciplinary discussion on personalisation in robotics. It aims at bringing researchers from different fields together to propose guidelines for personalisation while addressing the following questions: how to define it - how to achieve it - and how it should be guided to fit legal and ethical requirements.
△ Less
Submitted 23 November, 2023; v1 submitted 10 July, 2023;
originally announced July 2023.
-
AGAR: Attention Graph-RNN for Adaptative Motion Prediction of Point Clouds of Deformable Objects
Authors:
Pedro Gomes,
Silvia Rossi,
Laura Toni
Abstract:
This paper focuses on motion prediction for point cloud sequences in the challenging case of deformable 3D objects, such as human body motion. First, we investigate the challenges caused by deformable shapes and complex motions present in this type of representation, with the ultimate goal of understanding the technical limitations of state-of-the-art models. From this understanding, we propose an…
▽ More
This paper focuses on motion prediction for point cloud sequences in the challenging case of deformable 3D objects, such as human body motion. First, we investigate the challenges caused by deformable shapes and complex motions present in this type of representation, with the ultimate goal of understanding the technical limitations of state-of-the-art models. From this understanding, we propose an improved architecture for point cloud prediction of deformable 3D objects. Specifically, to handle deformable shapes, we propose a graph-based approach that learns and exploits the spatial structure of point clouds to extract more representative features. Then we propose a module able to combine the learned features in an adaptative manner according to the point cloud movements. The proposed adaptative module controls the composition of local and global motions for each point, enabling the network to model complex motions in deformable 3D objects more effectively. We tested the proposed method on the following datasets: MNIST moving digits, the Mixamo human bodies motions, JPEG and CWIPC-SXR real-world dynamic bodies. Simulation results demonstrate that our method outperforms the current baseline methods given its improved ability to model complex movements as well as preserve point cloud shape. Furthermore, we demonstrate the generalizability of the proposed framework for dynamic feature learning, by testing the framework for action recognition on the MSRAction3D dataset and achieving results on-par with state-of-the-art methods
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Continuous-Time Functional Diffusion Processes
Authors:
Giulio Franzese,
Giulio Corallo,
Simone Rossi,
Markus Heinonen,
Maurizio Filippone,
Pietro Michiardi
Abstract:
We introduce Functional Diffusion Processes (FDPs), which generalize score-based diffusion models to infinite-dimensional function spaces. FDPs require a new mathematical framework to describe the forward and backward dynamics, and several extensions to derive practical training objectives. These include infinite-dimensional versions of Girsanov theorem, in order to be able to compute an ELBO, and…
▽ More
We introduce Functional Diffusion Processes (FDPs), which generalize score-based diffusion models to infinite-dimensional function spaces. FDPs require a new mathematical framework to describe the forward and backward dynamics, and several extensions to derive practical training objectives. These include infinite-dimensional versions of Girsanov theorem, in order to be able to compute an ELBO, and of the sampling theorem, in order to guarantee that functional evaluations in a countable set of points are equivalent to infinite-dimensional functions. We use FDPs to build a new breed of generative models in function spaces, which do not require specialized network architectures, and that can work with any kind of continuous data. Our results on real data show that FDPs achieve high-quality image generation, using a simple MLP architecture with orders of magnitude fewer parameters than existing diffusion models.
△ Less
Submitted 18 December, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Explaining Hierarchical Features in Dynamic Point Cloud Processing
Authors:
Pedro Gomes,
Silvia Rossi,
Laura Toni
Abstract:
This paper aims at bringing some light and understanding to the field of deep learning for dynamic point cloud processing. Specifically, we focus on the hierarchical features learning aspect, with the ultimate goal of understanding which features are learned at the different stages of the process and what their meaning is. Last, we bring clarity on how hierarchical components of the network affect…
▽ More
This paper aims at bringing some light and understanding to the field of deep learning for dynamic point cloud processing. Specifically, we focus on the hierarchical features learning aspect, with the ultimate goal of understanding which features are learned at the different stages of the process and what their meaning is. Last, we bring clarity on how hierarchical components of the network affect the learned features and their importance for a successful learning model. This study is conducted for point cloud prediction tasks, useful for predicting coding applications.
△ Less
Submitted 30 September, 2022;
originally announced September 2022.
-
An Application of a Runtime Epistemic Probabilistic Event Calculus to Decision-making in e-Health Systems
Authors:
Fabio Aurelio D'Asaro,
Luca Raggioli,
Salim Malek,
Marco Grazioso,
Silvia Rossi
Abstract:
We present and discuss a runtime architecture that integrates sensorial data and classifiers with a logic-based decision-making system in the context of an e-Health system for the rehabilitation of children with neuromotor disorders. In this application, children perform a rehabilitation task in the form of games. The main aim of the system is to derive a set of parameters the child's current leve…
▽ More
We present and discuss a runtime architecture that integrates sensorial data and classifiers with a logic-based decision-making system in the context of an e-Health system for the rehabilitation of children with neuromotor disorders. In this application, children perform a rehabilitation task in the form of games. The main aim of the system is to derive a set of parameters the child's current level of cognitive and behavioral performance (e.g., engagement, attention, task accuracy) from the available sensors and classifiers (e.g., eye trackers, motion sensors, emotion recognition techniques) and take decisions accordingly. These decisions are typically aimed at improving the child's performance by triggering appropriate re-engagement stimuli when their attention is low, by changing the game or making it more difficult when the child is losing interest in the task as it is too easy. Alongside state-of-the-art techniques for emotion recognition and head pose estimation, we use a runtime variant of a probabilistic and epistemic logic programming dialect of the Event Calculus, known as the Epistemic Probabilistic Event Calculus. In particular, the probabilistic component of this symbolic framework allows for a natural interface with the machine learning techniques. We overview the architecture and its components, and show some of its characteristics through a discussion of a running example and experiments. Under consideration for publication in Theory and Practice of Logic Programming (TPLP).
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Neural Networks Reduction via Lumping
Authors:
Dalila Ressi,
Riccardo Romanello,
Sabina Rossi,
Carla Piazza
Abstract:
The increasing size of recently proposed Neural Networks makes it hard to implement them on embedded devices, where memory, battery and computational power are a non-trivial bottleneck. For this reason during the last years network compression literature has been thriving and a large number of solutions has been been published to reduce both the number of operations and the parameters involved wit…
▽ More
The increasing size of recently proposed Neural Networks makes it hard to implement them on embedded devices, where memory, battery and computational power are a non-trivial bottleneck. For this reason during the last years network compression literature has been thriving and a large number of solutions has been been published to reduce both the number of operations and the parameters involved with the models. Unfortunately, most of these reducing techniques are actually heuristic methods and usually require at least one re-training step to recover the accuracy. The need of procedures for model reduction is well-known also in the fields of Verification and Performances Evaluation, where large efforts have been devoted to the definition of quotients that preserve the observable underlying behaviour. In this paper we try to bridge the gap between the most popular and very effective network reduction strategies and formal notions, such as lumpability, introduced for verification and evaluation of Markov Chains. Elaborating on lumpability we propose a pruning approach that reduces the number of neurons in a network without using any data or fine-tuning, while completely preserving the exact behaviour. Relaxing the constraints on the exact definition of the quotienting method we can give a formal explanation of some of the most common reduction techniques.
△ Less
Submitted 15 September, 2022;
originally announced September 2022.
-
Are Deep Learning-Generated Social Media Profiles Indistinguishable from Real Profiles?
Authors:
Sippo Rossi,
Youngjin Kwon,
Odd Harald Auglend,
Raghava Rao Mukkamala,
Matti Rossi,
Jason Thatcher
Abstract:
In recent years, deep learning methods have become increasingly capable of generating near photorealistic pictures and humanlike text up to the point that humans can no longer recognize what is real and what is AI-generated. Concerningly, there is evidence that some of these methods have already been adopted to produce fake social media profiles and content. We hypothesize that these advances have…
▽ More
In recent years, deep learning methods have become increasingly capable of generating near photorealistic pictures and humanlike text up to the point that humans can no longer recognize what is real and what is AI-generated. Concerningly, there is evidence that some of these methods have already been adopted to produce fake social media profiles and content. We hypothesize that these advances have made detecting generated fake social media content in the feed extremely difficult, if not impossible, for the average user of social media. This paper presents the results of an experiment where 375 participants attempted to label real and generated profiles and posts in a simulated social media feed. The results support our hypothesis and suggest that even fully-generated fake profiles with posts written by an advanced text generator are difficult for humans to identify.
△ Less
Submitted 15 September, 2022;
originally announced September 2022.
-
The Road to a Successful HRI: AI, Trust and ethicS-TRAITS
Authors:
Alessandra Rossi,
Antonio Andriella,
Silvia Rossi,
Anouk van Maris
Abstract:
The aim of this workshop is to foster the exchange of insights on past and ongoing research towards effective and long-lasting collaborations between humans and robots. This workshop will provide a forum for representatives from academia and industry communities to analyse the different aspects of HRI that impact on its success. We particularly focus on AI techniques required to implement autonomo…
▽ More
The aim of this workshop is to foster the exchange of insights on past and ongoing research towards effective and long-lasting collaborations between humans and robots. This workshop will provide a forum for representatives from academia and industry communities to analyse the different aspects of HRI that impact on its success. We particularly focus on AI techniques required to implement autonomous and proactive interactions, on the factors that enhance, undermine, or recover humans' acceptance and trust in robots, and on the potential ethical and legal concerns related to the deployment of such robots in human-centred environments.
Website: https://sites.google.com/view/traits-hri-2022
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
How Much is Enough? A Study on Diffusion Times in Score-based Generative Models
Authors:
Giulio Franzese,
Simone Rossi,
Lixuan Yang,
Alessandro Finamore,
Dario Rossi,
Maurizio Filippone,
Pietro Michiardi
Abstract:
Score-based diffusion models are a class of generative models whose dynamics is described by stochastic differential equations that map noise into data. While recent works have started to lay down a theoretical foundation for these models, an analytical understanding of the role of the diffusion time T is still lacking. Current best practice advocates for a large T to ensure that the forward dynam…
▽ More
Score-based diffusion models are a class of generative models whose dynamics is described by stochastic differential equations that map noise into data. While recent works have started to lay down a theoretical foundation for these models, an analytical understanding of the role of the diffusion time T is still lacking. Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution; however, a smaller value of T should be preferred for a better approximation of the score-matching objective and higher computational efficiency. Starting from a variational interpretation of diffusion models, in this work we quantify this trade-off, and suggest a new method to improve quality and efficiency of both training and sampling, by adopting smaller diffusion times. Indeed, we show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process. Empirical results support our analysis; for image data, our method is competitive w.r.t. the state-of-the-art, according to standard sample quality metrics and log-likelihood.
△ Less
Submitted 10 June, 2022;
originally announced June 2022.
-
A ROS Architecture for Personalised HRI with a Bartender Social Robot
Authors:
Alessandra Rossi,
Maria Di Maro,
Antonio Origlia,
Agostino Palmiero,
Silvia Rossi
Abstract:
BRILLO (Bartending Robot for Interactive Long-Lasting Operations) project has the overall goal of creating an autonomous robotic bartender that can interact with customers while accomplishing its bartending tasks. In such a scenario, people's novelty effect connected to the use of an attractive technology is destined to wear off and, consequently, it negatively affects the success of the service r…
▽ More
BRILLO (Bartending Robot for Interactive Long-Lasting Operations) project has the overall goal of creating an autonomous robotic bartender that can interact with customers while accomplishing its bartending tasks. In such a scenario, people's novelty effect connected to the use of an attractive technology is destined to wear off and, consequently, it negatively affects the success of the service robotics application. For this reason, providing personalised natural interaction while accessing its services is of paramount importance for increasing users' engagement and, consequently, their loyalty. In this paper, we present the developed three-layers ROS architecture integrating a perception layer managing the processing of different social signals, a decision-making layer for handling multi-party interactions, and an execution layer controlling the behaviour of a complex robot composed of arms and a face. Finally, user modelling through a beliefs layer allows for personalised interaction.
△ Less
Submitted 15 March, 2022; v1 submitted 13 March, 2022;
originally announced March 2022.
-
Extending 3-DoF Metrics to Model User Behaviour Similarity in 6-DoF Immersive Applications
Authors:
Silvia Rossi,
Irene Viola,
Laura Toni,
Pablo Cesar
Abstract:
Immersive reality technologies, such as Virtual and Augmented Reality, have ushered a new era of user-centric systems, in which every aspect of the coding--delivery--rendering chain is tailored to the interaction of the users. Understanding the actual interactivity and behaviour of the users is still an open challenge and a key step to enabling such a user-centric system. Our main goal is to exten…
▽ More
Immersive reality technologies, such as Virtual and Augmented Reality, have ushered a new era of user-centric systems, in which every aspect of the coding--delivery--rendering chain is tailored to the interaction of the users. Understanding the actual interactivity and behaviour of the users is still an open challenge and a key step to enabling such a user-centric system. Our main goal is to extend the applicability of existing behavioural methodologies for studying user navigation in the case of 6 Degree-of-Freedom (DoF). Specifically, we first compare the navigation in 6-DoF with its 3-DoF counterpart highlighting the main differences and novelties. Then, we define new metrics aimed at better modelling behavioural similarities between users in a 6-DoF system. We validate and test our solutions on real navigation paths of users interacting with dynamic volumetric media in 6-DoF Virtual Reality conditions. Our results show that metrics that consider both user position and viewing direction better perform in detecting user similarity while navigating in a 6-DoF system. Having easy-to-use but robust metrics that underpin multiple tools and answer the question ``how do we detect if two users look at the same content?" open the gate to new solutions for a user-centric system.
△ Less
Submitted 20 June, 2023; v1 submitted 17 December, 2021;
originally announced December 2021.
-
Model Selection for Bayesian Autoencoders
Authors:
Ba-Hien Tran,
Simone Rossi,
Dimitrios Milios,
Pietro Michiardi,
Edwin V. Bonilla,
Maurizio Filippone
Abstract:
We develop a novel method for carrying out model selection for Bayesian autoencoders (BAEs) by means of prior hyper-parameter optimization. Inspired by the common practice of type-II maximum likelihood optimization and its equivalence to Kullback-Leibler divergence minimization, we propose to optimize the distributional sliced-Wasserstein distance (DSWD) between the output of the autoencoder and t…
▽ More
We develop a novel method for carrying out model selection for Bayesian autoencoders (BAEs) by means of prior hyper-parameter optimization. Inspired by the common practice of type-II maximum likelihood optimization and its equivalence to Kullback-Leibler divergence minimization, we propose to optimize the distributional sliced-Wasserstein distance (DSWD) between the output of the autoencoder and the empirical data distribution. The advantages of this formulation are that we can estimate the DSWD based on samples and handle high-dimensional problems. We carry out posterior estimation of the BAE parameters via stochastic gradient Hamiltonian Monte Carlo and turn our BAE into a generative model by fitting a flexible Dirichlet mixture model in the latent space. Consequently, we obtain a powerful alternative to variational autoencoders, which are the preferred choice in modern applications of autoencoders for representation learning with uncertainty. We evaluate our approach qualitatively and quantitatively using a vast experimental campaign on a number of unsupervised learning tasks and show that, in small-data regimes where priors matter, our approach provides state-of-the-art results, outperforming multiple competitive baselines.
△ Less
Submitted 11 June, 2021;
originally announced June 2021.
-
Enhancing human bodies with extra robotic arms and fingers: The Neural Resource Allocation Problem
Authors:
Giulia Dominijanni,
Solaiman Shokur,
Gionata Salvietti,
Sarah Buehler,
Erica Palmerini,
Simone Rossi,
Frederique De Vignemont,
Andrea D'Avella,
Tamar R. Makin,
Domenico Prattichizzo,
Silvestro Micera
Abstract:
The emergence of robot-based body augmentation promises exciting innovations that will inform robotics, human-machine interaction, and wearable electronics. Even though augmentative devices like extra robotic arms and fingers in many ways build on restorative technologies, they introduce unique challenges for bidirectional human-machine collaboration. Can humans adapt and learn to operate a new li…
▽ More
The emergence of robot-based body augmentation promises exciting innovations that will inform robotics, human-machine interaction, and wearable electronics. Even though augmentative devices like extra robotic arms and fingers in many ways build on restorative technologies, they introduce unique challenges for bidirectional human-machine collaboration. Can humans adapt and learn to operate a new limb collaboratively with their biological limbs without sacrificing their physical abilities? To successfully achieve robotic body augmentation, we need to ensure that by giving a person an additional (artificial) limb, we are not in fact trading off an existing (biological) one. In this manuscript, we introduce the "Neural Resource Allocation" problem, which distinguishes body augmentation from existing robotics paradigms such as teleoperation and prosthetics. We discuss how to allow the effective and effortless voluntary control of augmentative devices without compromising the voluntary control of the biological body. In reviewing the relevant literature on extra robotic fingers and limbs we critically assess the range of potential solutions available for the "Neural Resource Allocation" problem. For this purpose, we combine multiple perspectives from engineering and neuroscience with considerations from human-machine interaction, sensory-motor integration, ethics and law. Altogether we aim to define common foundations and operating principles for the successful implementation of motor augmentation.
△ Less
Submitted 31 March, 2021;
originally announced March 2021.
-
I Know What You Would Like to Drink: Benefits and Detriments of Sharing Personal Info with a Bartender Robot
Authors:
Alessandra Rossi,
Vito Giura,
Carmine Di Leva,
Silvia Rossi
Abstract:
This paper introduces benefits and detriments of a robot bartender that is capable of adapting the interaction with human users according to their preferences in drinks, music, and hobbies. We believe that a personalised experience during a human-robot interaction increases the human user's engagement with the robot and that such information will be used by the robot during the interaction. Howeve…
▽ More
This paper introduces benefits and detriments of a robot bartender that is capable of adapting the interaction with human users according to their preferences in drinks, music, and hobbies. We believe that a personalised experience during a human-robot interaction increases the human user's engagement with the robot and that such information will be used by the robot during the interaction. However, this implies that the users need to share several personal information with the robot. In this paper, we introduce the research topic and our approach to evaluate people's perceptions and consideration of their privacy with a robot. We present a within-subject study in which participants interacted twice with a robot that firstly had not any previous info about the users, and, then, having a knowledge of their preferences. We observed that less than 60\% of the participants were not concerned about sharing personal information with the robot.
△ Less
Submitted 2 April, 2021; v1 submitted 24 March, 2021;
originally announced March 2021.
-
The Road to a Successful HRI: AI, Trust and ethicS-TRAITS
Authors:
Antonio Andriella,
Alessandra Rossi,
Silvia Rossi,
Anouk van Maris
Abstract:
The aim of this workshop is to give researchers from academia and industry the possibility to discuss the inter-and multi-disciplinary nature of the relationships between people and robots towards effective and long-lasting collaborations. This workshop will provide a forum for the HRI and robotics communities to explore successful human-robot interaction (HRI) to analyse the different aspects of…
▽ More
The aim of this workshop is to give researchers from academia and industry the possibility to discuss the inter-and multi-disciplinary nature of the relationships between people and robots towards effective and long-lasting collaborations. This workshop will provide a forum for the HRI and robotics communities to explore successful human-robot interaction (HRI) to analyse the different aspects of HRI that impact its success. Particular focus are the AI algorithms required to implement autonomous interactions, and the factors that enhance, undermine, or recover humans' trust in robots. Finally, potential ethical and legal concerns, and how they can be addressed will be considered. Website: https://sites.google.com/view/traits-hri
△ Less
Submitted 20 April, 2021; v1 submitted 23 March, 2021;
originally announced March 2021.
-
Spatio-temporal Graph-RNN for Point Cloud Prediction
Authors:
Pedro Gomes,
Silvia Rossi,
Laura Toni
Abstract:
In this paper, we propose an end-to-end learning network to predict future frames in a point cloud sequence. As main novelty, an initial layer learns topological information of point clouds as geometric features, to form representative spatio-temporal neighborhoods. This module is followed by multiple Graph-RNN cells. Each cell learns points dynamics (i.e., RNN states) by processing each point joi…
▽ More
In this paper, we propose an end-to-end learning network to predict future frames in a point cloud sequence. As main novelty, an initial layer learns topological information of point clouds as geometric features, to form representative spatio-temporal neighborhoods. This module is followed by multiple Graph-RNN cells. Each cell learns points dynamics (i.e., RNN states) by processing each point jointly with the spatio-temporal neighbouring points. We tested the network performance with a MINST dataset of moving digits, a synthetic human bodies motions and JPEG dynamic bodies datasets. Simulation results demonstrate that our method outperforms baseline ones that neglect geometry features information.
△ Less
Submitted 22 February, 2021; v1 submitted 15 February, 2021;
originally announced February 2021.
-
All You Need is a Good Functional Prior for Bayesian Deep Learning
Authors:
Ba-Hien Tran,
Simone Rossi,
Dimitrios Milios,
Maurizio Filippone
Abstract:
The Bayesian treatment of neural networks dictates that a prior distribution is specified over their weight and bias parameters. This poses a challenge because modern neural networks are characterized by a large number of parameters, and the choice of these priors has an uncontrolled effect on the induced functional prior, which is the distribution of the functions obtained by sampling the paramet…
▽ More
The Bayesian treatment of neural networks dictates that a prior distribution is specified over their weight and bias parameters. This poses a challenge because modern neural networks are characterized by a large number of parameters, and the choice of these priors has an uncontrolled effect on the induced functional prior, which is the distribution of the functions obtained by sampling the parameters from their prior distribution. We argue that this is a hugely limiting aspect of Bayesian deep learning, and this work tackles this limitation in a practical and effective way. Our proposal is to reason in terms of functional priors, which are easier to elicit, and to "tune" the priors of neural network parameters in a way that they reflect such functional priors. Gaussian processes offer a rigorous framework to define prior distributions over functions, and we propose a novel and robust framework to match their prior with the functional prior of neural networks based on the minimization of their Wasserstein distance. We provide vast experimental evidence that coupling these priors with scalable Markov chain Monte Carlo sampling offers systematically large performance improvements over alternative choices of priors and state-of-the-art approximate Bayesian deep learning approaches. We consider this work a considerable step in the direction of making the long-standing challenge of carrying out a fully Bayesian treatment of neural networks, including convolutional neural networks, a concrete possibility.
△ Less
Submitted 25 April, 2022; v1 submitted 25 November, 2020;
originally announced November 2020.
-
Feature Selection based on Principal Component Analysis for Underwater Source Localization by Deep Learning
Authors:
Xiaoyu Zhu,
Hefeng Dong,
Pierluigi Salvo Rossi,
Martin Landrø
Abstract:
In this paper, we propose an interpretable feature selection method based on principal component analysis (PCA) and principal component regression (PCR), which can extract important features for underwater source localization by only introducing the source location without other prior information. This feature selection method is combined with a two-step framework for underwater source localizatio…
▽ More
In this paper, we propose an interpretable feature selection method based on principal component analysis (PCA) and principal component regression (PCR), which can extract important features for underwater source localization by only introducing the source location without other prior information. This feature selection method is combined with a two-step framework for underwater source localization based on the semi-supervised learning scheme. In the framework, the first step utilizes a convolutional autoencoder to extract the latent features from the whole available dataset. The second step performs source localization via an encoder multi-layer perceptron (MLP) trained on a limited labeled portion of the dataset. The proposed approach has been validated on the public dataset SwllEx-96 Event S5. The result shows the framework has appealing accuracy and robustness on the unseen data, especially when the number of data used to train gradually decreases. After feature selection, not only the training stage has a 95\% acceleration but the performance of the framework becomes more robust on the depth and more accurate when the number of labeled data used to train is extremely limited.
△ Less
Submitted 25 November, 2020;
originally announced November 2020.
-
Exoway: an exoskeleton on actuated wheels
Authors:
D. Abruzzese,
D. Carnevale,
A. Monti,
C. Possieri,
S. Rossi,
M. Sassano,
P. P. Valentini
Abstract:
In this short work we present a low cost exoskeleton with actuated wheels that allows movements as well as skating-like steps. The simple structure and the actuated wheels allows to minimize the use of motors for locomotion. The structure is stabilized by an active control system that balances the structure and permit to be maneuvered by the driver whose commands are acquired by a dedicated hardwa…
▽ More
In this short work we present a low cost exoskeleton with actuated wheels that allows movements as well as skating-like steps. The simple structure and the actuated wheels allows to minimize the use of motors for locomotion. The structure is stabilized by an active control system that balances the structure and permit to be maneuvered by the driver whose commands are acquired by a dedicated hardware interface.
△ Less
Submitted 26 October, 2020;
originally announced October 2020.
-
Towards Transparency of TD-RL Robotic Systems with a Human Teacher
Authors:
Marco Matarese,
Silvia Rossi,
Alessandra Sciutti,
Francesco Rea
Abstract:
The high request for autonomous and flexible HRI implies the necessity of deploying Machine Learning (ML) mechanisms in the robot control. Indeed, the use of ML techniques, such as Reinforcement Learning (RL), makes the robot behaviour, during the learning process, not transparent to the observing user. In this work, we proposed an emotional model to improve the transparency in RL tasks for human-…
▽ More
The high request for autonomous and flexible HRI implies the necessity of deploying Machine Learning (ML) mechanisms in the robot control. Indeed, the use of ML techniques, such as Reinforcement Learning (RL), makes the robot behaviour, during the learning process, not transparent to the observing user. In this work, we proposed an emotional model to improve the transparency in RL tasks for human-robot collaborative scenarios. The architecture we propose supports the RL algorithm with an emotional model able to both receive human feedback and exhibit emotional responses based on the learning process. The model is entirely based on the Temporal Difference (TD) error. The architecture was tested in an isolated laboratory with a simple setup. The results highlight that showing its internal state through an emotional response is enough to make a robot transparent to its human teacher. People also prefer to interact with a responsive robot because they are used to understand their intentions via emotions and social signals.
△ Less
Submitted 12 May, 2020;
originally announced May 2020.
-
Modeling limited attention in opinion dynamics by topological interactions
Authors:
Francesca Ceragioli,
Paolo Frasca,
Wilbert Samuel Rossi
Abstract:
This work explores models of opinion dynamics with opinion-dependent connectivity. Our starting point is that individuals have limited capabilities to engage in interactions with their peers. Motivated by this observation, we propose a continuous-time opinion dynamics model such that interactions take place with a limited number of peers: we refer to these interactions as topological, as opposed t…
▽ More
This work explores models of opinion dynamics with opinion-dependent connectivity. Our starting point is that individuals have limited capabilities to engage in interactions with their peers. Motivated by this observation, we propose a continuous-time opinion dynamics model such that interactions take place with a limited number of peers: we refer to these interactions as topological, as opposed to metric interactions that are postulated in classical bounded-confidence models. We observe that topological interactions produce equilibria that are very robust to perturbations.
△ Less
Submitted 30 June, 2020; v1 submitted 19 March, 2020;
originally announced March 2020.
-
Sparse Gaussian Processes Revisited: Bayesian Approaches to Inducing-Variable Approximations
Authors:
Simone Rossi,
Markus Heinonen,
Edwin V. Bonilla,
Zheyang Shen,
Maurizio Filippone
Abstract:
Variational inference techniques based on inducing variables provide an elegant framework for scalable posterior estimation in Gaussian process (GP) models. Besides enabling scalability, one of their main advantages over sparse approximations using direct marginal likelihood maximization is that they provide a robust alternative for point estimation of the inducing inputs, i.e. the location of the…
▽ More
Variational inference techniques based on inducing variables provide an elegant framework for scalable posterior estimation in Gaussian process (GP) models. Besides enabling scalability, one of their main advantages over sparse approximations using direct marginal likelihood maximization is that they provide a robust alternative for point estimation of the inducing inputs, i.e. the location of the inducing variables. In this work we challenge the common wisdom that optimizing the inducing inputs in the variational framework yields optimal performance. We show that, by revisiting old model approximations such as the fully-independent training conditionals endowed with powerful sampling-based inference methods, treating both inducing locations and GP hyper-parameters in a Bayesian way can improve performance significantly. Based on stochastic gradient Hamiltonian Monte Carlo, we develop a fully Bayesian approach to scalable GP and deep GP models, and demonstrate its state-of-the-art performance through an extensive experimental campaign across several regression and classification problems.
△ Less
Submitted 23 February, 2021; v1 submitted 6 March, 2020;
originally announced March 2020.
-
Efficient Approximate Inference with Walsh-Hadamard Variational Inference
Authors:
Simone Rossi,
Sebastien Marmin,
Maurizio Filippone
Abstract:
Variational inference offers scalable and flexible tools to tackle intractable Bayesian inference of modern statistical models like Bayesian neural networks and Gaussian processes. For largely over-parameterized models, however, the over-regularization property of the variational objective makes the application of variational inference challenging. Inspired by the literature on kernel methods, and…
▽ More
Variational inference offers scalable and flexible tools to tackle intractable Bayesian inference of modern statistical models like Bayesian neural networks and Gaussian processes. For largely over-parameterized models, however, the over-regularization property of the variational objective makes the application of variational inference challenging. Inspired by the literature on kernel methods, and in particular on structured approximations of distributions of random matrices, this paper proposes Walsh-Hadamard Variational Inference, which uses Walsh-Hadamard-based factorization strategies to reduce model parameterization, accelerate computations, and increase the expressiveness of the approximate posterior beyond fully factorized ones.
△ Less
Submitted 29 November, 2019;
originally announced December 2019.
-
Emotional Distraction for Children Anxiety Reduction During Vaccination
Authors:
Martina Ruocco,
Marwa Larafa,
Silvia Rossi
Abstract:
Social assistive robots are starting to be widely used in pediatric health-care environments with the aim of distracting and entertaining children, and so of reducing a possible state of anxiety. In this paper, we present some initial results of a study (N=69) conducted in a Health-Vaccines Center, where the distraction role of a social robot, which interacts with a child showing an emotional beha…
▽ More
Social assistive robots are starting to be widely used in pediatric health-care environments with the aim of distracting and entertaining children, and so of reducing a possible state of anxiety. In this paper, we present some initial results of a study (N=69) conducted in a Health-Vaccines Center, where the distraction role of a social robot, which interacts with a child showing an emotional behavior, is compared with the same not showing any emotional social cue. Outcome criteria for the evaluation of the intervention included the parents reported level of anxiety before, during and after the procedure.
△ Less
Submitted 3 October, 2019; v1 submitted 11 September, 2019;
originally announced September 2019.
-
Walsh-Hadamard Variational Inference for Bayesian Deep Learning
Authors:
Simone Rossi,
Sebastien Marmin,
Maurizio Filippone
Abstract:
Over-parameterized models, such as DeepNets and ConvNets, form a class of models that are routinely adopted in a wide variety of applications, and for which Bayesian inference is desirable but extremely challenging. Variational inference offers the tools to tackle this challenge in a scalable way and with some degree of flexibility on the approximation, but for over-parameterized models this is ch…
▽ More
Over-parameterized models, such as DeepNets and ConvNets, form a class of models that are routinely adopted in a wide variety of applications, and for which Bayesian inference is desirable but extremely challenging. Variational inference offers the tools to tackle this challenge in a scalable way and with some degree of flexibility on the approximation, but for over-parameterized models this is challenging due to the over-regularization property of the variational objective. Inspired by the literature on kernel methods, and in particular on structured approximations of distributions of random matrices, this paper proposes Walsh-Hadamard Variational Inference (WHVI), which uses Walsh-Hadamard-based factorization strategies to reduce the parameterization and accelerate computations, thus avoiding over-regularization issues with the variational objective. Extensive theoretical and empirical analyses demonstrate that WHVI yields considerable speedups and model reductions compared to other techniques to carry out approximate inference for over-parameterized models, and ultimately show how advances in kernel methods can be translated into advances in approximate Bayesian inference.
△ Less
Submitted 23 November, 2020; v1 submitted 27 May, 2019;
originally announced May 2019.
-
Spherical clustering of users navigating 360° content
Authors:
Silvia Rossi,
Francesca De Simone,
Pascal Frossard,
Laura Toni
Abstract:
In Virtual Reality (VR) applications, understanding how users explore the omnidirectional content is important to optimize content creation, to develop user-centric services, or even to detect disorders in medical applications. Clustering users based on their common navigation patterns is a first direction to understand users behaviour. However, classical clustering techniques fail in identifying…
▽ More
In Virtual Reality (VR) applications, understanding how users explore the omnidirectional content is important to optimize content creation, to develop user-centric services, or even to detect disorders in medical applications. Clustering users based on their common navigation patterns is a first direction to understand users behaviour. However, classical clustering techniques fail in identifying these common paths, since they are usually focused on minimizing a simple distance metric. In this paper, we argue that minimizing the distance metric does not necessarily guarantee to identify users that experience similar navigation path in the VR domain. Therefore, we propose a graph-based method to identify clusters of users who are attending the same portion of the spherical content over time. The proposed solution takes into account the spherical geometry of the content and aims at clustering users based on the actual overlap of displayed content among users. Our method is tested on real VR user navigation patterns. Results show that our solution leads to clusters in which at least 85% of the content displayed by one user is shared among the other users belonging to the same cluster.
△ Less
Submitted 5 May, 2020; v1 submitted 13 November, 2018;
originally announced November 2018.
-
Good Initializations of Variational Bayes for Deep Models
Authors:
Simone Rossi,
Pietro Michiardi,
Maurizio Filippone
Abstract:
Stochastic variational inference is an established way to carry out approximate Bayesian inference for deep models. While there have been effective proposals for good initializations for loss minimization in deep learning, far less attention has been devoted to the issue of initialization of stochastic variational inference. We address this by proposing a novel layer-wise initialization strategy b…
▽ More
Stochastic variational inference is an established way to carry out approximate Bayesian inference for deep models. While there have been effective proposals for good initializations for loss minimization in deep learning, far less attention has been devoted to the issue of initialization of stochastic variational inference. We address this by proposing a novel layer-wise initialization strategy based on Bayesian linear models. The proposed method is extensively validated on regression and classification tasks, including Bayesian DeepNets and ConvNets, showing faster and better convergence compared to alternatives inspired by the literature on initializations for loss minimization.
△ Less
Submitted 25 January, 2019; v1 submitted 18 October, 2018;
originally announced October 2018.
-
The closed loop between opinion formation and personalised recommendations
Authors:
Wilbert Samuel Rossi,
Jan Willem Polderman,
Paolo Frasca
Abstract:
In online platforms, recommender systems are responsible for directing users to relevant contents. In order to enhance the users' engagement, recommender systems adapt their output to the reactions of the users, who are in turn affected by the recommended contents. In this work, we study a tractable analytical model of a user that interacts with an online news aggregator, with the purpose of makin…
▽ More
In online platforms, recommender systems are responsible for directing users to relevant contents. In order to enhance the users' engagement, recommender systems adapt their output to the reactions of the users, who are in turn affected by the recommended contents. In this work, we study a tractable analytical model of a user that interacts with an online news aggregator, with the purpose of making explicit the feedback loop between the evolution of the user's opinion and the personalised recommendation of contents. More specifically, we assume that the user is endowed with a scalar opinion about a certain issue and seeks news about it on a news aggregator: this opinion is influenced by all received news, which are characterized by a binary position on the issue at hand. The user is affected by a confirmation bias, that is, a preference for news that confirm her current opinion. The news aggregator recommends items with the goal of maximizing the number of user's clicks (as a measure of her engagement): in order to fulfil its goal, the recommender has to compromise between exploring the user's preferences and exploiting what it has learned so far. After defining suitable metrics for the effectiveness of the recommender systems (such as the click-through rate) and for its impact on the opinion, we perform both extensive numerical simulations and a mathematical analysis of the model. We find that personalised recommendations markedly affect the evolution of opinions and favor the emergence of more extreme ones: the intensity of these effects is inherently related to the effectiveness of the recommender. We also show that by tuning the amount of randomness in the recommendation algorithm, one can seek a balance between the effectiveness of the recommendation system and its impact on the opinions.
△ Less
Submitted 9 September, 2019; v1 submitted 12 September, 2018;
originally announced September 2018.
-
Persistent Stochastic Non-Interference
Authors:
Jane Hillston,
Carla Piazza,
Sabina Rossi
Abstract:
In this paper we present an information flow security property for stochastic, cooperating, processes expressed as terms of the Performance Evaluation Process Algebra (PEPA). We introduce the notion of Persistent Stochastic Non-Interference (PSNI) based on the idea that every state reachable by a process satisfies a basic Stochastic Non-Interference (SNI) property. The structural operational seman…
▽ More
In this paper we present an information flow security property for stochastic, cooperating, processes expressed as terms of the Performance Evaluation Process Algebra (PEPA). We introduce the notion of Persistent Stochastic Non-Interference (PSNI) based on the idea that every state reachable by a process satisfies a basic Stochastic Non-Interference (SNI) property. The structural operational semantics of PEPA allows us to give two characterizations of PSNI: the first involves a single bisimulation-like equivalence check, while the second is formulated in terms of unwinding conditions. The observation equivalence at the base of our definition relies on the notion of lumpability and ensures that, for a secure process P, the steady state probability of observing the system being in a specific state P' is independent from its possible high level interactions.
△ Less
Submitted 26 August, 2018;
originally announced August 2018.