-
Collaboration Between Robots, Interfaces and Humans: Practice-Based and Audience Perspectives
Authors:
Anna Savery,
Richard Savery
Abstract:
This paper provides an analysis of a mixed-media experimental musical work that explores the integration of human musical interaction with a newly developed interface for the violin, manipulated by an improvising violinist, interactive visuals, a robotic drummer and an improvised synthesised orchestra. We first present a detailed technical overview of the systems involved including the design and…
▽ More
This paper provides an analysis of a mixed-media experimental musical work that explores the integration of human musical interaction with a newly developed interface for the violin, manipulated by an improvising violinist, interactive visuals, a robotic drummer and an improvised synthesised orchestra. We first present a detailed technical overview of the systems involved including the design and functionality of each component. We then conduct a practice-based review examining the creative processes and artistic decisions underpinning the work, focusing on the challenges and breakthroughs encountered during its development. Through this introspective analysis, we uncover insights into the collaborative dynamics between the human performer and technological agents, revealing the complexities of blending traditional musical expressiveness with artificial intelligence and robotics. To gauge public reception and interpretive perspectives, we conducted an online survey, sharing a video of the performance with a diverse audience. The feedback collected from this survey offers valuable viewpoints on the accessibility, emotional impact, and perceived artistic value of the work. Respondents' reactions underscore the transformative potential of integrating advanced technologies in musical performance, while also highlighting areas for further exploration and refinement.
△ Less
Submitted 23 July, 2024;
originally announced July 2024.
-
Long-Term, Store-Front Robotics: Interactive Music for Robotic Arm, Caxixi and Frame Drums
Authors:
Richard Savery,
Fouad Sukkar
Abstract:
This paper presents an innovative exploration into the integration of interactive robotic musicianship within a commercial retail environment, specifically through a three-week-long in-store installation featuring a UR3 robotic arm, custom-built frame drums, and an adaptive music generation system. Situated in a prominent storefront in one of the world's largest cities, this project aimed to enhan…
▽ More
This paper presents an innovative exploration into the integration of interactive robotic musicianship within a commercial retail environment, specifically through a three-week-long in-store installation featuring a UR3 robotic arm, custom-built frame drums, and an adaptive music generation system. Situated in a prominent storefront in one of the world's largest cities, this project aimed to enhance the shopping experience by creating dynamic, engaging musical interactions that respond to the store's ambient soundscape. Key contributions include the novel application of industrial robotics in artistic expression, the deployment of interactive music to enrich retail ambiance, and the demonstration of continuous robotic operation in a public setting over an extended period. Challenges such as system reliability, variation in musical output, safety in interactive contexts, and brand alignment were addressed to ensure the installation's success. The project not only showcased the technical feasibility and artistic potential of robotic musicianship in retail spaces but also offered insights into the practical implications of such integration, including system reliability, the dynamics of human-robot interaction, and the impact on store operations. This exploration opens new avenues for enhancing consumer retail experiences through the intersection of technology, music, and interactive art, suggesting a future where robotic musicianship contributes meaningfully to public and commercial spaces.
△ Less
Submitted 23 July, 2024;
originally announced July 2024.
-
Say What? Collaborative Pop Lyric Generation Using Multitask Transfer Learning
Authors:
Naveen Ram,
Tanay Gummadi,
Rahul Bhethanabotla,
Richard J. Savery,
Gil Weinberg
Abstract:
Lyric generation is a popular sub-field of natural language generation that has seen growth in recent years. Pop lyrics are of unique interest due to the genre's unique style and content, in addition to the high level of collaboration that goes on behind the scenes in the professional pop songwriting process. In this paper, we present a collaborative line-level lyric generation system that utilize…
▽ More
Lyric generation is a popular sub-field of natural language generation that has seen growth in recent years. Pop lyrics are of unique interest due to the genre's unique style and content, in addition to the high level of collaboration that goes on behind the scenes in the professional pop songwriting process. In this paper, we present a collaborative line-level lyric generation system that utilizes transfer-learning via the T5 transformer model, which, till date, has not been used to generate pop lyrics. By working and communicating directly with professional songwriters, we develop a model that is able to learn lyrical and stylistic tasks like rhyming, matching line beat requirements, and ending lines with specific target words. Our approach compares favorably to existing methods for multiple datasets and yields positive results from our online studies and interviews with industry songwriters.
△ Less
Submitted 15 November, 2021;
originally announced November 2021.
-
Musical Prosody-Driven Emotion Classification: Interpreting Vocalists Portrayal of Emotions Through Machine Learning
Authors:
Nicholas Farris,
Brian Model,
Richard Savery,
Gil Weinberg
Abstract:
The task of classifying emotions within a musical track has received widespread attention within the Music Information Retrieval (MIR) community. Music emotion recognition has traditionally relied on the use of acoustic features, verbal features, and metadata-based filtering. The role of musical prosody remains under-explored despite several studies demonstrating a strong connection between prosod…
▽ More
The task of classifying emotions within a musical track has received widespread attention within the Music Information Retrieval (MIR) community. Music emotion recognition has traditionally relied on the use of acoustic features, verbal features, and metadata-based filtering. The role of musical prosody remains under-explored despite several studies demonstrating a strong connection between prosody and emotion. In this study, we restrict the input of traditional machine learning algorithms to the features of musical prosody. Furthermore, our proposed approach builds upon the prior by classifying emotions under an expanded emotional taxonomy, using the Geneva Wheel of Emotion. We utilize a methodology for individual data collection from vocalists, and personal ground truth labeling by the artist themselves. We found that traditional machine learning algorithms when limited to the features of musical prosody (1) achieve high accuracies for a single singer, (2) maintain high accuracy when the dataset is expanded to multiple singers, and (3) achieve high accuracies when trained on a reduced subset of the total features.
△ Less
Submitted 13 June, 2021; v1 submitted 4 June, 2021;
originally announced June 2021.
-
Shimon the Robot Film Composer and DeepScore: An LSTM for Generation of Film Scores based on Visual Analysis
Authors:
Richard Savery,
Gil Weinberg
Abstract:
Composing for a film requires developing an understanding of the film, its characters and the film aesthetic choices made by the director. We propose using existing visual analysis systems as a core technology for film music generation. We extract film features including main characters and their emotions to develop a computer understanding of the film's narrative arc. This arc is combined with vi…
▽ More
Composing for a film requires developing an understanding of the film, its characters and the film aesthetic choices made by the director. We propose using existing visual analysis systems as a core technology for film music generation. We extract film features including main characters and their emotions to develop a computer understanding of the film's narrative arc. This arc is combined with visually analyzed director aesthetic choices including pacing and levels of movement. Two systems are presented, the first using a robotic film composer and marimbist to generate film scores in real-time performance. The second software-based system builds on the results from the robot film composer to create narrative driven film scores.
△ Less
Submitted 26 October, 2020;
originally announced November 2020.
-
Emotional Musical Prosody: Validated Vocal Dataset for Human Robot Interaction
Authors:
Richard Savery,
Lisa Zahray,
Gil Weinberg
Abstract:
Human collaboration with robotics is dependant on the development of a relationship between human and robot, without which performance and utilization can decrease. Emotion and personality conveyance has been shown to enhance robotic collaborations, with improved human-robot relationships and increased trust. One under-explored way for an artificial agent to convey emotions is through non-linguist…
▽ More
Human collaboration with robotics is dependant on the development of a relationship between human and robot, without which performance and utilization can decrease. Emotion and personality conveyance has been shown to enhance robotic collaborations, with improved human-robot relationships and increased trust. One under-explored way for an artificial agent to convey emotions is through non-linguistic musical prosody. In this work we present a new 4.2 hour dataset of improvised emotional vocal phrases based on the Geneva Emotion Wheel. This dataset has been validated through extensive listening tests and shows promising preliminary results for use in generative systems.
△ Less
Submitted 9 October, 2020;
originally announced October 2020.
-
Shimon the Rapper: A Real-Time System for Human-Robot Interactive Rap Battles
Authors:
Richard Savery,
Lisa Zahray,
Gil Weinberg
Abstract:
We present a system for real-time lyrical improvisation between a human and a robot in the style of hip hop. Our system takes vocal input from a human rapper, analyzes the semantic meaning, and generates a response that is rapped back by a robot over a musical groove. Previous work with real-time interactive music systems has largely focused on instrumental output, and vocal interactions with robo…
▽ More
We present a system for real-time lyrical improvisation between a human and a robot in the style of hip hop. Our system takes vocal input from a human rapper, analyzes the semantic meaning, and generates a response that is rapped back by a robot over a musical groove. Previous work with real-time interactive music systems has largely focused on instrumental output, and vocal interactions with robots have been explored, but not in a musical context. Our generative system includes custom methods for censorship, voice, rhythm, rhyming and a novel deep learning pipeline based on phoneme embeddings. The rap performances are accompanied by synchronized robotic gestures and mouth movements. Key technical challenges that were overcome in the system are developing rhymes, performing with low-latency and dataset censorship. We evaluated several aspects of the system through a survey of videos and sample text output. Analysis of comments showed that the overall perception of the system was positive. The model trained on our hip hop dataset was rated significantly higher than our metal dataset in coherence, rhyme quality, and enjoyment. Participants preferred outputs generated by a given input phrase over outputs generated from unknown keywords, indicating that the system successfully relates its output to its input.
△ Less
Submitted 19 September, 2020;
originally announced September 2020.
-
Emotional Musical Prosody for the Enhancement of Trust in Robotic Arm Communication
Authors:
Richard Savery,
Lisa Zahray,
Gil Weinberg
Abstract:
As robotic arms become prevalent in industry it is crucial to improve levels of trust from human collaborators. Low levels of trust in human-robot interaction can reduce overall performance and prevent full robot utilization. We investigated the potential benefits of using emotional musical prosody to allow the robot to respond emotionally to the user's actions. We tested participants' responses t…
▽ More
As robotic arms become prevalent in industry it is crucial to improve levels of trust from human collaborators. Low levels of trust in human-robot interaction can reduce overall performance and prevent full robot utilization. We investigated the potential benefits of using emotional musical prosody to allow the robot to respond emotionally to the user's actions. We tested participants' responses to interacting with a virtual robot arm that acted as a decision agent, helping participants select the next number in a sequence. We compared results from three versions of the application in a between-group experiment, where the robot had different emotional reactions to the user's input depending on whether the user agreed with the robot and whether the user's choice was correct. In all versions, the robot reacted with emotional gestures. One version used prosody-based emotional audio phrases selected from our dataset of singer improvisations, the second version used audio consisting of a single pitch randomly assigned to each emotion, and the final version used no audio, only gestures. Our results showed no significant difference for the percentage of times users from each group agreed with the robot, and no difference between user's agreement with the robot after it made a mistake. However, participants also took a trust survey following the interaction, and we found that the reported trust ratings of the musical prosody group were significantly higher than both the single-pitch and no audio groups.
△ Less
Submitted 18 September, 2020;
originally announced September 2020.
-
Mechatronics-Driven Musical Expressivity for Robotic Percussionists
Authors:
Ning Yang,
Richard Savery,
Raghavasimhan Sankaranarayanan,
Lisa Zahray,
Gil Weinberg
Abstract:
Musical expressivity is an important aspect of musical performance for humans as well as robotic musicians. We present a novel mechatronics-driven implementation of Brushless Direct Current (BLDC) motors in a robotic marimba player, named Shimon, designed to improve speed, dynamic range (loudness), and ultimately perceived musical expressivity in comparison to state-of-the-art robotic percussionis…
▽ More
Musical expressivity is an important aspect of musical performance for humans as well as robotic musicians. We present a novel mechatronics-driven implementation of Brushless Direct Current (BLDC) motors in a robotic marimba player, named Shimon, designed to improve speed, dynamic range (loudness), and ultimately perceived musical expressivity in comparison to state-of-the-art robotic percussionist actuators. In an objective test of dynamic range, we find that our implementation provides wider and more consistent dynamic range response in comparison with solenoid-based robotic percussionists. Our implementation also outperforms both solenoid and human marimba players in striking speed. In a subjective listening test measuring musical expressivity, our system performs significantly better than a solenoid-based system and is statistically indistinguishable from human performers.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
A Survey of Robotics and Emotion: Classifications and Models of Emotional Interaction
Authors:
Richard Savery,
Gil Weinberg
Abstract:
As emotion plays a growing role in robotic research it is crucial to develop methods to analyze and compare among the wide range of approaches. To this end we present a survey of 1427 IEEE and ACM publications that include robotics and emotion. This includes broad categorizations of trends in emotion input analysis, robot emotional expression, studies of emotional interaction and models for intern…
▽ More
As emotion plays a growing role in robotic research it is crucial to develop methods to analyze and compare among the wide range of approaches. To this end we present a survey of 1427 IEEE and ACM publications that include robotics and emotion. This includes broad categorizations of trends in emotion input analysis, robot emotional expression, studies of emotional interaction and models for internal processing. We then focus on 232 papers that present internal processing of emotion, such as using a human's emotion for better interaction or turning environmental stimuli into an emotional drive for robotic path planning. We conducted constant comparison analysis of the 232 papers and arrived at three broad categorization metrics; emotional intelligence, emotional model and implementation, each including two or three subcategories. The subcategories address the algorithm used, emotional mapping, history, the emotional model, emotional categories, the role of emotion, the purpose of emotion and the platform. Our results show a diverse field of study, largely divided by the role of emotion in the system, either for improved interaction, or improved robotic performance. We also present multiple future opportunities for research and describe intrinsic challenges common in all publications.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
Establishing Human-Robot Trust through Music-Driven Robotic Emotion Prosody and Gesture
Authors:
Richard Savery,
Ryan Rose,
Gil Weinberg
Abstract:
As human-robot collaboration opportunities continue to expand, trust becomes ever more important for full engagement and utilization of robots. Affective trust, built on emotional relationship and interpersonal bonds is particularly critical as it is more resilient to mistakes and increases the willingness to collaborate. In this paper we present a novel model built on music-driven emotional proso…
▽ More
As human-robot collaboration opportunities continue to expand, trust becomes ever more important for full engagement and utilization of robots. Affective trust, built on emotional relationship and interpersonal bonds is particularly critical as it is more resilient to mistakes and increases the willingness to collaborate. In this paper we present a novel model built on music-driven emotional prosody and gestures that encourages the perception of a robotic identity, designed to avoid uncanny valley. Symbolic musical phrases were generated and tagged with emotional information by human musicians. These phrases controlled a synthesis engine playing back pre-rendered audio samples generated through interpolation of phonemes and electronic instruments. Gestures were also driven by the symbolic phrases, encoding the emotion from the musical phrase to low degree-of-freedom movements. Through a user study we showed that our system was able to accurately portray a range of emotions to the user. We also showed with a significant result that our non-linguistic audio generation achieved an 8% higher mean of average trust than using a state-of-the-art text-to-speech system.
△ Less
Submitted 11 January, 2020;
originally announced January 2020.
-
Learning from History: Recreating and Repurposing Sister Harriet Padberg's Computer Composed Canon and Free Fugue
Authors:
Richard Savery,
Benjamin Genchel,
Jason Smith,
Anthony Caulkins,
Molly Jones,
Anna Savery
Abstract:
Harriet Padberg wrote Computer-Composed Canon and Free Fugue as part of her 1964 dissertation in Mathematics and Music at Saint Louis University. This program is one of the earliest examples of text-to-music software and algorithmic composition, which are areas of great interest in the present-day field of music technology. This paper aims to analyze the technological innovation, aesthetic design…
▽ More
Harriet Padberg wrote Computer-Composed Canon and Free Fugue as part of her 1964 dissertation in Mathematics and Music at Saint Louis University. This program is one of the earliest examples of text-to-music software and algorithmic composition, which are areas of great interest in the present-day field of music technology. This paper aims to analyze the technological innovation, aesthetic design process, and impact of Harriet Padberg's original 1964 thesis as well as the design of a modern recreation and utilization, in order to gain insight to the nature of revisiting older works. Here, we present our open source recreation of Padberg's program with a modern interface and, through its use as an artistic tool by three composers, show how historical works can be effectively used for new creative purposes in contemporary contexts. Not Even One by Molly Jones draws on the historical and social significance of Harriet Padberg through using her program in a piece about the lack of representation of women judges in composition competitions. Brevity by Anna Savery utilizes the original software design as a composition tool, and The Padberg Piano by Anthony Caulkins uses the melodic generation of the original to create a software instrument.
△ Less
Submitted 9 July, 2019;
originally announced July 2019.