-
SketchColour: Channel Concat Guided DiT-based Sketch-to-Colour Pipeline for 2D Animation
Authors:
Bryan Constantine Sadihin,
Michael Hua Wang,
Shei Pern Chua,
Hang Su
Abstract:
The production of high-quality 2D animation is highly labor-intensive process, as animators are currently required to draw and color a large number of frames by hand. We present SketchColour, the first sketch-to-colour pipeline for 2D animation built on a diffusion transformer (DiT) backbone. By replacing the conventional U-Net denoiser with a DiT-style architecture and injecting sketch informatio…
▽ More
The production of high-quality 2D animation is highly labor-intensive process, as animators are currently required to draw and color a large number of frames by hand. We present SketchColour, the first sketch-to-colour pipeline for 2D animation built on a diffusion transformer (DiT) backbone. By replacing the conventional U-Net denoiser with a DiT-style architecture and injecting sketch information via lightweight channel-concatenation adapters accompanied with LoRA finetuning, our method natively integrates conditioning without the parameter and memory bloat of a duplicated ControlNet, greatly reducing parameter count and GPU memory usage. Evaluated on the SAKUGA dataset, SketchColour outperforms previous state-of-the-art video colourization methods across all metrics, despite using only half the training data of competing models. Our approach produces temporally coherent animations with minimal artifacts such as colour bleeding or object deformation. Our code is available at: https://bconstantine.github.io/SketchColour .
△ Less
Submitted 2 July, 2025;
originally announced July 2025.
-
AffectMachine-Pop: A controllable expert system for real-time pop music generation
Authors:
Kat R. Agres,
Adyasha Dash,
Phoebe Chua,
Stefan K. Ehrlich
Abstract:
Music is a powerful medium for influencing listeners' emotional states, and this capacity has driven a surge of research interest in AI-based affective music generation in recent years. Many existing systems, however, are a black box which are not directly controllable, thus making these systems less flexible and adaptive to users. We present \textit{AffectMachine-Pop}, an expert system capable of…
▽ More
Music is a powerful medium for influencing listeners' emotional states, and this capacity has driven a surge of research interest in AI-based affective music generation in recent years. Many existing systems, however, are a black box which are not directly controllable, thus making these systems less flexible and adaptive to users. We present \textit{AffectMachine-Pop}, an expert system capable of generating retro-pop music according to arousal and valence values, which can either be pre-determined or based on a listener's real-time emotion states. To validate the efficacy of the system, we conducted a listening study demonstrating that AffectMachine-Pop is capable of generating affective music at target levels of arousal and valence. The system is tailored for use either as a tool for generating interactive affective music based on user input, or for incorporation into biofeedback or neurofeedback systems to assist users with emotion self-regulation.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
EmoSign: A Multimodal Dataset for Understanding Emotions in American Sign Language
Authors:
Phoebe Chua,
Cathy Mengying Fang,
Takehiko Ohkawa,
Raja Kushalnagar,
Suranga Nanayakkara,
Pattie Maes
Abstract:
Unlike spoken languages where the use of prosodic features to convey emotion is well studied, indicators of emotion in sign language remain poorly understood, creating communication barriers in critical settings. Sign languages present unique challenges as facial expressions and hand movements simultaneously serve both grammatical and emotional functions. To address this gap, we introduce EmoSign,…
▽ More
Unlike spoken languages where the use of prosodic features to convey emotion is well studied, indicators of emotion in sign language remain poorly understood, creating communication barriers in critical settings. Sign languages present unique challenges as facial expressions and hand movements simultaneously serve both grammatical and emotional functions. To address this gap, we introduce EmoSign, the first sign video dataset containing sentiment and emotion labels for 200 American Sign Language (ASL) videos. We also collect open-ended descriptions of emotion cues. Annotations were done by 3 Deaf ASL signers with professional interpretation experience. Alongside the annotations, we include baseline models for sentiment and emotion classification. This dataset not only addresses a critical gap in existing sign language research but also establishes a new benchmark for understanding model capabilities in multimodal emotion recognition for sign languages. The dataset is made available at https://huggingface.co/datasets/catfang/emosign.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
Perspectives on Capturing Emotional Expressiveness in Sign Language
Authors:
Phoebe Chua,
Cathy Mengying Fang,
Yasith Samaradivakara,
Pattie Maes,
Suranga Nanayakkara
Abstract:
Significant advances have been made in our ability to understand and generate emotionally expressive content such as text and speech, yet comparable progress in sign language technologies remain limited. While computational approaches to sign language translation have focused on capturing lexical content, the emotional dimensions of sign language communication remain largely unexplored. Through se…
▽ More
Significant advances have been made in our ability to understand and generate emotionally expressive content such as text and speech, yet comparable progress in sign language technologies remain limited. While computational approaches to sign language translation have focused on capturing lexical content, the emotional dimensions of sign language communication remain largely unexplored. Through semi-structured interviews with eight sign language users across Singapore, Sri Lanka and the United States, including both Deaf and Hard of hearing (DHH) and hearing signers, we investigate how emotions are expressed and perceived in sign languages. Our findings highlight the role of both manual and non-manual elements in emotional expression, revealing universal patterns as well as individual and cultural variations in how signers communicate emotions. We identify key challenges in capturing emotional nuance for sign language translation, and propose design considerations for developing more emotionally-aware sign language technologies. This work contributes to both theoretical understanding of emotional expression in sign language and practical development of interfaces to better serve diverse signing communities.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
Motion as Emotion: Detecting Affect and Cognitive Load from Free-Hand Gestures in VR
Authors:
Phoebe Chua,
Prasanth Sasikumar,
Yadeesha Weerasinghe,
Suranga Nanayakkara
Abstract:
Affect and cognitive load influence many user behaviors. In this paper, we propose Motion as Emotion, a novel method that utilizes fine differences in hand motion to recognise affect and cognitive load in virtual reality (VR). We conducted a study with 22 participants who used common free-hand gesture interactions to carry out tasks of varying difficulty in VR environments. We find that the affect…
▽ More
Affect and cognitive load influence many user behaviors. In this paper, we propose Motion as Emotion, a novel method that utilizes fine differences in hand motion to recognise affect and cognitive load in virtual reality (VR). We conducted a study with 22 participants who used common free-hand gesture interactions to carry out tasks of varying difficulty in VR environments. We find that the affect and cognitive load induced by tasks are associated with significant differences in gesture features such as speed, distance and hand tension. Standard support vector classification (SVC) models could accurately predict two levels (low, high) of valence, arousal and cognitive load from these features. Our results demonstrate the potential of Motion as Emotion as an accurate and reliable method of inferring user affect and cognitive load from free-hand gestures, without needing any additional wearable sensors or modifications to a standard VR headset.
△ Less
Submitted 19 September, 2024;
originally announced September 2024.
-
Leveraging AI-Generated Emotional Self-Voice to Nudge People towards their Ideal Selves
Authors:
Cathy Mengying Fang,
Phoebe Chua,
Samantha Chan,
Joanne Leong,
Andria Bao,
Pattie Maes
Abstract:
Emotions, shaped by past experiences, significantly influence decision-making and goal pursuit. Traditional cognitive-behavioral techniques for personal development rely on mental imagery to envision ideal selves, but may be less effective for individuals who struggle with visualization. This paper introduces Emotional Self-Voice (ESV), a novel system combining emotionally expressive language mode…
▽ More
Emotions, shaped by past experiences, significantly influence decision-making and goal pursuit. Traditional cognitive-behavioral techniques for personal development rely on mental imagery to envision ideal selves, but may be less effective for individuals who struggle with visualization. This paper introduces Emotional Self-Voice (ESV), a novel system combining emotionally expressive language models and voice cloning technologies to render customized responses in the user's own voice. We investigate the potential of ESV to nudge individuals towards their ideal selves in a study with 60 participants. Across all three conditions (ESV, text-only, and mental imagination), we observed an increase in resilience, confidence, motivation, and goal commitment, and the ESV condition was perceived as uniquely engaging and personalized. We discuss the implications of designing generated self-voice systems as a personalized behavioral intervention for different scenarios.
△ Less
Submitted 9 April, 2025; v1 submitted 17 September, 2024;
originally announced September 2024.
-
AffectMachine-Classical: A novel system for generating affective classical music
Authors:
Kat R. Agres,
Adyasha Dash,
Phoebe Chua
Abstract:
This work introduces a new music generation system, called AffectMachine-Classical, that is capable of generating affective Classic music in real-time. AffectMachine was designed to be incorporated into biofeedback systems (such as brain-computer-interfaces) to help users become aware of, and ultimately mediate, their own dynamic affective states. That is, this system was developed for music-based…
▽ More
This work introduces a new music generation system, called AffectMachine-Classical, that is capable of generating affective Classic music in real-time. AffectMachine was designed to be incorporated into biofeedback systems (such as brain-computer-interfaces) to help users become aware of, and ultimately mediate, their own dynamic affective states. That is, this system was developed for music-based MedTech to support real-time emotion self-regulation in users. We provide an overview of the rule-based, probabilistic system architecture, describing the main aspects of the system and how they are novel. We then present the results of a listener study that was conducted to validate the ability of the system to reliably convey target emotions to listeners. The findings indicate that AffectMachine-Classical is very effective in communicating various levels of Arousal ($R^2 = .96$) to listeners, and is also quite convincing in terms of Valence (R^2 = .90). Future work will embed AffectMachine-Classical into biofeedback systems, to leverage the efficacy of the affective music for emotional well-being in listeners.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses
Authors:
Phoebe Chua,
Dimos Makris,
Dorien Herremans,
Gemma Roig,
Kat Agres
Abstract:
Although media content is increasingly produced, distributed, and consumed in multiple combinations of modalities, how individual modalities contribute to the perceived emotion of a media item remains poorly understood. In this paper we present MusicVideos (MuVi), a novel dataset for affective multimedia content analysis to study how the auditory and visual modalities contribute to the perceived e…
▽ More
Although media content is increasingly produced, distributed, and consumed in multiple combinations of modalities, how individual modalities contribute to the perceived emotion of a media item remains poorly understood. In this paper we present MusicVideos (MuVi), a novel dataset for affective multimedia content analysis to study how the auditory and visual modalities contribute to the perceived emotion of media. The data were collected by presenting music videos to participants in three conditions: music, visual, and audiovisual. Participants annotated the music videos for valence and arousal over time, as well as the overall emotion conveyed. We present detailed descriptive statistics for key measures in the dataset and the results of feature importance analyses for each condition. Finally, we propose a novel transfer learning architecture to train Predictive models Augmented with Isolated modality Ratings (PAIR) and demonstrate the potential of isolated modality ratings for enhancing multimodal emotion recognition. Our results suggest that perceptions of arousal are influenced primarily by auditory information, while perceptions of valence are more subjective and can be influenced by both visual and auditory information. The dataset is made publicly available.
△ Less
Submitted 19 February, 2022;
originally announced February 2022.
-
Standard Cell Library Evaluation with Multiple lithography-compliant verification and Improved Synopsys Pin Access Checking Utility
Authors:
Yongfu Li,
Wan Chia Ang,
Chin Hui Lee,
Kok Peng Chua,
Yoong Seang Jonathan Ong,
Chiu Wing Colin Hui
Abstract:
While standard cell layouts are drawn with minimum design rules to maximize the benefit of design area shrinkage, the complicated design rules have caused difficulties with signal routes accessing the pins in standard cell layouts. As a result, it has become a great challenge for physical layout designers to design a standard cell layout that is optimized for area, power, timing, signal integrity,…
▽ More
While standard cell layouts are drawn with minimum design rules to maximize the benefit of design area shrinkage, the complicated design rules have caused difficulties with signal routes accessing the pins in standard cell layouts. As a result, it has become a great challenge for physical layout designers to design a standard cell layout that is optimized for area, power, timing, signal integrity, and printability. Multiple design iterations are required to consider pin accessibility during standard cells layout to increase the number of feasible solutions available to the router. In this work, we will demonstrate several improvements with the Synopsys PAC methodology, such as reducing the number of cells required for each Synopsys 'testcell' with the same cell abutment condition, increasing the complexity of the pin connection for better pin accessibility evaluation. We also recommend additional constraints to improve the probability of detecting pin accessibility issues. We also integrate other physical verification methods to access the design rule compliance and the printability of standard cells. We hope that the easy to use utility enables layout engineers to perform the verification, simplifying the verification methodology.
△ Less
Submitted 27 May, 2018;
originally announced May 2018.
-
Multiple-Lithography-Compliant Verification for Standard Cell Library Development Flow
Authors:
Yongfu Li,
Wan Chia Ang,
Chin Hui Lee,
Kok Peng Chua,
Yoong Seang Jonathan Ong,
Chiu Wing Colin Hui
Abstract:
Starting from 22-nm, a standard cell must be designed to be full lithography-compliant, which includes Design Rule Check, Design-for-Manufacturability and Double-Patterning compliant. It has become a great challenge for physical layout designers to provide a full lithography-compliant standard cell layout that is optimized for area, power, timing, signal integrity, and yield. This challenge is fur…
▽ More
Starting from 22-nm, a standard cell must be designed to be full lithography-compliant, which includes Design Rule Check, Design-for-Manufacturability and Double-Patterning compliant. It has become a great challenge for physical layout designers to provide a full lithography-compliant standard cell layout that is optimized for area, power, timing, signal integrity, and yield. This challenge is further exacerbated with abutted single- and multiple-height standard cells. At present, different foundries and library vendors have different approaches for full lithography-compliant library preparation and validation. To the best of our knowledge, there is no single tool integrates all types of lithography-compliant check in standard cell libraries validation flow. In this work, we will demonstrate multiple lithography-compliant verification for standard cell library development flow. Validation flow and detailed algorithm implementation will be explained to assist engineers to achieve full lithography-compliant standard cell libraries. An area-efficient standard cell placement methodology will also be discussed to validate the issues arises from standard cell abutment.
△ Less
Submitted 27 May, 2018;
originally announced May 2018.
-
Constraining the Synopsys Pin Access Checker Utility for Improved Standard Cells Library Verification Flow
Authors:
Yongfu Li,
Chin Hui Lee,
Wan Chia Ang,
Kok Peng Chua,
Yoong Seang Jonathan Ong,
Chiu Wing Colin Hui
Abstract:
While standard cell layouts are drawn with minimum design rules for maximum benefit of design area shrinkage, the complicated design rules begin to cause difficulties with signal routes accessing the pins in standard cell layouts. Multiple design iterations are required to resolve routing issues, thus increasing the runtime and the overall chip area. To optimize the chip performance, power and are…
▽ More
While standard cell layouts are drawn with minimum design rules for maximum benefit of design area shrinkage, the complicated design rules begin to cause difficulties with signal routes accessing the pins in standard cell layouts. Multiple design iterations are required to resolve routing issues, thus increasing the runtime and the overall chip area. To optimize the chip performance, power and area (PPA) and improve the routability, it is necessary to consider the pin accessibility during standard cell development phase so that each cell is designed to maximize the number of feasible pin-access solutions available to the router. As part of the Synopsys IC Compiler Library Preparation Reference Methodology, the Synopsys Pin Access Checker (PAC) reports DRC violations associated with the standard cell. Based on Synopsys PAC's methodology, we demonstrate several methods to improve the probability of detecting pin accessibility issues, such as reducing the number of cells required for each Synopsys 'testcell', increasing the complexity of the pin connectivity assignment and recommending the router constraints.
△ Less
Submitted 25 May, 2018;
originally announced May 2018.