-
On the Linguistic and Computational Requirements for Creating Face-to-Face Multimodal Human-Machine Interaction
Authors:
João Ranhel,
Cacilda Vilela de Lima
Abstract:
In this study, conversations between humans and avatars are linguistically, organizationally, and structurally analyzed, focusing on what is necessary for creating face-to-face multimodal interfaces for machines. We videorecorded thirty-four human-avatar interactions, performed complete linguistic microanalysis on video excerpts, and marked all the occurrences of multimodal actions and events. Sta…
▽ More
In this study, conversations between humans and avatars are linguistically, organizationally, and structurally analyzed, focusing on what is necessary for creating face-to-face multimodal interfaces for machines. We videorecorded thirty-four human-avatar interactions, performed complete linguistic microanalysis on video excerpts, and marked all the occurrences of multimodal actions and events. Statistical inferences were applied to data, allowing us to comprehend not only how often multimodal actions occur but also how multimodal events are distributed between the speaker (emitter) and the listener (recipient). We also observed the distribution of multimodal occurrences for each modality. The data show evidence that double-loop feedback is established during a face-to-face conversation. This led us to propose that knowledge from Conversation Analysis (CA), cognitive science, and Theory of Mind (ToM), among others, should be incorporated into the ones used for describing human-machine multimodal interactions. Face-to-face interfaces require an additional control layer to the multimodal fusion layer. This layer has to organize the flow of conversation, integrate the social context into the interaction, as well as make plans concerning 'what' and 'how' to progress on the interaction. This higher level is best understood if we incorporate insights from CA and ToM into the interface system.
△ Less
Submitted 24 November, 2022;
originally announced November 2022.
-
Guidelines for creating man-machine multimodal interfaces
Authors:
João Ranhel,
Cacilda Vilela
Abstract:
Understanding details of human multimodal interaction can elucidate many aspects of the type of information processing machines must perform to interact with humans. This article gives an overview of recent findings from Linguistics regarding the organization of conversation in turns, adjacent pairs, (dis)preferred responses, (self)repairs, etc. Besides, we describe how multiple modalities of sign…
▽ More
Understanding details of human multimodal interaction can elucidate many aspects of the type of information processing machines must perform to interact with humans. This article gives an overview of recent findings from Linguistics regarding the organization of conversation in turns, adjacent pairs, (dis)preferred responses, (self)repairs, etc. Besides, we describe how multiple modalities of signs interfere with each other modifying meanings. Then, we propose an abstract algorithm that describes how a machine can implement a double-feedback system that can reproduces a human-like face-to-face interaction by processing various signs, such as verbal, prosodic, facial expressions, gestures, etc. Multimodal face-to-face interactions enrich the exchange of information between agents, mainly because these agents are active all the time by emitting and interpreting signs simultaneously. This article is not about an untested new computational model. Instead, it translates findings from Linguistics as guidelines for designs of multimodal man-machine interfaces. An algorithm is presented. Brought from Linguistics, it is a description pointing out how human face-to-face interactions work. The linguistic findings reported here are the first steps towards the integration of multimodal communication. Some developers involved on interface designs carry on working on isolated models for interpreting text, grammar, gestures and facial expressions, neglecting the interwoven between these signs. In contrast, for linguists working on the state-of-the-art multimodal integration, the interpretation of separated modalities leads to an incomplete interpretation, if not to a miscomprehension of information. The algorithm proposed herein intends to guide man-machine interface designers who want to integrate multimodal components on face-to-face interactions as close as possible to those performed between humans.
△ Less
Submitted 6 August, 2020; v1 submitted 29 January, 2019;
originally announced January 2019.
-
A Draft Memory Model on Spiking Neural Assemblies
Authors:
João Ranhel,
João H. Albuquerque,
Bruno P. M. Azevedo,
Nathalia M. Cunha,
Pedro J. Ishimaru
Abstract:
A draft memory model (DM) for neural networks with spike propagation delay (SNNwD) is described. Novelty in this approach are that the DM learns immediately, with stimuli presented once, without synaptic weight changes, and without external learning algorithm. Basal on this model is to trap spikes within neural loops. In order to construct the DM we developed two functional blocks, also described…
▽ More
A draft memory model (DM) for neural networks with spike propagation delay (SNNwD) is described. Novelty in this approach are that the DM learns immediately, with stimuli presented once, without synaptic weight changes, and without external learning algorithm. Basal on this model is to trap spikes within neural loops. In order to construct the DM we developed two functional blocks, also described herein. The decoder block receives input from a single spikes source and connect it to one among many outputs. The selector block operates in the opposite direction, receiving many spikes sources and connecting one of them to a single output. We realized conceptual proofs by testing the DM in the prime numbers classifying task. This activation-based memory can be used as immediate and short-term memory.
△ Less
Submitted 26 March, 2016;
originally announced March 2016.
-
Simulation of Color Blindness and a Proposal for Using Google Glass as Color-correcting Tool
Authors:
H. M. de Oliveira,
J. Ranhel,
R. B. A. Alves
Abstract:
The human visual color response is driven by specialized cells called cones, which exist in three types, viz. R, G, and B. Software is developed to simulate how color images are displayed for different types of color blindness. Specified the default color deficiency associated with a user, it generates a preview of the rainbow (in the visible range, from red to violet) and shows up, side by side w…
▽ More
The human visual color response is driven by specialized cells called cones, which exist in three types, viz. R, G, and B. Software is developed to simulate how color images are displayed for different types of color blindness. Specified the default color deficiency associated with a user, it generates a preview of the rainbow (in the visible range, from red to violet) and shows up, side by side with a colorful image provided as input, the display correspondent colorblind. The idea is to provide an image processing after image acquisition to enable a better perception ofcolors by the color blind. Examples of pseudo-correction are shown for the case of Protanopia (red blindness). The system is adapted into a screen of an i-pad or a cellphone in which the colorblind observe the camera, the image processed with color detail previously imperceptible by his naked eye. As prospecting, wearable computer glasses could be manufactured to provide a corrected image playback. The approach can also provide augmented reality for human vision by adding the UV or IR responses as a new feature of Google Glass.
△ Less
Submitted 12 February, 2015;
originally announced February 2015.