-
Error-Tolerant Exact Query Learning of Finite Set Partitions with Same-Cluster Oracle
Authors:
Adela Frances DePavia,
Olga Medrano Martín del Campo,
Erasmo Tani
Abstract:
This paper initiates the study of active learning for exact recovery of partitions exclusively through access to a same-cluster oracle in the presence of bounded adversarial error. We first highlight a novel connection between learning partitions and correlation clustering. Then we use this connection to build a Rényi-Ulam style analytical framework for this problem, and prove upper and lower boun…
▽ More
This paper initiates the study of active learning for exact recovery of partitions exclusively through access to a same-cluster oracle in the presence of bounded adversarial error. We first highlight a novel connection between learning partitions and correlation clustering. Then we use this connection to build a Rényi-Ulam style analytical framework for this problem, and prove upper and lower bounds on its worst-case query complexity. Further, we bound the expected performance of a relevant randomized algorithm. Finally, we study the relationship between adaptivity and query complexity for this problem and related variants.
△ Less
Submitted 16 June, 2023; v1 submitted 22 May, 2023;
originally announced May 2023.
-
Data Fusion in Neuromarketing: Multimodal Analysis of Biosignals, Lifecycle Stages, Current Advances, Datasets, Trends, and Challenges
Authors:
Mario Quiles Pérez,
Enrique Tomás Martínez Beltrán,
Sergio López Bernal,
Eduardo Horna Prat,
Luis Montesano Del Campo,
Lorenzo Fernández Maimó,
Alberto Huertas Celdrán
Abstract:
The primary goal of any company is to increase its profits by improving both the quality of its products and how they are advertised. In this context, neuromarketing seeks to enhance the promotion of products and generate a greater acceptance on potential buyers. Traditionally, neuromarketing studies have relied on a single biosignal to obtain feedback from presented stimuli. However, thanks to ne…
▽ More
The primary goal of any company is to increase its profits by improving both the quality of its products and how they are advertised. In this context, neuromarketing seeks to enhance the promotion of products and generate a greater acceptance on potential buyers. Traditionally, neuromarketing studies have relied on a single biosignal to obtain feedback from presented stimuli. However, thanks to new devices and technological advances studying this area of knowledge, recent trends indicate a shift towards the fusion of diverse biosignals. An example is the usage of electroencephalography for understanding the impact of an advertisement at the neural level and visual tracking to identify the stimuli that induce such impacts. This emerging pattern determines which biosignals to employ for achieving specific neuromarketing objectives. Furthermore, the fusion of data from multiple sources demands advanced processing methodologies. Despite these complexities, there is a lack of literature that adequately collates and organizes the various data sources and the applied processing techniques for the research objectives pursued. To address these challenges, the current paper conducts a comprehensive analysis of the objectives, biosignals, and data processing techniques employed in neuromarketing research. This study provides both the technical definition and a graphical distribution of the elements under revision. Additionally, it presents a categorization based on research objectives and provides an overview of the combinatory methodologies employed. After this, the paper examines primary public datasets designed for neuromarketing research together with others whose main purpose is not neuromarketing, but can be used for this matter. Ultimately, this work provides a historical perspective on the evolution of techniques across various phases over recent years and enumerates key lessons learned.
△ Less
Submitted 21 August, 2023; v1 submitted 30 August, 2022;
originally announced September 2022.
-
The MAIEI Learning Community Report
Authors:
Brittany Wills,
Christina Isaicu,
Heather von Stackelberg,
Lujain Ibrahim,
Matthew Hutson,
Mitchel Fleming,
Nanditha Narayanamoorthy,
Samuel Curtis,
Shreyasha Paudel,
Sofia Trejo,
Tiziana Zevallos,
Victoria Martín del Campo,
Wilson Lee
Abstract:
This is a labor of the Learning Community cohort that was convened by MAIEI in Winter 2021 to work through and discuss important research issues in the field of AI ethics from a multidisciplinary lens. The community came together supported by facilitators from the MAIEI staff to vigorously debate and explore the nuances of issues like bias, privacy, disinformation, accountability, and more especia…
▽ More
This is a labor of the Learning Community cohort that was convened by MAIEI in Winter 2021 to work through and discuss important research issues in the field of AI ethics from a multidisciplinary lens. The community came together supported by facilitators from the MAIEI staff to vigorously debate and explore the nuances of issues like bias, privacy, disinformation, accountability, and more especially examining them from the perspective of industry, civil society, academia, and government.
The outcome of these discussions is reflected in the report that you are reading now - an exploration of a variety of issues with deep-dive, critical commentary on what has been done, what worked and what didn't, and what remains to be done so that we can meaningfully move forward in addressing the societal challenges posed by the deployment of AI systems.
The chapters titled "Design and Techno-isolationism", "Facebook and the Digital Divide: Perspectives from Myanmar, Mexico, and India", "Future of Work", and "Media & Communications & Ethical Foresight" will hopefully provide with you novel lenses to explore this domain beyond the usual tropes that are covered in the domain of AI ethics.
△ Less
Submitted 10 November, 2021;
originally announced December 2021.
-
An adaptive 3D virtual learning environment for training software developers in scrum
Authors:
Ezequiel Scott,
Marcelo Campo
Abstract:
Scrum is one of the most used frameworks for agile software development because of its potential improvements in productivity, quality, and client satisfaction. Academia has also focussed on teaching Scrum practices to prepare students to face common software engineering challenges and facilitate their insertion in professional contexts. Furthermore, advances in learning technologies currently off…
▽ More
Scrum is one of the most used frameworks for agile software development because of its potential improvements in productivity, quality, and client satisfaction. Academia has also focussed on teaching Scrum practices to prepare students to face common software engineering challenges and facilitate their insertion in professional contexts. Furthermore, advances in learning technologies currently offer many virtual learning environments to enhance learning in many ways. Their capability to consider the individual learner preferences has led a shift to more personalised training approaches, requiring that the environments adapt themselves to the learner. We propose an adaptive approach for training developers in Scrum, including an adaptive virtual learning environment based on Felder's learning style theory. Although still preliminary, our findings show that students who used the environment and received instruction matching their preferences obtained sightly higher learning gains than students who received a different instruction than the one they preferred. We also noticed less variability in the learning gains of students who received instruction matching their preferences. The relevance of this work goes beyond the impact on learning gains since it describes how adaptive virtual learning environments can be used in the domain of Software Engineering.
△ Less
Submitted 9 November, 2021;
originally announced November 2021.
-
Combinatorial and computational investigations of Neighbor-Joining bias
Authors:
Ruth Davidson,
Abraham Martin del Campo
Abstract:
The Neighbor-Joining algorithm is a popular distance-based phylogenetic method that computes a tree metric from a dissimilarity map arising from biological data. Realizing dissimilarity maps as points in Euclidean space, the algorithm partitions the input space into polyhedral regions indexed by the combinatorial type of the trees returned. A full combinatorial description of these regions has not…
▽ More
The Neighbor-Joining algorithm is a popular distance-based phylogenetic method that computes a tree metric from a dissimilarity map arising from biological data. Realizing dissimilarity maps as points in Euclidean space, the algorithm partitions the input space into polyhedral regions indexed by the combinatorial type of the trees returned. A full combinatorial description of these regions has not been found yet; different sequences of Neighbor-Joining agglomeration events can produce the same combinatorial tree, therefore associating multiple geometric regions to the same algorithmic output. We resolve this confusion by defining agglomeration orders on trees, leading to a bijection between distinct regions of the output space and weighted Motzkin paths. As a result, we give a formula for the number of polyhedral regions depending only on the number of taxa. We conclude with a computational comparison between these polyhedral regions, to unveil biases introduced in any implementation of the algorithm.
△ Less
Submitted 16 September, 2020; v1 submitted 18 July, 2020;
originally announced July 2020.
-
Band-limited Soft Actor Critic Model
Authors:
Miguel Campo,
Zhengxing Chen,
Luke Kung,
Kittipat Virochsiri,
Jianyu Wang
Abstract:
Soft Actor Critic (SAC) algorithms show remarkable performance in complex simulated environments. A key element of SAC networks is entropy regularization, which prevents the SAC actor from optimizing against fine grained features, oftentimes transient, of the state-action value function. This results in better sample efficiency during early training. We take this idea one step further by artificia…
▽ More
Soft Actor Critic (SAC) algorithms show remarkable performance in complex simulated environments. A key element of SAC networks is entropy regularization, which prevents the SAC actor from optimizing against fine grained features, oftentimes transient, of the state-action value function. This results in better sample efficiency during early training. We take this idea one step further by artificially bandlimiting the target critic spatial resolution through the addition of a convolutional filter. We derive the closed form solution in the linear case and show that bandlimiting reduces the interdependency between the low and high frequency components of the state-action value approximation, allowing the critic to learn faster. In experiments, the bandlimited SAC outperformed the classic twin-critic SAC in a number of Gym environments, and displayed more stability in returns. We derive novel insights about SAC by adding a stochastic noise disturbance, a technique that is increasingly being used to learn robust policies that transfer well to the real world counterparts.
△ Less
Submitted 19 June, 2020;
originally announced June 2020.
-
Convolutional Collaborative Filter Network for Video Based Recommendation Systems
Authors:
Cheng-Kang Hsieh,
Miguel Campo,
Abhinav Taliyan,
Matt Nickens,
Mitkumar Pandya,
JJ Espinoza
Abstract:
This analysis explores the temporal sequencing of objects in a movie trailer. Temporal sequencing of objects in a movie trailer (e.g., a long shot of an object vs intermittent short shots) can convey information about the type of movie, plot of the movie, role of the main characters, and the filmmakers cinematographic choices. When combined with historical customer data, sequencing analysis can be…
▽ More
This analysis explores the temporal sequencing of objects in a movie trailer. Temporal sequencing of objects in a movie trailer (e.g., a long shot of an object vs intermittent short shots) can convey information about the type of movie, plot of the movie, role of the main characters, and the filmmakers cinematographic choices. When combined with historical customer data, sequencing analysis can be used to improve predictions of customer behavior. E.g., a customer buys tickets to a new movie and maybe the customer has seen movies in the past that contained similar sequences. To explore object sequencing in movie trailers, we propose a video convolutional network to capture actions and scenes that are predictive of customers' preferences. The model learns the specific nature of sequences for different types of objects (e.g., cars vs faces), and the role of sequences in predicting customer future behavior. We show how such a temporal-aware model outperforms simple feature pooling methods proposed in our previous works and, importantly, demonstrate the additional model explain-ability allowed by such a model.
△ Less
Submitted 22 October, 2018; v1 submitted 18 October, 2018;
originally announced October 2018.
-
Competitive Analysis System for Theatrical Movie Releases Based on Movie Trailer Deep Video Representation
Authors:
Miguel Campo,
Cheng-Kang Hsieh,
Matt Nickens,
JJ Espinoza,
Abhinav Taliyan,
Julie Rieger,
Jean Ho,
Bettina Sherick
Abstract:
Audience discovery is an important activity at major movie studios. Deep models that use convolutional networks to extract frame-by-frame features of a movie trailer and represent it in a form that is suitable for prediction are now possible thanks to the availability of pre-built feature extractors trained on large image datasets. Using these pre-built feature extractors, we are able to process h…
▽ More
Audience discovery is an important activity at major movie studios. Deep models that use convolutional networks to extract frame-by-frame features of a movie trailer and represent it in a form that is suitable for prediction are now possible thanks to the availability of pre-built feature extractors trained on large image datasets. Using these pre-built feature extractors, we are able to process hundreds of publicly available movie trailers, extract frame-by-frame low level features (e.g., a face, an object, etc) and create video-level representations. We use the video-level representations to train a hybrid Collaborative Filtering model that combines video features with historical movie attendance records. The trained model not only makes accurate attendance and audience prediction for existing movies, but also successfully profiles new movies six to eight months prior to their release.
△ Less
Submitted 12 July, 2018;
originally announced July 2018.
-
Collaborative Metric Learning Recommendation System: Application to Theatrical Movie Releases
Authors:
Miguel Campo,
JJ Espinoza,
Julie Rieger,
Abhinav Taliyan
Abstract:
Product recommendation systems are important for major movie studios during the movie greenlight process and as part of machine learning personalization pipelines. Collaborative Filtering (CF) models have proved to be effective at powering recommender systems for online streaming services with explicit customer feedback data. CF models do not perform well in scenarios in which feedback data is not…
▽ More
Product recommendation systems are important for major movie studios during the movie greenlight process and as part of machine learning personalization pipelines. Collaborative Filtering (CF) models have proved to be effective at powering recommender systems for online streaming services with explicit customer feedback data. CF models do not perform well in scenarios in which feedback data is not available, in cold start situations like new product launches, and situations with markedly different customer tiers (e.g., high frequency customers vs. casual customers). Generative natural language models that create useful theme-based representations of an underlying corpus of documents can be used to represent new product descriptions, like new movie plots. When combined with CF, they have shown to increase the performance in cold start situations. Outside of those cases though in which explicit customer feedback is available, recommender engines must rely on binary purchase data, which materially degrades performance. Fortunately, purchase data can be combined with product descriptions to generate meaningful representations of products and customer trajectories in a convenient product space in which proximity represents similarity. Learning to measure the distance between points in this space can be accomplished with a deep neural network that trains on customer histories and on dense vectorizations of product descriptions. We developed a system based on Collaborative (Deep) Metric Learning (CML) to predict the purchase probabilities of new theatrical releases. We trained and evaluated the model using a large dataset of customer histories, and tested the model for a set of movies that were released outside of the training window. Initial experiments show gains relative to models that do not train on collaborative preferences.
△ Less
Submitted 28 February, 2018;
originally announced March 2018.
-
Big IoT and social networking data for smart cities: Algorithmic improvements on Big Data Analysis in the context of RADICAL city applications
Authors:
Evangelos Psomakelis,
Fotis Aisopos,
Antonios Litke,
Konstantinos Tserpes,
Magdalini Kardara,
Pablo Martínez Campo
Abstract:
In this paper we present a SOA (Service Oriented Architecture)-based platform, enabling the retrieval and analysis of big datasets stemming from social networking (SN) sites and Internet of Things (IoT) devices, collected by smart city applications and socially-aware data aggregation services. A large set of city applications in the areas of Participating Urbanism, Augmented Reality and Sound-Mapp…
▽ More
In this paper we present a SOA (Service Oriented Architecture)-based platform, enabling the retrieval and analysis of big datasets stemming from social networking (SN) sites and Internet of Things (IoT) devices, collected by smart city applications and socially-aware data aggregation services. A large set of city applications in the areas of Participating Urbanism, Augmented Reality and Sound-Mapping throughout participating cities is being applied, resulting into produced sets of millions of user-generated events and online SN reports fed into the RADICAL platform. Moreover, we study the application of data analytics such as sentiment analysis to the combined IoT and SN data saved into an SQL database, further investigating algorithmic and configurations to minimize delays in dataset processing and results retrieval.
△ Less
Submitted 2 July, 2016;
originally announced July 2016.