-
Emotion-aware Personalized Music Recommendation with a Heterogeneity-aware Deep Bayesian Network
Authors:
Erkang Jing,
Yezheng Liu,
Yidong Chai,
Shuo Yu,
Longshun Liu,
Yuanchun Jiang,
Yang Wang
Abstract:
Music recommender systems play a critical role in music streaming platforms by providing users with music that they are likely to enjoy. Recent studies have shown that user emotions can influence users' preferences for music moods. However, existing emotion-aware music recommender systems (EMRSs) explicitly or implicitly assume that users' actual emotional states expressed through identical emotio…
▽ More
Music recommender systems play a critical role in music streaming platforms by providing users with music that they are likely to enjoy. Recent studies have shown that user emotions can influence users' preferences for music moods. However, existing emotion-aware music recommender systems (EMRSs) explicitly or implicitly assume that users' actual emotional states expressed through identical emotional words are homogeneous. They also assume that users' music mood preferences are homogeneous under the same emotional state. In this article, we propose four types of heterogeneity that an EMRS should account for: emotion heterogeneity across users, emotion heterogeneity within a user, music mood preference heterogeneity across users, and music mood preference heterogeneity within a user. We further propose a Heterogeneity-aware Deep Bayesian Network (HDBN) to model these assumptions. The HDBN mimics a user's decision process of choosing music with four components: personalized prior user emotion distribution modeling, posterior user emotion distribution modeling, user grouping, and Bayesian neural network-based music mood preference prediction. We constructed two datasets, called EmoMusicLJ and EmoMusicLJ-small, to validate our method. Extensive experiments demonstrate that our method significantly outperforms baseline approaches on metrics of HR, Precision, NDCG, and MRR. Ablation studies and case studies further validate the effectiveness of our HDBN. The source code and datasets are available at https://github.com/jingrk/HDBN.
△ Less
Submitted 29 November, 2024; v1 submitted 20 June, 2024;
originally announced June 2024.
-
OVIR-3D: Open-Vocabulary 3D Instance Retrieval Without Training on 3D Data
Authors:
Shiyang Lu,
Haonan Chang,
Eric Pu Jing,
Abdeslam Boularias,
Kostas Bekris
Abstract:
This work presents OVIR-3D, a straightforward yet effective method for open-vocabulary 3D object instance retrieval without using any 3D data for training. Given a language query, the proposed method is able to return a ranked set of 3D object instance segments based on the feature similarity of the instance and the text query. This is achieved by a multi-view fusion of text-aligned 2D region prop…
▽ More
This work presents OVIR-3D, a straightforward yet effective method for open-vocabulary 3D object instance retrieval without using any 3D data for training. Given a language query, the proposed method is able to return a ranked set of 3D object instance segments based on the feature similarity of the instance and the text query. This is achieved by a multi-view fusion of text-aligned 2D region proposals into 3D space, where the 2D region proposal network could leverage 2D datasets, which are more accessible and typically larger than 3D datasets. The proposed fusion process is efficient as it can be performed in real-time for most indoor 3D scenes and does not require additional training in 3D space. Experiments on public datasets and a real robot show the effectiveness of the method and its potential for applications in robot navigation and manipulation.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs
Authors:
Haonan Chang,
Kowndinya Boyalakuntla,
Shiyang Lu,
Siwei Cai,
Eric Jing,
Shreesh Keskar,
Shijie Geng,
Adeeb Abbas,
Lifeng Zhou,
Kostas Bekris,
Abdeslam Boularias
Abstract:
We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as ``pick up a cup on a kitchen table" or ``navigate to a…
▽ More
We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as ``pick up a cup on a kitchen table" or ``navigate to a sofa on which someone is sitting". In contrast to existing research on 3D scene graphs, OVSG supports free-form text input and open-vocabulary querying. Through a series of comparative experiments using the ScanNet dataset and a self-collected dataset, we demonstrate that our proposed approach significantly surpasses the performance of previous semantic-based localization techniques. Moreover, we highlight the practical application of OVSG in real-world robot navigation and manipulation experiments.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Identifying Introductions in Podcast Episodes from Automatically Generated Transcripts
Authors:
Elise Jing,
Kristiana Schneck,
Dennis Egan,
Scott A. Waterman
Abstract:
As the volume of long-form spoken-word content such as podcasts explodes, many platforms desire to present short, meaningful, and logically coherent segments extracted from the full content. Such segments can be consumed by users to sample content before diving in, as well as used by the platform to promote and recommend content. However, little published work is focused on the segmentation of spo…
▽ More
As the volume of long-form spoken-word content such as podcasts explodes, many platforms desire to present short, meaningful, and logically coherent segments extracted from the full content. Such segments can be consumed by users to sample content before diving in, as well as used by the platform to promote and recommend content. However, little published work is focused on the segmentation of spoken-word content, where the errors (noise) in transcripts generated by automatic speech recognition (ASR) services poses many challenges. Here we build a novel dataset of complete transcriptions of over 400 podcast episodes, in which we label the position of introductions in each episode. These introductions contain information about the episodes' topics, hosts, and guests, providing a valuable summary of the episode content, as it is created by the authors. We further augment our dataset with word substitutions to increase the amount of available training data. We train three Transformer models based on the pre-trained BERT and different augmentation strategies, which achieve significantly better performance compared with a static embedding model, showing that it is possible to capture generalized, larger-scale structural information from noisy, loosely-organized speech data. This is further demonstrated through an analysis of the models' inner architecture. Our methods and dataset can be used to facilitate future work on the structure-based segmentation of spoken-word content.
△ Less
Submitted 9 December, 2021; v1 submitted 13 October, 2021;
originally announced October 2021.
-
Characterizing Partisan Political Narrative Frameworks about COVID-19 on Twitter
Authors:
Elise Jing,
Yong-Yeol Ahn
Abstract:
The COVID-19 pandemic is a global crisis that has been testing every society and exposing the critical role of local politics in crisis response. In the United States, there has been a strong partisan divide between the Democratic and Republican party's narratives about the pandemic which resulted in polarization of individual behaviors and divergent policy adoption across regions. As shown in thi…
▽ More
The COVID-19 pandemic is a global crisis that has been testing every society and exposing the critical role of local politics in crisis response. In the United States, there has been a strong partisan divide between the Democratic and Republican party's narratives about the pandemic which resulted in polarization of individual behaviors and divergent policy adoption across regions. As shown in this case, as well as in most major social issues, strongly polarized narrative frameworks facilitate such narratives. To understand polarization and other social chasms, it is critical to dissect these diverging narratives. Here, taking the Democratic and Republican political social media posts about the pandemic as a case study, we demonstrate that a combination of computational methods can provide useful insights into the different contexts, framing, and characters and relationships that construct their narrative frameworks which individual posts source from. Leveraging a dataset of tweets from elite politicians in the U.S., we found that the Democrats' narrative tends to be more concerned with the pandemic as well as financial and social support, while the Republicans discuss more about other political entities such as China. We then perform an automatic framing analysis to characterize the ways in which they frame their narratives, where we found that the Democrats emphasize the government's role in responding to the pandemic, and the Republicans emphasize the roles of individuals and support for small businesses. Finally, we present a semantic role analysis that uncovers the important characters and relationships in their narratives as well as how they facilitate a membership categorization process. Our findings concretely expose the gaps in the "elusive consensus" between the two parties. Our methodologies may be applied to computationally study narratives in various domains.
△ Less
Submitted 13 October, 2021; v1 submitted 11 March, 2021;
originally announced March 2021.
-
FrameAxis: Characterizing Microframe Bias and Intensity with Word Embedding
Authors:
Haewoon Kwak,
Jisun An,
Elise Jing,
Yong-Yeol Ahn
Abstract:
Framing is a process of emphasizing a certain aspect of an issue over the others, nudging readers or listeners towards different positions on the issue even without making a biased argument. {Here, we propose FrameAxis, a method for characterizing documents by identifying the most relevant semantic axes ("microframes") that are overrepresented in the text using word embedding. Our unsupervised app…
▽ More
Framing is a process of emphasizing a certain aspect of an issue over the others, nudging readers or listeners towards different positions on the issue even without making a biased argument. {Here, we propose FrameAxis, a method for characterizing documents by identifying the most relevant semantic axes ("microframes") that are overrepresented in the text using word embedding. Our unsupervised approach can be readily applied to large datasets because it does not require manual annotations. It can also provide nuanced insights by considering a rich set of semantic axes. FrameAxis is designed to quantitatively tease out two important dimensions of how microframes are used in the text. \textit{Microframe bias} captures how biased the text is on a certain microframe, and \textit{microframe intensity} shows how actively a certain microframe is used. Together, they offer a detailed characterization of the text. We demonstrate that microframes with the highest bias and intensity well align with sentiment, topic, and partisan spectrum by applying FrameAxis to multiple datasets from restaurant reviews to political news.} The existing domain knowledge can be incorporated into FrameAxis {by using custom microframes and by using FrameAxis as an iterative exploratory analysis instrument.} Additionally, we propose methods for explaining the results of FrameAxis at the level of individual words and documents. Our method may accelerate scalable and sophisticated computational analyses of framing across disciplines.
△ Less
Submitted 22 July, 2021; v1 submitted 20 February, 2020;
originally announced February 2020.
-
Sameness Entices, but Novelty Enchants in Fanfiction Online
Authors:
Elise Jing,
Simon DeDeo,
Devin Robert Wright,
Yong-Yeol Ahn
Abstract:
Cultural evolution is driven by how we choose what to consume and share with others. A common belief is that the cultural artifacts that succeed are ones that balance novelty and conventionality. This balance theory suggests that people prefer works that are familiar, but not so familiar as to be boring; novel, but not so novel as to violate the expectations of their genre. We test this idea using…
▽ More
Cultural evolution is driven by how we choose what to consume and share with others. A common belief is that the cultural artifacts that succeed are ones that balance novelty and conventionality. This balance theory suggests that people prefer works that are familiar, but not so familiar as to be boring; novel, but not so novel as to violate the expectations of their genre. We test this idea using a large dataset of fanfiction. We apply a multiple regression model and a generalized additive model to examine how the recognition a work receives varies with its novelty, estimated through a Latent Dirichlet Allocation topic model, in the context of existing works. We find the opposite pattern of what the balance theory predicts$\unicode{x2014}$overall success decline almost monotonically with novelty and exhibits a U-shaped, instead of an inverse U-shaped, curve. This puzzle is resolved by teasing out two competing forces: sameness attracts the mass whereas novelty provides enjoyment. Taken together, even though the balance theory holds in terms of expressed enjoyment, the overall success can show the opposite pattern due to the dominant role of sameness to attract the audience. Under these two forces, cultural evolution may have to work against inertia$\unicode{x2014}$the appetite for consuming the familiar$\unicode{x2014}$and may resemble a punctuated equilibrium, marked by occasional leaps.
△ Less
Submitted 15 September, 2023; v1 submitted 16 April, 2019;
originally announced April 2019.
-
Global labor flow network reveals the hierarchical organization and dynamics of geo-industrial clusters in the world economy
Authors:
Jaehyuk Park,
Ian Wood,
Elise Jing,
Azadeh Nematzadeh,
Souvik Ghosh,
Michael Conover,
Yong-Yeol Ahn
Abstract:
Groups of firms often achieve a competitive advantage through the formation of geo-industrial clusters. Although many exemplary clusters, such as Hollywood or Silicon Valley, have been frequently studied, systematic approaches to identify and analyze the hierarchical structure of the geo-industrial clusters at the global scale are rare. In this work, we use LinkedIn's employment histories of more…
▽ More
Groups of firms often achieve a competitive advantage through the formation of geo-industrial clusters. Although many exemplary clusters, such as Hollywood or Silicon Valley, have been frequently studied, systematic approaches to identify and analyze the hierarchical structure of the geo-industrial clusters at the global scale are rare. In this work, we use LinkedIn's employment histories of more than 500 million users over 25 years to construct a labor flow network of over 4 million firms across the world and apply a recursive network community detection algorithm to reveal the hierarchical structure of geo-industrial clusters. We show that the resulting geo-industrial clusters exhibit a stronger association between the influx of educated-workers and financial performance, compared to existing aggregation units. Furthermore, our additional analysis of the skill sets of educated-workers supplements the relationship between the labor flow of educated-workers and productivity growth. We argue that geo-industrial clusters defined by labor flow provide better insights into the growth and the decline of the economy than other common economic units.
△ Less
Submitted 19 March, 2019; v1 submitted 12 February, 2019;
originally announced February 2019.