-
Text-based automatic personality prediction: A bibliographic review
Authors:
Ali-Reza Feizi-Derakhshi,
Mohammad-Reza Feizi-Derakhshi,
Majid Ramezani,
Narjes Nikzad-Khasmakhi,
Meysam Asgari-Chenaghlu,
Taymaz Akan,
Mehrdad Ranjbar-Khadivi,
Elnaz Zafarni-Moattar,
Zoleikha Jahanbakhsh-Naghadeh
Abstract:
Personality detection is an old topic in psychology and Automatic Personality Prediction (or Perception) (APP) is the automated (computationally) forecasting of the personality on different types of human generated/exchanged contents (such as text, speech, image, video). The principal objective of this study is to offer a shallow (overall) review of natural language processing approaches on APP si…
▽ More
Personality detection is an old topic in psychology and Automatic Personality Prediction (or Perception) (APP) is the automated (computationally) forecasting of the personality on different types of human generated/exchanged contents (such as text, speech, image, video). The principal objective of this study is to offer a shallow (overall) review of natural language processing approaches on APP since 2010. With the advent of deep learning and following it transfer-learning and pre-trained model in NLP, APP research area has been a hot topic, so in this review, methods are categorized into three; pre-trained independent, pre-trained model based, multimodal approaches. Also, to achieve a comprehensive comparison, reported results are informed by datasets.
△ Less
Submitted 5 September, 2022; v1 submitted 4 October, 2021;
originally announced October 2021.
-
Phraseformer: Multimodal Key-phrase Extraction using Transformer and Graph Embedding
Authors:
Narjes Nikzad-Khasmakhi,
Mohammad-Reza Feizi-Derakhshi,
Meysam Asgari-Chenaghlu,
Mohammad-Ali Balafar,
Ali-Reza Feizi-Derakhshi,
Taymaz Rahkar-Farshi,
Majid Ramezani,
Zoleikha Jahanbakhsh-Nagadeh,
Elnaz Zafarani-Moattar,
Mehrdad Ranjbar-Khadivi
Abstract:
Background: Keyword extraction is a popular research topic in the field of natural language processing. Keywords are terms that describe the most relevant information in a document. The main problem that researchers are facing is how to efficiently and accurately extract the core keywords from a document. However, previous keyword extraction approaches have utilized the text and graph features, th…
▽ More
Background: Keyword extraction is a popular research topic in the field of natural language processing. Keywords are terms that describe the most relevant information in a document. The main problem that researchers are facing is how to efficiently and accurately extract the core keywords from a document. However, previous keyword extraction approaches have utilized the text and graph features, there is the lack of models that can properly learn and combine these features in a best way.
Methods: In this paper, we develop a multimodal Key-phrase extraction approach, namely Phraseformer, using transformer and graph embedding techniques. In Phraseformer, each keyword candidate is presented by a vector which is the concatenation of the text and structure learning representations. Phraseformer takes the advantages of recent researches such as BERT and ExEm to preserve both representations. Also, the Phraseformer treats the key-phrase extraction task as a sequence labeling problem solved using classification task.
Results: We analyze the performance of Phraseformer on three datasets including Inspec, SemEval2010 and SemEval 2017 by F1-score. Also, we investigate the performance of different classifiers on Phraseformer method over Inspec dataset. Experimental results demonstrate the effectiveness of Phraseformer method over the three datasets used. Additionally, the Random Forest classifier gain the highest F1-score among all classifiers.
Conclusions: Due to the fact that the combination of BERT and ExEm is more meaningful and can better represent the semantic of words. Hence, Phraseformer significantly outperforms single-modality methods.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
Covid-Transformer: Detecting COVID-19 Trending Topics on Twitter Using Universal Sentence Encoder
Authors:
Meysam Asgari-Chenaghlu,
Narjes Nikzad-Khasmakhi,
Shervin Minaee
Abstract:
The novel corona-virus disease (also known as COVID-19) has led to a pandemic, impacting more than 200 countries across the globe. With its global impact, COVID-19 has become a major concern of people almost everywhere, and therefore there are a large number of tweets coming out from every corner of the world, about COVID-19 related topics. In this work, we try to analyze the tweets and detect the…
▽ More
The novel corona-virus disease (also known as COVID-19) has led to a pandemic, impacting more than 200 countries across the globe. With its global impact, COVID-19 has become a major concern of people almost everywhere, and therefore there are a large number of tweets coming out from every corner of the world, about COVID-19 related topics. In this work, we try to analyze the tweets and detect the trending topics and major concerns of people on Twitter, which can enable us to better understand the situation, and devise better planning. More specifically we propose a model based on the universal sentence encoder to detect the main topics of Tweets in recent months. We used universal sentence encoder in order to derive the semantic representation and the similarity of tweets. We then used the sentence similarity and their embeddings, and feed them to K-means clustering algorithm to group similar tweets (in semantic sense). After that, the cluster summary is obtained using a text summarization algorithm based on deep learning, which can uncover the underlying topics of each cluster. Through experimental results, we show that our model can detect very informative topics, by processing a large number of tweets on sentence level (which can preserve the overall meaning of the tweets). Since this framework has no restriction on specific data distribution, it can be used to detect trending topics from any other social media and any other context rather than COVID-19. Experimental results show superiority of our proposed approach to other baselines, including TF-IDF, and latent Dirichlet allocation (LDA).
△ Less
Submitted 19 September, 2020; v1 submitted 8 September, 2020;
originally announced September 2020.
-
BERTERS: Multimodal Representation Learning for Expert Recommendation System with Transformer
Authors:
N. Nikzad-Khasmakhi,
M. A. Balafar,
M. Reza Feizi-Derakhshi,
Cina Motamed
Abstract:
The objective of an expert recommendation system is to trace a set of candidates' expertise and preferences, recognize their expertise patterns, and identify experts. In this paper, we introduce a multimodal classification approach for expert recommendation system (BERTERS). In our proposed system, the modalities are derived from text (articles published by candidates) and graph (their co-author c…
▽ More
The objective of an expert recommendation system is to trace a set of candidates' expertise and preferences, recognize their expertise patterns, and identify experts. In this paper, we introduce a multimodal classification approach for expert recommendation system (BERTERS). In our proposed system, the modalities are derived from text (articles published by candidates) and graph (their co-author connections) information. BERTERS converts text into a vector using the Bidirectional Encoder Representations from Transformer (BERT). Also, a graph Representation technique called ExEm is used to extract the features of candidates from the co-author network. Final representation of a candidate is the concatenation of these vectors and other features. Eventually, a classifier is built on the concatenation of features. This multimodal approach can be used in both the academic community and the community question answering. To verify the effectiveness of BERTERS, we analyze its performance on multi-label classification and visualization tasks.
△ Less
Submitted 30 June, 2020;
originally announced July 2020.
-
Multimodal price prediction
Authors:
Aidin Zehtab-Salmasi,
Ali-Reza Feizi-Derakhshi,
Narjes Nikzad-Khasmakhi,
Meysam Asgari-Chenaghlu,
Saeideh Nabipour
Abstract:
Price prediction is one of the examples related to forecasting tasks and is a project based on data science. Price prediction analyzes data and predicts the cost of new products. The goal of this research is to achieve an arrangement to predict the price of a cellphone based on its specifications. So, five deep learning models are proposed to predict the price range of a cellphone, one unimodal an…
▽ More
Price prediction is one of the examples related to forecasting tasks and is a project based on data science. Price prediction analyzes data and predicts the cost of new products. The goal of this research is to achieve an arrangement to predict the price of a cellphone based on its specifications. So, five deep learning models are proposed to predict the price range of a cellphone, one unimodal and four multimodal approaches. The multimodal methods predict the prices based on the graphical and non-graphical features of cellphones that have an important effect on their valorizations. Also, to evaluate the efficiency of the proposed methods, a cellphone dataset has been gathered from GSMArena. The experimental results show 88.3% F1-score, which confirms that multimodal learning leads to more accurate predictions than state-of-the-art techniques.
△ Less
Submitted 2 April, 2021; v1 submitted 9 July, 2020;
originally announced July 2020.
-
Automatic Personality Prediction; an Enhanced Method Using Ensemble Modeling
Authors:
Majid Ramezani,
Mohammad-Reza Feizi-Derakhshi,
Mohammad-Ali Balafar,
Meysam Asgari-Chenaghlu,
Ali-Reza Feizi-Derakhshi,
Narjes Nikzad-Khasmakhi,
Mehrdad Ranjbar-Khadivi,
Zoleikha Jahanbakhsh-Nagadeh,
Elnaz Zafarani-Moattar,
Taymaz Rahkar-Farshi
Abstract:
Human personality is significantly represented by those words which he/she uses in his/her speech or writing. As a consequence of spreading the information infrastructures (specifically the Internet and social media), human communications have reformed notably from face to face communication. Generally, Automatic Personality Prediction (or Perception) (APP) is the automated forecasting of the pers…
▽ More
Human personality is significantly represented by those words which he/she uses in his/her speech or writing. As a consequence of spreading the information infrastructures (specifically the Internet and social media), human communications have reformed notably from face to face communication. Generally, Automatic Personality Prediction (or Perception) (APP) is the automated forecasting of the personality on different types of human generated/exchanged contents (like text, speech, image, video, etc.). The major objective of this study is to enhance the accuracy of APP from the text. To this end, we suggest five new APP methods including term frequency vector-based, ontology-based, enriched ontology-based, latent semantic analysis (LSA)-based, and deep learning-based (BiLSTM) methods. These methods as the base ones, contribute to each other to enhance the APP accuracy through ensemble modeling (stacking) based on a hierarchical attention network (HAN) as the meta-model. The results show that ensemble modeling enhances the accuracy of APP.
△ Less
Submitted 8 June, 2022; v1 submitted 9 July, 2020;
originally announced July 2020.
-
A Model to Measure the Spread Power of Rumors
Authors:
Zoleikha Jahanbakhsh-Nagadeh,
Mohammad-Reza Feizi-Derakhshi,
Majid Ramezani,
Taymaz Akan,
Meysam Asgari-Chenaghlu,
Narjes Nikzad-Khasmakhi,
Ali-Reza Feizi-Derakhshi,
Mehrdad Ranjbar-Khadivi,
Elnaz Zafarani-Moattar,
Mohammad-Ali Balafar
Abstract:
With technologies that have democratized the production and reproduction of information, a significant portion of daily interacted posts in social media has been infected by rumors. Despite the extensive research on rumor detection and verification, so far, the problem of calculating the spread power of rumors has not been considered. To address this research gap, the present study seeks a model t…
▽ More
With technologies that have democratized the production and reproduction of information, a significant portion of daily interacted posts in social media has been infected by rumors. Despite the extensive research on rumor detection and verification, so far, the problem of calculating the spread power of rumors has not been considered. To address this research gap, the present study seeks a model to calculate the Spread Power of Rumor (SPR) as the function of content-based features in two categories: False Rumor (FR) and True Rumor (TR). For this purpose, the theory of Allport and Postman will be adopted, which it claims that importance and ambiguity are the key variables in rumor-mongering and the power of rumor. Totally 42 content features in two categories "importance" (28 features) and "ambiguity" (14 features) are introduced to compute SPR. The proposed model is evaluated on two datasets, Twitter and Telegram. The results showed that (i) the spread power of False Rumor documents is rarely more than True Rumors. (ii) there is a significant difference between the SPR means of two groups False Rumor and True Rumor. (iii) SPR as a criterion can have a positive impact on distinguishing False Rumors and True Rumors.
△ Less
Submitted 17 June, 2022; v1 submitted 18 February, 2020;
originally announced February 2020.
-
ExEm: Expert Embedding using dominating set theory with deep learning approaches
Authors:
N. Nikzad-Khasmakhi,
M. A. Balafar,
M. Reza Feizi-Derakhshi,
Cina Motamed
Abstract:
A collaborative network is a social network that is comprised of experts who cooperate with each other to fulfill a special goal. Analyzing this network yields meaningful information about the expertise of these experts and their subject areas. To perform the analysis, graph embedding techniques have emerged as an effective and promising tool. Graph embedding attempts to represent graph nodes as l…
▽ More
A collaborative network is a social network that is comprised of experts who cooperate with each other to fulfill a special goal. Analyzing this network yields meaningful information about the expertise of these experts and their subject areas. To perform the analysis, graph embedding techniques have emerged as an effective and promising tool. Graph embedding attempts to represent graph nodes as low-dimensional vectors. In this paper, we propose a graph embedding method, called ExEm, that uses dominating-set theory and deep learning approaches to capture node representations. ExEm finds dominating nodes of the collaborative network and constructs intelligent random walks that comprise of at least two dominating nodes. One dominating node should appear at the beginning of each path sampled to characterize the local neighborhoods. Moreover, the second dominating node reflects the global structure information. To learn the node embeddings, ExEm exploits three embedding methods including Word2vec, fastText and the concatenation of these two. The final result is the low-dimensional vectors of experts, called expert embeddings. The extracted expert embeddings can be applied to many applications. In order to extend these embeddings into the expert recommendation system, we introduce a novel strategy that uses expert vectors to calculate experts' scores and recommend experts. At the end, we conduct extensive experiments to validate the effectiveness of ExEm through assessing its performance over the multi-label classification, link prediction, and recommendation tasks on common datasets and our collected data formed by crawling the vast author Scopus profiles. The experiments show that ExEm outperforms the baselines especially in dense networks.
△ Less
Submitted 22 January, 2021; v1 submitted 16 January, 2020;
originally announced January 2020.