-
Responsible AI: Gender bias assessment in emotion recognition
Authors:
Artem Domnich,
Gholamreza Anbarjafari
Abstract:
Rapid development of artificial intelligence (AI) systems amplify many concerns in society. These AI algorithms inherit different biases from humans due to mysterious operational flow and because of that it is becoming adverse in usage. As a result, researchers have started to address the issue by investigating deeper in the direction towards Responsible and Explainable AI. Among variety of applic…
▽ More
Rapid development of artificial intelligence (AI) systems amplify many concerns in society. These AI algorithms inherit different biases from humans due to mysterious operational flow and because of that it is becoming adverse in usage. As a result, researchers have started to address the issue by investigating deeper in the direction towards Responsible and Explainable AI. Among variety of applications of AI, facial expression recognition might not be the most important one, yet is considered as a valuable part of human-AI interaction. Evolution of facial expression recognition from the feature based methods to deep learning drastically improve quality of such algorithms. This research work aims to study a gender bias in deep learning methods for facial expression recognition by investigating six distinct neural networks, training them, and further analysed on the presence of bias, according to the three definition of fairness. The main outcomes show which models are gender biased, which are not and how gender of subject affects its emotion recognition. More biased neural networks show bigger accuracy gap in emotion recognition between male and female test sets. Furthermore, this trend keeps for true positive and false positive rates. In addition, due to the nature of the research, we can observe which types of emotions are better classified for men and which for women. Since the topic of biases in facial expression recognition is not well studied, a spectrum of continuation of this research is truly extensive, and may comprise detail analysis of state-of-the-art methods, as well as targeting other biases.
△ Less
Submitted 21 March, 2021;
originally announced March 2021.
-
Ensemble approach for detection of depression using EEG features
Authors:
Egils Avots,
Klavs Jermakovs,
Maie Bachmann,
Laura Paeske,
Cagri Ozcinar,
Gholamreza Anbarjafari
Abstract:
Depression is a public health issue which severely affects one's well being and cause negative social and economic effect for society. To rise awareness of these problems, this publication aims to determine if long lasting effects of depression can be determined from electoencephalographic (EEG) signals. The article contains accuracy comparison for SVM, LDA, NB, kNN and D3 binary classifiers which…
▽ More
Depression is a public health issue which severely affects one's well being and cause negative social and economic effect for society. To rise awareness of these problems, this publication aims to determine if long lasting effects of depression can be determined from electoencephalographic (EEG) signals. The article contains accuracy comparison for SVM, LDA, NB, kNN and D3 binary classifiers which were trained using linear (relative band powers, APV, SASI) and non-linear (HFD, LZC, DFA) EEG features. The age and gender matched dataset consisted of 10 healthy subjects and 10 subjects with depression diagnosis at some point in their lifetime. Several of the proposed feature selection and classifier combinations reached accuracy of 90% where all models where evaluated using 10-fold cross validation and averaged over 100 repetitions with random sample permutations.
△ Less
Submitted 7 March, 2021;
originally announced March 2021.
-
Comprehensive Empirical Evaluation of Deep Learning Approaches for Session-based Recommendation in E-Commerce
Authors:
Mohamed Maher,
Perseverance Munga Ngoy,
Aleksandrs Rebriks,
Cagri Ozcinar,
Josue Cuevas,
Rajasekhar Sanagavarapu,
Gholamreza Anbarjafari
Abstract:
Boosting sales of e-commerce services is guaranteed once users find more matching items to their interests in a short time. Consequently, recommendation systems have become a crucial part of any successful e-commerce services. Although various recommendation techniques could be used in e-commerce, a considerable amount of attention has been drawn to session-based recommendation systems during the…
▽ More
Boosting sales of e-commerce services is guaranteed once users find more matching items to their interests in a short time. Consequently, recommendation systems have become a crucial part of any successful e-commerce services. Although various recommendation techniques could be used in e-commerce, a considerable amount of attention has been drawn to session-based recommendation systems during the recent few years. This growing interest is due to the security concerns in collecting personalized user behavior data, especially after the recent general data protection regulations. In this work, we present a comprehensive evaluation of the state-of-the-art deep learning approaches used in the session-based recommendation. In session-based recommendation, a recommendation system counts on the sequence of events made by a user within the same session to predict and endorse other items that are more likely to correlate with his/her preferences. Our extensive experiments investigate baseline techniques (\textit{e.g.,} nearest neighbors and pattern mining algorithms) and deep learning approaches (\textit{e.g.,} recurrent neural networks, graph neural networks, and attention-based networks). Our evaluations show that advanced neural-based models and session-based nearest neighbor algorithms outperform the baseline techniques in most of the scenarios. However, we found that these models suffer more in case of long sessions when there exists drift in user interests, and when there is no enough data to model different items correctly during training. Our study suggests that using hybrid models of different approaches combined with baseline algorithms could lead to substantial results in session-based recommendations based on dataset characteristics. We also discuss the drawbacks of current session-based recommendation algorithms and further open research directions in this field.
△ Less
Submitted 17 October, 2020;
originally announced October 2020.
-
ChaLearn Looking at People: IsoGD and ConGD Large-scale RGB-D Gesture Recognition
Authors:
Jun Wan,
Chi Lin,
Longyin Wen,
Yunan Li,
Qiguang Miao,
Sergio Escalera,
Gholamreza Anbarjafari,
Isabelle Guyon,
Guodong Guo,
Stan Z. Li
Abstract:
The ChaLearn large-scale gesture recognition challenge has been run twice in two workshops in conjunction with the International Conference on Pattern Recognition (ICPR) 2016 and International Conference on Computer Vision (ICCV) 2017, attracting more than $200$ teams round the world. This challenge has two tracks, focusing on isolated and continuous gesture recognition, respectively. This paper d…
▽ More
The ChaLearn large-scale gesture recognition challenge has been run twice in two workshops in conjunction with the International Conference on Pattern Recognition (ICPR) 2016 and International Conference on Computer Vision (ICCV) 2017, attracting more than $200$ teams round the world. This challenge has two tracks, focusing on isolated and continuous gesture recognition, respectively. This paper describes the creation of both benchmark datasets and analyzes the advances in large-scale gesture recognition based on these two datasets. We discuss the challenges of collecting large-scale ground-truth annotations of gesture recognition, and provide a detailed analysis of the current state-of-the-art methods for large-scale isolated and continuous gesture recognition based on RGB-D video sequences. In addition to recognition rate and mean jaccard index (MJI) as evaluation metrics used in our previous challenges, we also introduce the corrected segmentation rate (CSR) metric to evaluate the performance of temporal segmentation for continuous gesture recognition. Furthermore, we propose a bidirectional long short-term memory (Bi-LSTM) baseline method, determining the video division points based on the skeleton points extracted by convolutional pose machine (CPM). Experiments demonstrate that the proposed Bi-LSTM outperforms the state-of-the-art methods with an absolute improvement of $8.1\%$ (from $0.8917$ to $0.9639$) of CSR.
△ Less
Submitted 28 July, 2019;
originally announced July 2019.
-
On the effect of age perception biases for real age regression
Authors:
Julio C. S. Jacques Junior,
Cagri Ozcinar,
Marina Marjanovic,
Xavier Baró,
Gholamreza Anbarjafari,
Sergio Escalera
Abstract:
Automatic age estimation from facial images represents an important task in computer vision. This paper analyses the effect of gender, age, ethnic, makeup and expression attributes of faces as sources of bias to improve deep apparent age prediction. Following recent works where it is shown that apparent age labels benefit real age estimation, rather than direct real to real age regression, our mai…
▽ More
Automatic age estimation from facial images represents an important task in computer vision. This paper analyses the effect of gender, age, ethnic, makeup and expression attributes of faces as sources of bias to improve deep apparent age prediction. Following recent works where it is shown that apparent age labels benefit real age estimation, rather than direct real to real age regression, our main contribution is the integration, in an end-to-end architecture, of face attributes for apparent age prediction with an additional loss for real age regression. Experimental results on the APPA-REAL dataset indicate the proposed network successfully take advantage of the adopted attributes to improve both apparent and real age estimation. Our model outperformed a state-of-the-art architecture proposed to separately address apparent and real age regression. Finally, we present preliminary results and discussion of a proof of concept application using the proposed model to regress the apparent age of an individual based on the gender of an external observer.
△ Less
Submitted 20 February, 2019;
originally announced February 2019.
-
A Study of Language and Classifier-independent Feature Analysis for Vocal Emotion Recognition
Authors:
Fatemeh Noroozi,
Marina Marjanovic,
Angelina Njegus,
Sergio Escalera,
Gholamreza Anbarjafari
Abstract:
Every speech signal carries implicit information about the emotions, which can be extracted by speech processing methods. In this paper, we propose an algorithm for extracting features that are independent from the spoken language and the classification method to have comparatively good recognition performance on different languages independent from the employed classification methods. The propose…
▽ More
Every speech signal carries implicit information about the emotions, which can be extracted by speech processing methods. In this paper, we propose an algorithm for extracting features that are independent from the spoken language and the classification method to have comparatively good recognition performance on different languages independent from the employed classification methods. The proposed algorithm is composed of three stages. In the first stage, we propose a feature ranking method analyzing the state-of-the-art voice quality features. In the second stage, we propose a method for finding the subset of the common features for each language and classifier. In the third stage, we compare our approach with the recognition rate of the state-of-the-art filter methods. We use three databases with different languages, namely, Polish, Serbian and English. Also three different classifiers, namely, nearest neighbour, support vector machine and gradient descent neural network, are employed. It is shown that our method for selecting the most significant language-independent and method-independent features in many cases outperforms state-of-the-art filter methods.
△ Less
Submitted 14 November, 2018;
originally announced November 2018.
-
From 2D to 3D Geodesic-based Garment Matching
Authors:
Meysam Madadi,
Egils Avots,
Sergio Escalera,
Jordi Gonzalez,
Xavier Baro,
Gholamreza Anbarjafari
Abstract:
A new approach for 2D to 3D garment retexturing is proposed based on Gaussian mixture models and thin plate splines (TPS). An automatically segmented garment of an individual is matched to a new source garment and rendered, resulting in augmented images in which the target garment has been retextured by using the texture of the source garment. We divide the problem into garment boundary matching b…
▽ More
A new approach for 2D to 3D garment retexturing is proposed based on Gaussian mixture models and thin plate splines (TPS). An automatically segmented garment of an individual is matched to a new source garment and rendered, resulting in augmented images in which the target garment has been retextured by using the texture of the source garment. We divide the problem into garment boundary matching based on Gaussian mixture models and then interpolate inner points using surface topology extracted through geodesic paths, which leads to a more realistic result than standard approaches. We evaluated and compared our system quantitatively by mean square error (MSE) and qualitatively using the mean opinion score (MOS), showing the benefits of the proposed methodology on our gathered dataset.
△ Less
Submitted 21 September, 2018;
originally announced September 2018.
-
Doubly Attentive Transformer Machine Translation
Authors:
Hasan Sait Arslan,
Mark Fishel,
Gholamreza Anbarjafari
Abstract:
In this paper a doubly attentive transformer machine translation model (DATNMT) is presented in which a doubly-attentive transformer decoder normally joins spatial visual features obtained via pretrained convolutional neural networks, conquering any gap between image captioning and translation. In this framework, the transformer decoder figures out how to take care of source-language words and par…
▽ More
In this paper a doubly attentive transformer machine translation model (DATNMT) is presented in which a doubly-attentive transformer decoder normally joins spatial visual features obtained via pretrained convolutional neural networks, conquering any gap between image captioning and translation. In this framework, the transformer decoder figures out how to take care of source-language words and parts of an image freely by methods for two separate attention components in an Enhanced Multi-Head Attention Layer of doubly attentive transformer, as it generates words in the target language. We find that the proposed model can effectively exploit not just the scarce multimodal machine translation data, but also large general-domain text-only machine translation corpora, or image-text image captioning corpora. The experimental results show that the proposed doubly-attentive transformer-decoder performs better than a single-decoder transformer model, and gives the state-of-the-art results in the English-German multimodal machine translation task.
△ Less
Submitted 30 July, 2018;
originally announced July 2018.
-
3D Scanning: A Comprehensive Survey
Authors:
Morteza Daneshmand,
Ahmed Helmi,
Egils Avots,
Fatemeh Noroozi,
Fatih Alisinanoglu,
Hasan Sait Arslan,
Jelena Gorbova,
Rain Eric Haamer,
Cagri Ozcinar,
Gholamreza Anbarjafari
Abstract:
This paper provides an overview of 3D scanning methodologies and technologies proposed in the existing scientific and industrial literature. Throughout the paper, various types of the related techniques are reviewed, which consist, mainly, of close-range, aerial, structure-from-motion and terrestrial photogrammetry, and mobile, terrestrial and airborne laser scanning, as well as time-of-flight, st…
▽ More
This paper provides an overview of 3D scanning methodologies and technologies proposed in the existing scientific and industrial literature. Throughout the paper, various types of the related techniques are reviewed, which consist, mainly, of close-range, aerial, structure-from-motion and terrestrial photogrammetry, and mobile, terrestrial and airborne laser scanning, as well as time-of-flight, structured-light and phase-comparison methods, along with comparative and combinational studies, the latter being intended to help make a clearer distinction on the relevance and reliability of the possible choices. Moreover, outlier detection and surface fitting procedures are discussed concisely, which are necessary post-processing stages.
△ Less
Submitted 23 January, 2018;
originally announced January 2018.
-
Survey on Emotional Body Gesture Recognition
Authors:
Fatemeh Noroozi,
Ciprian Adrian Corneanu,
Dorota Kamińska,
Tomasz Sapiński,
Sergio Escalera,
Gholamreza Anbarjafari
Abstract:
Automatic emotion recognition has become a trending research topic in the past decade. While works based on facial expressions or speech abound, recognizing affect from body gestures remains a less explored topic. We present a new comprehensive survey hoping to boost research in the field. We first introduce emotional body gestures as a component of what is commonly known as "body language" and co…
▽ More
Automatic emotion recognition has become a trending research topic in the past decade. While works based on facial expressions or speech abound, recognizing affect from body gestures remains a less explored topic. We present a new comprehensive survey hoping to boost research in the field. We first introduce emotional body gestures as a component of what is commonly known as "body language" and comment general aspects as gender differences and culture dependence. We then define a complete framework for automatic emotional body gesture recognition. We introduce person detection and comment static and dynamic body pose estimation methods both in RGB and 3D. We then comment the recent literature related to representation learning and emotion recognition from images of emotionally expressive gestures. We also discuss multi-modal approaches that combine speech or face with body gestures for improved emotion recognition. While pre-processing methodologies (e.g. human detection and pose estimation) are nowadays mature technologies fully developed for robust large scale analysis, we show that for emotion recognition the quantity of labelled data is scarce, there is no agreement on clearly defined output spaces and the representations are shallow and largely based on naive geometrical representations.
△ Less
Submitted 23 January, 2018;
originally announced January 2018.
-
3D Face Reconstruction with Region Based Best Fit Blending Using Mobile Phone for Virtual Reality Based Social Media
Authors:
Gholamreza Anbarjafari,
Rain Eric Haamer,
Iiris Lusi,
Toomas Tikk,
Lembit Valgma
Abstract:
The use of virtual reality (VR) is exponentially increasing and due to that many researchers has started to work on developing new VR based social media. For this purpose it is important to have an avatar of the users which look like them to be easily generated by the devices which are accessible, such as mobile phone. In this paper, we propose a novel method of recreating a 3D human face model ca…
▽ More
The use of virtual reality (VR) is exponentially increasing and due to that many researchers has started to work on developing new VR based social media. For this purpose it is important to have an avatar of the users which look like them to be easily generated by the devices which are accessible, such as mobile phone. In this paper, we propose a novel method of recreating a 3D human face model captured with a phone camera image or video data. The method focuses more on model shape than texture in order to make the face recognizable. We detect 68 facial feature points and use them to separate a face into four regions. For each area the best fitting models are found and are further morphed combined to find the best fitting models for each area. These are then combined and further morphed in order to restore the original facial proportions. We also present a method of texturing the resulting model, where the aforementioned feature points are used to generate a texture for the resulting model
△ Less
Submitted 12 December, 2017;
originally announced January 2018.
-
Automatic Recognition of Facial Displays of Unfelt Emotions
Authors:
Kaustubh Kulkarni,
Ciprian Adrian Corneanu,
Ikechukwu Ofodile,
Sergio Escalera,
Xavier Baro,
Sylwia Hyniewska,
Juri Allik,
Gholamreza Anbarjafari
Abstract:
Humans modify their facial expressions in order to communicate their internal states and sometimes to mislead observers regarding their true emotional states. Evidence in experimental psychology shows that discriminative facial responses are short and subtle. This suggests that such behavior would be easier to distinguish when captured in high resolution at an increased frame rate. We are proposin…
▽ More
Humans modify their facial expressions in order to communicate their internal states and sometimes to mislead observers regarding their true emotional states. Evidence in experimental psychology shows that discriminative facial responses are short and subtle. This suggests that such behavior would be easier to distinguish when captured in high resolution at an increased frame rate. We are proposing SASE-FE, the first dataset of facial expressions that are either congruent or incongruent with underlying emotion states. We show that overall the problem of recognizing whether facial movements are expressions of authentic emotions or not can be successfully addressed by learning spatio-temporal representations of the data. For this purpose, we propose a method that aggregates features along fiducial trajectories in a deeply learnt space. Performance of the proposed model shows that on average it is easier to distinguish among genuine facial expressions of emotion than among unfelt facial expressions of emotion and that certain emotion pairs such as contempt and disgust are more difficult to distinguish than the rest. Furthermore, the proposed methodology improves state of the art results on CK+ and OULU-CASIA datasets for video emotion recognition, and achieves competitive results when classifying facial action units on BP4D datase.
△ Less
Submitted 9 January, 2018; v1 submitted 13 July, 2017;
originally announced July 2017.
-
Image Resolution Enhancement by Using Interpolation Followed by Iterative Back Projection
Authors:
Pejman Rasti,
Hasan Demirel,
Gholamreza Anbarjafari
Abstract:
In this paper, we propose a new super resolution technique based on the interpolation followed by registering them using iterative back projection (IBP). Low resolution images are being interpolated and then the interpolated images are being registered in order to generate a sharper high resolution image. The proposed technique has been tested on Lena, Elaine, Pepper, and Baboon. The quantitative…
▽ More
In this paper, we propose a new super resolution technique based on the interpolation followed by registering them using iterative back projection (IBP). Low resolution images are being interpolated and then the interpolated images are being registered in order to generate a sharper high resolution image. The proposed technique has been tested on Lena, Elaine, Pepper, and Baboon. The quantitative peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) results as well as the visual results show the superiority of the proposed technique over the conventional and state-of-art image super resolution techniques. For Lena's image, the PSNR is 6.52 dB higher than the bicubic interpolation.
△ Less
Submitted 3 January, 2016;
originally announced January 2016.
-
HSI based colour image equalization using iterative nth root and nth power
Authors:
Gholamreza Anbarjafari
Abstract:
In this paper an equalization technique for colour images is introduced. The method is based on nth root and nth power equalization approach but with optimization of the mean of the image in different colour channels such as RGB and HSI. The performance of the proposed method has been measured by the means of peak signal to noise ratio. The proposed algorithm has been compared with conventional hi…
▽ More
In this paper an equalization technique for colour images is introduced. The method is based on nth root and nth power equalization approach but with optimization of the mean of the image in different colour channels such as RGB and HSI. The performance of the proposed method has been measured by the means of peak signal to noise ratio. The proposed algorithm has been compared with conventional histogram equalization and the visual and quantitative experimental results are showing that the proposed method over perform the histogram equalization.
△ Less
Submitted 31 December, 2014;
originally announced January 2015.
-
Face recognition using color local binary pattern from mutually independent color channels
Authors:
Gholamreza Anbarjafari
Abstract:
In this paper, a high performance face recognition system based on local binary pattern (LBP) using the probability distribution functions (PDF) of pixels in different mutually independent color channels which are robust to frontal homogenous illumination and planer rotation is proposed. The illumination of faces is enhanced by using the state-of-the-art technique which is using discrete wavelet t…
▽ More
In this paper, a high performance face recognition system based on local binary pattern (LBP) using the probability distribution functions (PDF) of pixels in different mutually independent color channels which are robust to frontal homogenous illumination and planer rotation is proposed. The illumination of faces is enhanced by using the state-of-the-art technique which is using discrete wavelet transform (DWT) and singular value decomposition (SVD). After equalization, face images are segmented by use of local Successive Mean Quantization Transform (SMQT) followed by skin color based face detection system. Kullback-Leibler Distance (KLD) between the concatenated PDFs of a given face obtained by LBP and the concatenated PDFs of each face in the database is used as a metric in the recognition process. Various decision fusion techniques have been used in order to improve the recognition rate. The proposed system has been tested on the FERET, HP, and Bosphorus face databases. The proposed system is compared with conventional and thestate-of-the-art techniques. The recognition rates obtained using FVF approach for FERET database is 99.78% compared with 79.60% and 68.80% for conventional gray scale LBP and Principle Component Analysis (PCA) based face recognition techniques respectively.
△ Less
Submitted 31 December, 2014;
originally announced January 2015.