Unsupervised learning of fundamental emotional states via word embeddings

M. Mazzoleni, G. Maroni and F. Previdi

This paper presents a novel approach for the detection of emotional states from textual data. The considered sentiments are those known as Ekman’s basic emotions (Anger, Disgust, Sadness, Happiness, Fear, Surprise). The method is completely unsupervised and it is based on the concept of word embeddings. This technique permits to represent a single word through a vector, giving a methematical representation of the word’s semantic. The focus of the work is to assign the percentage of the aforementioned emotions to short sentences. The method
has been tested on a collection of Twitter messages and on the SemEval 2007 news headlines dataset. The entire period is expressed as the mean of the word’s vectors that compose the phrase, after preprocessing steps. The sentence representation is finally compared with each emotion’s word vector, to find the most representative with respect to the sentence’s vector.