Authors:
Abhinay Pandya
and
Mourad Oussalah
Affiliation:
University of Oulu, Finland
Keyword(s):
Sentiment Analysis, Deep Learning, Information Retrieval, Text Mining.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Computational Intelligence
;
Evolutionary Computing
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Machine Learning
;
Mining Text and Semi-Structured Data
;
Soft Computing
;
Symbolic Systems
;
Web Mining
Abstract:
Unsupervised learning of distributed representations (word embeddings) obviates the need for task-specific feature engineering for various NLP applications. However, such representations learned from massive text datasets do not faithfully represent finer semantic information in the feature space required by specific applications. This is owing to the fact that (a) models learning such representations ignore the linguistic structure of the sentences, (b) they fail to capture \textit{polysemous} usages of the words, and (c) they ignore pre-existing semantic information from manually-created ontologies. In this paper, we propose three semantics-based distributed representations of words and phrases as features for message polarity classification: Sentiment-Specific Multi-Word Expressions Embeddings(SSMWE) are sentiment encoded distributed representations of \textit{multi-word expressions (MWEs)}; Sense-Disambiguated Word Embeddings(SDWE) are sense-specific distributed representations
of words; and WordNet embeddings(WNE) are distributed representations of hypernym and hyponym of the correct sense of a given word. We examine the effects of these features incorporated in a convolutional neural network(CNN) model for evaluation on the SemEval benchmarked dataset. Our approach of using these novel features yields 14.24\% improvement in the macro-averaged F1 score on SemEval datasets over existing methods. While we have shown promising results in twitter sentiment classification, we believe that the method is general enough to be applied to many NLP applications where finer semantic analysis is required.
(More)