What is Sentiment Analysis?

Sentiment Analysis is a field of Natural Language Processing responsible for systems that can extract opinions from natural language. NLP targets creating pipelines that can understand language like we humans do. Sentiment analysis is one of the most basic problems in NLP and is usually one of the first problem that students face in a Natural Language Processing course.

Why Sentiment Analysis?

From being able to mine opinions from product reviews to being able to forecast stock prices by studying tweets, sentiment analysis has a very wide range of applications. Sentiment Analysis forms the basis for almost every other pipeline in what we call Natural Language Understanding, due to the intuitive nature of the problem.

From an instructor’s point of view, sentiment analysis contains everything that a Data Scientist working in NLP should be aware of. Sentence processing and all the common models/architectures used in NLP can be covered under the umbrella of Sentiment Analysis.

Types of Sentiment Analysis

Sentiment Analysis is essentially a classification problem. While sentiment analysis contains a wide array of problem variety, the most common types can be broadly divided as,

  1. Polarity Detection : Talking about the polarity of the sentence, that is, positive, negative or neutral. Sometimes the classification can be even more fine tuned, like very positive, positive, neutral, negative and very negative.
  2. Emotion Detection : Detecting the emotion of the speaker from the sentence, for example, happy, sad, angry etc.
  3. Intent Detection : Being able to detect not only what is present in the sentence but also its intent.

Basic Pipeline

Let us first talk about feature extraction from raw text. The input provided to Sentiment analysis is not all useful. While recent Deep Learning models have promoted shifting all feature engineering to these models, NLP practitioners still prefer cleaning up the input before passing it through any pipeline.

After converting words into mathematical features, Sentiment analysis becomes similar to a time series problem. This is because words used in a sentence related to each other and the order in which they appear in teh sentence matter too. In recent times, LSTM based Deep Learning models have been highly successful in Sentiment analysis.

What’s next?

One of the biggest challenges for Sentiment analysis is being able to capture the context in which the sentence is being presented and it’s tone. Sarcasm is one the biggest problems that common Sentiment analysis systems face. Improvement in being able to understand the context is something the researchers are currently working on.

References :

[1] https://monkeylearn.com/sentiment-analysis/#sentiment-analysis-use-cases-and-applications
[2] Liu, Bing. “Sentiment analysis and opinion mining.” Synthesis lectures on human language technologies 5.1 (2012): 1–167.
[3] Pak, Alexander, and Patrick Paroubek. “Twitter as a corpus for sentiment analysis and opinion mining.” LREc. Vol. 10. №2010. 2010.
[4] Wang, Yequan, Minlie Huang, and Li Zhao. “Attention-based LSTM for aspect-level sentiment classification.” Proceedings of the 2016 conference on empirical methods in natural language processing. 2016.