Multi-theme Sentiment Analysis using Quantified Contextual

Size: px

Start display at page:

Download "Multi-theme Sentiment Analysis using Quantified Contextual"

Claud West
6 years ago
Views:

1 Multi-theme Sentiment Analysis using Quantified Contextual Valence Shifters Hongkun Yu, Jingbo Shang, MeichunHsu, Malú Castellanos, Jiawei Han Presented by Jingbo Shang University of Illinois at Urbana-Champaign Oct 26, 2016 CIKM 2016

2 2 Outline q Observations and Definitions q Methodology: MTSA q Performance Study and Experimental Results q Conclusions and Future Work

3 3 Observation I - Multi-Theme q Review Examples q Observation q A sentiment word may express different polarity in different themes

4 What is a theme? q Review Examples 4 q Theme is a very general concept, it could be q Different aspects of products, e.g., service and environment for restaurants; q Different categories of review target, e.g., horror movie and romantic movie

5 Theme - Formal Definition q The themes in each review r are represented by a vector θ #, where θ #$ is the weight of theme i in the review r. q We assume such descriptors are given Aspects Battery Queue Screen Camera Documents

6 6 Observation II - Shifter q Review Examples q Observation q The presences of contextual valence shifters may interfere the word polarity.

7 7 What is a shifter? q Review Examples q 3 types q q q Negation: not Intensifier: very Diminisher: slightly

8 Shifter - Formal Definition q Assumption q Shifters are theme-invariant. q The sentiment shifting effect of the shifter w is quantified as f ( R q S, represents the sentiment polarity score of the word w q Assumption q Product rule: s./$012#,( = f./$012# S ( q Examples q not happy = f 678 S :;<<= q very happy = f >?@= S :;<<= 8 q possibly happy = f <7AABCD= S :;<<=

9 9 Outline q Observations and Definitions q Methodology: MTSA q Performance Study and Experimental Results q Conclusions and Future Work

10 Methodology - What is MTSA? q A data-driven approach q Given a review corpus D, the sentiment label (polarity or score) and the theme descriptor θ q An unified word-level sentiment analysis model q Multi-theme q Theme embedding and word embedding to capture different sentiment polarities of the same word in different themes. q Shifter q Automatically discover the sentiment-changing patterns and quantify their effects. 10

11 11 Methodology Multi-theme q [Observation] A sentiment word may express different polarity in different themes. q The sentiment polarity for word j in theme i: s $H = p i T q j q p i -- theme i s embedding vector q q j -- word j s embedding vector q W OH is the occurrence of the word j in the document d q Normalizations such as TF-IDF may be applied q A document d is a bag-of-words q s O = θ O$ W OH $ H p i T q j q Feature-based Matrix Factorization [2]

12 12 Methodology Shifter q [Observation] The presences of contextual valence shifters may interfere the word polarity. q Theme-invariant sentiment words q The polarities of s $H are consistent among almost all themes. q Learn f based theme-invariant sentiment words q A logistic regression problem q Find the context of shifters; Mask the sentiments of common sentiment words; Infer the effect of shifters

13 13 Methodology Shifter q Example : very disappointed in the customer service s([very, disappointed, service, ]) : I do not love the flavor s([do, not, love,..]) Masked by shifters : very disappointed in the customer service s([very, service, ]) : I do not love the flavor s([do, not,..]) f very s disappointed f not s love Learn shifters effect values: very intensifier, not negation q Theme-invariant sentiment words: disappointed (-) & love(+); q Find the context of shifters (sliding window); q Infer the effect of shifters (a logistic regression problem).

14 Methodology MTSA 14 q Iterative learning process q Fix shifter effects à Learn theme and word embeddings q Feature-based Matrix Factorization q Fix theme and word embeddingsà Learn shifter effects q Logistic Regression q Additional challenges: q Not very Not Very q Not good Bad q Our solutions: Phrase Mining techniques [1] q not_very as a phrase shifter q not_good as a sentiment phrase

15 15 Outline q Observations and Definitions q Methodology: MTSA q Performance Study and Experimental Results q Conclusions and Future Work

16 16 Experimental Settings q Dataset Statistics q Theme Descriptor q Yelp & IMDB: LDA implementation in MALLET [4], 20 topics. q RT: A biterm topic model (BTM) [3] for short text, 5 topics. q Note: RT is too short for LDA to estimate the posterior topic distributions.

17 Multi-Theme Verification q Polarities of the same sentiment words in different themes q cozy, prepared, cheap, cash, boring, old Cozy Prepared Cheap Cash Boring Old Restaurant Automotive Shopping Drink & Bar Gym 17

Shifter Learning Quality q Human Evaluation Design q Given a review and selected shifter modified sentiment words, check if after modification, the sentiment is correct or not.

18 Shifter Learning Quality q Human Evaluation Design q Given a review and selected shifter modified sentiment words, check if after modification, the sentiment is correct or not. q Typical error by overfitting: they were actually really good q Bi-gram: actually good = q Ours: actually good = q The intraclass correlation of 4 human judges is high enough to show agreement 18

19 Example Shifter Effects (Yelp) q Good negation: f 678 < 0.5 never: -1.33, not so: -1.00, not even: -0.75, not: -0.52, not very: -0.48, not really: -0.39, none: -0.27, no: -0.22, only: -0.18, not that: -0.13, nothing really: q Good diminisher: 0.0 < f ADBX:8D= < 1.0 could: 0.12, reasonably: 0.17, few: 0.18, slightly: 0.18, nothing that: 0.18, felt: 0.22, before: 0.22, not overly: 0.25, would only: 0.25, than: 0.27, somehow: 0.28 q Good intensifier: f >?@= > 1.0 completely: 2.59, more than: 2.42, absolutely: 2.33, extremely: 2.33, really: 2.25, not only: 2.23, some really: 2.17, far: 2.15, particularly: 2.13, simply: 2.12, too: 2.06, excessively: 2.02, certainly: 2.00, most: 2.00, very: 1.96

20 20 Explainable Sentiment Analysis

21 21 Sentiment Classification q Evaluate binary classification accuracy q All datasets are close to be balanced Not substantially improved, especially in Yelp & IMDB. Why?

22 22 Sentiment Classification - Discussion q The instances are ranked by the ratio (number of shifters /number of tokens), from high to low. q When the ratio getting bigger, shifters exist in the review with a larger portion and the gain of modeling shifter effect is bigger.

23 23 Sentiment Classification - Discussion q From statistical perspective q over 93% of reviews have shifters q the portion of words (serving as features) adjusted in each review are 7.2/87 in Yelp dataset and 10.5/122.8 in IMDB dataset q From semantic perspective q Long reviews have many mentions of similar sentiment, i.e., people mention not happy and unhappy in the same review q Conclusion q Shifters may not play important roles for long document classification, but for shorter text or sentence level, they will be more effective.

24 24 Outline q Observations and Definitions q Methodology: MTSA q Performance Study and Experimental Results q Conclusions and Future Work

25 25 Conclusions and Future Work q Conclusions q Discovered shifters with quantified effects enable people better understanding reviews q Multi-theme classifiers and shifter discovery are beneficial to sentiment analysis q Shifters only offers limited power to boost sentiment classification for long reviews, in accordance with literatures q Future Work q Beyond bag-of-words feature representations q Linguistic grammar to distinguish shifters

26 26 Reference q [1] Liu, Jialu, et al. "Mining quality phrases from massive text corpora."proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, q [2] Shang, Jingbo, et al. "A Parallel and Efficient Algorithm for Learning to Match." 2014 IEEE International Conference on Data Mining. IEEE, q [3] Yan, Xiaohui, et al. "A biterm topic model for short texts." Proceedings of the 22nd international conference on World Wide Web. ACM, q [4] McCallum, Andrew Kachites. "Mallet: A machine learning for language toolkit." (2002).

27 27 Q&A Thanks!

28 28 Sentiment Classification - Iterative Refinement

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation.

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation. ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Previous lectures: Sparse vectors recap How to represent