Bag of Words Meets Bags of Popcorn

Size: px

Start display at page:

Download "Bag of Words Meets Bags of Popcorn"

Lenard Small
6 years ago
Views:

1 Sentiment Analysis via and Natural Language Processing Tarleton State University July 16, 2015

2 Data Description

3 Sentiment Score tf-idf NDSI AFINN List word score invincible 2 mirthful 3 flops -2 hypocritical -2 upset -2 overlooked -1 hooligans -2 welcome 2

4 Sentiment Score tf-idf NDSI AFINN Score Probability Densities by Sentiment Bayes Classifier, AUC =.770

5 Sentiment Score tf-idf NDSI Count up the number of times each term occurs in a review no good war great bad... Review Review Review Review Review Review

6 Sentiment Score tf-idf NDSI Count up the number of times each term occurs in a review Using AFINN list to start no good war great bad... Review Review Review Review Review Review

7 Sentiment Score tf-idf NDSI Count up the number of times each term occurs in a review Using AFINN list to start no good war great bad... Review Review Review Review Review Review

8 Sentiment Score tf-idf NDSI Random Forest Classifier, AUC =.886

9 Sentiment Score tf-idf NDSI Text Frequency Inverse Document Frequency Compare a term s relevance in a document to the inverse of its relevance in a collection of documents

10 Sentiment Score tf-idf NDSI Text Frequency Inverse Document Frequency Compare a term s relevance in a document to the inverse of its relevance in a collection of documents The more frequently a term occurs in a document, the more relevant it is to that document

11 Sentiment Score tf-idf NDSI Text Frequency Inverse Document Frequency Compare a term s relevance in a document to the inverse of its relevance in a collection of documents The more frequently a term occurs in a document, the more relevant it is to that document The more frequently a term occurs in a collection of documents, the less relevant it is to each document in the collection

12 Sentiment Score tf-idf NDSI Text Frequency Inverse Document Frequency tf (t, d) = n(t d)

13 Sentiment Score tf-idf NDSI Text Frequency Inverse Document Frequency idf (t, D) = log tf (t, d) = n(t d) D {d D : t d}

14 Sentiment Score tf-idf NDSI Text Frequency Inverse Document Frequency idf (t, D) = log tf (t, d) = n(t d) D {d D : t d} tfidf (t, d, D) = tf (t, d) idf (t, D)

15 Sentiment Score tf-idf NDSI Text Frequency Inverse Document Frequency AUC =.883

16 Sentiment Score tf-idf NDSI Feature Extraction A priori feature extraction tends to perform poorly for simple analyses

17 Sentiment Score tf-idf NDSI Feature Extraction A priori feature extraction tends to perform poorly for simple analyses It s typically better to learn features from the data themselves

18 Sentiment Score tf-idf NDSI Term Frequency Word Frequency movie 125,307 film 113,054 one 77,447 like 59,147 just 53,132 good 43,279..

19 Sentiment Score tf-idf NDSI Difference in Term Frequencies Word Freq (Pos) Freq (Neg) Difference movie 18,139 23,668 5,529 bad 1,830 7,089 5,259 great 6,294 2,601 3,693 just 7,098 10,535 3,437 even 4,899 7,604 2,705 worst 246 2,436 2,

20 Sentiment Score tf-idf NDSI Normalized Difference Sentiment Index NDSI := n(t 1) n(t 0) n(t 1) + n(t 0)

21 Sentiment Score tf-idf NDSI Normalized Difference Sentiment Index NDSI := (n(t 1) + α) (n(t 0) + α) (n(t 1) + α) + (n(t 0) + α)

22 Sentiment Score tf-idf NDSI Normalized Difference Sentiment Index NDSI := n(t 1) n(t 0) n(t 1) + n(t 0) + 2α

23 Sentiment Score tf-idf NDSI Normalized Difference Sentiment Index NDSI := n(t 1) n(t 0) n(t 1) + n(t 0) + 2α

24 Sentiment Score tf-idf NDSI Normalized Difference Sentiment Index Word Freq (Pos) Freq (Neg) Difference NDSI worst 246 2,436 2, waste 94 1,351 1, poorly lame awful 159 1,441 1, mess

25 Sentiment Score tf-idf NDSI Normalized Difference Sentiment Index, AUC =.919 tf-idf, AUC =.904

26 Word2vec Doc2vec Word Vectors Vector representation of words word v i = [v i1, v i2,..., v in ] V R N Relative word meanings reflected in vector representations

27 Word2vec Doc2vec Word Vectors Distributional hypothesis: Two words appear in similar contexts iff they share similar meaning

28 Word2vec Doc2vec Word Vectors Distributional hypothesis: Two words appear in similar contexts iff they share similar meaning Context similarity: If two words appear in similar contexts, then their vector representations are similar, i.e. P( v i c) P( v j c) = v i v j

29 Word2vec Doc2vec Word Vectors Distributional hypothesis: Two words appear in similar contexts iff they share similar meaning Context similarity: If two words appear in similar contexts, then their vector representations are similar, i.e. P( v i c) P( v j c) = v i v j Distributional hypothesis + context similarity = If two words share similar meaning, then their vector representations are similar

30 Word2vec Doc2vec Word Vectors

31 Word2vec Doc2vec One Word Contexts (Bigrams) Multinomial Logistic (Softmax) Regression e βj v i P( v j v i ) = V e β k v i k Generalizable into multi-word contexts

32 Word2vec Doc2vec Word Similarity What do we mean by similar?

33 Word2vec Doc2vec Word Similarity What do we mean by similar? Cosine Similarity

34 Word2vec Doc2vec Word Similarity What do we mean by similar? Cosine Similarity sim( v i, v j ) = v i v j v i v j

35 Word2vec Doc2vec Word Similarity What do we mean by similar? Cosine Similarity sim( v i, v j ) = v i v j v i v j In [16]: model.most_similar( physics ) Out[16]: [(u quantum, ), (u laws, ), (u scientific, ), (u engineering, ), (u gravity, ), (u theory, ), (u mechanics, )]

36 Word2vec Doc2vec Analogies Relative meanings and word relationships preserved in vector representations

37 Word2vec Doc2vec Analogies Relative meanings and word relationships preserved in vector representations MAN : KING :: WOMAN :???

38 Word2vec Doc2vec Analogies Relative meanings and word relationships preserved in vector representations MAN : KING :: WOMAN :??? v king v man x v woman

39 Word2vec Doc2vec MAN : KING :: WOMAN :??? In [17]: model.most_similar(positive = [ king, woman ], negative = [ man ])

40 Word2vec Doc2vec MAN : KING :: WOMAN :??? In [17]: model.most_similar(positive = [ king, woman ], negative = [ man ]) Out[17]: [(u queen, ), (u princess, ), (u arthur, ), (u mistress, ), (u france, ), (u lion, ), (u throne, ), (u kong, ), (u kingdom, ), (u prince, )]

41 Word2vec Doc2vec Document Vectors Combine word vectors into one document vector

42 Word2vec Doc2vec Document Vectors Combine word vectors into one document vector f ({ v 1, v 2,... v k }) = d

43 Word2vec Doc2vec Document Vectors Combine word vectors into one document vector f ({ v 1, v 2,... v k }) = d Document vectors live in the same space as word vectors

44 Word2vec Doc2vec Document Vectors Combine word vectors into one document vector f ({ v 1, v 2,... v k }) = d Document vectors live in the same space as word vectors bad not good = v bad v not good

45 Word2vec Doc2vec Document Vectors Combine word vectors into one document vector f ({ v 1, v 2,... v k }) = d Document vectors live in the same space as word vectors bad not good = v bad v not good

46 Word2vec Doc2vec Document Vectors Combine word vectors into one document vector f ({ v 1, v 2,... v k }) = d Document vectors live in the same space as word vectors bad not good = v bad v not good Syntax trees

47 Word2vec Doc2vec Syntax Trees S IP IP NP VP but IP All models are wrong, NP VP some models are useful.

48 Word2vec Doc2vec References. Kaggle. AFINN list. F Nielson. The Technical University of Denmark. details.php?id=6010 Tf-idf Weighting. The Stanford Group. C Manning, P Raghavan. H Schütze, Deep Learning for (Without Magic). R Socher, Y Bengio, C Manning. The Stanford Group. word2vec Parameter Learning Explained. X Rong. arxiv: v1. Thank you!

Homework 3 COMS 4705 Fall 2017 Prof. Kathleen McKeown

Homework 3 COMS 4705 Fall 2017 Prof. Kathleen McKeown Homework 3 COMS 4705 Fall 017 Prof. Kathleen McKeown The assignment consists of a programming part and a written part. For the programming part, make sure you have set up the development environment as