RECSM Summer School: Facebook + Topic Models. github.com/pablobarbera/big-data-upf
|
|
- Eugene Gilmore
- 5 years ago
- Views:
Transcription
1 RECSM Summer School: Facebook + Topic Models Pablo Barberá School of International Relations University of Southern California pablobarbera.com Networked Democracy Lab Course website: github.com/pablobarbera/big-data-upf
2 Collecting Facebook data Facebook only allows access to public pages data through the Graph API: 1. Posts on public pages
3 Collecting Facebook data Facebook only allows access to public pages data through the Graph API: 1. Posts on public pages 2. Likes, reactions, comments, replies...
4 Collecting Facebook data Facebook only allows access to public pages data through the Graph API: 1. Posts on public pages 2. Likes, reactions, comments, replies... Some public user data (gender, location) was available through previous versions of the API (not anymore)
5 Collecting Facebook data Facebook only allows access to public pages data through the Graph API: 1. Posts on public pages 2. Likes, reactions, comments, replies... Some public user data (gender, location) was available through previous versions of the API (not anymore) Access to other (anonymized) data used in published studies requires permission from Facebook
6 Collecting Facebook data Facebook only allows access to public pages data through the Graph API: 1. Posts on public pages 2. Likes, reactions, comments, replies... Some public user data (gender, location) was available through previous versions of the API (not anymore) Access to other (anonymized) data used in published studies requires permission from Facebook R library: Rfacebook
7 Overview of text as data methods Fig. 1 in Grimmer and Stewart (2013)
8 Overview of text as data methods Entity Recognition Fig. 1 in Grimmer and Stewart (2013)
9 Overview of text as data methods Entity Recognition Events Quotes Locations Names... Fig. 1 in Grimmer and Stewart (2013)
10 Overview of text as data methods Entity Recognition Cosine similarity Naive Bayes Events Quotes Locations Names... Fig. 1 in Grimmer and Stewart (2013)
11 Overview of text as data methods Entity Recognition Cosine similarity Naive Bayes mixture model? Events Quotes Locations Names... Fig. 1 in Grimmer and Stewart (2013)
12 Overview of text as data methods Entity Recognition Cosine similarity Naive Bayes mixture model? Events Quotes Locations Names... (ML methods) Fig. 1 in Grimmer and Stewart (2013)
13 Overview of text as data methods Entity Recognition Cosine similarity Naive Bayes mixture model? Events Quotes Locations Names... Models with covariates (slda, STM) (ML methods) Fig. 1 in Grimmer and Stewart (2013)
14 Latent Dirichlet allocation (LDA) Topic models are powerful tools for exploring large data sets and for making inferences about the content of documents!"#$%&'() *"+,#) +"/,9#)1 +.&),3&'(1 "65%51 :5)2,'0("'1.&/,0,"'1 -.&/,0,"'1 2,'3$1 4$3,5)%1 &(2,#)1 6$332,)%1 )+".()1 65)&65//1 )"##&.1 65)7&(65//1 8""(65//1 - -
15 Latent Dirichlet allocation (LDA) Complexity+of+Inference+in+Latent+Dirichlet+Alloca6on+ David+Sontag,+Daniel+Roy+ Topic models are powerful tools for exploring large data (NYU,+Cambridge)+ sets and for making inferences about the content of documents Topic+models+are+powerful+tools+for+exploring+large+data+sets+and+for+making+ inferences+about+the+content+of+documents+!"#$%&'() *"+,#) Documents+ Topics+ +"/,9#)1.&/,0,"'1 )+".()1 poli6cs &),3&'(1 2,'3$1 religion )&65//1sports president hindu baseball "65%51 4$3,5)%1 )"##&.1 obama judiasm soccer :5)2,'0("'1 &(2,#)1 65)7&(65//1 washington ethics basketball &/,0,"'1 6$332,)%1 8""(65//1 religion buddhism football Many applications in information retrieval, document summarization, and classification Almost+all+uses+of+topic+models+(e.g.,+for+unsupervised+learning,+informa6on+ retrieval,+classifica6on)+require+probabilis)c+inference:+ New+document+ t = p(w z = t) What+is+this+document+about?+ weather+.50+ finance+.49+ sports+.01+ W66+ + Words+w 1,+,+w N+ Distribu6on+of+topics +
16 Latent Dirichlet allocation (LDA) Complexity+of+Inference+in+Latent+Dirichlet+Alloca6on+ David+Sontag,+Daniel+Roy+ Topic models are powerful tools for exploring large data (NYU,+Cambridge)+ sets and for making inferences about the content of documents Topic+models+are+powerful+tools+for+exploring+large+data+sets+and+for+making+ inferences+about+the+content+of+documents+!"#$%&'() *"+,#) Documents+ Topics+ +"/,9#)1.&/,0,"'1 )+".()1 poli6cs &),3&'(1 2,'3$1 religion )&65//1sports president hindu baseball "65%51 4$3,5)%1 )"##&.1 obama judiasm soccer :5)2,'0("'1 &(2,#)1 65)7&(65//1 washington ethics basketball &/,0,"'1 6$332,)%1 8""(65//1 religion buddhism football Many applications in information retrieval, document summarization, and classification Almost+all+uses+of+topic+models+(e.g.,+for+unsupervised+learning,+informa6on+ retrieval,+classifica6on)+require+probabilis)c+inference:+ New+document+ t = p(w z = t) What+is+this+document+about?+ weather+.50+ finance+.49+ sports+.01+ W66+ + Words+w 1,+,+w N+ Distribu6on+of+topics + LDA is one of the simplest and most widely used topic models
17 Latent Dirichlet Allocation
18 Latent Dirichlet Allocation Document = random mixture over latent topics Topic = distribution over n-grams Probabilistic model with 3 steps: 1. Choose θ i Dirichlet(α) 2. Choose β k Dirichlet(δ) 3. For each word in document i: Choose a topic zm Multinomial(θ i ) Choose a word w im Multinomial(β i,k=zm ) where: α=parameter of Dirichlet prior on distribution of topics over docs. θ i =topic distribution for document i δ=parameter of Dirichlet prior on distribution of words over topics β k =word distribution for topic k
19 Latent Dirichlet Allocation Key parameters: 1. θ = matrix of dimensions N documents by K topics where θ ik corresponds to the probability that document i belongs to topic k; i.e. assuming K = 5: T1 T2 T3 T4 T5 Document Document Document N
20 Latent Dirichlet Allocation Key parameters: 1. θ = matrix of dimensions N documents by K topics where θ ik corresponds to the probability that document i belongs to topic k; i.e. assuming K = 5: T1 T2 T3 T4 T5 Document Document Document N β = matrix of dimensions K topics by M words where β km corresponds to the probability that word m belongs to topic k; i.e. assuming M = 6: W1 W2 W3 W4 W5 W6 Topic Topic Topic k
21 Validation From Quinn et al, AJPS, 2010: 1. Semantic validity
22 Validation From Quinn et al, AJPS, 2010: 1. Semantic validity Do the topics identify coherent groups of tweets that are internally homogenous, and are related to each other in a meaningful way?
23 Validation From Quinn et al, AJPS, 2010: 1. Semantic validity Do the topics identify coherent groups of tweets that are internally homogenous, and are related to each other in a meaningful way? 2. Convergent/discriminant construct validity
24 Validation From Quinn et al, AJPS, 2010: 1. Semantic validity Do the topics identify coherent groups of tweets that are internally homogenous, and are related to each other in a meaningful way? 2. Convergent/discriminant construct validity Do the topics match existing measures where they should match?
25 Validation From Quinn et al, AJPS, 2010: 1. Semantic validity Do the topics identify coherent groups of tweets that are internally homogenous, and are related to each other in a meaningful way? 2. Convergent/discriminant construct validity Do the topics match existing measures where they should match? Do they depart from existing measures where they should depart?
26 Validation From Quinn et al, AJPS, 2010: 1. Semantic validity Do the topics identify coherent groups of tweets that are internally homogenous, and are related to each other in a meaningful way? 2. Convergent/discriminant construct validity Do the topics match existing measures where they should match? Do they depart from existing measures where they should depart? 3. Predictive validity
27 Validation From Quinn et al, AJPS, 2010: 1. Semantic validity Do the topics identify coherent groups of tweets that are internally homogenous, and are related to each other in a meaningful way? 2. Convergent/discriminant construct validity Do the topics match existing measures where they should match? Do they depart from existing measures where they should depart? 3. Predictive validity Does variation in topic usage correspond with expected events?
28 Validation From Quinn et al, AJPS, 2010: 1. Semantic validity Do the topics identify coherent groups of tweets that are internally homogenous, and are related to each other in a meaningful way? 2. Convergent/discriminant construct validity Do the topics match existing measures where they should match? Do they depart from existing measures where they should depart? 3. Predictive validity Does variation in topic usage correspond with expected events? 4. Hypothesis validity
29 Validation From Quinn et al, AJPS, 2010: 1. Semantic validity Do the topics identify coherent groups of tweets that are internally homogenous, and are related to each other in a meaningful way? 2. Convergent/discriminant construct validity Do the topics match existing measures where they should match? Do they depart from existing measures where they should depart? 3. Predictive validity Does variation in topic usage correspond with expected events? 4. Hypothesis validity Can topic variation be used effectively to test substantive hypotheses?
30 Example: open-ended survey responses Bauer, Barberá et al, Political Behavior, Data: General Social Survey (2008) in Germany Responses to questions: Would you please tell me what you associate with the term left? and would you please tell me what you associate with the term right? Open-ended questions minimize priming and potential interviewer effects Sparse Additive Generative model instead of LDA (more coherent topics for short text) K = 4 topics for each question
31 Example: open-ended survey responses Bauer, Barberá et al, Political Behavior, 2016.
32 Example: open-ended survey responses Bauer, Barberá et al, Political Behavior, 2016.
33 Example: open-ended survey responses Bauer, Barberá et al, Political Behavior, 2016.
34 Example: topics in US legislators tweets Data: 651,116 tweets sent by US legislators from January 2013 to December ,920 documents = 730 days 2 chambers 2 parties Why aggregating? Applications that aggregate by author or day outperform tweet-level analyses (Hong and Davidson, 2010) K = 100 topics (more on this later) Validation:
35 Choosing the number of topics Choosing K is one of the most difficult questions in unsupervised learning (Grimmer and Stewart, 2013, p.19)
36 Choosing the number of topics Choosing K is one of the most difficult questions in unsupervised learning (Grimmer and Stewart, 2013, p.19) We chose K = 100 based on cross-validated model fit Ratio wrt worst value loglikelihood Perplexity Number of topics
37 Choosing the number of topics Choosing K is one of the most difficult questions in unsupervised learning (Grimmer and Stewart, 2013, p.19) We chose K = 100 based on cross-validated model fit Ratio wrt worst value loglikelihood Perplexity Number of topics BUT: there is often a negative relationship between the best-fitting model and the substantive information provided.
38 Choosing the number of topics Choosing K is one of the most difficult questions in unsupervised learning (Grimmer and Stewart, 2013, p.19) We chose K = 100 based on cross-validated model fit Ratio wrt worst value loglikelihood Perplexity Number of topics BUT: there is often a negative relationship between the best-fitting model and the substantive information provided. GS propose to choose K based on substantive fit.
39 Extensions of LDA 1. Structural topic model (Roberts et al, 2014, AJPS) 2. Dynamic topic model (Blei and Lafferty, 2006, ICML; Quinn et al, 2010, AJPS) 3. Hierarchical topic model (Griffiths and Tenembaun, 2004, NIPS; Grimmer, 2010, PA) Why? Substantive reasons: incorporate specific elements of DGP into estimation Statistical reasons: structure can lead to better topics.
40 Structural topic model Prevalence: Prior on the mixture over topics is now document-specific, and can be a function of covariates (documents with similar covariates will tend to be about the same topics) Content: distribution over words is now document-specific and can be a function of covariates (documents with similar covariates will tend to use similar words to refer to the same topic)
41 Dynamic topic model Source: Blei, Modeling Science
42 Dynamic topic model Source: Blei, Modeling Science
43 Wordfish (Slapin and Proksch, 2008, AJPS) Goal: unsupervised scaling of ideological positions
44 Wordfish (Slapin and Proksch, 2008, AJPS) Goal: unsupervised scaling of ideological positions Ideology of politician i, θ i is a position in a latent scale.
45 Wordfish (Slapin and Proksch, 2008, AJPS) Goal: unsupervised scaling of ideological positions Ideology of politician i, θ i is a position in a latent scale. Word usage is drawn from a Poisson-IRT model: W im Poisson(λ im ) λ im = exp(α i + ψ m + β m θ i )
46 Wordfish (Slapin and Proksch, 2008, AJPS) Goal: unsupervised scaling of ideological positions Ideology of politician i, θ i is a position in a latent scale. Word usage is drawn from a Poisson-IRT model: W im Poisson(λ im ) λ im = exp(α i + ψ m + β m θ i ) where:
47 Wordfish (Slapin and Proksch, 2008, AJPS) Goal: unsupervised scaling of ideological positions Ideology of politician i, θ i is a position in a latent scale. Word usage is drawn from a Poisson-IRT model: W im Poisson(λ im ) λ im = exp(α i + ψ m + β m θ i ) where: α i is loquaciousness of politician i
48 Wordfish (Slapin and Proksch, 2008, AJPS) Goal: unsupervised scaling of ideological positions Ideology of politician i, θ i is a position in a latent scale. Word usage is drawn from a Poisson-IRT model: W im Poisson(λ im ) λ im = exp(α i + ψ m + β m θ i ) where: α i is loquaciousness of politician i ψ m is frequency of word m
49 Wordfish (Slapin and Proksch, 2008, AJPS) Goal: unsupervised scaling of ideological positions Ideology of politician i, θ i is a position in a latent scale. Word usage is drawn from a Poisson-IRT model: W im Poisson(λ im ) λ im = exp(α i + ψ m + β m θ i ) where: α i is loquaciousness of politician i ψ m is frequency of word m β m is discrimination parameter of word m
50 Wordfish (Slapin and Proksch, 2008, AJPS) Goal: unsupervised scaling of ideological positions Ideology of politician i, θ i is a position in a latent scale. Word usage is drawn from a Poisson-IRT model: W im Poisson(λ im ) λ im = exp(α i + ψ m + β m θ i ) where: α i is loquaciousness of politician i ψ m is frequency of word m β m is discrimination parameter of word m Estimation using EM algorithm.
51 Wordfish (Slapin and Proksch, 2008, AJPS) Goal: unsupervised scaling of ideological positions Ideology of politician i, θ i is a position in a latent scale. Word usage is drawn from a Poisson-IRT model: W im Poisson(λ im ) λ im = exp(α i + ψ m + β m θ i ) where: α i is loquaciousness of politician i ψ m is frequency of word m β m is discrimination parameter of word m Estimation using EM algorithm. Identification:
52 Wordfish (Slapin and Proksch, 2008, AJPS) Goal: unsupervised scaling of ideological positions Ideology of politician i, θ i is a position in a latent scale. Word usage is drawn from a Poisson-IRT model: W im Poisson(λ im ) λ im = exp(α i + ψ m + β m θ i ) where: α i is loquaciousness of politician i ψ m is frequency of word m β m is discrimination parameter of word m Estimation using EM algorithm. Identification: Unit variance restriction for θ i
53 Wordfish (Slapin and Proksch, 2008, AJPS) Goal: unsupervised scaling of ideological positions Ideology of politician i, θ i is a position in a latent scale. Word usage is drawn from a Poisson-IRT model: W im Poisson(λ im ) λ im = exp(α i + ψ m + β m θ i ) where: α i is loquaciousness of politician i ψ m is frequency of word m β m is discrimination parameter of word m Estimation using EM algorithm. Identification: Unit variance restriction for θ i Choose a and b such that θa > θ b
Introduction To Machine Learning
Introduction To Machine Learning David Sontag New York University Lecture 21, April 14, 2016 David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 1 / 14 Expectation maximization
More informationLatent Dirichlet Allocation (LDA)
Latent Dirichlet Allocation (LDA) D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3:993-1022, January 2003. Following slides borrowed ant then heavily modified from: Jonathan Huang
More informationHybrid Models for Text and Graphs. 10/23/2012 Analysis of Social Media
Hybrid Models for Text and Graphs 10/23/2012 Analysis of Social Media Newswire Text Formal Primary purpose: Inform typical reader about recent events Broad audience: Explicitly establish shared context
More informationInformation retrieval LSI, plsi and LDA. Jian-Yun Nie
Information retrieval LSI, plsi and LDA Jian-Yun Nie Basics: Eigenvector, Eigenvalue Ref: http://en.wikipedia.org/wiki/eigenvector For a square matrix A: Ax = λx where x is a vector (eigenvector), and
More informationTopic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up
Much of this material is adapted from Blei 2003. Many of the images were taken from the Internet February 20, 2014 Suppose we have a large number of books. Each is about several unknown topics. How can
More informationRETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS
RETRIEVAL MODELS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Boolean model Vector space model Probabilistic
More informationGenerative Clustering, Topic Modeling, & Bayesian Inference
Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week
More informationLecture 22 Exploratory Text Analysis & Topic Models
Lecture 22 Exploratory Text Analysis & Topic Models Intro to NLP, CS585, Fall 2014 http://people.cs.umass.edu/~brenocon/inlp2014/ Brendan O Connor [Some slides borrowed from Michael Paul] 1 Text Corpus
More informationTopic Modeling: Beyond Bag-of-Words
University of Cambridge hmw26@cam.ac.uk June 26, 2006 Generative Probabilistic Models of Text Used in text compression, predictive text entry, information retrieval Estimate probability of a word in a
More informationMachine Learning
Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University April 5, 2011 Today: Latent Dirichlet Allocation topic models Social network analysis based on latent probabilistic
More informationCS Lecture 18. Topic Models and LDA
CS 6347 Lecture 18 Topic Models and LDA (some slides by David Blei) Generative vs. Discriminative Models Recall that, in Bayesian networks, there could be many different, but equivalent models of the same
More informationAN INTRODUCTION TO TOPIC MODELS
AN INTRODUCTION TO TOPIC MODELS Michael Paul December 4, 2013 600.465 Natural Language Processing Johns Hopkins University Prof. Jason Eisner Making sense of text Suppose you want to learn something about
More informationDiscovering Geographical Topics in Twitter
Discovering Geographical Topics in Twitter Liangjie Hong, Lehigh University Amr Ahmed, Yahoo! Research Alexander J. Smola, Yahoo! Research Siva Gurumurthy, Twitter Kostas Tsioutsiouliklis, Twitter Overview
More informationUnified Modeling of User Activities on Social Networking Sites
Unified Modeling of User Activities on Social Networking Sites Himabindu Lakkaraju IBM Research - India Manyata Embassy Business Park Bangalore, Karnataka - 5645 klakkara@in.ibm.com Angshu Rai IBM Research
More informationAssessing the Consequences of Text Preprocessing Decisions
Assessing the Consequences of Text Preprocessing Decisions Matthew J. Denny 1 Penn State University Arthur Spirling New York University October 15, 20016 1 Work supported by NSF Grant: DGE-1144860 Common
More informationUnderstanding Comments Submitted to FCC on Net Neutrality. Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014
Understanding Comments Submitted to FCC on Net Neutrality Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014 Abstract We aim to understand and summarize themes in the 1.65 million
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Matrix Data: Clustering: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu October 19, 2014 Methods to Learn Matrix Data Set Data Sequence Data Time Series Graph & Network
More informationDocument and Topic Models: plsa and LDA
Document and Topic Models: plsa and LDA Andrew Levandoski and Jonathan Lobo CS 3750 Advanced Topics in Machine Learning 2 October 2018 Outline Topic Models plsa LSA Model Fitting via EM phits: link analysis
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING Text Data: Topic Model Instructor: Yizhou Sun yzsun@cs.ucla.edu December 4, 2017 Methods to be Learnt Vector Data Set Data Sequence Data Text Data Classification Clustering
More informationApplying LDA topic model to a corpus of Italian Supreme Court decisions
Applying LDA topic model to a corpus of Italian Supreme Court decisions Paolo Fantini Statistical Service of the Ministry of Justice - Italy CESS Conference - Rome - November 25, 2014 Our goal finding
More informationLatent variable models for discrete data
Latent variable models for discrete data Jianfei Chen Department of Computer Science and Technology Tsinghua University, Beijing 100084 chris.jianfei.chen@gmail.com Janurary 13, 2014 Murphy, Kevin P. Machine
More informationA Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank
A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank Shoaib Jameel Shoaib Jameel 1, Wai Lam 2, Steven Schockaert 1, and Lidong Bing 3 1 School of Computer Science and Informatics,
More informationCollaborative topic models: motivations cont
Collaborative topic models: motivations cont Two topics: machine learning social network analysis Two people: " boy Two articles: article A! girl article B Preferences: The boy likes A and B --- no problem.
More informationStudy Notes on the Latent Dirichlet Allocation
Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection
More informationLearning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text
Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute
More informationTopic Modeling Using Latent Dirichlet Allocation (LDA)
Topic Modeling Using Latent Dirichlet Allocation (LDA) Porter Jenkins and Mimi Brinberg Penn State University prj3@psu.edu mjb6504@psu.edu October 23, 2017 Porter Jenkins and Mimi Brinberg (PSU) LDA October
More informationKernel Density Topic Models: Visual Topics Without Visual Words
Kernel Density Topic Models: Visual Topics Without Visual Words Konstantinos Rematas K.U. Leuven ESAT-iMinds krematas@esat.kuleuven.be Mario Fritz Max Planck Institute for Informatics mfrtiz@mpi-inf.mpg.de
More informationComparing Relevance Feedback Techniques on German News Articles
B. Mitschang et al. (Hrsg.): BTW 2017 Workshopband, Lecture Notes in Informatics (LNI), Gesellschaft für Informatik, Bonn 2017 301 Comparing Relevance Feedback Techniques on German News Articles Julia
More informationProbabilistic Dyadic Data Analysis with Local and Global Consistency
Deng Cai DENGCAI@CAD.ZJU.EDU.CN Xuanhui Wang XWANG20@CS.UIUC.EDU Xiaofei He XIAOFEIHE@CAD.ZJU.EDU.CN State Key Lab of CAD&CG, College of Computer Science, Zhejiang University, 100 Zijinggang Road, 310058,
More informationLiangjie Hong, Ph.D. Candidate Dept. of Computer Science and Engineering Lehigh University Bethlehem, PA
Rutgers, The State University of New Jersey Nov. 12, 2012 Liangjie Hong, Ph.D. Candidate Dept. of Computer Science and Engineering Lehigh University Bethlehem, PA Motivation Modeling Social Streams Future
More informationLatent Dirichlet Conditional Naive-Bayes Models
Latent Dirichlet Conditional Naive-Bayes Models Arindam Banerjee Dept of Computer Science & Engineering University of Minnesota, Twin Cities banerjee@cs.umn.edu Hanhuai Shan Dept of Computer Science &
More informationtopic modeling hanna m. wallach
university of massachusetts amherst wallach@cs.umass.edu Ramona Blei-Gantz Helen Moss (Dave's Grandma) The Next 30 Minutes Motivations and a brief history: Latent semantic analysis Probabilistic latent
More informationCollaborative Topic Modeling for Recommending Scientific Articles
Collaborative Topic Modeling for Recommending Scientific Articles Chong Wang and David M. Blei Best student paper award at KDD 2011 Computer Science Department, Princeton University Presented by Tian Cao
More informationLatent Dirichlet Allocation Based Multi-Document Summarization
Latent Dirichlet Allocation Based Multi-Document Summarization Rachit Arora Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai - 600 036, India. rachitar@cse.iitm.ernet.in
More informationDistributed ML for DOSNs: giving power back to users
Distributed ML for DOSNs: giving power back to users Amira Soliman KTH isocial Marie Curie Initial Training Networks Part1 Agenda DOSNs and Machine Learning DIVa: Decentralized Identity Validation for
More informationReplicated Softmax: an Undirected Topic Model. Stephen Turner
Replicated Softmax: an Undirected Topic Model Stephen Turner 1. Introduction 2. Replicated Softmax: A Generative Model of Word Counts 3. Evaluating Replicated Softmax as a Generative Model 4. Experimental
More informationPachinko Allocation: DAG-Structured Mixture Models of Topic Correlations
: DAG-Structured Mixture Models of Topic Correlations Wei Li and Andrew McCallum University of Massachusetts, Dept. of Computer Science {weili,mccallum}@cs.umass.edu Abstract Latent Dirichlet allocation
More informationLatent Dirichlet Allocation (LDA)
Latent Dirichlet Allocation (LDA) A review of topic modeling and customer interactions application 3/11/2015 1 Agenda Agenda Items 1 What is topic modeling? Intro Text Mining & Pre-Processing Natural Language
More informationMeasuring Topic Quality in Latent Dirichlet Allocation
Measuring Topic Quality in Sergei Koltsov Olessia Koltsova Steklov Institute of Mathematics at St. Petersburg Laboratory for Internet Studies, National Research University Higher School of Economics, St.
More informationTopic Models and Applications to Short Documents
Topic Models and Applications to Short Documents Dieu-Thu Le Email: dieuthu.le@unitn.it Trento University April 6, 2011 1 / 43 Outline Introduction Latent Dirichlet Allocation Gibbs Sampling Short Text
More informationClassical Predictive Models
Laplace Max-margin Markov Networks Recent Advances in Learning SPARSE Structured I/O Models: models, algorithms, and applications Eric Xing epxing@cs.cmu.edu Machine Learning Dept./Language Technology
More informationOnline Bayesian Passive-Agressive Learning
Online Bayesian Passive-Agressive Learning International Conference on Machine Learning, 2014 Tianlin Shi Jun Zhu Tsinghua University, China 21 August 2015 Presented by: Kyle Ulrich Introduction Online
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationGaussian Models
Gaussian Models ddebarr@uw.edu 2016-04-28 Agenda Introduction Gaussian Discriminant Analysis Inference Linear Gaussian Systems The Wishart Distribution Inferring Parameters Introduction Gaussian Density
More informationYahoo! Labs Nov. 1 st, Liangjie Hong, Ph.D. Candidate Dept. of Computer Science and Engineering Lehigh University
Yahoo! Labs Nov. 1 st, 2012 Liangjie Hong, Ph.D. Candidate Dept. of Computer Science and Engineering Lehigh University Motivation Modeling Social Streams Future work Motivation Modeling Social Streams
More informationWeb-Mining Agents Topic Analysis: plsi and LDA. Tanya Braun Ralf Möller Universität zu Lübeck Institut für Informationssysteme
Web-Mining Agents Topic Analysis: plsi and LDA Tanya Braun Ralf Möller Universität zu Lübeck Institut für Informationssysteme Acknowledgments Pilfered from: Ramesh M. Nallapati Machine Learning applied
More informationMachine learning for pervasive systems Classification in high-dimensional spaces
Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version
More informationDay 6: Classification and Machine Learning
Day 6: Classification and Machine Learning Kenneth Benoit Essex Summer School 2014 July 30, 2013 Today s Road Map The Naive Bayes Classifier The k-nearest Neighbour Classifier Support Vector Machines (SVMs)
More informationBayesian Learning. CSL603 - Fall 2017 Narayanan C Krishnan
Bayesian Learning CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Bayes Theorem MAP Learners Bayes optimal classifier Naïve Bayes classifier Example text classification Bayesian networks
More informationIncorporating Social Context and Domain Knowledge for Entity Recognition
Incorporating Social Context and Domain Knowledge for Entity Recognition Jie Tang, Zhanpeng Fang Department of Computer Science, Tsinghua University Jimeng Sun College of Computing, Georgia Institute of
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationBayesian Nonparametrics for Speech and Signal Processing
Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer
More informationLecture 13 : Variational Inference: Mean Field Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1
More informationTime Series Topic Modeling and Bursty Topic Detection of Correlated News and Twitter
Time Series Topic Modeling and Bursty Topic Detection of Correlated News and Twitter Daichi Koike Yusuke Takahashi Takehito Utsuro Grad. Sch. Sys. & Inf. Eng., University of Tsukuba, Tsukuba, 305-8573,
More informationPresent and Future of Text Modeling
Present and Future of Text Modeling Daichi Mochihashi NTT Communication Science Laboratories daichi@cslab.kecl.ntt.co.jp T-FaNT2, University of Tokyo 2008-2-13 (Wed) Present and Future of Text Modeling
More informationLatent Dirichlet Allocation Introduction/Overview
Latent Dirichlet Allocation Introduction/Overview David Meyer 03.10.2016 David Meyer http://www.1-4-5.net/~dmm/ml/lda_intro.pdf 03.10.2016 Agenda What is Topic Modeling? Parametric vs. Non-Parametric Models
More informationContent-based Recommendation
Content-based Recommendation Suthee Chaidaroon June 13, 2016 Contents 1 Introduction 1 1.1 Matrix Factorization......................... 2 2 slda 2 2.1 Model................................. 3 3 flda 3
More informationGLAD: Group Anomaly Detection in Social Media Analysis
GLAD: Group Anomaly Detection in Social Media Analysis Poster #: 1150 Rose Yu, Xinran He and Yan Liu University of Southern California Group Anomaly Detection Anomalous phenomenon in social media data
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationProbabilistic Models for Multi-relational Data Analysis
Probabilistic Models for Multi-relational Data Analysis A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Hanhuai Shan IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
More informationDirichlet Enhanced Latent Semantic Analysis
Dirichlet Enhanced Latent Semantic Analysis Kai Yu Siemens Corporate Technology D-81730 Munich, Germany Kai.Yu@siemens.com Shipeng Yu Institute for Computer Science University of Munich D-80538 Munich,
More informationProbabilistic Time Series Classification
Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign
More informationIntroduction to Bayesian inference
Introduction to Bayesian inference Thomas Alexander Brouwer University of Cambridge tab43@cam.ac.uk 17 November 2015 Probabilistic models Describe how data was generated using probability distributions
More informationParallelized Variational EM for Latent Dirichlet Allocation: An Experimental Evaluation of Speed and Scalability
Parallelized Variational EM for Latent Dirichlet Allocation: An Experimental Evaluation of Speed and Scalability Ramesh Nallapati, William Cohen and John Lafferty Machine Learning Department Carnegie Mellon
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationSum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017
Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth
More informationCombining Generative and Discriminative Methods for Pixel Classification with Multi-Conditional Learning
Combining Generative and Discriminative Methods for Pixel Classification with Multi-Conditional Learning B. Michael Kelm Interdisciplinary Center for Scientific Computing University of Heidelberg michael.kelm@iwr.uni-heidelberg.de
More informationLanguage Information Processing, Advanced. Topic Models
Language Information Processing, Advanced Topic Models mcuturi@i.kyoto-u.ac.jp Kyoto University - LIP, Adv. - 2011 1 Today s talk Continue exploring the representation of text as histogram of words. Objective:
More informationDistinguish between different types of scenes. Matching human perception Understanding the environment
Scene Recognition Adriana Kovashka UTCS, PhD student Problem Statement Distinguish between different types of scenes Applications Matching human perception Understanding the environment Indexing of images
More informationCOMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification
COMP 55 Applied Machine Learning Lecture 5: Generative models for linear classification Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp55 Unless otherwise noted, all material
More informationText mining and natural language analysis. Jefrey Lijffijt
Text mining and natural language analysis Jefrey Lijffijt PART I: Introduction to Text Mining Why text mining The amount of text published on paper, on the web, and even within companies is inconceivably
More informationTopic modeling with more confidence: a theory and some algorithms
Topic modeling with more confidence: a theory and some algorithms Long Nguyen Department of Statistics Department of EECS University of Michigan, Ann Arbor Pacific-Asia Knowledge Discovery and Data Mining,
More informationTitle Document Clustering. Issue Date Right.
NAOSITE: Nagasaki University's Ac Title Author(s) Comparing LDA with plsi as a Dimens Document Clustering Masada, Tomonari; Kiyasu, Senya; Mi Citation Lecture Notes in Computer Science, Issue Date 2008-03
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s
More informationA Continuous-Time Model of Topic Co-occurrence Trends
A Continuous-Time Model of Topic Co-occurrence Trends Wei Li, Xuerui Wang and Andrew McCallum Department of Computer Science University of Massachusetts 140 Governors Drive Amherst, MA 01003-9264 Abstract
More informationProbabilistic Topic Models Tutorial: COMAD 2011
Probabilistic Topic Models Tutorial: COMAD 2011 Indrajit Bhattacharya Assistant Professor Dept of Computer Sc. & Automation Indian Institute Of Science, Bangalore My Background Interests Topic Models Probabilistic
More informationUniversity of Washington Department of Electrical Engineering EE512 Spring, 2006 Graphical Models
University of Washington Department of Electrical Engineering EE512 Spring, 2006 Graphical Models Jeff A. Bilmes Lecture 1 Slides March 28 th, 2006 Lec 1: March 28th, 2006 EE512
More informationClick Prediction and Preference Ranking of RSS Feeds
Click Prediction and Preference Ranking of RSS Feeds 1 Introduction December 11, 2009 Steven Wu RSS (Really Simple Syndication) is a family of data formats used to publish frequently updated works. RSS
More informationParametric Mixture Models for Multi-Labeled Text
Parametric Mixture Models for Multi-Labeled Text Naonori Ueda Kazumi Saito NTT Communication Science Laboratories 2-4 Hikaridai, Seikacho, Kyoto 619-0237 Japan {ueda,saito}@cslab.kecl.ntt.co.jp Abstract
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationAdditive Regularization of Topic Models for Topic Selection and Sparse Factorization
Additive Regularization of Topic Models for Topic Selection and Sparse Factorization Konstantin Vorontsov 1, Anna Potapenko 2, and Alexander Plavin 3 1 Moscow Institute of Physics and Technology, Dorodnicyn
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationLatent Topic Models for Hypertext
Latent Topic Models for Hypertext Amit Gruber School of CS and Eng. The Hebrew University Jerusalem 9194 Israel amitg@cs.huji.ac.il Michal Rosen-Zvi IBM Research Laboratory in Haifa Haifa University, Mount
More informationCollaborative User Clustering for Short Text Streams
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Collaborative User Clustering for Short Text Streams Shangsong Liang, Zhaochun Ren, Emine Yilmaz, and Evangelos Kanoulas
More informationText Mining for Economics and Finance Supervised Learning
Text Mining for Economics and Finance Supervised Learning Stephen Hansen Text Mining Lecture 7 1 / 48 Introduction Supervised learning is the problem of predicting a response variable (i.e. dependent variable)
More informationDEKDIV: A Linked-Data-Driven Web Portal for Learning Analytics Data Enrichment, Interactive Visualization, and Knowledge Discovery
DEKDIV: A Linked-Data-Driven Web Portal for Learning Analytics Data Enrichment, Interactive Visualization, and Knowledge Discovery Yingjie Hu, Grant McKenzie, Jiue-An Yang, Song Gao, Amin Abdalla, and
More informationLarge-Scale Feature Learning with Spike-and-Slab Sparse Coding
Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Ian J. Goodfellow, Aaron Courville, Yoshua Bengio ICML 2012 Presented by Xin Yuan January 17, 2013 1 Outline Contributions Spike-and-Slab
More informationApplying Latent Dirichlet Allocation to Group Discovery in Large Graphs
Lawrence Livermore National Laboratory Applying Latent Dirichlet Allocation to Group Discovery in Large Graphs Keith Henderson and Tina Eliassi-Rad keith@llnl.gov and eliassi@llnl.gov This work was performed
More informationIncorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors
Incorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors David Andrzejewski, Xiaojin Zhu, Mark Craven University of Wisconsin Madison ICML 2009 Andrzejewski (Wisconsin) Dirichlet
More informationDecoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process
Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Chong Wang Computer Science Department Princeton University chongw@cs.princeton.edu David M. Blei Computer Science Department
More informationNeural Networks for Web Page Classification Based on Augmented PCA
Neural Networks for Web Page Classification Based on Augmented PCA Ali Selamat and Sigeru Omatu Division of Computer and Systems Sciences, Graduate School of Engineering, Osaka Prefecture University, Sakai,
More informationClass-Specific Simplex-Latent Dirichlet Allocation for Image Classification
Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification Mandar Dixit, Nikhil Rasiwasia, Nuno Vasconcelos Department of Electrical and Computer Engineering University of California,
More informationClass-Specific Simplex-Latent Dirichlet Allocation for Image Classification
3 IEEE International Conference on Computer Vision Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification Mandar Dixit, Nikhil Rasiwasia, Nuno Vasconcelos Department of Electrical
More informationLanguage Models. CS6200: Information Retrieval. Slides by: Jesse Anderton
Language Models CS6200: Information Retrieval Slides by: Jesse Anderton What s wrong with VSMs? Vector Space Models work reasonably well, but have a few problems: They are based on bag-of-words, so they
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,
More informationExpectation-Propagation for the Generative Aspect Model
Expectation-Propagation for the Generative Aspect Model Thomas Minka Department of Statistics Carnegie Mellon University Pittsburgh, PA 15213 USA minkastat.cmu.edu John Lafferty School of Computer Science
More informationMining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee
Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee wwlee1@uiuc.edu September 28, 2004 Motivation IR on newsgroups is challenging due to lack of
More informationBehavioral Data Mining. Lecture 2
Behavioral Data Mining Lecture 2 Autonomy Corp Bayes Theorem Bayes Theorem P(A B) = probability of A given that B is true. P(A B) = P(B A)P(A) P(B) In practice we are most interested in dealing with events
More informationGaussian Mixture Model
Case Study : Document Retrieval MAP EM, Latent Dirichlet Allocation, Gibbs Sampling Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 5 th,
More informationDimension Reduction (PCA, ICA, CCA, FLD,
Dimension Reduction (PCA, ICA, CCA, FLD, Topic Models) Yi Zhang 10-701, Machine Learning, Spring 2011 April 6 th, 2011 Parts of the PCA slides are from previous 10-701 lectures 1 Outline Dimension reduction
More information