Improving Diversity in Ranking using Absorbing Random Walks

Similar documents
Online Social Networks and Media. Link Analysis and Web Search

Outline for today. Information Retrieval. Cosine similarity between query and document. tf-idf weighting

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Complex Social System, Elections. Introduction to Network Analysis 1

Online Social Networks and Media. Link Analysis and Web Search

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS6220: DATA MINING TECHNIQUES

Link Mining PageRank. From Stanford C246

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

Data Mining Recitation Notes Week 3

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

CS6220: DATA MINING TECHNIQUES

IR: Information Retrieval

Introduction to Information Retrieval

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31

1998: enter Link Analysis

Seeing stars when there aren't many stars Graph-based semi-supervised learning for sentiment categorization

CS246: Mining Massive Datasets Jure Leskovec, Stanford University.

Link Analysis Ranking

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Google PageRank. Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano

Machine Learning CPSC 340. Tutorial 12

CS249: ADVANCED DATA MINING

Slides based on those in:

PV211: Introduction to Information Retrieval

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Link Analysis. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Reductionist View: A Priori Algorithm and Vector-Space Text Retrieval. Sargur Srihari University at Buffalo The State University of New York

Web Structure Mining Nodes, Links and Influence

Lesson Plan. AM 121: Introduction to Optimization Models and Methods. Lecture 17: Markov Chains. Yiling Chen SEAS. Stochastic process Markov Chains

Probabilistic Information Retrieval

Multi-document Summarization via Budgeted Maximization of Submodular Functions

COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017

Data and Algorithms of the Web

Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University.

Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

CS425: Algorithms for Web Scale Data

Text mining and natural language analysis. Jefrey Lijffijt

Latent Dirichlet Allocation Based Multi-Document Summarization

Pseudocode for calculating Eigenfactor TM Score and Article Influence TM Score using data from Thomson-Reuters Journal Citations Reports

0.1 Naive formulation of PageRank

ECEN 689 Special Topics in Data Science for Communications Networks

MultiRank and HAR for Ranking Multi-relational Data, Transition Probability Tensors, and Multi-Stochastic Tensors

Part A. P (w 1 )P (w 2 w 1 )P (w 3 w 1 w 2 ) P (w M w 1 w 2 w M 1 ) P (w 1 )P (w 2 w 1 )P (w 3 w 2 ) P (w M w M 1 )

LINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

Data Mining and Matrices

Introduction to Data Mining

Link Analysis. Leonid E. Zhukov

Model Reduction for Edge-Weighted Personalized PageRank

Vector Space Model. Yufei Tao KAIST. March 5, Y. Tao, March 5, 2013 Vector Space Model

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Retrieval by Content. Part 2: Text Retrieval Term Frequency and Inverse Document Frequency. Srihari: CSE 626 1

Approximate Inference

Nonnegative Matrix Factorization

Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search

Descriptive Data Summarization

CS246 Final Exam, Winter 2011

Jeffrey D. Ullman Stanford University

PageRank. Ryan Tibshirani /36-662: Data Mining. January Optional reading: ESL 14.10

Node Centrality and Ranking on Networks

Online Social Networks and Media. Opinion formation on social networks

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

Information Retrieval and Web Search

Web Ranking. Classification (manual, automatic) Link Analysis (today s lesson)

Chap 2: Classical models for information retrieval

PageRank algorithm Hubs and Authorities. Data mining. Web Data Mining PageRank, Hubs and Authorities. University of Szeged.

Applications. Nonnegative Matrices: Ranking

CS345a: Data Mining Jure Leskovec and Anand Rajaraman Stanford University

Approximating a single component of the solution to a linear system

On the Foundations of Diverse Information Retrieval. Scott Sanner, Kar Wai Lim, Shengbo Guo, Thore Graepel, Sarvnaz Karimi, Sadegh Kharazmi

CSCE 561 Information Retrieval System Models

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

Why matrices matter. Paul Van Dooren, UCL, CESAME

CS 188: Artificial Intelligence Spring Announcements

Markov Chains Absorption Hamid R. Rabiee

An Axiomatic Framework for Result Diversification

Applications of Submodular Functions in Speech and NLP

Markov Chains Absorption (cont d) Hamid R. Rabiee

Recommendation Systems

Statistical Debugging with Latent Topic Models

Lecture 5: Random Walks and Markov Chain

Recommendation Systems

Lecture: Local Spectral Methods (2 of 4) 19 Computing spectral ranking with the push procedure

Topical Sequence Profiling

Online Manifold Regularization: A New Learning Setting and Empirical Study

Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of Data

Link Analysis. Stony Brook University CSE545, Fall 2016

Blog Distillation via Sentiment-Sensitive Link Analysis

Data Mining Techniques

Comparative Document Analysis for Large Text Corpora

Motivation. User. Retrieval Model Result: Query. Document Collection. Information Need. Information Retrieval / Chapter 3: Retrieval Models

Math 304 Handout: Linear algebra, graphs, and networks.

Today. Next lecture. (Ch 14) Markov chains and hidden Markov models

How Does Google?! A journey into the wondrous mathematics behind your favorite websites. David F. Gleich! Computer Science! Purdue University!

Social Interaction Based Video Recommendation: Recommending YouTube Videos to Facebook Users

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

Binary Principal Component Analysis in the Netflix Collaborative Filtering Task

Transcription:

Improving Diversity in Ranking using Absorbing Random Walks Andrew B. Goldberg with Xiaojin Zhu, Jurgen Van Gael, and David Andrzejewski Department of Computer Sciences, University of Wisconsin, Madison Department of Engineering, University of Cambridge SIGIR IDR Workshop, July 23, 29 Originally appeared at NAACL-HLT 27

Problem Description Given set of items (e.g., sentences, news articles, people) Rank items such that highly ranked items are important but not too similar to each other Documents: Query: Bad ranking: Good ranking:

Our Contribution Previous work [MMR (Carbonell and Goldstein, 1998) & CSIS (Radev, 2)] Treat centrality and diversity separately using heuristics Post-processing step to eliminate redundancy Our approach () Use absorbing Markov chain random walks to simultaneously rank based on centrality and diversity

Diversity in Ranking can be Important Examples Information retrieval Do not want near identical articles Extractive text summarization Do not want near identical sentences Social network analysis Do not want people all from the same group

Our Algorithm = Graph Random-walk with Absorbing StateS that HOPs among PEaks for Ranking Centrality Diversity Prior ranking

Inputs to Three inputs Graph W Relationships between the items to rank Prior ranking r (as a probability vector) Prior knowledge of item importance Uniform if no information is available Weight λ [, 1] to balance the two

The Graph W To rank n items, we use an n n weight matrix W w ij the weight on edge from item i to item j Directed or undirected Non-negative weights (w ij = if i, j not connected) Self edges are ok Examples: cosine similarity between documents i, j; number of phone calls i made to j, etc.

The Prior Ranking r Optional. Used for re-ranking. Probability vector r =(r 1,..., r n ), r i, i r i = 1 r i = P( picking item i as the most important item ) More control than ranking: (.1,.7,.2) vs. (.3,.37,.33) Uniform if no prior ranking Examples: (IR) document relevance scores to query; (summarization) position of sentence, etc.

Finding the First Item Same as PageRank (Page et al., 1998) Teleporting random walk Raw transition matrix P: P ij = w ij / k w ik Teleporting transition matrix P = λ P +(1 λ)1r Stationary distribution π = P π g 1 = arg max i π i But, PageRank does not address diversity at all (Note: First item can instead be specified by user.)

Finding the First Item Toy example 8 Data from 3 Gaussians.1 Height = stationary probability 6 4.1. g 1 2 1 1 1 Problem: Stationary distribution lacks diversity

Ranking the Remaining Items Idea: Turn ranked items into absorbing states Random walk ends when reaches absorbing state No more stationary distribution Rank by expected number of visits per walk

Absorbing States Illustrated Initial random walk with no absorbing states 1 1 2 3 4 11 7 6 8 9 First ranked item

Absorbing States Illustrated Absorbing random walk after ranking first item 1 Similar states get visited less 2 3 4 Dissimilar states get visited more Absorbing state Walk ends here

Ranking the Remaining Items hops! Stationary probabilities Expected # visits before absorption.1.1 g 1 4 3 g 2 1. 1 g 3. 2. 1 1 1 1 1 1 1 States similar to absorbing states get fewer visits Next item ranked is diverse w.r.t. already ranked items

Expected Number of Visits For g G (set of absorbing states), P gg = 1, other entries [ ] I absorbing Rearrange P = R Q non-absorbing Fundamental matrix N =(I Q) 1 N ij : if random walk starts from i, the expected number of visits to j, before absorption Average over starting states: v = N 1/(n G ) v j : expected number of visits to j regardless of start Next ranked item: g G +1 = arg max i v i

Discussions Similar to the heuristic cluster, take center in turn But no actual clustering is performed No need to know a priori how many clusters exist Fast computation: Matrix inversion lemma Initial fundamental matrix computed with real inverse Follow-up inverses derived using simple updates

NLP Experiments Multi-document extractive text summarization 24 Document Understanding Conference data sets Items: all sentences in all documents on a topic Graph W with sparse (thresholded) cosine edges Prior ranking based on sentence position: r i p α i λ =., α =.2 tuned on DUC 23 data

Results on DUC 24 Tasks Evaluation based on ROUGE-1 metric (Lin and Hovy, 23) Task 2: Between 1st & 2nd of 34 DUC competitors Task 4a: Between th & 6th of 11 Task 4b: Between 2nd & 3rd of 11

Social Network Experiments Goal: Rank prominent comedy stars from diverse countries Used 3+ actors from films in 2 26 Movie Count Ranking Ranking 1. Ben Stiller USA 1. Ben Stiller USA 2. Anthony Anderson USA 2. Anthony Anderson USA 3. Eddie Murphy USA 3. Johnny Knoxville USA...... 12. Gerard Depardieu France 8. Gerard Depardieu France 3. Mads Mikkelsen Denmark 13. Mads Mikkelsen Denmark 62. Til Schweiger Germany 27. Til Schweiger Germany 161. Tadanobu Asano Japan 34. Tadanobu Asano Japan 287. Kjell Bergqvist Sweden 44. Kjell Bergqvist Sweden

Social Network Analysis Results 3 Country coverage 1 Movie coverage Number of countries covered 2 2 1 1 MOVIECOUNT RANDOM 1 2 3 4 k (number of actors) Number of movies covered 9 8 7 6 4 3 2 1 MOVIECOUNT RANDOM 1 2 3 4 k (number of actors) highly ranks actors representing many diverse countries and who starred in many different movies

Conclusion ranking Centrality + diversity + prior in a unified framework of absorbing random walks Download code: http://www.cs.wisc.edu/~jerryzhu/pub/grasshopper.m