Some graph optimization problems in data mining. P. Van Dooren, CESAME, Univ. catholique Louvain based on work in collaboration with the group on

Size: px

Start display at page:

Download "Some graph optimization problems in data mining. P. Van Dooren, CESAME, Univ. catholique Louvain based on work in collaboration with the group on"

Lindsey Stevens
5 years ago
Views:

1 Some graph optimization problems in data mining P. Van Dooren, CESAME, Univ. catholique Louvain based on work in collaboration with the group on University of Chicago, October 16, 2012

2 Leuven Lambiotte et al Phys Rev, 2008 Call density over 6 months

3 Brussels Lambiotte et al Phys Rev, 2008 Call density over 6 months

4 Ref: Melchior, Eng. Thesis, UCL

5 Reputation systems Outline of the talk Application to MovieLens Database Similarity matrix of two graphs Application to Synonym Extraction Concluding remarks

6 What is a reputation system? Movielens

( Movielens ) ------------------------------ Giving a grade (reputation) to

7 Motivation Detecting dishonest participants in auction systems ( ) Removing spammers in on-line review databases ( Movielens ) Giving a grade (reputation) to web raters ( ) Evaluating the trust of nodes in Peer2Peer systems ( )

8 Reputation of raters and objects Given a bipartite graph with n raters and m objects and votes on the edges, what should be the reputation of these n+m items? Example: graph matrix form r1 r2 r o1 o2 r1 r2 r3 o1 o = X (votes) Characterize the reputation f of the raters and r of the objects

9 Reputation of raters and objects Belief divergence = Variance f? f 1 = 4.6 f 2 = 4.2 f 3 = r?

10 Reputation of raters and objects Belief divergence = Variance f? f 1 = 4.6 f 2 = 4.2 f 3 = r?

11 Reputation of raters and objects Belief divergence = Variance f? f 1 = after convergence f 2 = 4.8 f 3 = r?

12 Our approach Assume that every rater evaluates all objects with a vote [0,1] and that f >0 are the voting matrix and the raters reputation The object s reputation vector r is the weighted sum of the votes The rater s reputation f depends on the discrepancy with the other votes There is a unique pair of vectors r and f satisfying these formulas when d Inf De Kerchove-VD,SIAM News 08

13 Nonlinear iteration These two formulas lead to define the following iteration: where the voting matrix could be dynamic and then changes at each iteration. If the matrix X is fixed, we can prove Theorem If d > m, the iteration converges towards the unique fixed point that gives the reputations r of the objects and f(r) of the raters.

14 Cost function If d > m, the fixed point of our iteration corresponds to the minimum of the following cost function defined on the unit hypercube [0,1] m : E.g. for m=2, the energy function looks like (for d>2 and for d=1.5)

15 Convergence and one iteration step corresponds to the steepest descent (with a particular step size) and this converges monotonically to r* since we have r k+1 -r k 2

16 Data set consists of 100,000 ratings (1-5) from 943 users on 1682 movies. Each user has rated at least 20 movies. The data was collected through the MovieLens web site (movielens.umn.edu) during a seven-month period 237 spammers (scoring always 1 except for their unique best friend that receive the maximum: 5) are added (+25%): The mean (Left) is less robust than our iteration (Middle) that also gives good results for the raters reputations (Right). Convergence for spammers separation after step 1, 2 and Inf

17 Some remarks Strengths: linear complexity (in the number of votes) applicable to any graph and with any rating matrix can be dynamic (varying matrix X k ) reputations for the raters robust against attackers and spammers Further study: choice of the function stability for the dynamic case mixing raters and objects

18 Similarity matrix of two arbitrary graphs For A and B adjacency matrices of the two graphs S solves ρs = A S B T + A T S B This matrix can be obtained via fixed point of power method (linear) Ref: Blondel et al, SIAM Rev., 04

19 Similarity matrix of two arbitrary graphs For A and B adjacency matrices of the two graphs S solves ρs = A S B T + A T S B Element S 54 says how similar node 5 of A is to node 4 of B

20 Similarity matrix of two arbitrary graphs For A and B adjacency matrices of the two graphs S solves ρs = A S B T + A T S B Element S 43 says how similar node 4 of A is to node 3 of B

21 Similarity matrix of two arbitrary graphs For A and B adjacency matrices of the two graphs S solves ρs = A S B T + A T S B Two nodes are similar if their parents and children are similar Such a recursive definition leads to an eigenvector equation

22 The (normalized) sequence Algorithm? Z k+1 = (AZ k B T +A T Z k B)/ AZ k B T +A T Z k B F has two fixed points Z even and Z odd for every Z 0 >0 Similarity matrix S = lim k Z 2k, Z 0 =1 S i,j is the similarity score between V i (A) and V j (B) With z k =vec(z k ), this is equivalent to the power method z k+1 = (B A + B T A T )z k / (B A + B T A T )z k 2 which is the power method on M = B A + B T A T

23 Some properties Satisfies ρs=asb T +A T SB, ρ= ASB T +A T SB F It is the nonnegative fixed point S of largest 1-norm It solves the optimization problem max ASB T +A T SB, S subject to S F =1 Extension of Kleinberg s Hits method Linear convergence (power method for sparse M)

24 The dictionary graph Nodes = words present in the dictionary : 112,169 nodes Edge (u,v) if v appears in the definition of u : 1,398,424 edges Average of 12 edges per node Ref: Blondel et al, SIAM Rev., 04

25 Neighborhood graph is the subset of vertices used for finding synonyms : it contains all parents and children of the node neighborhood graph of likely Central uses this sub-graph to rank automatically synonyms Rank each node in the graph with the similarity to node c in b c e Ref: Blondel et al, SIAM Rev., 04

26 Disappear Vectors Central ArcRanc Wordnet Microsoft 1 vanish vanish epidemic vanish vanish 2 wear pass disappearing go away cease to exist 3 die die port end fade away 4 sail wear dissipate finish die out 5 faint faint cease terminate go 6 light fade eat cease evaporate 7 port sail gradually wane 8 absorb light instrumental expire 9 appear dissipate darkness withdraw 10 cease cease efface pass away Mark Std Dev Vectors, Central and ArcRank are automatic, Wordnet, Microsoft Word are manual

27 Sugar Vectors Central ArcRanc Wordnet Microsoft 1 juice cane granulation sweetening darling 2 starch starch shrub sweetener baby 3 cane sucrose sucrose carbohydrate honey 4 milk milk preserve saccharide dear 5 molasses sweet honeyed organic compound love 6 sucrose dextrose property saccarify dearest 7 wax molasses sorghum sweeten beloved 8 root juice grocer dulcify precious 9 crystalline glucose acetate edulcorate pet 10 confection lactose saccharine dulcorate babe Mark Std Dev

28 S F =1 U T U=V T V=I k U T U=V T V=I k

29 Optimization problems The fixed point of ρs=asb T +A T SB, ρ= ASB T +A T SB F corresponds to max ASB T +A T SB, S subject to S F =1 The fixed point of UΣV T =Π opt (AUV T B T +A T UV T B), corresponds to max AUV T B T +A T UV T B, UV T subject to U T U=V T V=I k This is not an eigenvalue problem anymore but can be computed using iterative techniques with a linear complexity per step

30 Projected correlation max AUV T B T +A T UV T B, UV T subject to U T U=V T V=I k Is also equivalent to max U T AU,V T BV subject to U T U=V T V=I k U T AU and V T BV can be viewed as kxk Rayleigh quotients Linearly converging iteration (truncated SVD) U k+1 Σ k+1 V T k+1 +U Σ V T = AU k V T k B T + A T U k V T k B + su k V T k

31 Correlation of graphs Graphs with similar structure Correlation is nearly optimal Fraikin, Nesterov, VD, LAA 07

32 Some remarks Optimization is on large sparse graphs Complexity of one iteration step is linear in the number of nodes in both graphs We have methods with linear convergence (power-like method and gradient like method) We have Newton-like methods with manifold constraints (U T U=V T V=I k ) Extensions to colored nodes and edges

A Measure of Similarity between Graph Vertices: Applications to Synonym Extraction and Web Searching

A Measure of Similarity between Graph Vertices: Applications to Synonym Extraction and Web Searching SIAM REVIEW Vol 46, No 4, pp 647 666 c 2004 Society for Industrial and Applied Mathematics Downloaded 2/27/3 to 69233544 Redistribution subject to SIAM license or copyright; see http://wwwsiamorg/journals/ojsaphp