Learning a Degree-Augmented Distance Metric from a Network. Bert Huang, U of Maryland Blake Shaw, Foursquare Tony Jebara, Columbia U

Size: px

Start display at page:

Download "Learning a Degree-Augmented Distance Metric from a Network. Bert Huang, U of Maryland Blake Shaw, Foursquare Tony Jebara, Columbia U"

Meredith Bell
5 years ago
Views:

1 Learning a Degree-Augmented Distance Metric from a Network Bert Huang, U of Maryland Blake Shaw, Foursquare Tony Jebara, Columbia U Beyond Mahalanobis: Supervised Large-Scale Learning of Similarity NIPS Workshop, Sierra Nevada, Dec. 16, 2011

2 Motivation: Similarity in Networks 4 8 :) :) bossy :( 5 2 :) :) :( bossy :( 5 7 Homophily occurs in natural networks: neighbors are similar Learning must account for structural nature of networks

3 Outline Problem formulation Structure Preserving Metric Learning Degree Distributional Metric Learning Analysis Experiments

4 Problem Formulation Adjacency Matrix Node Features + A 2 B n n X 2 R d n Given: node feature matrix X and adjacency matrix A Learn the inherent distance metric related to the homophily of the network

5 Structure Preserving Metric Learning Connectivity algorithms: k-nn, b-matching, MST, -neighborhoods Distances are structure-preserving if the connectivity algo outputs the true connectivity [SJ09] Parameterize distances D M (x i, x j )=(x i x j ) > M(x i x j )

6 Structured Prediction Motivation Constraints: true must score higher than any feasible adjacency matrix A Frobenius regularization -> SVM Cutting plane doesn t scale: requires Ã iterating SDP and separation oracle Relaxation: (Optional) drop PSD constraint only consider small changes to A

7 Stochastic SPML (k-nn) Consider only changes along nodeneighbor-impostor triplets T = {(i, j, k) A ij =1,A ik =0} j j k Difference between scores is only along i A i k A (ijk) triplet edges 2 M 2 F 1 T X (ijk)2t Randomly sample triplets and follow stochastic subgradients h(d M (x i, x j ) D M (x i, x k ) + 1) (Periodically project to PSD)

8 Out-of-Sample Extension Connectivity algorithm is fixed: structural parameters must be known e.g., k = degree of training nodes What degree should new nodes have? Feature-dependent degree preference functions

9 Degree Distributional Metric Learning Simultaneously learn feature-dependent degree preference functions such that the connectivity algorithm maximizes F (A X; M, S) = X ij A ij D M (x i, x j )+ X g (c[i] x i ; S) i Linear deg. pref. score g(k x; S) = kx x > s k 0 k 0 =1 Regularizing w/ S 2 F, DDML can also be a structural SVM

10 Stochastic DDML Triplet-based loss function: min M,S 2 ( M 2 + S 2 )+ X h(f (A X; M, S) F (A (ijk) X; M, S) + 1) 1 T ijk2t Score difference cancels j k j k except for four quantities: i A i A (ijk) F (A X; M, S) F (A (ijk) X; M, S) = D M (x i, x k ) D M (x i, x j )+x > k s (c[k]+1) x > j s (c[j] 1) (Project toward concave degree prefs)

11 Learner Running Time Limit the maximum degree so the number of parameters for degree preference function is constant Subgradient computation: O(d 2 ) for SPML and O(d 2 + c max ) for DDML Learner reduces to PEGASOS algorithm (Shalev-Shwartz et al. 07) on one-class SVM: O 1 time for -convergence

12 Link Prediction from DDML For concave degree preferences, the connectivity algorithm reduces to an O(N 3 ) combinatorial algorithm (Huang & Jebara, 09) Or rank edges by degree-augmented distance b D M (b, c) a d g(b, 2) g(b, 1) c g(c, 3) g(c, 2)

13 Experiments Wikipedia: word counts and hyperlinks Facebook: status, gender, major, dorm, year, and friendship links [Traub, et al., '11] Randomly hold out 20% of nodes and incident edges for testing

14 Experiment Results 1 Philosophy Concepts true positive rate DDML 0.4 SPML Euclidean RTM 0.2 SVM Random false positive rate AUC Facebook Wikipedia n m d Euclid. RTM SVM SPML DDML Graph Theory * Philosophy Concepts * Search Engines * Philosophy Crawl 100k 4m Harvard k MIT k Stanford k Columbia k

15 Summary + Open Problems + Thanks! SPML: metrics consistent with structural behavior of networks DDML: explicit degree preference functions that are feature-dependent Linear constraints and Frobenius regularization: constant time convergence Other applications of structure-preserving metric More natural regularizer? Stopping criterion Large-scale prediction

Web Structure Mining Nodes, Links and Influence

Web Structure Mining Nodes, Links and Influence 1 Outline 1. Importance of nodes 1. Centrality 2. Prestige 3. Page Rank 4. Hubs and Authority 5. Metrics comparison 2. Link analysis 3. Influence model 1.