Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking
|
|
- Pierce Parker
- 6 years ago
- Views:
Transcription
1 Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking Yuxin Chen Electrical Engineering, Princeton University Joint work with Jianqing Fan, Cong Ma and Kaizheng Wang
2 Ranking A fundamental problem in a wide range of contexts web search, recommendation systems, admissions, sports competitions, voting,... PageRank figure credit: Dzenan Hamzic Top-K ranking 2/ 21
3 Rank aggregation from pairwise comparisons pairwise comparisons for ranking top tennis players figure credit: Bozóki, Csató, Temesi Top-K ranking 3/ 21
4 Parametric models Assign latent score to each of n items w = [w 1,, w n] w i :preferencescore i: rank Top-K ranking 4/ 21
5 Parametric models Assign latent score to each of n items w = [w 1,, w n] w i :preferencescore i: rank This work: Bradley-Terry-Luce (logistic) model P {item j beats item i} = wi + w j Other models: Thurstone model, low-rank model,... Top-K ranking 4/ 21 w j
6 Typical ranking procedures Estimate latent scores rank items based on score estimates Top-K ranking 5/ 21
7 Top-K ranking Estimate latent scores rank items based on score estimates Goal: identify the set of top-k items under minimal sample size Top-K ranking 5/ 21
8 Model: random sampling Comparison graph: Erdős Rényi graph G G(n, p) For each (i, j) G, obtain L paired comparisons y (l) wj ind. 1, with prob. w i,j = i +w j 1 l L 0, else Top-K ranking 6/ 21
9 Model: random sampling Comparison graph: Erdős Rényi graph G G(n, p) For each (i, j) G, obtain L paired comparisons y i,j = 1 L L l=1 y (l) i,j (sufficient statistic) Top-K ranking 6/ 21
10 Prior art mean square error for estimating scores top-k ranking accuracy Spectral method MLE Spectral MLE Negahban et al. 12 Negahban et al. 12 Hajek et al. 14 Chen & Suh. 15 Top-K ranking 7/ 21
11 Prior art a meta metric mean square error for estimating scores top-k ranking accuracy Spectral method MLE Spectral MLE Negahban et al. 12 Negahban et al. 12 Hajek et al. 14 Chen & Suh. 15 Top-K ranking 7/ 21
12 Small l 2 loss high ranking accuracy Top-K ranking 8/ 21
13 Small l 2 loss high ranking accuracy Top-K ranking 8/ 21
14 Small l 2 loss high ranking accuracy These two estimates have same l 2 loss, but output different rankings Top-K ranking 8/ 21
15 Small l 2 loss high ranking accuracy These two estimates have same l 2 loss, but output different rankings Need to control entrywise error! Top-K ranking 8/ 21
16 Optimality? Is spectral method or MLE alone optimal for top-k ranking? Top-K ranking 9/ 21
17 Optimality? Is spectral method or MLE alone optimal for top-k ranking? Partial answer (Jang et al 16): spectral method works if comparison graph is sufficiently dense Top-K ranking 9/ 21
18 Optimality? Is spectral method or MLE alone optimal for top-k ranking? Partial answer (Jang et al 16): spectral method works if comparison graph is sufficiently dense This work: affirmative answer for both methods + entire regime }{{} inc. sparse graphs Top-K ranking 9/ 21
19 Spectral method (Rank Centrality) Negahban, Oh, Shah 12 Construct a probability transition matrix P, whose off-diagonal entries obey { yi,j, if (i, j) G P i,j 0, if (i, j) / G Return score estimate as leading left eigenvector of P Top-K ranking 10/ 21
20 Rationale behind spectral method In large-sample case, P P, whose off-diagonal entries obey wj Pi,j w, if (i, j) G i +w j 0, if (i, j) / G Stationary distribution of } reversible {{} P check detailed balance π [w 1, w 2,..., w n] }{{} true score Top-K ranking 11/ 21
21 (i,j) G Regularized MLE Negative log-likelihood L (w) := { } w i w j y j,i log + (1 y j,i ) log w i + w j w i + w j L (w) becomes convex after reparametrization: w θ = [θ 1,, θ n ], θ i = log w i Top-K ranking 12/ 21
22 (i,j) G Regularized MLE Negative log-likelihood L (w) := { } w i w j y j,i log + (1 y j,i ) log w i + w j w i + w j L (w) becomes convex after reparametrization: w θ = [θ 1,, θ n ], θ i = log w i (Regularized MLE) minimize θ L λ (θ) := L (θ) λ θ 2 2 choose λ np log n L Top-K ranking 12/ 21
23 Main result comparison graph G(n, p); sample size pn2 L Theorem 1 (Chen, Fan, Ma, Wang 17) When p & logn n, both spectral method and regularized MLE achieve optimal sample complexity for top-k ranking! Top-K ranking 13/ 21
24 mple size Jang et al 16 K 1/4 log n Main result relatively dense Our work / optimal sample size Jang 1/4 log n Our work / optimal sample size Jang et al 16 K n easible sample size n Our work / optimal sample size separation: K : score separation achievable by both methods infeasible feasible Jang et al 16 sample size Our work / optimal sample size Jan 1/4 log n separation: K : score separation w g et al 16 K w(k) n (K+1) K := : score separation kw k nking 14/ 21 Top-K ranking 13/ 21
25 Comparison with Jang et al 16 log n Jang et al 16: spectral method controls entrywise error if p n }{{} relatively dense Top-K ranking 14/ 21
26 Comparison with Jang et al 16 log n Jang et al 16: spectral method controls entrywise error if p n }{{} relatively dense sample size Our work / optimal sample size Jang et al 16 1/4 log n n K : score separation Top-K ranking 14/ 21
27 top-k ranking accuracy Empirical top-k ranking accuracy Spectral Method Regularized MLE " K : score separation n = 200, p = 0.25, L = 20 Top-K ranking 15/ 21
28 Optimal control of entrywise error w 1 w 2 w 3 w K w K+1 w 1 w 2 w 3 w K K z } { {z} w K+1 {z} < 1 2 K < 1 2 K true score score estimates z Theorem 2 Suppose p log n n and sample size n log n. Then with high prob., 2 K the estimates w returned by both methods obey (up to global scaling) w w w < 1 2 K Top-K ranking 16/ 21
29 Key ingredient: leave-one-out analysis For each 1 m n, introduce leave-one-out estimate w (m) m n m =) w (m) n y =[y i,j ] 1applei,japplen Top-K ranking 17/ 21
30 Key ingredient: leave-one-out analysis For each 1 m n, introduce leave-one-out estimate w (m) statistical independence stability Top-K ranking 17/ 21
31 Exploit statistical independence m n m =) w (m) n y =[y i,j ] 1applei,japplen leave-one-out estimate w (m) = all data related to mth item Top-K ranking 18/ 21
32 Leave-one-out stability leave-one-out estimate w (m) true estimate w Top-K ranking 19/ 21
33 Leave-one-out stability leave-one-out estimate w (m) true estimate w Spectral method: eigenvector perturbation bound π π π π(p P ) π spectral-gap new Davis-Kahan bound for probability transition matrices }{{} asymmetric Top-K ranking 19/ 21
34 Leave-one-out stability leave-one-out estimate w (m) true estimate w Spectral method: eigenvector perturbation bound π π π π(p P ) π spectral-gap new Davis-Kahan bound for probability transition matrices }{{} asymmetric MLE: local strong convexity θ θ 2 L λ (θ; ŷ) 2 strong convexity parameter Top-K ranking 19/ 21
35 A small sample of related works Parametric models Ford 57 Hunter 04 Negahban, Oh, Shah 12 Rajkumar, Agarwal 14 Hajek, Oh, Xu 14 Chen, Suh 15 Rajkumar, Agarwal 16 Jang, Kim, Suh, Oh 16 Suh, Tan, Zhao 17 Non-parametric models Shah, Wainwright 15 Shah, Balakrishnan, Guntuboyina, Wainwright 16 Chen, Gopi, Mao, Schneider 17 Leave-one-out analysis El Karoui, Bean, Bickel, Lim, Yu 13 Zhong, Boumal 17 Abbe, Fan, Wang, Zhong 17 Top-K ranking 20/ 21
36 Summary Spectral method Regularized MLE Optimal sample complexity Linear-time computational complexity Novel entrywise perturbation analysis for spectral method and convex optimization Paper: Spectral method and regularized MLE are both optimal for top-k ranking, Y. Chen, J. Fan, C. Ma, K. Wang, arxiv: , 2017 Top-K ranking 21/ 21
ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018
ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space
More informationRanking from Pairwise Comparisons
Ranking from Pairwise Comparisons Sewoong Oh Department of ISE University of Illinois at Urbana-Champaign Joint work with Sahand Negahban(MIT) and Devavrat Shah(MIT) / 9 Rank Aggregation Data Algorithm
More informationRecent Advances in Ranking: Adversarial Respondents and Lower Bounds on the Bayes Risk
Recent Advances in Ranking: Adversarial Respondents and Lower Bounds on the Bayes Risk Vincent Y. F. Tan Department of Electrical and Computer Engineering, Department of Mathematics, National University
More informationInformation Recovery from Pairwise Measurements
Information Recovery from Pairwise Measurements A Shannon-Theoretic Approach Yuxin Chen, Changho Suh, Andrea Goldsmith Stanford University KAIST Page 1 Recovering data from correlation measurements A large
More informationo A set of very exciting (to me, may be others) questions at the interface of all of the above and more
Devavrat Shah Laboratory for Information and Decision Systems Department of EECS Massachusetts Institute of Technology http://web.mit.edu/devavrat/www (list of relevant references in the last set of slides)
More informationStatistical and Computational Phase Transitions in Planted Models
Statistical and Computational Phase Transitions in Planted Models Jiaming Xu Joint work with Yudong Chen (UC Berkeley) Acknowledgement: Prof. Bruce Hajek November 4, 203 Cluster/Community structure in
More informationSimple, Robust and Optimal Ranking from Pairwise Comparisons
Journal of Machine Learning Research 8 (208) -38 Submitted 4/6; Revised /7; Published 4/8 Simple, Robust and Optimal Ranking from Pairwise Comparisons Nihar B. Shah Machine Learning Department and Computer
More informationMatrix estimation by Universal Singular Value Thresholding
Matrix estimation by Universal Singular Value Thresholding Courant Institute, NYU Let us begin with an example: Suppose that we have an undirected random graph G on n vertices. Model: There is a real symmetric
More informationRobust Principal Component Analysis
ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M
More informationThe Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences
The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences Yuxin Chen Emmanuel Candès Department of Statistics, Stanford University, Sep. 2016 Nonconvex optimization
More informationJointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs
Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs Jiaming Xu Joint work with Rui Wu, Kai Zhu, Bruce Hajek, R. Srikant, and Lei Ying University of Illinois, Urbana-Champaign
More informationBreaking the 1/ n Barrier: Faster Rates for Permutation-based Models in Polynomial Time
Proceedings of Machine Learning Research vol 75:1 6, 2018 31st Annual Conference on Learning Theory Breaking the 1/ n Barrier: Faster Rates for Permutation-based Models in Polynomial Time Cheng Mao Massachusetts
More informationStatistical ranking problems and Hodge decompositions of graphs and skew-symmetric matrices
Statistical ranking problems and Hodge decompositions of graphs and skew-symmetric matrices Lek-Heng Lim and Yuan Yao 2008 Workshop on Algorithms for Modern Massive Data Sets June 28, 2008 (Contains joint
More informationLikelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square
Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square Yuxin Chen Electrical Engineering, Princeton University Coauthors Pragya Sur Stanford Statistics Emmanuel
More informationLearning from Comparisons and Choices
Journal of Machine Learning Research 9 08-95 Submitted 0/7; Revised 7/8; Published 8/8 Sahand Negahban Statistics Department Yale University Sewoong Oh Department of Industrial and Enterprise Systems Engineering
More informationBAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage
BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement
More informationMinimax-optimal Inference from Partial Rankings
Minimax-optimal Inference from Partial Rankings Bruce Hajek UIUC b-hajek@illinois.edu Sewoong Oh UIUC swoh@illinois.edu Jiaming Xu UIUC jxu8@illinois.edu Abstract This paper studies the problem of rank
More informationNonconvex Methods for Phase Retrieval
Nonconvex Methods for Phase Retrieval Vince Monardo Carnegie Mellon University March 28, 2018 Vince Monardo (CMU) 18.898 Midterm March 28, 2018 1 / 43 Paper Choice Main reference paper: Solving Random
More informationProperties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana
More informationarxiv: v1 [cs.lg] 30 Dec 2015
Simple, Robust and Optimal Ranking from Pairwise Comparisons Nihar B. Shah nihar@eecs.berkeley.edu Martin J. Wainwright, wainwrig@berkeley.edu arxiv:5.08949v [cs.lg] 30 Dec 05 Department of EECS, and Department
More informationGaussian Graphical Models and Graphical Lasso
ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf
More informationHodgeRank on Random Graphs
Outline Motivation Application Discussions School of Mathematical Sciences Peking University July 18, 2011 Outline Motivation Application Discussions 1 Motivation Crowdsourcing Ranking on Internet 2 HodgeRank
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More informationCommunity detection in stochastic block models via spectral methods
Community detection in stochastic block models via spectral methods Laurent Massoulié (MSR-Inria Joint Centre, Inria) based on joint works with: Dan Tomozei (EPFL), Marc Lelarge (Inria), Jiaming Xu (UIUC),
More informationSYNC-RANK: ROBUST RANKING, CONSTRAINED RANKING AND RANK AGGREGATION VIA EIGENVECTOR AND SDP SYNCHRONIZATION
SYNC-RANK: ROBUST RANKING, CONSTRAINED RANKING AND RANK AGGREGATION VIA EIGENVECTOR AND SDP SYNCHRONIZATION MIHAI CUCURINGU Abstract. We consider the classic problem of establishing a statistical ranking
More informationRandom hyperplane tessellations and dimension reduction
Random hyperplane tessellations and dimension reduction Roman Vershynin University of Michigan, Department of Mathematics Phenomena in high dimensions in geometric analysis, random matrices and computational
More informationCombinatorial Hodge Theory and a Geometric Approach to Ranking
Combinatorial Hodge Theory and a Geometric Approach to Ranking Yuan Yao 2008 SIAM Annual Meeting San Diego, July 7, 2008 with Lek-Heng Lim et al. Outline 1 Ranking on networks (graphs) Netflix example
More informationCOR-OPT Seminar Reading List Sp 18
COR-OPT Seminar Reading List Sp 18 Damek Davis January 28, 2018 References [1] S. Tu, R. Boczar, M. Simchowitz, M. Soltanolkotabi, and B. Recht. Low-rank Solutions of Linear Matrix Equations via Procrustes
More informationInformation Recovery from Pairwise Measurements: A Shannon-Theoretic Approach
Information Recovery from Pairwise Measurements: A Shannon-Theoretic Approach Yuxin Chen Statistics Stanford University Email: yxchen@stanfordedu Changho Suh EE KAIST Email: chsuh@kaistackr Andrea J Goldsmith
More informationCombinatorial Types of Tropical Eigenvector
Combinatorial Types of Tropical Eigenvector arxiv:1105.55504 Ngoc Mai Tran Department of Statistics, UC Berkeley Joint work with Bernd Sturmfels 2 / 13 Tropical eigenvalues and eigenvectors Max-plus: (R,,
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationNew ways of dimension reduction? Cutting data sets into small pieces
New ways of dimension reduction? Cutting data sets into small pieces Roman Vershynin University of Michigan, Department of Mathematics Statistical Machine Learning Ann Arbor, June 5, 2012 Joint work with
More informationA physical model for efficient rankings in networks
A physical model for efficient rankings in networks Daniel Larremore Assistant Professor Dept. of Computer Science & BioFrontiers Institute March 5, 2018 CompleNet danlarremore.com @danlarremore The idea
More informationGoogle Page Rank Project Linear Algebra Summer 2012
Google Page Rank Project Linear Algebra Summer 2012 How does an internet search engine, like Google, work? In this project you will discover how the Page Rank algorithm works to give the most relevant
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize/navigate it? First try: Human curated Web directories Yahoo, DMOZ, LookSmart
More informationSQL-Rank: A Listwise Approach to Collaborative Ranking
SQL-Rank: A Listwise Approach to Collaborative Ranking Liwei Wu Depts of Statistics and Computer Science UC Davis ICML 18, Stockholm, Sweden July 10-15, 2017 Joint work with Cho-Jui Hsieh and James Sharpnack
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationApproximating a single component of the solution to a linear system
Approximating a single component of the solution to a linear system Christina E. Lee, Asuman Ozdaglar, Devavrat Shah celee@mit.edu asuman@mit.edu devavrat@mit.edu MIT LIDS 1 How do I compare to my competitors?
More informationParameter estimation in linear Gaussian covariance models
Parameter estimation in linear Gaussian covariance models Caroline Uhler (IST Austria) Joint work with Piotr Zwiernik (UC Berkeley) and Donald Richards (Penn State University) Big Data Reunion Workshop
More informationRegularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008
Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)
More informationLink Mining PageRank. From Stanford C246
Link Mining PageRank From Stanford C246 Broad Question: How to organize the Web? First try: Human curated Web dictionaries Yahoo, DMOZ LookSmart Second try: Web Search Information Retrieval investigates
More informationSync-Rank: Robust Ranking, Constrained Ranking and Rank Aggregation via Eigenvector and SDP Synchronization
JOURNAL OF L A T E X CLASS FILES, 216 1 Sync-Rank: Robust Ranking, Constrained Ranking and Rank Aggregation via Eigenvector and SDP Synchronization Mihai Cucuringu E-mail: mihai@math.ucla.edu Abstract
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationCertifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering
Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Shuyang Ling Courant Institute of Mathematical Sciences, NYU Aug 13, 2018 Joint
More informationAlgebraic Representation of Networks
Algebraic Representation of Networks 0 1 2 1 1 0 0 1 2 0 0 1 1 1 1 1 Hiroki Sayama sayama@binghamton.edu Describing networks with matrices (1) Adjacency matrix A matrix with rows and columns labeled by
More informationLab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018
Lab 8: Measuring Graph Centrality - PageRank Monday, November 5 CompSci 531, Fall 2018 Outline Measuring Graph Centrality: Motivation Random Walks, Markov Chains, and Stationarity Distributions Google
More informationSparse Learning and Distributed PCA. Jianqing Fan
w/ control of statistical errors and computing resources Jianqing Fan Princeton University Coauthors Han Liu Qiang Sun Tong Zhang Dong Wang Kaizheng Wang Ziwei Zhu Outline Computational Resources and Statistical
More informationSUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES
SUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES MICHELE BENZI AND CHRISTINE KLYMKO Abstract This document contains details of numerical
More informationDecoupled Collaborative Ranking
Decoupled Collaborative Ranking Jun Hu, Ping Li April 24, 2017 Jun Hu, Ping Li WWW2017 April 24, 2017 1 / 36 Recommender Systems Recommendation system is an information filtering technique, which provides
More informationRanking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization
Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Jialin Dong ShanghaiTech University 1 Outline Introduction FourVignettes: System Model and Problem Formulation Problem Analysis
More informationCS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine
CS 277: Data Mining Mining Web Link Structure Class Presentations In-class, Tuesday and Thursday next week 2-person teams: 6 minutes, up to 6 slides, 3 minutes/slides each person 1-person teams 4 minutes,
More informationHigh-dimensional statistics: Some progress and challenges ahead
High-dimensional statistics: Some progress and challenges ahead Martin Wainwright UC Berkeley Departments of Statistics, and EECS University College, London Master Class: Lecture Joint work with: Alekh
More informationCOMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY
COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY OLIVIER GUÉDON AND ROMAN VERSHYNIN Abstract. We present a simple and flexible method to prove consistency of semidefinite optimization
More informationRecovering any low-rank matrix, provably
Recovering any low-rank matrix, provably Rachel Ward University of Texas at Austin October, 2014 Joint work with Yudong Chen (U.C. Berkeley), Srinadh Bhojanapalli and Sujay Sanghavi (U.T. Austin) Matrix
More informationRank Aggregation via Hodge Theory
Rank Aggregation via Hodge Theory Lek-Heng Lim University of Chicago August 18, 2010 Joint work with Xiaoye Jiang, Yuao Yao, Yinyu Ye L.-H. Lim (Chicago) HodgeRank August 18, 2010 1 / 24 Learning a Scoring
More informationOn the Spectra of General Random Graphs
On the Spectra of General Random Graphs Fan Chung Mary Radcliffe University of California, San Diego La Jolla, CA 92093 Abstract We consider random graphs such that each edge is determined by an independent
More informationFast and Robust Phase Retrieval
Fast and Robust Phase Retrieval Aditya Viswanathan aditya@math.msu.edu CCAM Lunch Seminar Purdue University April 18 2014 0 / 27 Joint work with Yang Wang Mark Iwen Research supported in part by National
More informationEstimators based on non-convex programs: Statistical and computational guarantees
Estimators based on non-convex programs: Statistical and computational guarantees Martin Wainwright UC Berkeley Statistics and EECS Based on joint work with: Po-Ling Loh (UC Berkeley) Martin Wainwright
More informationRisk and Noise Estimation in High Dimensional Statistics via State Evolution
Risk and Noise Estimation in High Dimensional Statistics via State Evolution Mohsen Bayati Stanford University Joint work with Jose Bento, Murat Erdogdu, Marc Lelarge, and Andrea Montanari Statistical
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationA Spectral Algorithm for Ranking using Seriation
A Spectral Algorithm for Ranking using Seriation Alex d Aspremont, CNRS & ENS, Paris. with Fajwel Fogel, Milan Vojnovic, Ecole Polytechnique, Microsoft Research. Alex d Aspremont Simons Institute, Berkeley,
More informationCalculating Web Page Authority Using the PageRank Algorithm
Jacob Miles Prystowsky and Levi Gill Math 45, Fall 2005 1 Introduction 1.1 Abstract In this document, we examine how the Google Internet search engine uses the PageRank algorithm to assign quantitatively
More informationHigh-dimensional regression:
High-dimensional regression: How to pick the objective function in high-dimension UC Berkeley March 11, 2013 Joint work with Noureddine El Karoui, Peter Bickel, Chingwhay Lim, and Bin Yu 1 / 12 Notation.
More informationActive Ranking from Pairwise Comparisons and when Parametric Assumptions Don t Help
Active Ranking from Pairwise Comparisons and when Parametric Assumptions Don t Help Reinhard Heckel Nihar B. Shah Kannan Ramchandran Martin J. Wainwright, Department of Statistics, and Department of Electrical
More informationData science with multilayer networks: Mathematical foundations and applications
Data science with multilayer networks: Mathematical foundations and applications CDSE Days University at Buffalo, State University of New York Monday April 9, 2018 Dane Taylor Assistant Professor of Mathematics
More informationPseudocode for calculating Eigenfactor TM Score and Article Influence TM Score using data from Thomson-Reuters Journal Citations Reports
Pseudocode for calculating Eigenfactor TM Score and Article Influence TM Score using data from Thomson-Reuters Journal Citations Reports Jevin West and Carl T. Bergstrom November 25, 2008 1 Overview There
More informationData Mining Techniques
Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!
More informationa Short Introduction
Collaborative Filtering in Recommender Systems: a Short Introduction Norm Matloff Dept. of Computer Science University of California, Davis matloff@cs.ucdavis.edu December 3, 2016 Abstract There is a strong
More informationGoogle PageRank. Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano
Google PageRank Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano fricci@unibz.it 1 Content p Linear Algebra p Matrices p Eigenvalues and eigenvectors p Markov chains p Google
More informationLecture 14: Random Walks, Local Graph Clustering, Linear Programming
CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 14: Random Walks, Local Graph Clustering, Linear Programming Lecturer: Shayan Oveis Gharan 3/01/17 Scribe: Laura Vonessen Disclaimer: These
More informationInformation in a Two-Stage Adaptive Optimal Design
Information in a Two-Stage Adaptive Optimal Design Department of Statistics, University of Missouri Designed Experiments: Recent Advances in Methods and Applications DEMA 2011 Isaac Newton Institute for
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationGraph Detection and Estimation Theory
Introduction Detection Estimation Graph Detection and Estimation Theory (and algorithms, and applications) Patrick J. Wolfe Statistics and Information Sciences Laboratory (SISL) School of Engineering and
More informationDictionary Learning Using Tensor Methods
Dictionary Learning Using Tensor Methods Anima Anandkumar U.C. Irvine Joint work with Rong Ge, Majid Janzamin and Furong Huang. Feature learning as cornerstone of ML ML Practice Feature learning as cornerstone
More informationDesigning Information Devices and Systems I Spring 2016 Elad Alon, Babak Ayazifar Homework 12
EECS 6A Designing Information Devices and Systems I Spring 06 Elad Alon, Babak Ayazifar Homework This homework is due April 6, 06, at Noon Homework process and study group Who else did you work with on
More informationLecture 1 and 2: Introduction and Graph theory basics. Spring EE 194, Networked estimation and control (Prof. Khan) January 23, 2012
Lecture 1 and 2: Introduction and Graph theory basics Spring 2012 - EE 194, Networked estimation and control (Prof. Khan) January 23, 2012 Spring 2012: EE-194-02 Networked estimation and control Schedule
More informationExtreme eigenvalues of Erdős-Rényi random graphs
Extreme eigenvalues of Erdős-Rényi random graphs Florent Benaych-Georges j.w.w. Charles Bordenave and Antti Knowles MAP5, Université Paris Descartes May 18, 2018 IPAM UCLA Inhomogeneous Erdős-Rényi random
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationGraph Helmholtzian and Rank Learning
Graph Helmholtzian and Rank Learning Lek-Heng Lim NIPS Workshop on Algebraic Methods in Machine Learning December 2, 2008 (Joint work with Xiaoye Jiang, Yuan Yao, and Yinyu Ye) L.-H. Lim (NIPS 2008) Graph
More informationA Note on Google s PageRank
A Note on Google s PageRank According to Google, google-search on a given topic results in a listing of most relevant web pages related to the topic. Google ranks the importance of webpages according to
More informationLink Analysis Ranking
Link Analysis Ranking How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it? Naïve ranking of query results Given query
More informationQuick Tour of Linear Algebra and Graph Theory
Quick Tour of Linear Algebra and Graph Theory CS224w: Social and Information Network Analysis Fall 2012 Yu Wayne Wu Based on Borja Pelato s version in Fall 2011 Matrices and Vectors Matrix: A rectangular
More informationThe Theory behind PageRank
The Theory behind PageRank Mauro Sozio Telecom ParisTech May 21, 2014 Mauro Sozio (LTCI TPT) The Theory behind PageRank May 21, 2014 1 / 19 A Crash Course on Discrete Probability Events and Probability
More informationGraph Partitioning Using Random Walks
Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient
More informationReconstruction in the Generalized Stochastic Block Model
Reconstruction in the Generalized Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign GDR
More informationCollaborative topic models: motivations cont
Collaborative topic models: motivations cont Two topics: machine learning social network analysis Two people: " boy Two articles: article A! girl article B Preferences: The boy likes A and B --- no problem.
More informationScaling Neighbourhood Methods
Quick Recap Scaling Neighbourhood Methods Collaborative Filtering m = #items n = #users Complexity : m * m * n Comparative Scale of Signals ~50 M users ~25 M items Explicit Ratings ~ O(1M) (1 per billion)
More informationhttps://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:
More informationApproximation Algorithms
Approximation Algorithms Chapter 26 Semidefinite Programming Zacharias Pitouras 1 Introduction LP place a good lower bound on OPT for NP-hard problems Are there other ways of doing this? Vector programs
More informationWeighted Least Squares I
Weighted Least Squares I for i = 1, 2,..., n we have, see [1, Bradley], data: Y i x i i.n.i.d f(y i θ i ), where θ i = E(Y i x i ) co-variates: x i = (x i1, x i2,..., x ip ) T let X n p be the matrix of
More informationSpectral Graph Theory and You: Matrix Tree Theorem and Centrality Metrics
Spectral Graph Theory and You: and Centrality Metrics Jonathan Gootenberg March 11, 2013 1 / 19 Outline of Topics 1 Motivation Basics of Spectral Graph Theory Understanding the characteristic polynomial
More informationNon-convex Robust PCA: Provable Bounds
Non-convex Robust PCA: Provable Bounds Anima Anandkumar U.C. Irvine Joint work with Praneeth Netrapalli, U.N. Niranjan, Prateek Jain and Sujay Sanghavi. Learning with Big Data High Dimensional Regime Missing
More informationPageRank. Ryan Tibshirani /36-662: Data Mining. January Optional reading: ESL 14.10
PageRank Ryan Tibshirani 36-462/36-662: Data Mining January 24 2012 Optional reading: ESL 14.10 1 Information retrieval with the web Last time we learned about information retrieval. We learned how to
More informationComputational and Statistical Aspects of Statistical Machine Learning. John Lafferty Department of Statistics Retreat Gleacher Center
Computational and Statistical Aspects of Statistical Machine Learning John Lafferty Department of Statistics Retreat Gleacher Center Outline Modern nonparametric inference for high dimensional data Nonparametric
More informationOn pairwise comparison matrices that can be made consistent by the modification of a few elements
Noname manuscript No. (will be inserted by the editor) On pairwise comparison matrices that can be made consistent by the modification of a few elements Sándor Bozóki 1,2 János Fülöp 1,3 Attila Poesz 2
More informationHigh dimensional Ising model selection
High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse
More informationStatistical Analysis of Online News
Statistical Analysis of Online News Laurent El Ghaoui EECS, UC Berkeley Information Systems Colloquium, Stanford University November 29, 2007 Collaborators Joint work with and UCB students: Alexandre d
More informationLecture III & IV: The Mathemaics of Data
Lecture III & IV: The Mathemaics of Data Rank aggregation via Hodge theory and nuclear norm minimization Lek-Heng Lim University of California, Berkeley December 22, 2009 Joint work with David Gleich,
More informationStatistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices
Journal of Machine Learning Research 17 (2016) 1-57 Submitted 8/14; Revised 5/15; Published 4/16 Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number
More information