Recent Advances in Ranking: Adversarial Respondents and Lower Bounds on the Bayes Risk

Size: px

Start display at page:

Download "Recent Advances in Ranking: Adversarial Respondents and Lower Bounds on the Bayes Risk"

Maryann Richardson
5 years ago
Views:

1 Recent Advances in Ranking: Adversarial Respondents and Lower Bounds on the Bayes Risk Vincent Y. F. Tan Department of Electrical and Computer Engineering, Department of Mathematics, National University of Singapore McMaster University June 15, 2018 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 1 / 51

2 Outline 1 Introduction to Statistical Models for Ranking 2 Fundamental Limits of Top-K Ranking with Adversaries 3 Lower Bounds on the Bayes Risk of a Bayesian BTL Model Vincent Y. F. Tan (NUS) Ranking Mcmaster University 2 / 51

3 Outline 1 Introduction to Statistical Models for Ranking 2 Fundamental Limits of Top-K Ranking with Adversaries 3 Lower Bounds on the Bayes Risk of a Bayesian BTL Model Vincent Y. F. Tan (NUS) Ranking Mcmaster University 3 / 51

4 Ranking A fundamental problem in a wide range of contexts Vincent Y. F. Tan (NUS) Ranking Mcmaster University 4 / 51

5 Ranking A fundamental problem in a wide range of contexts Applications: web search, recommendation systems, social choice, sports competitions, voting, etc. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 4 / 51

6 Ranking A fundamental problem in a wide range of contexts Applications: web search, recommendation systems, social choice, sports competitions, voting, etc. Efforts in developing various ranking algorithms Vincent Y. F. Tan (NUS) Ranking Mcmaster University 4 / 51

7 Ranking A fundamental problem in a wide range of contexts Applications: web search, recommendation systems, social choice, sports competitions, voting, etc. Efforts in developing various ranking algorithms A variety of statistical models introduced for evaluating ranking algorithms Vincent Y. F. Tan (NUS) Ranking Mcmaster University 4 / 51

8 Ranking: An Example and Difficulties Example: Web search Vincent Y. F. Tan (NUS) Ranking Mcmaster University 5 / 51

Ranking: An Example and Difficulties Example: Web search n = 10 9 websites ( n ) 2 n 2 = 10 18

9 Ranking: An Example and Difficulties Example: Web search n = 10 9 websites ( n ) 2 n 2 = comparisons Do we really need Θ(n 2 ) comparisons? Vincent Y. F. Tan (NUS) Ranking Mcmaster University 5 / 51

10 Large-Scale Ranking Suppose that we want a total ordering pairwise comparisons are randomly given (probabilistically). This indeed requires Θ(n 2 ) comparisons Vincent Y. F. Tan (NUS) Ranking Mcmaster University 6 / 51

11 Large-Scale Ranking Suppose that we want a total ordering pairwise comparisons are randomly given (probabilistically). This indeed requires Θ(n 2 ) comparisons No way to identify the ordering between 1 and 2 without a direct comparison, i.e., comparison must be made w.p. 1 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 6 / 51

12 Large-Scale Ranking Suppose that we want a total ordering pairwise comparisons are randomly given (probabilistically). This indeed requires Θ(n 2 ) comparisons No way to identify the ordering between 1 and 2 without a direct comparison, i.e., comparison must be made w.p. 1 Worse with noisy data Vincent Y. F. Tan (NUS) Ranking Mcmaster University 6 / 51

13 Large-Scale Ranking Suppose that we want a total ordering pairwise comparisons are randomly given (probabilistically). This indeed requires Θ(n 2 ) comparisons No way to identify the ordering between 1 and 2 without a direct comparison, i.e., comparison must be made w.p. 1 Worse with noisy data Adopt a Shannon-theoretic approach in our analyses Vincent Y. F. Tan (NUS) Ranking Mcmaster University 6 / 51

14 Top-K Ranking Usually Suffices Vincent Y. F. Tan (NUS) Ranking Mcmaster University 7 / 51

15 Top-K Ranking Usually Suffices Vincent Y. F. Tan (NUS) Ranking Mcmaster University 7 / 51

16 Statistical Model for Top-K Ranking: Part I Adopt the Bradley-Terry-Luce or BTL model in which there is an underlying unknown score vector w = (w 1,..., w n ) R n ++, where w i is the likeability of movie i. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 8 / 51

17 Statistical Model for Top-K Ranking: Part I Adopt the Bradley-Terry-Luce or BTL model in which there is an underlying unknown score vector w = (w 1,..., w n ) R n ++, where w i is the likeability of movie i. Decide which items to compare via a comparison graph Vincent Y. F. Tan (NUS) Ranking Mcmaster University 8 / 51

18 Statistical Model for Top-K Ranking: Part II The outcome of the comparison between item 1 and 2 is ( ) w1 Y 12 = I{item 1 item 2} Bern. w 1 + w 2 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 9 / 51

19 Statistical Model for Top-K Ranking: Part II The outcome of the comparison between item 1 and 2 is ( ) w1 Y 12 = I{item 1 item 2} Bern. w 1 + w 2 E.g., w 1 = 0.9 and w 2 = 0.1, then item 1 beats item 2 w.p. 90%. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 9 / 51

20 Statistical Model for Top-K Ranking: Part II The outcome of the comparison between item 1 and 2 is ( ) w1 Y 12 = I{item 1 item 2} Bern. w 1 + w 2 E.g., w 1 = 0.9 and w 2 = 0.1, then item 1 beats item 2 w.p. 90%. We have L independent copies Y (1) ij,..., Y (L) ij for each observed edge {i, j} E of the observation graph. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 9 / 51

21 Statistical Model for Top-K Ranking: Part II The outcome of the comparison between item 1 and 2 is ( ) w1 Y 12 = I{item 1 item 2} Bern. w 1 + w 2 E.g., w 1 = 0.9 and w 2 = 0.1, then item 1 beats item 2 w.p. 90%. We have L independent copies Y (1) ij,..., Y (L) ij for each observed edge {i, j} E of the observation graph. Determine fundamental limits on L (as a function of n and other parameters) so that recovery of top-k set is successful. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 9 / 51

22 Outline 1 Introduction to Statistical Models for Ranking 2 Fundamental Limits of Top-K Ranking with Adversaries 3 Lower Bounds on the Bayes Risk of a Bayesian BTL Model Vincent Y. F. Tan (NUS) Ranking Mcmaster University 10 / 51

23 Top-K Ranking with Adversaries Joint work with Changho Suh (KAIST) Renbo Zhao (NUS) Vincent Y. F. Tan (NUS) Ranking Mcmaster University 11 / 51

24 Top-K Ranking with Adversaries Joint work with Changho Suh (KAIST) Renbo Zhao (NUS) C. Suh, VYFT and R. Zhao Adversarial Top-K Ranking, IEEE Trans. on Inf. Theory, Apr 2017 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 11 / 51

25 Ranking with Adversaries: Crowdsourced Setting Vincent Y. F. Tan (NUS) Ranking Mcmaster University 12 / 51

26 Ranking with Adversaries: Crowdsourced Setting ( ) wi Y ij Bern w i + w j Vincent Y. F. Tan (NUS) Ranking Mcmaster University 12 / 51

27 Ranking with Adversaries: Crowdsourced Setting ( ) wi Y ij Bern w i + w j ( ) 1/w i Y ij Bern 1/w i + 1/w j Vincent Y. F. Tan (NUS) Ranking Mcmaster University 12 / 51

28 Ranking with Adversaries: Crowdsourced Setting ( ) wi Y ij Bern w i + w j ( ) 1/w i Y ij Bern 1/w i + 1/w j ( ) wj = Bern w i + w j Vincent Y. F. Tan (NUS) Ranking Mcmaster University 12 / 51

29 Ranking with Adversaries: Crowdsourced Setting ( ) wi Y ij Bern w i + w j ( ) 1/w i Y ij Bern 1/w i + 1/w j ( ) wj = Bern w i + w j Spammers provide answers in an adversarial manner Vincent Y. F. Tan (NUS) Ranking Mcmaster University 12 / 51

30 Ranking with Adversaries: Crowdsourced Setting ( ) wi Y ij Bern w i + w j ( ) 1/w i Y ij Bern 1/w i + 1/w j ( ) wj = Bern w i + w j Spammers provide answers in an adversarial manner ( ) w i w j Y ij Bern η + (1 η) w i + w j w i + w j Vincent Y. F. Tan (NUS) Ranking Mcmaster University 12 / 51

31 Related Work: Crowdsourced BTL [Chen et al. 13] 1 Given an observed pair, each sample has different distributions ( ) Y (l) w i 1/w i ij Bern η l + (1 η l ) w i + w j 1/w i + 1/w j where η l is a quality parameter of measurement l 1 X. Chen, P. N. Bennett, K. Collins-Thompson, and E. Horvitz, Pairwise ranking aggregation in a crowdsourced setting, in WSDM, 2013 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 13 / 51

32 Related Work: Crowdsourced BTL [Chen et al. 13] 1 Given an observed pair, each sample has different distributions ( ) Y (l) w i 1/w i ij Bern η l + (1 η l ) w i + w j 1/w i + 1/w j where η l is a quality parameter of measurement l Subsumes as a special case our adversarial BTL model when all quality parameters are the same 1 X. Chen, P. N. Bennett, K. Collins-Thompson, and E. Horvitz, Pairwise ranking aggregation in a crowdsourced setting, in WSDM, 2013 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 13 / 51

33 Related Work: Crowdsourced BTL [Chen et al. 13] 1 Given an observed pair, each sample has different distributions ( ) Y (l) w i 1/w i ij Bern η l + (1 η l ) w i + w j 1/w i + 1/w j where η l is a quality parameter of measurement l Subsumes as a special case our adversarial BTL model when all quality parameters are the same The authors developed a ranking algorithm but without theoretical guarantees 1 X. Chen, P. N. Bennett, K. Collins-Thompson, and E. Horvitz, Pairwise ranking aggregation in a crowdsourced setting, in WSDM, 2013 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 13 / 51

34 Related Work: Crowdsourced BTL [Chen et al. 13] 1 Given an observed pair, each sample has different distributions ( ) Y (l) w i 1/w i ij Bern η l + (1 η l ) w i + w j 1/w i + 1/w j where η l is a quality parameter of measurement l Subsumes as a special case our adversarial BTL model when all quality parameters are the same The authors developed a ranking algorithm but without theoretical guarantees More difficult to analyze as there are many more parameters 1 X. Chen, P. N. Bennett, K. Collins-Thompson, and E. Horvitz, Pairwise ranking aggregation in a crowdsourced setting, in WSDM, 2013 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 13 / 51

35 Goal of Adversarial Top-K Ranking Erdös-Rényi comparison graph Vincent Y. F. Tan (NUS) Ranking Mcmaster University 14 / 51

36 Goal of Adversarial Top-K Ranking Erdös-Rényi comparison graph Vincent Y. F. Tan (NUS) Ranking Mcmaster University 14 / 51

37 Contribution #1: 1/2 < η < 1 known η = Fraction of non-adversaries; K w K w K+1 sample complexity η = 1 studied by Chen and Suh (2015) 2 2 Y. Chen and C. Suh, Spectral MLE: Top-K rank aggregation from pairwise comparisons, in ICML 2015 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 15 / 51

38 Contribution #1: 1/2 < η < 1 known η = Fraction of non-adversaries; K w K w K+1 sample complexity η = 1 studied by Chen and Suh (2015) 2 2 Y. Chen and C. Suh, Spectral MLE: Top-K rank aggregation from pairwise comparisons, in ICML 2015 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 15 / 51

39 Contribution #1: 1/2 < η < 1 known η = Fraction of non-adversaries; K w K w K+1 sample complexity minimax optimal η = 1 studied by Chen and Suh (2015) 2 2 Y. Chen and C. Suh, Spectral MLE: Top-K rank aggregation from pairwise comparisons, in ICML 2015 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 15 / 51

40 Contribution #1: 1/2 < η < 1 known η = Fraction of non-adversaries; K w K w K+1 sample complexity minimax optimal nearly-linear time algorithm η = 1 studied by Chen and Suh (2015) 2 2 Y. Chen and C. Suh, Spectral MLE: Top-K rank aggregation from pairwise comparisons, in ICML 2015 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 15 / 51

41 Contribution #1: 1/2 < η < 1 known Experimental Results for n = 1000 and K = 10 Ŝ = ( ) n pˆl ave 2 Ŝ norm = Ŝ (n log n)/ 2 K Vincent Y. F. Tan (NUS) Ranking Mcmaster University 16 / 51

42 Contribution #2: 1/2 < η < 1 unknown η = Fraction of non-adversaries; K w K w K+1 sample complexity Vincent Y. F. Tan (NUS) Ranking Mcmaster University 17 / 51

43 Contribution #2: 1/2 < η < 1 unknown η = Fraction of non-adversaries; K w K w K+1 sample complexity infeasible Vincent Y. F. Tan (NUS) Ranking Mcmaster University 17 / 51

44 Contribution #2: 1/2 < η < 1 unknown η = Fraction of non-adversaries; K w K w K+1 sample complexity polynomial time algorithm infeasible Vincent Y. F. Tan (NUS) Ranking Mcmaster University 17 / 51

45 Contribution #2: 1/2 < η < 1 unknown η = Fraction of non-adversaries; K w K w K+1 sample complexity?? polynomial time algorithm infeasible Vincent Y. F. Tan (NUS) Ranking Mcmaster University 17 / 51

46 Optimality Vincent Y. F. Tan (NUS) Ranking Mcmaster University 18 / 51

47 Optimality Minimax optimality: Construct worst-case score vectors Vincent Y. F. Tan (NUS) Ranking Mcmaster University 18 / 51

48 Optimality Minimax optimality: Construct worst-case score vectors Translation to M-ary hypothesis testing: Construction of multiple hypotheses Vincent Y. F. Tan (NUS) Ranking Mcmaster University 18 / 51

49 Optimality Minimax optimality: Construct worst-case score vectors Translation to M-ary hypothesis testing: Construction of multiple hypotheses Information-theoretic ideas applied to statistical learning Vincent Y. F. Tan (NUS) Ranking Mcmaster University 18 / 51

50 Optimality: Tools Construction of M := min{k, n K} + 1 n/2 hypotheses: Pr(σ([K]) = S) = 1, for S = {2,..., K} {i}, i = 1, K + 1,..., n M Vincent Y. F. Tan (NUS) Ranking Mcmaster University 19 / 51

51 Optimality: Tools Construction of M := min{k, n K} + 1 n/2 hypotheses: Pr(σ([K]) = S) = 1, for S = {2,..., K} {i}, i = 1, K + 1,..., n M Bound mutual info. of permutation and erased version of Y (l) I(σ; Z) p M 2 σ 1,σ 2 M l=1 L i j ( D P (l) Y ij σ 1 P Y (l) ij σ 2 ) ij : Vincent Y. F. Tan (NUS) Ranking Mcmaster University 19 / 51

52 Optimality: Tools Construction of M := min{k, n K} + 1 n/2 hypotheses: Pr(σ([K]) = S) = 1, for S = {2,..., K} {i}, i = 1, K + 1,..., n M Bound mutual info. of permutation and erased version of Y (l) I(σ; Z) p M 2 σ 1,σ 2 M l=1 L i j ( D P (l) Y ij σ 1 P Y (l) ij σ 2 ) Bound the divergence using reverse Pinsker s inequality. Here is where K comes in ( ) D P (l) Y ij σ 1 P Y (l) n (2η 1) 2 2 ij σ 2 K i j ij : Vincent Y. F. Tan (NUS) Ranking Mcmaster University 19 / 51

53 Optimality: Tools Construction of M := min{k, n K} + 1 n/2 hypotheses: Pr(σ([K]) = S) = 1, for S = {2,..., K} {i}, i = 1, K + 1,..., n M Bound mutual info. of permutation and erased version of Y (l) I(σ; Z) p M 2 σ 1,σ 2 M l=1 L i j ( D P (l) Y ij σ 1 P Y (l) ij σ 2 ) Bound the divergence using reverse Pinsker s inequality. Here is where K comes in ( ) D P (l) Y ij σ 1 P Y (l) n (2η 1) 2 2 ij σ 2 K i j Fano s inequality Vincent Y. F. Tan (NUS) Ranking Mcmaster University 19 / 51 ij :

54 Ranking Algorithm for η Known: Part I Vincent Y. F. Tan (NUS) Ranking Mcmaster University 20 / 51

55 Ranking Algorithm for η Known: Part I Scores determine the ranking Vincent Y. F. Tan (NUS) Ranking Mcmaster University 20 / 51

56 Ranking Algorithm for η Known: Part I Scores determine the ranking Adopt a two-step approach Vincent Y. F. Tan (NUS) Ranking Mcmaster University 20 / 51

57 Ranking Algorithm for η Known: Part II Vincent Y. F. Tan (NUS) Ranking Mcmaster University 21 / 51

58 Ranking Algorithm for η Known: Part II Key Message: Small MSE = Small l Error of ŵ = High Ranking Accuracy Vincent Y. F. Tan (NUS) Ranking Mcmaster University 21 / 51

59 How to ensure small MSE for η = 1? Vincent Y. F. Tan (NUS) Ranking Mcmaster University 22 / 51

60 How to ensure small MSE for η = 1? Recall η = 1 (no adversaries) Vincent Y. F. Tan (NUS) Ranking Mcmaster University 22 / 51

61 How to ensure small MSE for η = 1? Recall η = 1 (no adversaries) L independent copies Y (1) ij, Y (2) ij,..., Y (L) ij Vincent Y. F. Tan (NUS) Ranking Mcmaster University 22 / 51

62 How to ensure small MSE for η = 1? Recall η = 1 (no adversaries) L independent copies Y (1) ij, Y (2) ij,..., Y (L) ij Convergence to stationary distribution 1 L L l=1 Y (l) ij w i w i + w j Vincent Y. F. Tan (NUS) Ranking Mcmaster University 22 / 51

63 How to ensure small MSE for η = 1? Recall η = 1 (no adversaries) L independent copies Y (1) ij, Y (2) ij,..., Y (L) ij Convergence to stationary distribution 1 L L l=1 Detailed balance equation: w j w i π i = π j w i + w j w i + w j Y (l) ij w i w i + w j where π := [π 1, π 2,..., π n ] is the stat. distn. of the chain. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 22 / 51

64 How to ensure small MSE for η = 1? Recall η = 1 (no adversaries) L independent copies Y (1) ij, Y (2) ij,..., Y (L) ij Convergence to stationary distribution 1 L L l=1 Detailed balance equation: w j w i π i = π j w i + w j w i + w j Y (l) ij w i w i + w j where π := [π 1, π 2,..., π n ] is the stat. distn. of the chain. Stationary distribution converges to w (up to constant scaling), i.e., lim L π(l) = αw. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 22 / 51

65 How to ensure small MSE for η (1/2, 1]? Vincent Y. F. Tan (NUS) Ranking Mcmaster University 23 / 51

66 How to ensure small MSE for η (1/2, 1]? Arbitrary η (1/2, 1] (adversaries) L independent copies Y (1) ij, Y (2) ij,..., Y (L) ij Redefine Markov chain Vincent Y. F. Tan (NUS) Ranking Mcmaster University 23 / 51

67 How to ensure small MSE for η (1/2, 1]? Arbitrary η (1/2, 1] (adversaries) L independent copies Y (1) ij, Y (2) ij,..., Y (L) ij Redefine Markov chain We instead have the following convergence: 1 L L l=1 Y (l) ij w i w j η + (1 η) = (2η 1) + (1 η) w i + w j w i + w j w i + w j w i Vincent Y. F. Tan (NUS) Ranking Mcmaster University 23 / 51

68 How to ensure small MSE for η (1/2, 1]? Arbitrary η (1/2, 1] (adversaries) L independent copies Y (1) ij, Y (2) ij,..., Y (L) ij Redefine Markov chain We instead have the following convergence: 1 L L l=1 Y (l) ij w i w j η + (1 η) = (2η 1) + (1 η) w i + w j w i + w j w i + w j Redefine shifted samples with range scaled by 2η 1: [ ] 1 1 L Ỹ ij = Y (l) ij (1 η) w i 2η 1 L w i + w j l=1 w i Vincent Y. F. Tan (NUS) Ranking Mcmaster University 23 / 51

69 How to ensure small MSE for η (1/2, 1]? Arbitrary η (1/2, 1] (adversaries) L independent copies Y (1) ij, Y (2) ij,..., Y (L) ij Redefine Markov chain We instead have the following convergence: 1 L L l=1 Y (l) ij w i w j η + (1 η) = (2η 1) + (1 η) w i + w j w i + w j w i + w j Redefine shifted samples with range scaled by 2η 1: [ ] 1 1 L Ỹ ij = Y (l) ij (1 η) w i 2η 1 L w i + w j l=1 Construct Markov chain with transition probabilities {Ỹ ij }. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 23 / 51 w i

70 Ranking Algorithm for η Known: Summary Vincent Y. F. Tan (NUS) Ranking Mcmaster University 24 / 51

71 Ranking Algorithm for η Known: Summary Use several concentraition inequalities (Hoeffding, Bernstein, Tropp, etc.), we can show that if ( ) n n log n sample size = L p 2 (2η 1) 2 2 = Feasible Top-K Ranking K Vincent Y. F. Tan (NUS) Ranking Mcmaster University 24 / 51

72 What if η is unknown? Adversarial BTL model is a mixture model 3 P. Jain and S. Oh, Learning mixtures of discrete product distributions using spectral decompositions, in COLT, A. Anandkumar, R. Ge, D. Hsu, S. M. Kakade, and M. Telgarsky, Tensor decompositions for learning latent variable models, JMLR, Vincent Y. F. Tan (NUS) Ranking Mcmaster University 25 / 51

73 What if η is unknown? Adversarial BTL model is a mixture model Obtaining global optimality guarantees for mixture model problems is difficult in general 3 P. Jain and S. Oh, Learning mixtures of discrete product distributions using spectral decompositions, in COLT, A. Anandkumar, R. Ge, D. Hsu, S. M. Kakade, and M. Telgarsky, Tensor decompositions for learning latent variable models, JMLR, Vincent Y. F. Tan (NUS) Ranking Mcmaster University 25 / 51

74 What if η is unknown? Adversarial BTL model is a mixture model Obtaining global optimality guarantees for mixture model problems is difficult in general Recent developments: Tensor methods: Jain and Oh 3 and Anandkumar et al. 4 Key idea: Exact 2nd and 3rd moments yield sufficient statistics 3 P. Jain and S. Oh, Learning mixtures of discrete product distributions using spectral decompositions, in COLT, A. Anandkumar, R. Ge, D. Hsu, S. M. Kakade, and M. Telgarsky, Tensor decompositions for learning latent variable models, JMLR, Vincent Y. F. Tan (NUS) Ranking Mcmaster University 25 / 51

75 What if η is unknown? Adversarial BTL model is a mixture model Obtaining global optimality guarantees for mixture model problems is difficult in general Recent developments: Tensor methods: Jain and Oh 3 and Anandkumar et al. 4 Key idea: Exact 2nd and 3rd moments yield sufficient statistics Our setting: Can obtain estimates of 2nd and 3rd moments Can estimate η 3 P. Jain and S. Oh, Learning mixtures of discrete product distributions using spectral decompositions, in COLT, A. Anandkumar, R. Ge, D. Hsu, S. M. Kakade, and M. Telgarsky, Tensor decompositions for learning latent variable models, JMLR, Vincent Y. F. Tan (NUS) Ranking Mcmaster University 25 / 51

76 Unknown η: High-Level Algorithm: Part I 1 Turn weights into distribution vectors [ w i w j w i π 0 = w i + w j w i + w j w i + w j w j w i + w j ] T Vincent Y. F. Tan (NUS) Ranking Mcmaster University 26 / 51

77 Unknown η: High-Level Algorithm: Part I 1 Turn weights into distribution vectors [ w i w j w i π 0 = w i + w j w i + w j w i + w j w j w i + w j ] T 2 Estimate moments. Ground truth moment matrix and tensor are: M 2 := ηπ 0 π 0 + (1 η)π 1 π 1, M 3 := ηπ 0 π 0 π 0 + (1 η)π 1 π 1 π 1. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 26 / 51

78 Unknown η: High-Level Algorithm: Part I 1 Turn weights into distribution vectors [ w i w j w i π 0 = w i + w j w i + w j w i + w j w j w i + w j ] T 2 Estimate moments. Ground truth moment matrix and tensor are: M 2 := ηπ 0 π 0 + (1 η)π 1 π 1, M 3 := ηπ 0 π 0 π 0 + (1 η)π 1 π 1 π 1. 3 Solves a Least Squares Problem ] Ĝ arg min P Ω 3 (Z [P 1 ) 2 ˆM2 3 Y (t) Z R I 2 t I 2 F Vincent Y. F. Tan (NUS) Ranking Mcmaster University 26 / 51

79 Unknown η: High-Level Algorithm: Part I 1 Turn weights into distribution vectors [ w i w j w i π 0 = w i + w j w i + w j w i + w j w j w i + w j ] T 2 Estimate moments. Ground truth moment matrix and tensor are: M 2 := ηπ 0 π 0 + (1 η)π 1 π 1, M 3 := ηπ 0 π 0 π 0 + (1 η)π 1 π 1 π 1. 3 Solves a Least Squares Problem ] Ĝ arg min P Ω 3 (Z [P 1 ) 2 ˆM2 3 Y (t) Z R I 2 t I 2 F 4 Find leading eigenvalue λ 1 (Ĝ) of Ĝ which is related to η as follows: ˆη = λ 1 (Ĝ) 2 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 26 / 51

80 Unknown η: High-Level Algorithm: Part II Vincent Y. F. Tan (NUS) Ranking Mcmaster University 27 / 51

81 Unknown η: High-Level Algorithm: Part II How does the quality of the estimation of η affect overall sample complexity? Vincent Y. F. Tan (NUS) Ranking Mcmaster University 27 / 51

82 Tradeoff Between ˆη η and Sample Complexity With very careful analysis, we can derive a meta-lemma ( ) n ˆη η ɛ = Sample size = L p n log2 n 2 ɛ 2 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 28 / 51

83 Tradeoff Between ˆη η and Sample Complexity With very careful analysis, we can derive a meta-lemma ( ) n ˆη η ɛ = Sample size = L p n log2 n 2 ɛ 2 This implies that ˆη η implies that ŵ w but sample size Vincent Y. F. Tan (NUS) Ranking Mcmaster University 28 / 51

84 Tradeoff Between ˆη η and Sample Complexity With very careful analysis, we can derive a meta-lemma ( ) n ˆη η ɛ = Sample size = L p n log2 n 2 ɛ 2 This implies that ˆη η implies that ŵ w but sample size ˆη η implies that sample size but ŵ w Vincent Y. F. Tan (NUS) Ranking Mcmaster University 28 / 51

85 Tradeoff Between ˆη η and Sample Complexity With very careful analysis, we can derive a meta-lemma ( ) n ˆη η ɛ = Sample size = L p n log2 n 2 ɛ 2 This implies that ˆη η implies that ŵ w but sample size ˆη η implies that sample size but ŵ w Find a sweet spot to show that sample size n log2 n (2η 1) 4 4, = Feasible Top-K ranking K Vincent Y. F. Tan (NUS) Ranking Mcmaster University 28 / 51

86 Conclusion for Adversarial Top-K Ranking Explored a Top-K ranking problem for an adversarial setting Vincent Y. F. Tan (NUS) Ranking Mcmaster University 29 / 51

87 Conclusion for Adversarial Top-K Ranking Explored a Top-K ranking problem for an adversarial setting Characterized exact order-wise optimal sample complexity for η-known case Vincent Y. F. Tan (NUS) Ranking Mcmaster University 29 / 51

88 Conclusion for Adversarial Top-K Ranking Explored a Top-K ranking problem for an adversarial setting Characterized exact order-wise optimal sample complexity for η-known case Established an upper bound on the sample complexity for the η-unknown case Vincent Y. F. Tan (NUS) Ranking Mcmaster University 29 / 51

89 Conclusion for Adversarial Top-K Ranking Explored a Top-K ranking problem for an adversarial setting Characterized exact order-wise optimal sample complexity for η-known case Established an upper bound on the sample complexity for the η-unknown case Developed computationally efficient algorithms for both cases (using state-of-the-art tensor methods for the η-unknown case) Vincent Y. F. Tan (NUS) Ranking Mcmaster University 29 / 51

90 Conclusion for Adversarial Top-K Ranking Explored a Top-K ranking problem for an adversarial setting Characterized exact order-wise optimal sample complexity for η-known case Established an upper bound on the sample complexity for the η-unknown case Developed computationally efficient algorithms for both cases (using state-of-the-art tensor methods for the η-unknown case) C. Suh, VYFT and R. Zhao Adversarial Top-K Ranking, IEEE Trans. on Inf. Theory, Apr 2017 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 29 / 51

91 Outline 1 Introduction to Statistical Models for Ranking 2 Fundamental Limits of Top-K Ranking with Adversaries 3 Lower Bounds on the Bayes Risk of a Bayesian BTL Model Vincent Y. F. Tan (NUS) Ranking Mcmaster University 30 / 51

92 Lower Bounds on the Risk of a Bayesian BTL Model Joint work with Mine Alsan (NUS) Ranjitha Prasad (TCS Innovation Labs, Delhi) Vincent Y. F. Tan (NUS) Ranking Mcmaster University 31 / 51

Prasad and VYFT, Lower Bounds on the Bayes Risk of the Bayesian BTL Model with

93 Lower Bounds on the Risk of a Bayesian BTL Model Joint work with Mine Alsan (NUS) Ranjitha Prasad (TCS Innovation Labs, Delhi) M. Alsan, R. Prasad and VYFT, Lower Bounds on the Bayes Risk of the Bayesian BTL Model with Applications to Comparison Graphs, IEEE J. on Sel. Topics of Sig. Proc., Oct 2018 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 31 / 51

94 Summary of contributions Study the fundamental performance limits of ranking algorithms in the Bradley-Terry-Luce model within a Bayesian framework: Vincent Y. F. Tan (NUS) Ranking Mcmaster University 32 / 51

95 Summary of contributions Study the fundamental performance limits of ranking algorithms in the Bradley-Terry-Luce model within a Bayesian framework: 1 Derive lower bounds on the Bayes Risk of estimators. - A family of information-theoretic lower bounds for norm-based distortion functions r r, for any r 1. - The Bayesian Cramér-Rao bound for the MSE, i.e., r = 2. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 32 / 51

96 Summary of contributions Study the fundamental performance limits of ranking algorithms in the Bradley-Terry-Luce model within a Bayesian framework: 1 Derive lower bounds on the Bayes Risk of estimators. - A family of information-theoretic lower bounds for norm-based distortion functions r r, for any r 1. - The Bayesian Cramér-Rao bound for the MSE, i.e., r = 2. 2 Explore optimal comparison graph structures to design experiments minimizing distortion. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 32 / 51

97 Recall the BTL Model BTL model: To each item i [n], a skill parameter w i R ++ s.t. P ij := Pr[item i item j] = w i w i + w j. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 33 / 51

98 Recall the BTL Model BTL model: To each item i [n], a skill parameter w i R ++ s.t. P ij := Pr[item i item j] = w i w i + w j. Instead of Top-K ranking, now we want to estimate the vector w := (w 1,..., w n ) R n ++ Vincent Y. F. Tan (NUS) Ranking Mcmaster University 33 / 51

99 Recall the BTL Model BTL model: To each item i [n], a skill parameter w i R ++ s.t. P ij := Pr[item i item j] = w i w i + w j. Instead of Top-K ranking, now we want to estimate the vector Given w := (w 1,..., w n ) R n ++ m = (i,j):i j m ij N indep. pairwise comparisons, we count: Vincent Y. F. Tan (NUS) Ranking Mcmaster University 33 / 51

100 Recall the BTL Model BTL model: To each item i [n], a skill parameter w i R ++ s.t. P ij := Pr[item i item j] = w i w i + w j. Instead of Top-K ranking, now we want to estimate the vector w := (w 1,..., w n ) R n ++ Given m = m ij N (i,j):i j indep. pairwise comparisons, we count: 1 m ij : Num. of pairwise comparisons between items i & j, 2 b ij : Num. of comparisons in which i is preferred over j. M := {m ij } N n n and B := {b ij } N n n. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 33 / 51

101 Induced Probabilities by BTL Model We assume that the matrix M = {m ij } is fixed a priori. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 34 / 51

102 Induced Probabilities by BTL Model We assume that the matrix M = {m ij } is fixed a priori. The BTL model induces the following distributions: 1 For fixed m ij, p(b ij w i, w j ) = Bin(b ij ; m ij, P ij ). 2 For fixed M, p(b λ) = Bin(b ij ; m ij, P ij ), (i,j):i<j Vincent Y. F. Tan (NUS) Ranking Mcmaster University 34 / 51

103 Bayesian BTL Model Adopt the Bayesian BTL framework by Caron & Doucet 5 : 5 F. Caron and A. Doucet, Efficient Bayesian Inference for Generalized Bradley-Terry Models", in JCGS, 2012 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 35 / 51

104 Bayesian BTL Model Adopt the Bayesian BTL framework by Caron & Doucet 5 : 1 Prior distribution: They assign p(w i ) = Gam(w i ; α i, β i ) to each item i [n], where α = {α i } n i=1, β := {β i} n i=1 Rn F. Caron and A. Doucet, Efficient Bayesian Inference for Generalized Bradley-Terry Models", in JCGS, 2012 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 35 / 51

105 Bayesian BTL Model Adopt the Bayesian BTL framework by Caron & Doucet 5 : 1 Prior distribution: They assign p(w i ) = Gam(w i ; α i, β i ) to each item i [n], where α = {α i } n i=1, β := {β i} n i=1 Rn Latent random variables: They introduce Z := {Z ij } R n n m ij Z ij = Z ji := min{y si, Y sj }, for i, j [n] such that i < j, where Y i Exp(w i ) & Y j Exp(w j ) such that P ij = Pr[Y i < Y j ]. Known as Thurstonian interpretation of the BTL model. 5 F. Caron and A. Doucet, Efficient Bayesian Inference for Generalized Bradley-Terry Models", in JCGS, 2012 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 35 / 51 s=1

106 Induced Probabilities by Bayesian BTL Model 1 Prior: n n p(w) = p(w i ) = Gam(w i ; α i, β i ), i=1 i=1 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 36 / 51

107 Induced Probabilities by Bayesian BTL Model 1 Prior: p(w) = 2 Prior Likelihood: p(w, B) = p(w)p(b w) = n p(w i ) = i=1 n n Gam(w i ; α i, β i ), i=1 i=1 Gam(w i ; α i, β i ) i<j ( Bin b ij ; m ij, w i w i + w j ). Vincent Y. F. Tan (NUS) Ranking Mcmaster University 36 / 51

108 Induced Probabilities by Bayesian BTL Model 1 Prior: p(w) = 2 Prior Likelihood: p(w, B) = p(w)p(b w) = 3 Latent Variable: n p(w i ) = i=1 n n Gam(w i ; α i, β i ), i=1 i=1 Gam(w i ; α i, β i ) i<j p(z ij w i, w j ) = Gam(Z ij ; m ij, w i + w j ). ( Bin b ij ; m ij, w i w i + w j ). Vincent Y. F. Tan (NUS) Ranking Mcmaster University 36 / 51

109 Induced Probabilities by Bayesian BTL Model 1 Prior: p(w) = 2 Prior Likelihood: p(w, B) = p(w)p(b w) = 3 Latent Variable: n p(w i ) = i=1 n n Gam(w i ; α i, β i ), i=1 i=1 Gam(w i ; α i, β i ) i<j p(z ij w i, w j ) = Gam(Z ij ; m ij, w i + w j ). ( Bin b ij ; m ij, w i w i + w j ). 4 Posterior: n p(w B, Z) = Gam(w i ; α i + b i, β i + Z i ). i=1 where b i := j i b ij and Z i := j i Z ij. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 36 / 51

110 Bayes Risk For any r 1, we define the family of Bayes risks for estimating w Vincent Y. F. Tan (NUS) Ranking Mcmaster University 37 / 51

111 Bayes Risk For any r 1, we define the family of Bayes risks for estimating w 1 from only B as R B := inf ϕ E [ w ϕ(b) r r ], where ϕ(b) is an estimator of w. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 37 / 51

112 Bayes Risk For any r 1, we define the family of Bayes risks for estimating w 1 from only B as R B := inf ϕ E [ w ϕ(b) r r ], where ϕ(b) is an estimator of w. 2 from B and the latent variable Z as [ R B := inf E w ϕ (B, Z) ] r, ϕ r where ϕ (B, Z) is an estimator of w. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 37 / 51

113 Bayes Risk For any r 1, we define the family of Bayes risks for estimating w 1 from only B as R B := inf ϕ E [ w ϕ(b) r r ], where ϕ(b) is an estimator of w. 2 from B and the latent variable Z as [ R B := inf E w ϕ (B, Z) ] r, ϕ r where ϕ (B, Z) is an estimator of w. R B R B. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 37 / 51

114 Bayesian Network of All Variables w i Gam(w i ; α i, β i ) Prior on w i Vincent Y. F. Tan (NUS) Ranking Mcmaster University 38 / 51

115 Bayesian Network of All Variables w i P ij = w i + w j BTL model Vincent Y. F. Tan (NUS) Ranking Mcmaster University 38 / 51

116 Bayesian Network of All Variables Y si Exp(w i ) Latent Arrival Times Vincent Y. F. Tan (NUS) Ranking Mcmaster University 38 / 51

117 Bayesian Network of All Variables b ij Bin (b ij ; m ij, P ij ) Num of times i beats j out of m ij games Vincent Y. F. Tan (NUS) Ranking Mcmaster University 38 / 51

118 Bayesian Network of All Variables m ij Z ij = min{y si, Y sj } : Latent variables s=1 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 38 / 51

119 Bayesian Network of All Variables ϕ(b) and ϕ (B, Z) : Functions to estimate w Vincent Y. F. Tan (NUS) Ranking Mcmaster University 38 / 51

120 General Lower Bounds on the Bayes Risk For r = 2, can compute the Bayesian Cramér-Rao bound on R B. 6 A. Xu and M. Raginsky, Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation," in IEEE Trans. on Inf. Theory, Vincent Y. F. Tan (NUS) Ranking Mcmaster University 39 / 51

121 General Lower Bounds on the Bayes Risk For r = 2, can compute the Bayesian Cramér-Rao bound on R B. We compute a family of information-theoretic lower bounds: 6 A. Xu and M. Raginsky, Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation," in IEEE Trans. on Inf. Theory, Vincent Y. F. Tan (NUS) Ranking Mcmaster University 39 / 51

122 General Lower Bounds on the Bayes Risk For r = 2, can compute the Bayesian Cramér-Rao bound on R B. We compute a family of information-theoretic lower bounds: 1 Theorem 3 of Xu and Raginsky 6 reads: For any r 1, R B n ( ( V n Γ 1 + n ) ) r/n [ exp r ) I(w; B, Z) h(w) re r n( ], where V n is the volume of the unit ball in (R n, r ). 6 A. Xu and M. Raginsky, Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation," in IEEE Trans. on Inf. Theory, Vincent Y. F. Tan (NUS) Ranking Mcmaster University 39 / 51

123 General Lower Bounds on the Bayes Risk For r = 2, can compute the Bayesian Cramér-Rao bound on R B. We compute a family of information-theoretic lower bounds: 1 Theorem 3 of Xu and Raginsky 6 reads: For any r 1, R B n ( ( V n Γ 1 + n ) ) r/n [ exp r ) I(w; B, Z) h(w) re r n( ], where V n is the volume of the unit ball in (R n, r ). 6 A. Xu and M. Raginsky, Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation," in IEEE Trans. on Inf. Theory, Vincent Y. F. Tan (NUS) Ranking Mcmaster University 39 / 51

124 General Lower Bounds on the Bayes Risk For r = 2, can compute the Bayesian Cramér-Rao bound on R B. We compute a family of information-theoretic lower bounds: 1 Theorem 3 of Xu and Raginsky 6 reads: For any r 1, R B n ( ( V n Γ 1 + n ) ) r/n [ exp r ) I(w; B, Z) h(w) re r n( ], where V n is the volume of the unit ball in (R n, r ). 2 Using Stirling s approximation, we upper bound I(w; B, Z) h(w) = E [log p(w B, Z)]. 6 A. Xu and M. Raginsky, Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation," in IEEE Trans. on Inf. Theory, Vincent Y. F. Tan (NUS) Ranking Mcmaster University 39 / 51

125 Family of Information-Theoretic Lower Bounds Theorem For all i [n], let m i := 1 m ij. half the total num. of games i plays 2 j i Vincent Y. F. Tan (NUS) Ranking Mcmaster University 40 / 51

126 Family of Information-Theoretic Lower Bounds Theorem For all i [n], let m i := 1 m ij. 2 j i half the total num. of games i plays Then, the Bayes risk is asymptotically lower bounded by R B n ( ( V n Γ 1 + n ) ) r/n exp [ r E(B, α, β) ], re r where Vincent Y. F. Tan (NUS) Ranking Mcmaster University 40 / 51

127 Family of Information-Theoretic Lower Bounds Theorem For all i [n], let m i := 1 m ij. 2 j i half the total num. of games i plays Then, the Bayes risk is asymptotically lower bounded by R B n ( ( V n Γ 1 + n ) ) r/n exp [ r E(B, α, β) ], re r where E(B, α, β) := ( ) n 1 2 log (2π) + log β i ψ(α i ) log (α i + m i ). i=1 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 40 / 51

128 Information-Theoretic Lower Bounds Take α i = α and β i = β, for each i [n]. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 41 / 51

129 Information-Theoretic Lower Bounds Take α i = α and β i = β, for each i [n]. For the L 1 norm (r = 1), π R B 2 exp [ (log β ψ(α) + 1) ] n, α/n + m Vincent Y. F. Tan (NUS) Ranking Mcmaster University 41 / 51

130 Information-Theoretic Lower Bounds Take α i = α and β i = β, for each i [n]. For the L 1 norm (r = 1), π R B 2 exp [ (log β ψ(α) + 1) ] n, α/n + m For the squared L 2 norm (r = 2), R B exp [ 2(log β ψ(α)) 1 ] n α/n + m. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 41 / 51

131 Performance of Lower Bounds: L 1 error L1 Error EM algorithm Information theoretic lower bound Sample Size Figure: L 1 error of the EM algo. and the information-theoretic lower bound (for n = 100, α = 5 and β = αn 1). Vincent Y. F. Tan (NUS) Ranking Mcmaster University 42 / 51

132 Perf. of Lower Bounds: MSE (squared L 2 error) MSE EM algorithm BCRB Information theoretic lower bound Sample Size Figure: L 2 error of the EM algo., the IT lower bound and the BCRB (for n = 100, α = 5, and β = αn 1). Vincent Y. F. Tan (NUS) Ranking Mcmaster University 43 / 51

133 Application to General Comparison Graphs Given a fixed budget of m = i j m ij games, how to allocate games among n players to minimize the bounds? Vincent Y. F. Tan (NUS) Ranking Mcmaster University 44 / 51

134 Application to General Comparison Graphs Given a fixed budget of m = i j m ij games, how to allocate games among n players to minimize the bounds? Corollary (Optimal Connected Graphs) Regular Connected Graphs are Optimal! Vincent Y. F. Tan (NUS) Ranking Mcmaster University 44 / 51

135 Application to General Comparison Graphs Proof: Minimizing the lower bound is equivalent to maximizing f ({m i } i [n] ) := n i=1 subject to n i=1 m i = m and m i N. 1 2 log (α i + m i ) Vincent Y. F. Tan (NUS) Ranking Mcmaster University 45 / 51

136 Application to General Comparison Graphs Proof: Minimizing the lower bound is equivalent to maximizing f ({m i } i [n] ) := n i=1 subject to n i=1 m i = m and m i N. Solution given by water-filling formula: 1 2 log (α i + m i ) m i = µ α i +, i [n], where µ > 0 is chosen such that n µ α i + = m. i=1 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 45 / 51

137 Application to General Comparison Graphs Proof: Minimizing the lower bound is equivalent to maximizing f ({m i } i [n] ) := n i=1 subject to n i=1 m i = m and m i N. Solution given by water-filling formula: 1 2 log (α i + m i ) m i = µ α i +, i [n], where µ > 0 is chosen such that n µ α i + = m. i=1 But when α i = α for all i, m i are all equal. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 45 / 51

138 The Gamma Distribution with Fixed β = 1 α i = Greater belief that w i = Games i plays with others m i , = 1, = 2, = 3, = 4, = Vincent Y. F. Tan (NUS) Ranking Mcmaster University 46 / 51

139 Application to Comparison Graphs: Restricted to Trees Vincent Y. F. Tan (NUS) Ranking Mcmaster University 47 / 51

140 Application to Comparison Graphs: Restricted to Trees Corollary (Optimal Tree Graphs) Vincent Y. F. Tan (NUS) Ranking Mcmaster University 47 / 51

141 Application to Comparison Graphs: Restricted to Trees Corollary (Optimal Tree Graphs) 1 Best: Minimizes the (lower bound on the) Bayes Risk 2 Worst: Maximizes the (lower bound on the) Bayes Risk Vincent Y. F. Tan (NUS) Ranking Mcmaster University 47 / 51

142 Application to Comparison Graphs: Restricted to Trees Proof for Star: Maximizing the lower bound on Bayes risk equivalent to min m: g(m) := 1 i m i=m 2 log α + 2m log(α + m i) i i i m i where m = {m i } i [n] and i = 1 is the central node. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 48 / 51

143 Application to Comparison Graphs: Restricted to Trees Proof for Star: Maximizing the lower bound on Bayes risk equivalent to min m: g(m) := 1 i m i=m 2 log α + 2m log(α + m i) i i i m i where m = {m i } i [n] and i = 1 is the central node. Shift part of weight of an edge m 1j > 0, for j 1, to create a new edge with weight m ji such that i 1. Can show that g(m 1,..., m n ) m i > 0 implying that f will be increased by the new configuration. Vincent Y. F. Tan (NUS) Ranking Mcmaster University 48 / 51

144 Effect of Tree Graph Structure on IT Bound Fully Connected Graph Star Graph Chain Graph Random Tree MSE Sample Size Figure: IT bounds for diff. graph structures (for n = 100, α = 5, β = αn 1). Vincent Y. F. Tan (NUS) Ranking Mcmaster University 49 / 51

145 Effect of Tree Graph Structure on BCRB Fully Connected Graph Star Graph Chain Graph Random Tree MSE Sample Size Figure: BCRB for diff. graph structures (for n = 100, α = 5, β = αn 1). Vincent Y. F. Tan (NUS) Ranking Mcmaster University 50 / 51

146 Final Remarks Also derived lower bounds for the home-field advantage scenario: Q ij := θλ i, if i is home, θλ P ij = i + λ j λ i Q ij :=, if j is home, λ i + θλ j where θ R ++ models the strength of advantage (θ > 1) Vincent Y. F. Tan (NUS) Ranking Mcmaster University 51 / 51

147 Final Remarks Also derived lower bounds for the home-field advantage scenario: Q ij := θλ i, if i is home, θλ P ij = i + λ j λ i Q ij :=, if j is home, λ i + θλ j where θ R ++ models the strength of advantage (θ > 1) Future works: Matching information-theoretic upper bounds Vincent Y. F. Tan (NUS) Ranking Mcmaster University 51 / 51

148 Final Remarks Also derived lower bounds for the home-field advantage scenario: Q ij := θλ i, if i is home, θλ P ij = i + λ j λ i Q ij :=, if j is home, λ i + θλ j where θ R ++ models the strength of advantage (θ > 1) Future works: Matching information-theoretic upper bounds Other questions related to comparing graph structure, e.g., Vincent Y. F. Tan (NUS) Ranking Mcmaster University 51 / 51

149 Final Remarks Also derived lower bounds for the home-field advantage scenario: Q ij := θλ i, if i is home, θλ P ij = i + λ j λ i Q ij :=, if j is home, λ i + θλ j where θ R ++ models the strength of advantage (θ > 1) Future works: Matching information-theoretic upper bounds Other questions related to comparing graph structure, e.g., Does the fully-connected graph outperform a simple cycle?" Vincent Y. F. Tan (NUS) Ranking Mcmaster University 51 / 51

150 Final Remarks Also derived lower bounds for the home-field advantage scenario: Q ij := θλ i, if i is home, θλ P ij = i + λ j λ i Q ij :=, if j is home, λ i + θλ j where θ R ++ models the strength of advantage (θ > 1) Future works: Matching information-theoretic upper bounds Other questions related to comparing graph structure, e.g., Does the fully-connected graph outperform a simple cycle?" M. Alsan, R. Prasad and VYFT, Lower Bounds on the Bayes Risk of the Bayesian BTL Model with Applications to Comparison Graphs, IEEE J. on Sel. Topics of Sig. Proc., Oct 2018 Vincent Y. F. Tan (NUS) Ranking Mcmaster University 51 / 51

Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking

Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking Yuxin Chen Electrical Engineering, Princeton University Joint work with Jianqing Fan, Cong Ma and Kaizheng Wang Ranking A fundamental