Research Program. Romain COUILLET. 11 février CentraleSupélec Université Paris-Sud 11 1 / 38

Size: px

Start display at page:

Download "Research Program. Romain COUILLET. 11 février CentraleSupélec Université Paris-Sud 11 1 / 38"

Cathleen King
5 years ago
Views:

1 Research Program Romain COUILLET CentraleSupélec Université Paris-Sud février / 38

2 Outline Curriculum Vitae Research Project : Learning in Large Dimensions Axis 1 : Robust Estimation in Large Dimensions Axis 2 : Classification in Large Dimensions Axis 3 : Random Matrices and Neural Networks Axis 4 : Graphs 2 / 38

3 Curriculum Vitae/ 3/38 Outline Curriculum Vitae Research Project : Learning in Large Dimensions Axis 1 : Robust Estimation in Large Dimensions Axis 2 : Classification in Large Dimensions Axis 3 : Random Matrices and Neural Networks Axis 4 : Graphs 3 / 38

4 Curriculum Vitae/ 4/38 Education and Professional Experience Professional Experience Full Professor since Jan CentraleSupélec, Gif sur Yvette, France. Telecom Department, LANEAS group, Dvision Signals & Stats Diplomas Habilitation à Diriger des Recherches Feb Place University Paris Saclay, France Topic Robust Estimation in the Large Random Matrix Regime PhD in Physics Nov Place CentraleSupélec, Gif sur Yvette, France Topic Application of Random Matrix Theory to Future Wireless Flexible Networks Advis. Mérouane Debbah Engineer and Master Diplomas Mar Place Telecom ParisTech, Paris, France Grade Very Good (Très Bien) Topic Communications, embedded systems, computer science. 4 / 38

5 Curriculum Vitae/ 5/38 Teaching Activities and Research Projects Teaching Activities ENS Cachan, Cachan, France since 2013 Courses Master MVA, 18 hrs/year CentraleSupélec, Gif sur Yvette, France since 2011 Courses PhD level, 18 hrs/year Master SAR, research seminars, 24 hrs/year Undergraduate, lectures + practical courses, 70 hrs/year Advising Interns, undergraduate projects, 8/year Research : Projects HUAWEI RMTin5G 100% (PI) ANR RMT4GRAPH 100% (PI) ERC MORE 50% ANR DIONISOS 25% Research : Community Life Special Session organizations 4 IEEE Senior Member since 2015 IEEE SPTM technical committee member since 2014 IEEE TSP Associate Editor since 2015 Member of GRETSI since / 38

6 Curriculum Vitae/ 6/38 PhD students Axel MÜLLER (research engineer at HUAWEI Labs, Paris) Subject Random matrix models for multi-cellular wireless communications Advising 50%, with M. Debbah (CentraleSupélec) Publications 3 articles in IEEE-JSTSP, -TIT, -TSP, 5 IEEE conferences Awards 1 best student paper award. Julia VINOGRADOVA (postdoc at Linköping University, Sweden) Subject Random matrix theory applied to detection and estimation in antenna arrays Advising 50%, with W. Hachem (Telecom ParisTech) Publications 2 articles in IEEE-TSP, 2 IEEE conferences Azary ABBOUD (postdoc at INRIA, France) Subject Distributed optimization in smart grids Advising 33%, with M. Debbah and H. Siguerdidjane (CentraleSupélec) Publications 1 article in IEEE-TSP, 1 IEEE conference Gil KATZ Subject Interactive communications for distributed computation Advising 33%, with M. Debbah and P. Piantanida (CentraleSupélec) Publications 1 IEEE conference Hafiz TIOMOKO ALI Subject Random matrices in machine learning Advising 100% Publications 2 IEEE conferences. 6 / 38

7 Curriculum Vitae/ 7/38 Research Activities Publication Record (as of February 1st, 2016) Publications Books : 1, Chapters : 3, Journals : 36, Conferences : 53, Patents : 4. Citations 1256 (five best : 282, 204, 84, 49, 33) Indices h-index : 17, i10-index : 25 Subjects Mathematics Applications random matrix theory, statistics machine learning, signal processing, communications Journal Articles per Area 8 6 Telecom Signal Maths/stats Learning soumis 7 / 38

8 Curriculum Vitae/ 8/38 Research Activities Prizes and Awards IEEE Senior Member 2016 CNRS Bronze Medal (section INS2I) 2013 IEEE ComSoc Outstanding Young Researcher Award (EMEA region) 2013 EEA/GdR ISIS/GRETSI PhD thesis award 2011 Paper Awards Second prize of the IEEE Australia Council Student Paper Contest 2013 Best Student Paper Award Final of the IEEE Asilomar Conference 2011 Best Student Paper Award of the ValueTools Conference / 38

9 Research Project : Learning in Large Dimensions/ 9/38 Outline Curriculum Vitae Research Project : Learning in Large Dimensions Axis 1 : Robust Estimation in Large Dimensions Axis 2 : Classification in Large Dimensions Axis 3 : Random Matrices and Neural Networks Axis 4 : Graphs 9 / 38

10 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 10 / 38

11 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 1. dramatic increase of data dimension (and number) 10 / 38

12 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 1. dramatic increase of data dimension (and number) 2. importance of outlying and missing data 10 / 38

13 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 1. dramatic increase of data dimension (and number) 2. importance of outlying and missing data 3. data heterogeneity 10 / 38

14 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 1. dramatic increase of data dimension (and number) 2. importance of outlying and missing data 3. data heterogeneity Limitations of classical tools 10 / 38

15 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 1. dramatic increase of data dimension (and number) 2. importance of outlying and missing data 3. data heterogeneity Limitations of classical tools 1. classical statistics limited by small but numerous data hypothesis 10 / 38

16 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 1. dramatic increase of data dimension (and number) 2. importance of outlying and missing data 3. data heterogeneity Limitations of classical tools 1. classical statistics limited by small but numerous data hypothesis 2. techniques relying on poorly robust empirical estimates or barely usable robust approaches 10 / 38

17 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 1. dramatic increase of data dimension (and number) 2. importance of outlying and missing data 3. data heterogeneity Limitations of classical tools 1. classical statistics limited by small but numerous data hypothesis 2. techniques relying on poorly robust empirical estimates or barely usable robust approaches 3. bipolarity between : powerful techniques based on models (signal processing approach) ad-hoc techniques based on data (machine learning/stats approach) 10 / 38

18 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 1. dramatic increase of data dimension (and number) 2. importance of outlying and missing data 3. data heterogeneity Limitations of classical tools 1. classical statistics limited by small but numerous data hypothesis 2. techniques relying on poorly robust empirical estimates or barely usable robust approaches 3. bipolarity between : powerful techniques based on models (signal processing approach) ad-hoc techniques based on data (machine learning/stats approach) Our approach 10 / 38

19 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 1. dramatic increase of data dimension (and number) 2. importance of outlying and missing data 3. data heterogeneity Limitations of classical tools 1. classical statistics limited by small but numerous data hypothesis 2. techniques relying on poorly robust empirical estimates or barely usable robust approaches 3. bipolarity between : powerful techniques based on models (signal processing approach) ad-hoc techniques based on data (machine learning/stats approach) Our approach 1. development of methods and mathematical tools to handle large and numerous datasets 10 / 38

20 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 1. dramatic increase of data dimension (and number) 2. importance of outlying and missing data 3. data heterogeneity Limitations of classical tools 1. classical statistics limited by small but numerous data hypothesis 2. techniques relying on poorly robust empirical estimates or barely usable robust approaches 3. bipolarity between : powerful techniques based on models (signal processing approach) ad-hoc techniques based on data (machine learning/stats approach) Our approach 1. development of methods and mathematical tools to handle large and numerous datasets 2. revisit robust statistics in large dimensions 10 / 38

21 Research Project : Learning in Large Dimensions/ 10/38 Context The BigData Challenge 1. dramatic increase of data dimension (and number) 2. importance of outlying and missing data 3. data heterogeneity Limitations of classical tools 1. classical statistics limited by small but numerous data hypothesis 2. techniques relying on poorly robust empirical estimates or barely usable robust approaches 3. bipolarity between : powerful techniques based on models (signal processing approach) ad-hoc techniques based on data (machine learning/stats approach) Our approach 1. development of methods and mathematical tools to handle large and numerous datasets 2. revisit robust statistics in large dimensions 3. (for lack of better approach) better understand and improve ad-hoc techniques on simple but large dimensional models. 10 / 38

22 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 11/38 Outline Curriculum Vitae Research Project : Learning in Large Dimensions Axis 1 : Robust Estimation in Large Dimensions Axis 2 : Classification in Large Dimensions Axis 3 : Random Matrices and Neural Networks Axis 4 : Graphs 11 / 38

23 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 12/38 Axis 1 : Robust Estimation in Large Dimensions Baseline Scenario : x 1,..., x n R p i.i.d. with E[x 1 ] = 0, E[x 1 x 1 ] = Cp, but potentially heavy tailed existence of outliers 12 / 38

24 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 12/38 Axis 1 : Robust Estimation in Large Dimensions Baseline Scenario : x 1,..., x n R p i.i.d. with E[x 1 ] = 0, E[x 1 x 1 ] = Cp, but potentially heavy tailed existence of outliers Several Estimators for C p 12 / 38

25 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 12/38 Axis 1 : Robust Estimation in Large Dimensions Baseline Scenario : x 1,..., x n R p i.i.d. with E[x 1 ] = 0, E[x 1 x 1 ] = Cp, but potentially heavy tailed existence of outliers Several Estimators for C p ML estimator in Gaussian case : sample covariance matrix Ĉ p = 1 n x i x i n. i=1 very practical, used in many methods but very sensitive to outliers 12 / 38

26 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 12/38 Axis 1 : Robust Estimation in Large Dimensions Baseline Scenario : x 1,..., x n R p i.i.d. with E[x 1 ] = 0, E[x 1 x 1 ] = Cp, but potentially heavy tailed existence of outliers Several Estimators for C p ML estimator in Gaussian case : sample covariance matrix Ĉ p = 1 n x i x i n. i=1 very practical, used in many methods but very sensitive to outliers [Huber 67 ; Maronna 76] Robust Estimators (outliers or heavy tails) Ĉ p = 1 n ( ) 1 u n p x i Ĉ 1 p x i x i x i. i=1 12 / 38

27 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 12/38 Axis 1 : Robust Estimation in Large Dimensions Baseline Scenario : x 1,..., x n R p i.i.d. with E[x 1 ] = 0, E[x 1 x 1 ] = Cp, but potentially heavy tailed existence of outliers Several Estimators for C p ML estimator in Gaussian case : sample covariance matrix Ĉ p = 1 n x i x i n. i=1 very practical, used in many methods but very sensitive to outliers [Huber 67 ; Maronna 76] Robust Estimators (outliers or heavy tails) Ĉ p = 1 n ( ) 1 u n p x i Ĉ 1 p x i x i x i. i=1 [Pascal 13 ; Chen 11] Regularized Versions for Large Data (all n, p), Ĉ p(ρ) = (1 ρ) 1 n x i x i + ρi n 1 N. i=1 p x i Ĉ 1 p (ρ)x i 12 / 38

28 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 13/38 Axis 1 : Robust Estimation in Large Dimensions Problem and Objectives 13 / 38

29 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 13/38 Axis 1 : Robust Estimation in Large Dimensions Problem and Objectives Ill-understood estimators, difficult to use (implicit definition of Ĉp) 13 / 38

30 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 13/38 Axis 1 : Robust Estimation in Large Dimensions Problem and Objectives Ill-understood estimators, difficult to use (implicit definition of Ĉp) Only known results for fixed p and n : not appropriate in BigData. 13 / 38

31 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 13/38 Axis 1 : Robust Estimation in Large Dimensions Problem and Objectives Ill-understood estimators, difficult to use (implicit definition of Ĉp) Only known results for fixed p and n : not appropriate in BigData. We thus need : study Ĉp as n, p exploit the double-concentration effect to better understand Ĉp. 13 / 38

32 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 13/38 Axis 1 : Robust Estimation in Large Dimensions Problem and Objectives Ill-understood estimators, difficult to use (implicit definition of Ĉp) Only known results for fixed p and n : not appropriate in BigData. We thus need : study Ĉp as n, p exploit the double-concentration effect to better understand Ĉp. Results and Perspectives Asymptotic approximation of Ĉp by a tractable equivalent model Second order statistics (CLT type) for Ĉp Study of elliptical cases, outliers, regularized or not. Applications : radar array processing (impulsiveness due to clutter) financial data processing 13 / 38

33 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 13/38 Axis 1 : Robust Estimation in Large Dimensions Problem and Objectives Ill-understood estimators, difficult to use (implicit definition of Ĉp) Only known results for fixed p and n : not appropriate in BigData. We thus need : study Ĉp as n, p exploit the double-concentration effect to better understand Ĉp. Results and Perspectives Asymptotic approximation of Ĉp by a tractable equivalent model Second order statistics (CLT type) for Ĉp Study of elliptical cases, outliers, regularized or not. Applications : radar array processing (impulsiveness due to clutter) financial data processing Joint mean and covariance estimation Study of robust regression More generally, deeper study of iterative methods in large dimensions (such as AMP). 13 / 38

34 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 14/38 Axis 1 : Robust Estimation in Large Dimensions Theorem ([Couillet,Pascal,Silverstein 15] Maronna Estimator) For x i = τ i w i, with τ i impulsive, w i orthogonal and isotropic, w i = p, Ĉp Ŝp p.s. 0 in spectral norm, where Ĉ p = 1 n ( ) 1 u n p x i Ĉ 1 p x i x i x i i=1 Ŝ p = 1 n v(τ i γ p)x i x i n i=1 with v(t) similar to u(t) and γ p unique solution of 1 = 1 n γ pv(τ i γ p) n 1 + cγ j=1 pv(τ i γ. p) 14 / 38

35 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 15/38 Axis 1 : Robust Estimation in Large Dimensions Consequences 15 / 38

36 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 15/38 Axis 1 : Robust Estimation in Large Dimensions Consequences Difficult object Ĉp made tractable thanks to Ŝp 15 / 38

37 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 15/38 Axis 1 : Robust Estimation in Large Dimensions Consequences Difficult object Ĉp made tractable thanks to Ŝp Analysis and optimization possible by replacing Ĉp by Ŝp. 15 / 38

38 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 15/38 Axis 1 : Robust Estimation in Large Dimensions Consequences Difficult object Ĉp made tractable thanks to Ŝp Analysis and optimization possible by replacing Ĉp by Ŝp. 3 Eigenvalues of Ĉp Figure n = 2500, p = 500, C p = diag(i 125, 3I 125, 10I 250), τ i Γ(.5, 2) i.i.d. 15 / 38

39 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 15/38 Axis 1 : Robust Estimation in Large Dimensions Consequences Difficult object Ĉp made tractable thanks to Ŝp Analysis and optimization possible by replacing Ĉp by Ŝp. 3 Eigenvalues of Ĉp Eigenvalues of Ŝp Figure n = 2500, p = 500, C p = diag(i 125, 3I 125, 10I 250), τ i Γ(.5, 2) i.i.d. 15 / 38

40 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 15/38 Axis 1 : Robust Estimation in Large Dimensions Consequences Difficult object Ĉp made tractable thanks to Ŝp Analysis and optimization possible by replacing Ĉp by Ŝp. 3 Eigenvalues of Ĉp Eigenvalues of Ŝp Limiting density Figure n = 2500, p = 500, C p = diag(i 125, 3I 125, 10I 250), τ i Γ(.5, 2) i.i.d. 15 / 38

41 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 16/38 Axis 1 : Robust Estimation in Large Dimensions Application to Detection in Radars 16 / 38

42 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 16/38 Axis 1 : Robust Estimation in Large Dimensions Application to Detection in Radars Hypothesis testing under impulsive noise : purely noisy inputs x 1,..., x n, x i = τ i w i, new datum { τw, H0 y = s + τw, H 1 16 / 38

43 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 16/38 Axis 1 : Robust Estimation in Large Dimensions Application to Detection in Radars Hypothesis testing under impulsive noise : purely noisy inputs x 1,..., x n, x i = τ i w i, new datum { τw, H0 y = s + τw, H 1 Robust detector T p(ρ) given by T p(ρ) H 1 H 0 γ p where y Ĉ 1 p (ρ)s T p(ρ) = y Ĉ 1 p (ρ)y p Ĉ 1 p (ρ)p Ĉ p(ρ) = (1 ρ) 1 n x i x i + ρi n 1 p. i=1 p x i Ĉp(ρ) 1 x i 16 / 38

44 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 16/38 Axis 1 : Robust Estimation in Large Dimensions Application to Detection in Radars Hypothesis testing under impulsive noise : purely noisy inputs x 1,..., x n, x i = τ i w i, new datum { τw, H0 y = s + τw, H 1 Robust detector T p(ρ) given by T p(ρ) H 1 H 0 γ p where y Ĉ 1 p (ρ)s T p(ρ) = y Ĉ 1 p (ρ)y p Ĉ 1 p (ρ)p Ĉ p(ρ) = (1 ρ) 1 n x i x i + ρi n 1 p. i=1 p x i Ĉp(ρ) 1 x i Objectives performance analysis find optimal regularization ρ parameter. 16 / 38

45 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 17/38 Axis 1 : Robust Estimation in Large Dimensions Théorie Theory Estimateur empirique Empirical Estimator Détecteur Detector P ( ptp(ρ) > γ) γ = γ = ρ γ = γ = ρ Figure False alarm rate P ( pt p(ρ) > γ), for p = 20 (left), p = 100 (right), s = p 1 2 [1,..., 1] T, [C p] ij = 0.7 i j, p/n = 1/2. 17 / 38

46 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 18/38 Axis 1 : Robust Estimation in Large Dimensions 10 0 Theory Detector p = 20 p = Figure False alarm rate P (T p(ˆρ p ) > Γ), ˆρ p best estimated ρ, for p = 20 and p = 100, s = p 1 2 [1,..., 1] T, p/n = 1/2 and [C p] ij = 0.7 i j. Γ 18 / 38

47 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 19/38 Axis 1 : Robust Estimation in Large Dimensions Theoretical Results (clickable title links) R. Couillet, M. McKay, Large Dimensional Analysis and Optimization of Robust Shrinkage Covariance Matrix Estimators, Elsevier Journal of Multivariate Analysis, vol. 131, pp , R. Couillet, F. Pascal, J. W. Silverstein, The Random Matrix Regime of Maronna s M-estimator with elliptically distributed samples, Elsevier Journal of Multivariate Analysis, vol. 139, pp , D. Morales-Jimenez, R. Couillet, M. McKay, Large Dimensional Analysis of Robust M-Estimators of Covariance with Outliers, IEEE Transactions on Signal Processing, vol. 63, no. 21, pp , R. Couillet, A. Kammoun, F. Pascal, Second order statistics of robust estimators of scatter. Application to GLRT detection for elliptical signals, Elsevier Journal of Multivariate Analysis, vol. 143, pp , Applications (clickable title links) R. Couillet, A. Kammoun, F. Pascal, Second order statistics of robust estimators of scatter. Application to GLRT detection for elliptical signals, Elsevier Journal of Multivariate Analysis, vol. 143, pp , R. Couillet, Robust spiked random matrices and a robust G-MUSIC estimator, Elsevier Journal of Multivariate Analysis, vol. 140, pp , / 38

48 Research Project : Learning in Large Dimensions/Axis 1 : Robust Estimation in Large Dimensions 20/38 Axis 1 : Robust Estimation in Large Dimensions L. Yang, R. Couillet, M. McKay, A Robust Statistics Approach to Minimum Variance Portfolio Optimization IEEE Transactions on Signal Processing, vol. 63, no. 24, pp , A. Kammoun, R. Couillet, F. Pascal, M.-S. Alouini, Optimal Design of the Adaptive Normalized Matched Filter Detector (submitted to) IEEE Transactions on Information Theory, 2015, arxiv Preprint / 38

49 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 21/38 Outline Curriculum Vitae Research Project : Learning in Large Dimensions Axis 1 : Robust Estimation in Large Dimensions Axis 2 : Classification in Large Dimensions Axis 3 : Random Matrices and Neural Networks Axis 4 : Graphs 21 / 38

50 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 22/38 Axis 2 : Classification in Large Dimensions Baseline Scenario : x 1,..., x n R p belonging to k classes C 1,..., C k to identify in supervised manner : numerous labelled data (e.g., support vector machine) in unsupervised manner : no labelled data (e.g., kernel spectral clustering) in semi-supervised manner : (few) labelled data (e.g., harmonic function method). 22 / 38

51 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 22/38 Axis 2 : Classification in Large Dimensions Baseline Scenario : x 1,..., x n R p belonging to k classes C 1,..., C k to identify in supervised manner : numerous labelled data (e.g., support vector machine) in unsupervised manner : no labelled data (e.g., kernel spectral clustering) in semi-supervised manner : (few) labelled data (e.g., harmonic function method). Spectral Algorithms 22 / 38

52 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 22/38 Axis 2 : Classification in Large Dimensions Baseline Scenario : x 1,..., x n R p belonging to k classes C 1,..., C k to identify in supervised manner : numerous labelled data (e.g., support vector machine) in unsupervised manner : no labelled data (e.g., kernel spectral clustering) in semi-supervised manner : (few) labelled data (e.g., harmonic function method). Spectral Algorithms Data often non linearly separable 22 / 38

53 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 22/38 Axis 2 : Classification in Large Dimensions Baseline Scenario : x 1,..., x n R p belonging to k classes C 1,..., C k to identify in supervised manner : numerous labelled data (e.g., support vector machine) in unsupervised manner : no labelled data (e.g., kernel spectral clustering) in semi-supervised manner : (few) labelled data (e.g., harmonic function method). Spectral Algorithms Data often non linearly separable Numerous methods based on kernel matrices K R n n, with (for instance) K ij = f ( x i x j 2) and f some function (often decreasing). 22 / 38

54 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 22/38 Axis 2 : Classification in Large Dimensions Baseline Scenario : x 1,..., x n R p belonging to k classes C 1,..., C k to identify in supervised manner : numerous labelled data (e.g., support vector machine) in unsupervised manner : no labelled data (e.g., kernel spectral clustering) in semi-supervised manner : (few) labelled data (e.g., harmonic function method). Spectral Algorithms Data often non linearly separable Numerous methods based on kernel matrices K R n n, with (for instance) K ij = f ( x i x j 2) and f some function (often decreasing). Spectral methods consist in : extracting dominating eigenvectors of K (spectral clustering) solve optimization problem based on K (support vector machine) linear functional of K (semi-supervised methods) 22 / 38

55 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 23/38 Axis 2 : Classification in Large Dimensions Problems and Objectives 23 / 38

56 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 23/38 Axis 2 : Classification in Large Dimensions Problems and Objectives Matrix K very difficult to analyze for small dimensional x i (even Gaussian) Qualitative understanding of the tools, difficult to optimize. 23 / 38

57 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 23/38 Axis 2 : Classification in Large Dimensions Problems and Objectives Matrix K very difficult to analyze for small dimensional x i (even Gaussian) Qualitative understanding of the tools, difficult to optimize. We need here : study K as n, p exploit the double-concentration to better understand K deduce quantitative performance of learning methods improve performances as well as methods. 23 / 38

58 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 23/38 Axis 2 : Classification in Large Dimensions Problems and Objectives Matrix K very difficult to analyze for small dimensional x i (even Gaussian) Qualitative understanding of the tools, difficult to optimize. We need here : study K as n, p exploit the double-concentration to better understand K deduce quantitative performance of learning methods improve performances as well as methods. Results and Perspectives Development of tools for large dimensional analysis of kernel matrices. Thorough analysis of spectral clustering performance in (large) Gaussian mixtures. Optimization for subspace clustering (new approach, undergoing patent by HUAWEI). 23 / 38

59 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 23/38 Axis 2 : Classification in Large Dimensions Problems and Objectives Matrix K very difficult to analyze for small dimensional x i (even Gaussian) Qualitative understanding of the tools, difficult to optimize. We need here : study K as n, p exploit the double-concentration to better understand K deduce quantitative performance of learning methods improve performances as well as methods. Results and Perspectives Development of tools for large dimensional analysis of kernel matrices. Thorough analysis of spectral clustering performance in (large) Gaussian mixtures. Optimization for subspace clustering (new approach, undergoing patent by HUAWEI). Generalization to semi-supervised case. Study of support vector machines in this context. Generalization to more realistic models, deeper comparison to real datasets. 23 / 38

60 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 24/38 Axis 2 : Classification in Large Dimensions Model : Consider Laplacian matrix (core of Ng Weiss Jordan algorithm) L = nd 1 2 KD 1 2 n D n1 T nd T n D1n where D = diag(k1 n) and x i C a x i N (µ a, 1 Ca), Ca = na. p 24 / 38

61 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 24/38 Axis 2 : Classification in Large Dimensions Model : Consider Laplacian matrix (core of Ng Weiss Jordan algorithm) L = nd 1 2 KD 1 2 n D n1 T n D T n D1n where D = diag(k1 n) and x i C a x i N (µ a, 1 Ca), Ca = na. p Theorem ([Couillet,Benaych 16] Equivalent to Laplacian Matrix) a.s. As n, p, under appropriate hypotheses, L ˆL 0 with ˆL = 2 f [ ] (τ) 1 f(τ) p P W T W P + UBU T + α(τ)i n where τ = 2 n a a np tr Ca, W = [w 1,..., w n] (x i = µ a + w i ), P = I n 1 n 1n1T n, [ ] [ ] 1 B11 U = p J, Φ, ψ, B = ( 5f B 11 = M T (τ) M + 8f(τ) f ) (τ) 2f tt T f (τ) (τ) f (τ) T + p f(τ)α(τ) n 2f (τ) 1 k1 T k. 24 / 38

62 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 24/38 Axis 2 : Classification in Large Dimensions Model : Consider Laplacian matrix (core of Ng Weiss Jordan algorithm) L = nd 1 2 KD 1 2 n D n1 T n D T nd1 n where D = diag(k1 n) and x i C a x i N (µ a, 1 Ca), Ca = na. p Theorem ([Couillet,Benaych 16] Equivalent to Laplacian Matrix) a.s. As n, p, under appropriate hypotheses, L ˆL 0 with ˆL = 2 f [ ] (τ) 1 f(τ) p P W T W P + UBU T + α(τ)i n where τ = 2 n a a np tr Ca, W = [w 1,..., w n] (x i = µ a + w i ), P = I n 1 n 1n1T n, [ ] [ ] 1 B11 U = p J, Φ, ψ, B = ( 5f B 11 = M T (τ) M + 8f(τ) f ) (τ) 2f tt T f (τ) (τ) f (τ) T + p f(τ)α(τ) n 2f (τ) 1 k1 T k. Important Notations : 1 p J = [j 1,..., j k ], j a canonical vector of class C a. 24 / 38

63 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 24/38 Axis 2 : Classification in Large Dimensions Model : Consider Laplacian matrix (core of Ng Weiss Jordan algorithm) L = nd 1 2 KD 1 2 n D n1 T nd T nd1 n where D = diag(k1 n) and x i C a x i N (µ a, 1 Ca), Ca = na. p Theorem ([Couillet,Benaych 16] Equivalent to Laplacian Matrix) a.s. As n, p, under appropriate hypotheses, L ˆL 0 with ˆL = 2 f [ ] (τ) 1 f(τ) p P W T W P + UBU T + α(τ)i n where τ = 2 n a a np tr Ca, W = [w 1,..., w n] (x i = µ a + w i ), P = I n 1 n 1n1T n, [ ] [ ] 1 B11 U = p J, Φ, ψ, B = ( 5f B 11 = M T (τ) M + 8f(τ) f ) (τ) 2f tt T f (τ) (τ) f (τ) T + p f(τ)α(τ) n 2f (τ) 1 k1 T k. Important Notations : M = [µ 1,..., µ k ], µ a = µa k n b b=1 n µ b. 24 / 38

64 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 24/38 Axis 2 : Classification in Large Dimensions Model : Consider Laplacian matrix (core of Ng Weiss Jordan algorithm) L = nd 1 2 KD 1 2 n D n1 T n D T nd1 n where D = diag(k1 n) and x i C a x i N (µ a, 1 Ca), Ca = na. p Theorem ([Couillet,Benaych 16] Equivalent to Laplacian Matrix) a.s. As n, p, under appropriate hypotheses, L ˆL 0 with ˆL = 2 f [ ] (τ) 1 f(τ) p P W T W P + UBU T + α(τ)i n where τ = 2 n a a np tr Ca, W = [w 1,..., w n] (x i = µ a + w i ), P = I n 1 n 1n1T n, [ ] [ ] 1 B11 U = p J, Φ, ψ, B = ( 5f B 11 = M T (τ) M + 8f(τ) f ) (τ) 2f tt T f (τ) (τ) f (τ) T + p f(τ)α(τ) n 2f (τ) 1 k1 T k. Important [ Notations : ] t = p 1 tr C1,..., p 1 tr Ck, Ca = Ca k n b b=1 n C b. 24 / 38

65 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 24/38 Axis 2 : Classification in Large Dimensions Model : Consider Laplacian matrix (core of Ng Weiss Jordan algorithm) L = nd 1 2 KD 1 2 n D n1 T n D T nd1 n where D = diag(k1 n) and x i C a x i N (µ a, 1 Ca), Ca = na. p Theorem ([Couillet,Benaych 16] Equivalent to Laplacian Matrix) a.s. As n, p, under appropriate hypotheses, L ˆL 0 with ˆL = 2 f [ ] (τ) 1 f(τ) p P W T W P + UBU T + α(τ)i n where τ = 2 n a a np tr Ca, W = [w 1,..., w n] (x i = µ a + w i ), P = I n 1 n 1n1T n, [ ] [ ] 1 B11 U = p J, Φ, ψ, B = ( 5f B 11 = M T (τ) M + 8f(τ) f ) (τ) 2f tt T f (τ) (τ) f (τ) T + p f(τ)α(τ) n 2f (τ) 1 k1 T k. Important Notations : { } k T = 1 p tr C a C b a,b=1, C a = Ca k b=1 n b n C b. 24 / 38

66 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 25/38 Axis 2 : Classification in Large Dimensions Consequences : 25 / 38

67 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 25/38 Axis 2 : Classification in Large Dimensions Consequences : thorough understanding of eigenvector structure 25 / 38

68 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 25/38 Axis 2 : Classification in Large Dimensions Consequences : thorough understanding of eigenvector structure important consequences to kernel choice (depends on derivatives f (l) (τ)) 25 / 38

69 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 25/38 Axis 2 : Classification in Large Dimensions Consequences : thorough understanding of eigenvector structure important consequences to kernel choice (depends on derivatives f (l) (τ)) Application to real data : MNIST database (µ a, C a evaluated from full database) Figure Four leading eigenvectors of D 2 1 KD 1 2 for MNIST dataset (red), equivalent Gaussian model (black), and asymptotic results (blue). 25 / 38

70 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 25/38 Axis 2 : Classification in Large Dimensions Consequences : thorough understanding of eigenvector structure important consequences to kernel choice (depends on derivatives f (l) (τ)) Application to real data : MNIST database (µ a, C a evaluated from full database) Figure Four leading eigenvectors of D 2 1 KD 1 2 for MNIST dataset (red), equivalent Gaussian model (black), and asymptotic results (blue). 25 / 38

understanding of eigenvector structure important consequences to kernel choice

a evaluated from full database) Figure Four leading eigenvectors of D 2 1 KD 1 2 for

71 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 25/38 Axis 2 : Classification in Large Dimensions Consequences : thorough understanding of eigenvector structure important consequences to kernel choice (depends on derivatives f (l) (τ)) Application to real data : MNIST database (µ a, C a evaluated from full database) Figure Four leading eigenvectors of D 2 1 KD 1 2 for MNIST dataset (red), equivalent Gaussian model (black), and asymptotic results (blue). 25 / 38

72 Research Project : Learning in Large Dimensions/Axis 2 : Classification in Large Dimensions 26/38 Axis 2 : Classification in Large Dimensions Eigenvector 2/Eigenvector 1 Eigenvector 3/Eigenvector 2 Figure 2D plot of eigenvectors of L, MNIST database. Theoretical 1-σ and 2-σ standard deviations in blue. Classes 0, 1, 2 in colors. 26 / 38

73 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 27/38 Outline Curriculum Vitae Research Project : Learning in Large Dimensions Axis 1 : Robust Estimation in Large Dimensions Axis 2 : Classification in Large Dimensions Axis 3 : Random Matrices and Neural Networks Axis 4 : Graphs 27 / 38

74 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 28/38 Axis 3 : Random Matrices and Neural Networks Baseline Scenario : Study of large recurrent neural nets (RNN and ESN) 28 / 38

75 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 28/38 Axis 3 : Random Matrices and Neural Networks Baseline Scenario : Study of large recurrent neural nets (RNN and ESN) dynamical model x t = S (W x t 1 + mu t + ηε t) with n-node network with connectivity W R n n activation function S internal noise ε t (biological model essentially) 28 / 38

76 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 28/38 Axis 3 : Random Matrices and Neural Networks Baseline Scenario : Study of large recurrent neural nets (RNN and ESN) dynamical model x t = S (W x t 1 + mu t + ηε t) with n-node network with connectivity W R n n activation function S internal noise ε t (biological model essentially) readout ω R n training only (depth-1 NN) by LS regression ω = (XX T ) 1 Xr for T -long training u r R T and X = [x 1,..., x T ]. 28 / 38

77 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 28/38 Axis 3 : Random Matrices and Neural Networks Baseline Scenario : Study of large recurrent neural nets (RNN and ESN) dynamical model x t = S (W x t 1 + mu t + ηε t) with n-node network with connectivity W R n n activation function S internal noise ε t (biological model essentially) readout ω R n training only (depth-1 NN) by LS regression ω = (XX T ) 1 Xr for T -long training u r R T and X = [x 1,..., x T ]. Performance Measures : quadratic errors in training and testing memory, training MSE = r X T ω for training couples u r R T. 28 / 38

78 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 28/38 Axis 3 : Random Matrices and Neural Networks Baseline Scenario : Study of large recurrent neural nets (RNN and ESN) dynamical model x t = S (W x t 1 + mu t + ηε t) with n-node network with connectivity W R n n activation function S internal noise ε t (biological model essentially) readout ω R n training only (depth-1 NN) by LS regression ω = (XX T ) 1 Xr for T -long training u r R T and X = [x 1,..., x T ]. Performance Measures : quadratic errors in training and testing memory, training MSE = r X T ω for training couples u r R T. generalization, test MSE = ˆr ˆX T ω for test couples û ˆr R ˆT, ω = ω(u, r). 28 / 38

79 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 29/38 Axis 3 : Random Matrices and Neural Networks Problems and Objectives 29 / 38

80 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 29/38 Axis 3 : Random Matrices and Neural Networks Problems and Objectives Performance evaluation essentially qualitative 29 / 38

81 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 29/38 Axis 3 : Random Matrices and Neural Networks Problems and Objectives Performance evaluation essentially qualitative Difficulty linked to randomness in W and ε t 29 / 38

82 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 29/38 Axis 3 : Random Matrices and Neural Networks Problems and Objectives Performance evaluation essentially qualitative Difficulty linked to randomness in W and ε t We need here : Study asymptotic performances as n, T, ˆT Understand effects of W -defining hyper-parameters Generalize study to more advanced models. 29 / 38

83 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 29/38 Axis 3 : Random Matrices and Neural Networks Problems and Objectives Performance evaluation essentially qualitative Difficulty linked to randomness in W and ε t We need here : Study asymptotic performances as n, T, ˆT Understand effects of W -defining hyper-parameters Generalize study to more advanced models. Results and Perspectives 29 / 38

84 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 29/38 Axis 3 : Random Matrices and Neural Networks Problems and Objectives Performance evaluation essentially qualitative Difficulty linked to randomness in W and ε t We need here : Study asymptotic performances as n, T, ˆT Understand effects of W -defining hyper-parameters Generalize study to more advanced models. Results and Perspectives Asymptotic deterministic equivalents for training and testing MSE Multiples new consequences and intuitions Proposition of improved structures for W 29 / 38

85 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 29/38 Axis 3 : Random Matrices and Neural Networks Problems and Objectives Performance evaluation essentially qualitative Difficulty linked to randomness in W and ε t We need here : Study asymptotic performances as n, T, ˆT Understand effects of W -defining hyper-parameters Generalize study to more advanced models. Results and Perspectives Asymptotic deterministic equivalents for training and testing MSE Multiples new consequences and intuitions Proposition of improved structures for W Generalization to non-linear setting Introduction of external memory, back-propagation Analogous study of deep networks, extreme ML, auto-encoders, etc. 29 / 38

86 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 30/38 Axis 3 : Random Matrices and Neural Networks Theorem ([Couillet,Wainrib 16] Training MSE for fixed W ) As n, T, with n/t c < 1, MSE = 1 ( T rt I T + R + 1 { } ) T 1 1 η 2 U T m T (W i ) T R 1 W j m U r + o(1) i,j=0 where U ij = u i j and R, R are solutions to { 1 ( )} T R = c n tr S R 1 i j, R 1 = i,j=1 T tr ( J q (I T + R) 1) S q q= with [J q ] ij δ i+q,j and S q k 0 W k+( q)+ (W k+q+ ) T. 30 / 38

87 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 30/38 Axis 3 : Random Matrices and Neural Networks Theorem ([Couillet,Wainrib 16] Training MSE for fixed W ) As n, T, with n/t c < 1, MSE = 1 ( T rt I T + R + 1 { } ) T 1 1 η 2 U T m T (W i ) T R 1 W j m U r + o(1) i,j=0 where U ij = u i j and R, R are solutions to { 1 ( )} T R = c n tr S R 1 i j, R 1 = i,j=1 T tr ( J q (I T + R) 1) S q q= with [J q ] ij δ i+q,j and S q k 0 W k+( q)+ (W k+q+ ) T. Corollaries : for c = 0 (S 0 = k 0 W k (W k ) T ), MSE = 1 ( T rt I T + 1 { } ) T 1 1 η 2 U T m T (W i ) T S 1 0 W j m U r + o(1) i,j=0 30 / 38

88 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 30/38 Axis 3 : Random Matrices and Neural Networks Theorem ([Couillet,Wainrib 16] Training MSE for fixed W ) As n, T, with n/t c < 1, MSE = 1 ( T rt I T + R + 1 { } ) T 1 1 η 2 U T m T (W i ) T R 1 W j m U r + o(1) i,j=0 where U ij = u i j and R, R are solutions to { 1 ( )} T R = c n tr S R 1 i j, R 1 = i,j=1 T tr ( J q (I T + R) 1) S q q= with [J q ] ij δ i+q,j and S q k 0 W k+( q)+ (W k+q+ ) T. Corollaries : for c = 0 (S 0 = k 0 W k (W k ) T ), MSE = 1 ( T rt I T + 1 { } ) T 1 1 η 2 U T m T (W i ) T S 1 0 W j m U r + o(1) i,j=0 for W = σz with Z Haar, m = 1 independent of W, MSE = (1 c) 1 ( T rt I T + 1 η 2 U T diag {(1 σ 2 )σ 2(i 1)} ) T 1 U r + o(1). i=1 30 / 38

89 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 31/38 Axis 3 : Random Matrices and Neural Networks n = 200, T = ˆT = 400 n = 400, T = ˆT = Test Test NMSE 10 3 Entr η 2 Monte Carlo Th. (W fixe) Th. (limite) Train Monte Carlo Th. (W fixed) Th. (limit) η 2 Figure Prediction for the Mackey Glass model, W = σz, σ =.9, Z Haar. 31 / 38

90 Research Project : Learning in Large Dimensions/Axis 3 : Random Matrices and Neural Networks 32/38 Axis 3 : Random Matrices and Neural Networks Consequences : Analysis suggests choice W = diag(w 1,..., W k ), W j = σ j Z j, Z j R n j n j Haar, leading to change k (1 σ 2 )σ 2τ j=1 MC(τ) c jσj 2τ k j=1 c j(1 σ. j 2) MC(τ; W ) MC(τ; W + 1 ) MC(τ; W + 2 ) MC(τ; W + 3 ) τ Figure Memory curve (MC) for W = diag(w 1, W 2, W 3), W j = σ jz j, Z j R n j n j Haar, σ 1 =.99, n 1/n =.01, σ 2 =.9, n 2/n =.1, and σ 3 =.5, n 3/n =.89. Matrices W + i defined by W + i = σ iz + i, with Z+ i R n n Haar. 32 / 38

91 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 33/38 Outline Curriculum Vitae Research Project : Learning in Large Dimensions Axis 1 : Robust Estimation in Large Dimensions Axis 2 : Classification in Large Dimensions Axis 3 : Random Matrices and Neural Networks Axis 4 : Graphs 33 / 38

92 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 34/38 Axis 4 : Graphs Baseline Scenario : Analysis of inference methods for large graphs 34 / 38

93 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 34/38 Axis 4 : Graphs Baseline Scenario : Analysis of inference methods for large graphs community detection on realistic graphs (spectral methods) 34 / 38

94 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 34/38 Axis 4 : Graphs Baseline Scenario : Analysis of inference methods for large graphs community detection on realistic graphs (spectral methods) analysis of signal processing on graphs methods 34 / 38

95 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 34/38 Axis 4 : Graphs Baseline Scenario : Analysis of inference methods for large graphs community detection on realistic graphs (spectral methods) analysis of signal processing on graphs methods Tools : 34 / 38

96 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 34/38 Axis 4 : Graphs Baseline Scenario : Analysis of inference methods for large graphs community detection on realistic graphs (spectral methods) analysis of signal processing on graphs methods Tools : methods based on adjacency A, Laplacian L, or modularity M L = D A M = A E[A]. e.g., if A ij Bern(q i q j ), M = A qq T. 34 / 38

97 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 34/38 Axis 4 : Graphs Baseline Scenario : Analysis of inference methods for large graphs community detection on realistic graphs (spectral methods) analysis of signal processing on graphs methods Tools : methods based on adjacency A, Laplacian L, or modularity M L = D A M = A E[A]. e.g., if A ij Bern(q i q j ), M = A qq T. spectrum and eigenvectors of A, L fundamental to inference methods 34 / 38

98 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 34/38 Axis 4 : Graphs Baseline Scenario : Analysis of inference methods for large graphs community detection on realistic graphs (spectral methods) analysis of signal processing on graphs methods Tools : methods based on adjacency A, Laplacian L, or modularity M L = D A M = A E[A]. e.g., if A ij Bern(q i q j ), M = A qq T. spectrum and eigenvectors of A, L fundamental to inference methods optimization, regression, PCA, etc., based on spectral properties. 34 / 38

99 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 35/38 Axis 4 : Graphs Problems and Objectives 35 / 38

100 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 35/38 Axis 4 : Graphs Problems and Objectives Community detection based on homogeneous graph methods 35 / 38

101 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 35/38 Axis 4 : Graphs Problems and Objectives Community detection based on homogeneous graph methods Signal processing oh graphs purely deterministically studied 35 / 38

102 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 35/38 Axis 4 : Graphs Problems and Objectives Community detection based on homogeneous graph methods Signal processing oh graphs purely deterministically studied We need here : develop and analyze community detection algorithms for realistic graphs analyze performances of signal processing on graphs methods 35 / 38

103 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 35/38 Axis 4 : Graphs Problems and Objectives Community detection based on homogeneous graph methods Signal processing oh graphs purely deterministically studied We need here : develop and analyze community detection algorithms for realistic graphs analyze performances of signal processing on graphs methods Results and Perspectives 35 / 38

104 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 35/38 Axis 4 : Graphs Problems and Objectives Community detection based on homogeneous graph methods Signal processing oh graphs purely deterministically studied We need here : develop and analyze community detection algorithms for realistic graphs analyze performances of signal processing on graphs methods Results and Perspectives New algorithms (and their analysis) for community detection with heterogeneous nodes 35 / 38

105 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 35/38 Axis 4 : Graphs Problems and Objectives Community detection based on homogeneous graph methods Signal processing oh graphs purely deterministically studied We need here : develop and analyze community detection algorithms for realistic graphs analyze performances of signal processing on graphs methods Results and Perspectives New algorithms (and their analysis) for community detection with heterogeneous nodes Exportation to signal processing on graphs problems. 35 / 38

106 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 36/38 Axis 4 : Graphs Model : graph G with n nodes and k classes, with for i C a, j C b, where C ab = 1 + n 1 2 Γ ab, Γ ab = O(1). A ij Bern (q i q j C ab ) 36 / 38

107 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 36/38 Axis 4 : Graphs Model : graph G with n nodes and k classes, with for i C a, j C b, where C ab = 1 + n 1 2 Γ ab, Γ ab = O(1). A ij Bern (q i q j C ab ) Limitations of classical approaches : Normalized modularity (ˆq = 1 n A1n) [ ] L = 1 diag(ˆq) 1 A ˆqˆqT n 1 n 1T n ˆq diag(ˆq) degree bias 0.1 class C 1 class C , corrected bias class C 1 class C ,000 Figure 2nd eigenvector of A (top) and 1st eigenvector of L (bottom) with bimodal q i, 2 classes, n = / 38

108 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 37/38 Axis 4 : Graphs Theorem ([Tiomoko Ali,Couillet 16] Limiting Deterministic Equivalent) p.s. As n, L L 0 with L = 1 [ ] 1 n m 2 D 1 XD 1 + UΛU T q 1 where D = diag({q i }), m q = lim n n i q i and [ ] J n 1 U = D nm 1 X1 n q [ (Ik 1 Λ = k c T )Γ(I k c1 T k ) 1 ] k 1 k 0 J = [j 1,..., j k ], j a = [0,..., 0, 1,..., 1, 0,..., 0] T R n canonical vector of class C a. 37 / 38

109 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 37/38 Axis 4 : Graphs Theorem ([Tiomoko Ali,Couillet 16] Limiting Deterministic Equivalent) p.s. As n, L L 0 with L = 1 [ ] 1 n m 2 D 1 XD 1 + UΛU T q 1 where D = diag({q i }), m q = lim n n i q i and [ ] J n 1 U = D nm 1 X1 n q [ (Ik 1 Λ = k c T )Γ(I k c1 T k ) 1 ] k 1 k 0 J = [j 1,..., j k ], j a = [0,..., 0, 1,..., 1, 0,..., 0] T R n canonical vector of class C a. Consequences : detection based on eigenvalues of Γ alignment of eigenvectors to j a 37 / 38

110 Research Project : Learning in Large Dimensions/Axis 4 : Graphs 38/38 End Thank you. 38 / 38

Random Matrices for Big Data Signal Processing and Machine Learning

Random Matrices for Big Data Signal Processing and Machine Learning (ICASSP 2017, New Orleans) Romain COUILLET and Hafiz TIOMOKO ALI CentraleSupélec, France March, 2017 1 / 153 Outline Basics of Random