Machine learning for Dynamic Social Network Analysis Manuel Gomez Rodriguez Max Planck Ins7tute for So;ware Systems UC3M, MAY 2017
Interconnected World SOCIAL NETWORKS TRANSPORTATION NETWORKS WORLD WIDE WEB PROTEIN INTERACTIONS INFORMATION NETWORKS INTERNET OF THINGS 2
Many discrete events in con7nuous 7me Qmee, 2013 3
Variety of processes behind these events Events are (noisy) observa7ons of a variety of complex dynamic processes FAST News spread in TwiXer Video becomes viral in Youtube Product reviews and sales in Amazon Ar7cle crea7on in Wikipedia A user gains recogni7on in Quora SLOW in a wide range of temporal scales. 4
Example I: Idea adop7on/viral marke7ng S D means D follows S 3.25pm Bob Chris7ne 3.00pm Beth 3.27pm Joe David 4.15pm t Friggeri et al., 2014 They can have an impact in the off- line world 5
Example II: Informa7on crea7on & cura7on Addi7on Refuta7on t Ques7on Answer Upvote t t
Example III: Learning trajectories 1st year computer science student Introduc9on to programming Discrete math Project presenta9on For/do- while loops Define Set theory func9ons Powerpoint Graph Theory Class vs. Keynote inheritance Export Geometry pptx to pdf t If else How to write Logic switch Private func9ons PP templates Class destructor Plot library 7
Detailed event traces DETAILED TRACES OF ACTIVITY The availability of event traces boosts a new generation of data- driven models and algorithms t 8
Previously: discrete- 7me models & algorithms Epoch 1 Epoch 2 Epoch 3 Epoch 4 Discrete- 7me models ar7ficially introduce epochs: 1. How long is each epoch? Data is very heterogeneous. 2. How to aggregate events within an epoch? 3. What if no event within an epoch? 4. Time is treated as index or condi7oning variable, not easy to deal with 7me- related queries. 9
Outline of the Seminar REPRESENTATION: TEMPORAL POINT PROCESSES 1. Intensity func7on 2. Basic building blocks 3. Superposi7on 4. Marks and SDEs with jumps This lecture APPLICATIONS: MODELS 1. Informa7on propaga7on 2. Opinion dynamics 3. Informa7on reliability 4. Knowledge acquisi7on APPLICATIONS: CONTROL 1. Influence maximiza7on 2. Ac7vity shaping 3. When- to- post Slides/references: learning.mpi-sws.org/uc3m-seminar 10
Representa7on: Temporal Point Processes 1. Intensity func7on 2. Basic building blocks 3. Superposi7on 4. Marks and SDEs with jumps 11
Temporal point processes Temporal point process: A random process whose realiza7on consists of discrete events localized in 7me Discrete events History, Dirac delta func7on Formally: 12
Model 7me as a random variable Prob. between [t, t+dt) density History, Prob. not before t Likelihood of a 7meline: 13
Problems of density parametriza7on (I) It is difficult for model design and interpretability: 1. Densi7es need to integrate to 1 (i.e., par77on func7on) 2. Difficult to combine 7melines 14
Problems of density parametriza7on (II) Difficult to combine 7melines: + = Sum of random processes 15
Intensity func7on density Prob. between [t, t+dt) History, Prob. not before t Intensity: Probability between [t, t+dt) but not before t Observa7on: It is a rate = # of events / unit of 7me 16
Advantages of intensity parametriza7on (I) Suitable for model design and interpretable: 1. Intensi7es only need to be nonnega7ve 2. Easy to combine 7melines 17
Advantages of intensity parametriza7on (II) Easy to combine 7meline: + = Sum of random processes 18
Rela7on between f*, F*, S*, λ* Central quan7ty we will use! 19
Representa7on: Temporal Point Processes 1. Intensity func7on 2. Basic building blocks 3. Superposi7on 4. Marks and SDEs with jumps 20
Poisson process Intensity of a Poisson process Observa7ons: 1. Intensity independent of history 2. Uniformly random occurrence 3. Time interval follows exponen7al distribu7on 21
Fisng a Poisson from (historical) 7meline Maximum likelihood 22
Sampling from a Poisson process We would like to sample: We sample using inversion sampling: 23
Inhomogeneous Poisson process Intensity of an inhomogeneous Poisson process Observa7ons: 1. Intensity independent of history 24
Fisng an inhomogeneous Poisson Maximum likelihood Design such that max. likelihood is convex (and use CVX)
Nonparametric inhomogeneous Poisson process Posi7ve combina7on of (Gaussian) RFB kernels: 26
Sampling from an inhomogeneous Poisson Thinning procedure (similar to rejec7on sampling): 1. Sample from Poisson process with intensity Inversion sampling 2. Generate 3. Keep the sample if Keep sample with prob.
Termina7ng (or survival) process Intensity of a termina7ng (or survival) process Observa7ons: 1. Limited number of occurrences 28
Self- exci7ng (or Hawkes) process History, Intensity of self- exci7ng (or Hawkes) process: Triggering kernel Observa7ons: 1. Clustered (or bursty) occurrence of events 2. Intensity is stochas7c and history dependent 29
Fisng a Hawkes process from a recorded 7meline Maximum likelihood The max. likelihood is jointly convex in and
Sampling from a Hawkes process Thinning procedure (similar to rejec7on sampling): 1. Sample from Poisson process with intensity Inversion sampling 2. Generate 3. Keep the sample if Keep sample with prob. 31
Summary Building blocks to represent different dynamic processes: Poisson processes: Inhomogeneous Poisson processes: We know how to fit them and how to sample from them Termina9ng point processes: Self- exci9ng point processes: 32
Representa7on: Temporal Point Processes 1. Intensity func7on 2. Basic building blocks 3. Superposi7on 4. Marks and SDEs with jumps 33
Superposi7on of processes Sample each intensity + take minimum = Addi7ve intensity 34
Mutually exci7ng process Bob History Chris7ne History Clustered occurrence affected by neighbors 35
Mutually exci7ng termina7ng process Bob Chris7ne History Clustered occurrence affected by neighbors 36
Representa7on: Temporal Point Processes 1. Intensity func7on 2. Basic building blocks 3. Superposi7on 4. Marks and SDEs with jumps 37
Marked temporal point processes Marked temporal point process: A random process whose realiza7on consists of discrete marked events localized in 7me History, 38
Independent iden7cally distributed marks Distribu7on for the marks: Observa7ons: 1. Marks independent of the temporal dynamics 2. Independent iden7cally distributed (I.I.D.) 39
Dependent marks: SDEs with jumps History, Marks given by stochas7c differen7al equa7on with jumps: Observa7ons: Dri; Event influence 1. Marks dependent of the temporal dynamics 2. Defined for all values of t 40
Dependent marks: distribu7on + SDE with jumps History, Distribu7on for the marks: Observa7ons: Dri; Event influence 1. Marks dependent on the temporal dynamics 2. Distribu7on represents addi7onal source of uncertainty 41
Mutually exci7ng + marks Bob Chris7ne Marks affected by neighbors Dri; Neighbor influence 42
REPRESENTATION: TEMPORAL POINT PROCESSES 1. Intensity func7on 2. Basic building blocks 3. Superposi7on 4. Marks and SDEs with jumps APPLICATIONS: MODELS 1. Informa7on propaga7on 2. Opinion dynamics 3. Informa7on reliability 4. Knowledge acquisi7on This lecture Next lecture APPLICATIONS: CONTROL 1. Influence maximiza7on 2. Ac7vity shaping 3. When- to- post Slides/references: learning.mpi-sws.org/sydney-seminar 43