Learning with Temporal Point Processes

Similar documents
Machine learning for Dynamic Social Network Analysis

Epidemics and information spreading

Inferring the origin of an epidemic with a dynamic message-passing algorithm

Time varying networks and the weakness of strong ties

Diffusion of information and social contagion

Pricing of Cyber Insurance Contracts in a Network Model

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

The decoupling assumption in large stochastic system analysis Talk at ECLT

Gaussian Process Vine Copulas for Multivariate Dependence

Kurume University Faculty of Economics Monograph Collection 18. Theoretical Advances and Applications in. Operations Research. Kyushu University Press

Brief Glimpse of Agent-Based Modeling

Inductive Principles for Restricted Boltzmann Machine Learning

6.867 Machine learning, lecture 23 (Jaakkola)

Bayesian Methods in Artificial Intelligence

Graphical Models and Kernel Methods

Wiki Definition. Reputation Systems I. Outline. Introduction to Reputations. Yury Lifshits. HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte

Simulation and Calibration of a Fully Bayesian Marked Multidimensional Hawkes Process with Dissimilar Decays

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016

Intelligent Systems (AI-2)

Learning from Data. Amos Storkey, School of Informatics. Semester 1. amos/lfd/

Designing and Evaluating Generic Ontologies

Stochastic Processes

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

Interpretable Latent Variable Models

Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Random Walk Based Algorithms for Complex Network Analysis

Discovering Topical Interactions in Text-based Cascades using Hidden Markov Hawkes Processes

OPPA European Social Fund Prague & EU: We invest in your future.

Chapter 9. Non-Parametric Density Function Estimation

Theory and Methods for the Analysis of Social Networks

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Identification and Estimation of Causal Effects from Dependent Data

CS 188: Artificial Intelligence Spring Announcements

Northwestern University Department of Electrical Engineering and Computer Science

Web Structure Mining Nodes, Links and Influence

Robotics 2 Data Association. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Wolfram Burgard

L09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms

Sampling Algorithms for Probabilistic Graphical models

Stat Lecture 20. Last class we introduced the covariance and correlation between two jointly distributed random variables.

Bayesian Inference for Contact Networks Given Epidemic Data

CS224W: Analysis of Networks Jure Leskovec, Stanford University

MODELING, PREDICTING, AND GUIDING USERS TEMPORAL BEHAVIORS. A Dissertation Presented to The Academic Faculty. Yichen Wang

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Cost and Preference in Recommender Systems Junhua Chen LESS IS MORE

Introduction to Machine Learning CMU-10701

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)

Bayesian network modeling. 1

Temporal Networks aka time-varying networks, time-stamped graphs, dynamical networks...

A LRT Framework for Fast Spatial Anomaly Detection*

Integrating Induction and Deduction for Verification and Synthesis

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem

Sean Escola. Center for Theoretical Neuroscience

Switched systems: stability

Graduate Econometrics I: What is econometrics?

KINETICS OF SOCIAL CONTAGION. János Kertész Central European University. SNU, June

Chapter 9. Non-Parametric Density Function Estimation

Probability Theory for Machine Learning. Chris Cremer September 2015

Convergence of Time Decay for Event Weights

Machine Learning for Computational Advertising

New Algorithms for Contextual Bandits

Probabilistic Guarded Commands Mechanized in HOL

Extreme Value Analysis and Spatial Extremes

Introduction to Artificial Intelligence. Unit # 11

A Tunable Mechanism for Identifying Trusted Nodes in Large Scale Distributed Networks

Machine Learning and Modeling for Social Networks

Temporal point processes: the conditional intensity function

Does Better Inference mean Better Learning?

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016

Introduction to Machine Learning CMU-10701

Probability and Information Theory. Sargur N. Srihari

1/30/2012. Reasoning with uncertainty III

Chapter 2 Random Variables

Bayesian Nonparametric Learning of Complex Dynamical Phenomena

Multivariate Hawkes Processes and Their Simulations

12.1 A Polynomial Bound on the Sample Size m for PAC Learning

Reasoning Under Uncertainty: More on BNets structure and construction

CHAPTER 4 PROBABILITY AND COUNTING RULES UC DENVER

Artificial Intelligence Bayesian Networks

Recoverabilty Conditions for Rankings Under Partial Information

Bayesian Learning in Undirected Graphical Models

Introduction to Stochastic SIR Model

Mathematical statistics

Lecture Start

MTTTS16 Learning from Multiple Sources

DOWNLOAD OR READ : CONCEPTS IN PROBABILITY THEORY PDF EBOOK EPUB MOBI

Introduction to Algorithmic Trading Strategies Lecture 3

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

Information Recovery from Pairwise Measurements

Testing Monotonicity of Pricing Kernels

Stochastic Spectral Approaches to Bayesian Inference

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017

COEVOLVE: A Joint Point Process Model for Information Diffusion and Network Evolution

Greedy Maximization Framework for Graph-based Influence Functions

Modeling Data Correlations in Private Data Mining with Markov Model and Markov Networks. Yang Cao Emory University

Convex polytopes, interacting particles, spin glasses, and finance

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008

Q-Learning and Stochastic Approximation

Transcription:

Learning with Temporal Point Processes t Manuel Gomez Rodriguez MPI for Software Systems Isabel Valera MPI for Intelligent Systems Slides/references: http://learning.mpi-sws.org/tpp-icml18 ICML TUTORIAL, JULY 2018

Many discrete events in continuous Disease dynamics Q m ee, 2013 Online actions Financial trading Mobility dynamics 2

Variety of processes behind these events Events are (noisy) observations of a variety of complex dynamic processes FAST Stock trading News spread in Twitter Flu spreading Ride-sharing requests Article creation in Wikipedia Reviews and sales in Amazon A user s reputation in Quora SLOW in a wide range of temporal scales. 3

Example I: Information propagation S D means D follows S 3.25pm Bob Christine 3.00pm Beth 3.27pm Joe David 4.15pm t Friggeri et al., 2014 They can have an impact in the off-line world 4

Example II: Knowledge creation Addition Refutation t Question Answer Upvote t t

Example III: Human learning 1st year computer science student Introduction to programming Discrete math Project presentation For/do-while loops Define Set theory functions Graph Theory Powerpoint vs. Keynote Class inheritance Export Geometry pptx to pdf t If else How to write Logic switch Private functions PP templates Class destructor Plot library 6

Aren t these event traces just series? t t t t Discrete and continuous s series What about aggregating events in epochs? Discrete events in continuous The framework of temporal point processes provides a native How representation long is each epoch? How to aggregate events per epoch? What if no event in one epoch? What about -related queries? Epoch 1 Epoch 2 Epoch 3 t 7

Outline of the Seminar TEMPORAL POINT PROCESSES (TPPS): INTRO 1. Intensity function 2. Basic building blocks 3. Superposition 4. Marks and SDEs with jumps Next MODELS & INFERENCE 1. Modeling event sequences 2. Clustering event sequences 3. Capturing complex dynamics 4. Causal reasoning on event sequences RL & CONTROL 1. Marked TPPs: a new setting 2. Stochastic optimal control 3. Reinforcement learning Slides/references: learning.mpi-sws.org/tpp-icml18 8

Temporal Point Processes (TPPs): Introduction 1. Intensity function 2. Basic building blocks 3. Superposition 4. Marks and SDEs with jumps 9

Temporal point processes Temporal point process: A random process whose realization consists of discrete events localized in Discrete events History, Dirac delta function Formally: 10

Model as a random variable Prob. between [t, t+dt) density History, Prob. not before t Likelihood of a line: 11

Problems of density parametrization (I) It is difficult for model design and interpretability: 1. Densities need to integrate to 1 (i.e., partition function) 2. Difficult to combine lines 12

Intensity function density Prob. between [t, t+dt) History, Prob. not before t Intensity: Probability between [t, t+dt) but not before t Observation: It is a rate = # of events / unit of 13

Advantages of intensity parametrization (I) Suitable for model design and interpretable: 1. Intensities only need to be nonnegative 2. Easy to combine lines 14

Relation between f*, F*, S*, λ* Central quantity we will use! 15

Representation: Temporal Point Processes 1. Intensity function 2. Basic building blocks 3. Superposition 4. Marks and SDEs with jumps 16

Poisson process Intensity of a Poisson process Observations: 1. Intensity independent of history 2. Uniformly random occurrence 3. Time interval follows exponential distribution 17

Fitting & sampling from a Poisson Fitting by maximum likelihood: Sampling using inversion sampling: 18

Inhomogeneous Poisson process Intensity of an inhomogeneous Poisson process Example: (Independent of history) 19

Fitting & sampling from inhomogeneous Poisson Fitting by maximum likelihood: Sampling using thinning (reject. sampling) + inverse sampling: 1. Sample from Poisson process with intensity using inverse sampling 2. Generate Keep sample with 3. Keep the sample if prob. 20

Terminating (or survival) process Intensity of a terminating (or survival) process Observations: 1. Limited number of occurrences Try sampling and fitting! 21

Self-exciting (or Hawkes) process History, Intensity of self-exciting (or Hawkes) process: Triggering kernel Observations: 1. Clustered (or bursty) occurrence of events 2. Intensity is stochastic and history dependent 22

Fitting a Hawkes process from a recorded line Fitting by maximum likelihood: The max. likelihood is jointly convex in and Sampling using thinning (reject. sampling) + inverse sampling: Key idea: the maximum of the intensity over changes

Summary Building blocks to represent different dynamic processes: Poisson processes: Inhomogeneous Poisson processes: We know how to fit them and how to sample from them Terminating point processes: Self-exciting point processes: 24

Representation: Temporal Point Processes 1. Intensity function 2. Basic building blocks 3. Superposition 4. Marks and SDEs with jumps 25

Mutually exciting process Bob History Christine History Clustered occurrence affected by neighbors 26

Mutually exciting terminating process Bob Christine History Clustered occurrence affected by neighbors 27

Representation: Temporal Point Processes 1. Intensity function 2. Basic building blocks 3. Superposition 4. Marks and SDEs with jumps 28

Marked temporal point processes Marked temporal point process: A random process whose realization consists of discrete marked events localized in History, 29

Independent identically distributed marks Distribution for the marks: Observations: 1. Marks independent of the temporal dynamics 2. Independent identically distributed (I.I.D.) 30

Dependent marks: SDEs with jumps History, Marks given by stochastic differential equation with jumps: Observations: Drift Event influence 1. Marks dependent of the temporal dynamics 2. Defined for all values of t 31

Dependent marks: distribution + SDE with jumps History, Distribution for the marks: Observations: Drift Event influence 1. Marks dependent on the temporal dynamics 2. Distribution represents additional source of uncertainty 32

Mutually exciting + marks Bob Christine Marks affected by neighbors Drift Neighbor influence 33

Marked TPPs as stochastic dynamical systems Example: Susceptible-Infected-Susceptible (SIS) SDE with jumps Susceptible Infected Susceptible Infection rate Node is susceptible It gets infected It recovers If friends are infected, higher infection rate SDE with jumps Recovery rate Self-recovery rate when node gets infected If node recovers, rate to zero Rate increases 34 if node gets treated

Outline of the Seminar TEMPORAL POINT PROCESSES (TPPS): INTRO 1. Intensity function 2. Basic building blocks 3. Superposition 4. Marks and SDEs with jumps MODELS & INFERENCE 1. Modeling event sequences 2. Clustering event sequences 3. Capturing complex dynamics 4. Causal reasoning on event sequences Next RL & CONTROL 1. Marked TPPs: a new setting 2. Stochastic optimal control 3. Reinforcement learning Slides/references: learning.mpi-sws.org/tpp-icml18 35