Exploratory Causal Analysis in Bivariate Time Series Data Abstract

Size: px
Start display at page:

Download "Exploratory Causal Analysis in Bivariate Time Series Data Abstract"

Transcription

1 Exploratory Causal Analysis in Bivariate Time Series Data Abstract Many scientific disciplines rely on observational data of systems for which it is difficult (or impossible) to implement controlled experiments and data analysis techniques are required for identifying causal information and relationships directly from observational data. This need has lead to the development of many different time series causality approaches and tools including transfer entropy, convergent cross-mapping (CCM), and Granger causality statistics. A practicing analyst can explore the literature to find many proposals for identifying drivers and causal connections in times series data sets, but little research exists of how these tools compare to each other in practice. This work introduces and defines exploratory causal analysis (ECA) to address this issue. The motivation is to provide a framework for exploring potential causal structures in time series data sets. J. M. McCracken Defense talk for PhD in Physics, Department of Physics and Astronomy 10:00 AM November 20, 2015; Exploratory Hall, 3301 Advisor: Dr. Robert Weigel; Committee: Dr. Paul So, Dr. Tim Sauer J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

2 Exploratory Causal Analysis in Bivariate Time Series Data J. M. McCracken Department of Physics and Astronomy George Mason University, Fairfax, VA November 20, 2015 J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

3 Outline 1. Motivation 2. Causality studies 3. Data causality 4. Exploratory causal analysis 5. Making an ECA summary Transfer entropy difference Granger causality statistic Pairwise asymmetric inference Weighed mean observed leaning Lagged cross-correlation difference 6. Computational tools for the ECA summary 7. Empirical examples Cooling/Heating System Data Snowfall Data 8. Times series causality as data analysis J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

4 Motivation Data Consider two sets of time series measurements, X and Y. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

5 Motivation Data Consider two sets of time series measurements, X and Y. Question Is there evidence that X drives Y? J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

6 Motivation Data Consider two sets of time series measurements, X and Y. Question Is there evidence that X drives Y? We were looking for a data analysis approach, i.e., we were looking for analysis tools that J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

7 Motivation Data Consider two sets of time series measurements, X and Y. Question Is there evidence that X drives Y? We were looking for a data analysis approach, i.e., we were looking for analysis tools that worked with time series data, J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

8 Motivation Data Consider two sets of time series measurements, X and Y. Question Is there evidence that X drives Y? We were looking for a data analysis approach, i.e., we were looking for analysis tools that worked with time series data, had straightforward, preferably well-established, interpretations, J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

9 Motivation Data Consider two sets of time series measurements, X and Y. Question Is there evidence that X drives Y? We were looking for a data analysis approach, i.e., we were looking for analysis tools that worked with time series data, had straightforward, preferably well-established, interpretations, were reliable, J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

10 Motivation Data Consider two sets of time series measurements, X and Y. Question Is there evidence that X drives Y? We were looking for a data analysis approach, i.e., we were looking for analysis tools that worked with time series data, had straightforward, preferably well-established, interpretations, were reliable, and did not require studying the (vast) philosophical causality literature. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

11 Motivation Data Consider two sets of time series measurements, X and Y. Question Is there evidence that X drives Y? We were looking for a data analysis approach, i.e., we were looking for analysis tools that worked with time series data, had straightforward, preferably well-established, interpretations, were reliable, and did not require studying the (vast) philosophical causality literature. Essentially, we were looking for a plug-and-play analysis tool. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

12 Motivation Data Consider two sets of time series measurements, X and Y. Question Is there evidence that X drives Y? We were looking for a data analysis approach, i.e., we were looking for analysis tools that worked with time series data, had straightforward, preferably well-established, interpretations, were reliable, and did not require studying the (vast) philosophical causality literature. Essentially, we were looking for a plug-and-play analysis tool. This work stems from our search for such a tool. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

13 Causality studies The study of causality is as old as science itself Modern historians credit Aristotle with both the first theory of causality ( four causes ) and an early version of the scientific method The modern study of causality is broadly interdisciplinary; far too broad to review in a short talk. Illari and Russo s textbook 1 provides an overview of causality studies 1 Illari, P., & Russo, F. (2014). Causality: Philosophical theory meets scientific practice. Oxford University Press. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

14 Towards a taxonomy of causal studies Paul Holland identified four types of causal questions 2 : the ultimate meaningfulness of the notion of causality the details of causal mechanisms the causes of a given effect the effects of a given cause 2 Holland, P. W. (1986). Statistics and causal inference. Journal of the American statistical Association, 81(396), J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

15 Towards a taxonomy of causal studies Paul Holland identified four types of causal questions 2 : the ultimate meaningfulness of the notion of causality the details of causal mechanisms the causes of a given effect the effects of a given cause Foundational causality Is a cause required to precede an effect? or How are causes and effects related in space-time? 2 Holland, P. W. (1986). Statistics and causal inference. Journal of the American statistical Association, 81(396), J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

16 Towards a taxonomy of causal studies Paul Holland identified four types of causal questions 2 : the ultimate meaningfulness of the notion of causality the details of causal mechanisms the causes of a given effect the effects of a given cause Foundational causality Is a cause required to precede an effect? or How are causes and effects related in space-time? Data causality Does smoking cause lung cancer? or Are traffic accidents caused by rain storms? 2 Holland, P. W. (1986). Statistics and causal inference. Journal of the American statistical Association, 81(396), J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

17 Data causality Data causality is data analysis to draw causal inferences J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

18 Data causality Data causality is data analysis to draw causal inferences Approaches to data causality studies include design of experiments (e.g., Fisher randomization) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

19 Data causality Data causality is data analysis to draw causal inferences Approaches to data causality studies include design of experiments (e.g., Fisher randomization) potential outcomes (Rubin s counterfactuals) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

20 Data causality Data causality is data analysis to draw causal inferences Approaches to data causality studies include design of experiments (e.g., Fisher randomization) potential outcomes (Rubin s counterfactuals) directed acyclic graphs (DAGs) with structural equation models (SEMs); popularized by Pearl as structural causal models (SCMs) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

21 Data causality Data causality is data analysis to draw causal inferences Approaches to data causality studies include design of experiments (e.g., Fisher randomization) potential outcomes (Rubin s counterfactuals) directed acyclic graphs (DAGs) with structural equation models (SEMs); popularized by Pearl as structural causal models (SCMs) time series causality J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

22 Data causality Data causality is data analysis to draw causal inferences Approaches to data causality studies include design of experiments (e.g., Fisher randomization) potential outcomes (Rubin s counterfactuals) directed acyclic graphs (DAGs) with structural equation models (SEMs); popularized by Pearl as structural causal models (SCMs) time series causality There is no consensus on the best approach to data causality Many authors consider their favored approach to be the exclusive correct approach. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

23 Data causality Data causality is data analysis to draw causal inferences Approaches to data causality studies include design of experiments (e.g., Fisher randomization) potential outcomes (Rubin s counterfactuals) directed acyclic graphs (DAGs) with structural equation models (SEMs); popularized by Pearl as structural causal models (SCMs) time series causality There is no consensus on the best approach to data causality Many authors consider their favored approach to be the exclusive correct approach. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

24 Time series causality Time series causality is data causality with time series data J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

25 Time series causality Time series causality is data causality with time series data Approaches to time series causality can be roughly divided into five categories, Granger (model based approaches) Information-theoretic State space reconstruction (SSR) Correlation Penchant J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

26 Exploratory causal analysis Language Exploring causal structures in data sets is distinct from confirming causal structures in data sets. Causal language used in ECA should not be conflated with other typical uses; i.e., cause, effect, drive, etc. are used as technical terms with definitions unrelated to their common, everyday definitions. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

27 Exploratory causal analysis Language Exploring causal structures in data sets is distinct from confirming causal structures in data sets. Causal language used in ECA should not be conflated with other typical uses; i.e., cause, effect, drive, etc. are used as technical terms with definitions unrelated to their common, everyday definitions. and will be used as shorthand for causal statements, e.g., A drives B will be written as A B. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

28 Exploratory causal analysis Assumptions A cause always precedes an effect. This assumption is required for the operational definitions of causality. A driver may be present in the data being analyzed. This assumption may lead to issues of confounding. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

29 Exploratory causal analysis ECA summary vector approach We will not favor a specific operational definition of causality we do not favor any particular tool Consider a time series pair (X, Y), J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

30 Exploratory causal analysis ECA summary vector approach We will not favor a specific operational definition of causality we do not favor any particular tool Consider a time series pair (X, Y), ECA summary vector Define a vector g where each element g i is defined as either 0 if X Y, 1 if X Y, or 2 if no causal inference can be made. The value of each g i comes from a specific time series causality tool. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

31 Exploratory causal analysis ECA summary vector approach We will not favor a specific operational definition of causality we do not favor any particular tool Consider a time series pair (X, Y), ECA summary vector Define a vector g where each element g i is defined as either 0 if X Y, 1 if X Y, or 2 if no causal inference can be made. The value of each g i comes from a specific time series causality tool. ECA summary The ECA summary is either X Y, Y X, or undefined, with g i = 0 g i g X Y and g i = 1 g i g Y X. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

32 Making an ECA summary Our focus is on time series, so each causal inference g i g will be drawn from a tool in one of each of the five time series causality categories. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

33 Making an ECA summary Our focus is on time series, so each causal inference g i g will be drawn from a tool in one of each of the five time series causality categories. g 1 g 2 g 3 g 4 g 5 transfer entropy difference Granger log-likelihood statistics pairwise asymmetric inference (PAI) average weighted mean observed leaning lagged cross-correlation difference J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

34 Making an ECA summary Our focus is on time series, so each causal inference g i g will be drawn from a tool in one of each of the five time series causality categories. g 1 g 2 g 3 g 4 g 5 transfer entropy difference information-theoretic Granger log-likelihood statistics pairwise asymmetric inference (PAI) average weighted mean observed leaning lagged cross-correlation difference J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

35 Making an ECA summary Our focus is on time series, so each causal inference g i g will be drawn from a tool in one of each of the five time series causality categories. g 1 g 2 g 3 g 4 g 5 transfer entropy difference information-theoretic Granger log-likelihood statistics Granger pairwise asymmetric inference (PAI) average weighted mean observed leaning lagged cross-correlation difference J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

36 Making an ECA summary Our focus is on time series, so each causal inference g i g will be drawn from a tool in one of each of the five time series causality categories. g 1 g 2 g 3 g 4 g 5 transfer entropy difference information-theoretic Granger log-likelihood statistics Granger pairwise asymmetric inference (PAI) SSR average weighted mean observed leaning lagged cross-correlation difference J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

37 Making an ECA summary Our focus is on time series, so each causal inference g i g will be drawn from a tool in one of each of the five time series causality categories. g 1 g 2 g 3 g 4 g 5 transfer entropy difference information-theoretic Granger log-likelihood statistics Granger pairwise asymmetric inference (PAI) SSR average weighted mean observed leaning penchant lagged cross-correlation difference J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

38 Making an ECA summary Our focus is on time series, so each causal inference g i g will be drawn from a tool in one of each of the five time series causality categories. g 1 g 2 g 3 g 4 g 5 transfer entropy difference information-theoretic Granger log-likelihood statistics Granger pairwise asymmetric inference (PAI) SSR average weighted mean observed leaning penchant lagged cross-correlation difference correlation J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

39 Transfer entropy (g 1 ) Shannon entropy The uncertainty that a random variable X takes some specific value X n is given by the Shannon (or information) entropy, N X H X = P(X = X n ) log 2 P(X = X n ) n=1 J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

40 Transfer entropy (g 1 ) Shannon entropy The uncertainty that a random variable X takes some specific value X n is given by the Shannon (or information) entropy, N X H X = n=1 P(X = X n ) log 2 P(X = X n ) P(X = X n ) is the probability that X takes the specific value X n J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

41 Transfer entropy (g 1 ) Shannon entropy The uncertainty that a random variable X takes some specific value X n is given by the Shannon (or information) entropy, H X = N X n=1 P(X = X n ) log 2 P(X = X n ) The sum is over all possible values of X n ; n = 1, 2,..., N X J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

42 Transfer entropy (g 1 ) Shannon entropy The uncertainty that a random variable X takes some specific value X n is given by the Shannon (or information) entropy, N X H X = P(X = X n ) log 2 P(X = X n ) n=1 The base of the logarithm sets the entropy units, which is bits here J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

43 Transfer entropy (g 1 ) Shannon entropy example Binary example (to help with intuition) Consider a coin C that take the value H with probability p H and T with probability p T. The Shannon entropy is H C = (p H log 2 p H + p T log 2 p T ) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

44 Transfer entropy (g 1 ) Shannon entropy example Binary example (to help with intuition) Consider a coin C that take the value H with probability p H and T with probability p T. The Shannon entropy is H C = (p H log 2 p H + p T log 2 p T ) completely uncertain of outcome Fair coin p H = p T = 0.5 H C = 1 J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

45 Transfer entropy (g 1 ) Shannon entropy example Binary example (to help with intuition) Consider a coin C that take the value H with probability p H and T with probability p T. The Shannon entropy is H C = (p H log 2 p H + p T log 2 p T ) completely uncertain of outcome Fair coin p H = p T = 0.5 H C = 1 completely certain of outcome Always heads (or tails) p H(T ) = 0, p T (H) = 1 H C = 0 J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

46 Transfer entropy (g 1 ) Shannon entropy example Binary example (to help with intuition) Consider a coin C that take the value H with probability p H and T with probability p T. The Shannon entropy is H C = (p H log 2 p H + p T log 2 p T ) completely uncertain of outcome Fair coin p H = p T = 0.5 H C = 1 completely certain of outcome Always heads (or tails) p H(T ) = 0, p T (H) = 1 H C = 0 (Entropy calculations almost always assume 0 log 2 0 := 0.) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

47 Transfer entropy (g 1 ) Mutual information A pair of random variables (X, Y) have some mutual information given by I X ;Y = H X + H Y H X,Y = N X N Y n=1 m=1 P(X = X n, Y = Y m ) log 2 P(X = X n, Y = Y m ) P(X = X n )P(Y = Y m ) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

48 Transfer entropy (g 1 ) Mutual information A pair of random variables (X, Y) have some mutual information given by I X ;Y = H X + H Y H X,Y = N X N Y n=1 m=1 P(X = X n, Y = Y m ) log 2 P(X = X n, Y = Y m ) P(X = X n )P(Y = Y m ) P(X = X n, Y = Y m ) is the probability that X takes the specific value X n and Y takes the specific value Y m J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

49 Transfer entropy (g 1 ) Mutual information A pair of random variables (X, Y) have some mutual information given by I X ;Y = H X + H Y H X,Y = N X N Y n=1 m=1 P(X = X n, Y = Y m ) log 2 P(X=X n,y=y m) P(X=X n)p(y=y m) If X and Y are independent, then P(X = X n, Y = Y m ) = P(X = X n )P(Y = Y m ) I X ;Y = 0 J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

50 Transfer entropy (g 1 ) Mutual information A pair of random variables (X, Y) have some mutual information given by I X ;Y = H X + H Y H X,Y = N X N Y n=1 m=1 P(X = X n, Y = Y m ) log 2 P(X = X n, Y = Y m ) P(X = X n )P(Y = Y m ) The mutual information is symmetric; i.e., I X ;Y = I Y ;X J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

51 Transfer entropy (g 1 ) Mutual information A pair of random variables (X, Y) have some mutual information given by I X ;Y = H X + H Y H X,Y = N X N Y n=1 m=1 P(X = X n, Y = Y m ) log 2 P(X = X n, Y = Y m ) P(X = X n )P(Y = Y m ) Schreiber proposed an extension of the mutual information to measure information flow by making it conditional and including assumptions about the temporal behavior X and Y. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

52 Transfer entropy (g 1 ) Information flow Suppose X and Y are both Markov processes. The directed flow of information from Y to X is given by the transfer entropy, with T Y X = N X N Y n=1 m=1 p n+1,n,m log 2 p n+1 n,m p n+1 n p n+1,n,m = P(X(t + 1) = X n+1, X(t) = X n, Y(τ) = Y m ) p n+1 n,m = P(X(t + 1) = X n+1 X(t) = X n, Y(τ) = Y m ) p n+1 n = P(X(t + 1) = X n+1 X(t) = X n ) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

53 Transfer entropy (g 1 ) Information flow There is no directed information flow from Y to X if X is conditionally independent of Y; i.e., p n+1 n,m = p n+1 n T Y X = 0 J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

54 Transfer entropy (g 1 ) Information flow There is no directed information flow from Y to X if X is conditionally independent of Y; i.e., p n+1 n,m = p n+1 n T Y X = 0 Operational causality (information-theoretic) X causes Y if the directed information flow from X to Y is higher than the directed information flow from Y to X; i.e., T X Y T Y X > 0 X Y T X Y T Y X < 0 Y X T X Y T Y X = 0 no causal inference J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

55 Granger causality (g 2 ) Granger s axioms Consider a discrete universe with two time series X = {X t t = 1,..., n} and Y = {Y t t = 1,..., n}, where t = n is considered the present time. All knowledge available in the universe at all times t n is denoted as Ω n. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

56 Granger causality (g 2 ) Granger s axioms Consider a discrete universe with two time series X = {X t t = 1,..., n} and Y = {Y t t = 1,..., n}, where t = n is considered the present time. All knowledge available in the universe at all times t n is denoted as Ω n. Axiom 1 The past and present may cause the future, but the future cannot cause the past. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

57 Granger causality (g 2 ) Granger s axioms Consider a discrete universe with two time series X = {X t t = 1,..., n} and Y = {Y t t = 1,..., n}, where t = n is considered the present time. All knowledge available in the universe at all times t n is denoted as Ω n. Axiom 1 The past and present may cause the future, but the future cannot cause the past. Axiom 2 Ω n contains no redundant information, so that if some variable Z is functionally related to one or more other variables, in a deterministic fashion, then Z should be excluded from Ω n. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

58 Granger causality (g 2 ) Granger s axioms Consider a discrete universe with two time series X = {X t t = 1,..., n} and Y = {Y t t = 1,..., n}, where t = n is considered the present time. All knowledge available in the universe at all times t n is denoted as Ω n. Granger s definition of causality Given some set A, Y causes X if P(X n+1 A Ω n ) P(X n+1 A Ω n Y) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

59 Granger causality (g 2 ) Granger s axioms Consider a discrete universe with two time series X = {X t t = 1,..., n} and Y = {Y t t = 1,..., n}, where t = n is considered the present time. All knowledge available in the universe at all times t n is denoted as Ω n. Granger s definition of causality Given some set A, Y causes X if P(X n+1 A Ω n ) P(X n+1 A Ω n Y) Granger s original goal was to make this notion of causality operational. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

60 Granger causality (g 2 ) VAR models Consider a time series pair (X, Y). Suppose there is a vector autoregressive (VAR) model that describes the pair, ( Xt Y t ) = n ( A i 11 A i ) ( ) 12 Xt i A i 21 A i + 22 Y t i i=1 ( ε1,t ε 2,t ) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

61 Granger causality (g 2 ) VAR models Consider a time series pair (X, Y). Suppose there is a vector autoregressive (VAR) model that describes the pair, ( Xt Y t ) = n ( A i 11 A i ) ( ) 12 Xt i A i 21 A i + 22 Y t i i=1 ( ε1,t ε 2,t ) The current time step t of X and Y J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

62 Granger causality (g 2 ) VAR models Consider a time series pair (X, Y). Suppose there is a vector autoregressive (VAR) model that describes the pair, ( Xt Y t ) = n ( A i 11 A i ) ( ) 12 Xt i A i 21 A i + 22 Y t i i=1 ( ε1,t ε 2,t ) The current time step t of X and Y is modeled as a sum of n past steps J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

63 Granger causality (g 2 ) VAR models Consider a time series pair (X, Y). Suppose there is a vector autoregressive (VAR) model that describes the pair, ( Xt Y t ) = n i=1 ( A i 11 A i ) ( ) 12 Xt i A i 21 A i + 22 Y t i ( ε1,t ε 2,t ) The current time step t of X and Y is modeled as a sum of n past steps of X and Y, J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

64 Granger causality (g 2 ) VAR models Consider a time series pair (X, Y). Suppose there is a vector autoregressive (VAR) model that describes the pair, ( Xt Y t ) = n ( A i 11 A i ) ( ) 12 Xt i A i 21 A i + 22 Y t i i=1 ( ε1,t ε 2,t ) The current time step t of X and Y is modeled as a sum of n past steps of X and Y, plus uncorrelated noise terms. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

65 Granger causality (g 2 ) Comparison of VAR models Consider two different VAR models for the pair (X, Y), ( Xt Y t ( Xt Y t ) = ) = n ( ) ( ) Axx,i A xy,i Xt i + i=1 A yx,i A yy,i n ( A xx,i 0 i=1 0 A yy,i Y t i ) ( Xt i Y t i ( εx,t ε y,t ) ) ( ε ) + x,t ε y,t J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

66 Granger causality (g 2 ) Comparison of VAR models Consider two different VAR models for the pair (X, Y), ( Xt Y t ( Xt Y t ) = ) = n ( ) ( ) Axx,i A xy,i Xt i + i=1 A yx,i A yy,i n ( A xx,i 0 i=1 0 A yy,i Y t i ) ( Xt i Y t i The G-causality log-likelihood statistic is defined as F Y X = ln Σ xx Σ xx ( εx,t ε y,t ) ) ( ε ) + x,t ε y,t J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

67 Granger causality (g 2 ) Comparison of VAR models Consider two different VAR models for the pair (X, Y), ( Xt Y t ( Xt Y t ) = ) = n ( ) ( ) Axx,i A xy,i Xt i + i=1 A yx,i A yy,i n ( A xx,i 0 i=1 0 A yy,i Y t i ) ( Xt i Y t i The G-causality log-likelihood statistic is defined as F Y X = ln Σ xx Σ xx ( εx,t ε y,t ) ) ( ε ) + x,t ε y,t Covariance of X model residuals given no dependence on Y J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

68 Granger causality (g 2 ) Comparison of VAR models Consider two different VAR models for the pair (X, Y), ( Xt Y t ( Xt Y t ) = ) = n ( ) ( ) Axx,i A xy,i Xt i + i=1 A yx,i A yy,i n ( A xx,i 0 i=1 0 A yy,i Y t i ) ( Xt i Y t i The G-causality log-likelihood statistic is defined as F Y X = ln Σ xx Σ xx ( εx,t ε y,t ) ) ( ε ) + x,t ε y,t Covariance of X model residuals given a possible dependence on Y J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

69 Granger causality (g 2 ) G-causality log-likelihood statistic If both VAR models fit (or forecast ) the data equally well, then there is no G-causality; i.e., Σ xx = Σ xx F Y X = 0 J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

70 Granger causality (g 2 ) G-causality log-likelihood statistic If both VAR models fit (or forecast ) the data equally well, then there is no G-causality; i.e., Σ xx = Σ xx F Y X = 0 Operational causality (Granger) X causes Y if the X-dependent forecast of Y decreases the Y model residual covariance (as compared to the X-independent forecast) more than the Y-dependent forecast of X decreases the X model residual covariance (as compared to the Y-independent forecast); i.e., F X Y F Y X > 0 X Y F X Y F Y X < 0 Y X F X Y F Y X = 0 no causal inference J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

71 Pairwise asymmetric inference (g 3 ) State space reconstruction Consider an embedding of the time series X = {x t t = 0, 1..., L 1, L} constructed from delayed time steps as X = { x t t = 1 + (E 1)τ,..., L} with x t = ( x t, x t τ, x t 2τ,..., x t (E 1)τ ) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

72 Pairwise asymmetric inference (g 3 ) State space reconstruction Consider an embedding of the time series X = {x t t = 0, 1..., L 1, L} constructed from delayed time steps as X = { x t t = 1 + (E 1)τ,..., L} with x t = ( x t, x t τ, x t 2 τ,..., x t (E 1) τ ) τ is the delay time step J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

73 Pairwise asymmetric inference (g 3 ) State space reconstruction Consider an embedding of the time series X = {x t t = 0, 1..., L 1, L} constructed from delayed time steps as X = { x t t = 1 + (E 1)τ,..., L} with x t = ( ) x t, x t τ, x t 2τ,..., x t ( E 1)τ τ is the delay time step E is the embedding dimension J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

74 Pairwise asymmetric inference (g 3 ) Cross-mapping Consider a time series pair (X, Y). The shadow manifold of X (labeled X) is constructed from the points x t = (x t, x t τ, x t 2τ,..., x t (E 1)τ, y t ) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

75 Pairwise asymmetric inference (g 3 ) Cross-mapping Consider a time series pair (X, Y). The shadow manifold of X (labeled X) is constructed from the points x t = (x t, x t τ, x t 2τ,..., x t (E 1)τ, y t ) 1. Find the n nearest neighbors to x t (in X), where nearest means smallest Euclidean distance, d; i.e., d 1 < d 2 <... < d n J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

76 Pairwise asymmetric inference (g 3 ) Cross-mapping Consider a time series pair (X, Y). The shadow manifold of X (labeled X) is constructed from the points x t = (x t, x t τ, x t 2τ,..., x t (E 1)τ, y t ) 2. Create weights,w, from the nearest neighbors as w i = e d i d 1 n j=1 e d j d 1 J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

77 Pairwise asymmetric inference (g 3 ) Cross-mapping Consider a time series pair (X, Y). The shadow manifold of X (labeled X) is constructed from the points x t = (x t, x t τ, x t 2τ,..., x t (E 1)τ, y t ) 3. Construct the cross-mapped estimate of Y using the weights as { } n Y X = Y t X = t = 1 + (E 1)τ,..., L i=1 w i Yˆt i J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

78 Pairwise asymmetric inference (g 3 ) Cross-mapping Consider a time series pair (X, Y). The shadow manifold of X (labeled X) is constructed from the points x t = (x t, x t τ, x t 2τ,..., x t (E 1)τ, y t ) Each cross-mapped point in the estimate of Y, i.e., Y t X = n i=1 e d i /d 1 n j=1 e d j /d 1 Yˆt i depends on comparisons of J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

79 Pairwise asymmetric inference (g 3 ) Cross-mapping Consider a time series pair (X, Y). The shadow manifold of X (labeled X) is constructed from the points x t = (x t, x t τ, x t 2τ,..., x t (E 1)τ, y t ) Each cross-mapped point in the estimate of Y, i.e., Y t X = n i=1 e d i /d 1 n j=1 e d j /d 1 Yˆt i depends on comparisons of the pasts of X J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

80 Pairwise asymmetric inference (g 3 ) Cross-mapping Consider a time series pair (X, Y). The shadow manifold of X (labeled X) is constructed from the points x t = ( x t, x t τ, x t 2τ,..., x t (E 1)τ, y t ) Each cross-mapped point in the estimate of Y, i.e., Y t X = n i=1 e d i /d 1 n j=1 e d j /d 1 Yˆt i depends on comparisons of the pasts of X and the presents of X and Y. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

81 Pairwise asymmetric inference (g 3 ) Cross-mapped correlation A good cross-mapped estimate is defined as one that is strongly correlated with the original times series. The cross-mapped correlation is C YX = [ ] 2 ρ(y, Y X) where ρ ( ) is Pearson s correlation coefficient. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

82 Pairwise asymmetric inference (g 3 ) Cross-mapped correlation A good cross-mapped estimate is defined as one that is strongly correlated with the original times series. The cross-mapped correlation is C YX = [ ] 2 ρ(y, Y X) where ρ ( ) is Pearson s correlation coefficient. Cross-mapping interpretation If similar histories of X (i.e., nearest neighbors in the shadow manifold) capably estimate Y (i.e., lead to C YX 1, or at least C YX 0), then the presence (or action) of Y in the system has been recorded in X. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

83 Pairwise asymmetric inference (g 3 ) Cross-mapping interpretation of causality A time series pair (X, Y) will have two cross-mapped correlations, C YX and C XY. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

84 Pairwise asymmetric inference (g 3 ) Cross-mapping interpretation of causality A time series pair (X, Y) will have two cross-mapped correlations, C YX and C XY. Operational causality (SSR) X causes Y if similar histories of Y estimate X better than similar histories of X estimate Y, where the similar histories of one time series are used to estimate another time series through shadow manifold nearest neighbor weighting (cross-mapping); i.e., C YX C XY < 0 X Y C YX C XY > 0 Y X C YX C XY = 0 no causal inference J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

85 Weighted mean observed leaning (g 4 ) Causal penchant The causal penchant ρ EC [1, 1] is ρ EC = P (E C) P ( E C ) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

86 Weighted mean observed leaning (g 4 ) Causal penchant The causal penchant ρ EC [1, 1] is ρ EC = P (E C) P ( E C ) P (E C) is the probability of some effect E given some cause C J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

87 Weighted mean observed leaning (g 4 ) Causal penchant The causal penchant ρ EC [1, 1] is ρ EC = P (E C) P ( E C ) P ( E C ) is the probability of some effect E given no cause C J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

88 Weighted mean observed leaning (g 4 ) Causal penchant The causal penchant ρ EC [1, 1] is ρ EC = P (E C) P ( E C ) So, the penchant is the probability of an effect E given a cause C minus the probability of that effect without the cause J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

89 Weighted mean observed leaning (g 4 ) Causal penchant The causal penchant ρ EC [1, 1] is ρ EC = P (E C) P ( E C ) In the psychology/medical literature, the causal penchant is known as the Eells measure of causal strength or probability contrast. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

90 Weighted mean observed leaning (g 4 ) Causal penchant The causal penchant ρ EC [1, 1] is ρ EC = P (E C) P ( E C ) If C drives E, then it is expected that ρ EC > 0. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

91 Weighted mean observed leaning (g 4 ) Causal penchant The causal penchant ρ EC [1, 1] is ρ EC = P (E C) P ( E C ) The second term, P ( E C ), can be eliminated from the penchant formula using Bayes theorem. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

92 Weighted mean observed leaning (g 4 ) Causal penchant The causal penchant ρ EC [1, 1] is ρ EC = P(E C) ( 1 + P(C) 1 P(C) ) P(E) 1 P(C) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

93 Weighted mean observed leaning (g 4 ) Causal penchant The causal penchant ρ EC [1, 1] is ρ EC = P(E C) ( 1 + P(C) 1 P(C) ) P(E) 1 P(C) If E and C are independent, then P(E C) = P(E), which implies ρ EC = P(E) + P(E)P(C) P(E) 1 P(C) = P(E) P(E) = 0 J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

94 Weighted mean observed leaning (g 4 ) Causal penchant The causal penchant ρ EC [1, 1] is ρ EC = P(E C) ( 1 + P(C) 1 P(C) ) P(E) 1 P(C) Example (to help with intuition) Consider C and E to be two fair coins, c 1 and c 2, being heads ; i.e., P(c 1 = heads ) = 0.5 and P(c 2 = heads ) = 0.5. If the coins are independent, then P(c 2 = heads c 1 = heads ) = P(c 2 = heads ) = 0.5 ρ EC = 0 If they are completely dependent then P(c 2 = heads c 1 = heads ) = 1 or 0 ρ EC = 1 or 1 J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

95 Weighted mean observed leaning (g 4 ) Causal penchant The causal penchant ρ EC [1, 1] is ρ EC = P(E C) ( 1 + P(C) 1 P(C) ) P(E) 1 P(C) This formula has the additional benefit of only needing to estimate one conditional probability from the data. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

96 Weighted mean observed leaning (g 4 ) Causal leaning A difference of penchants can be used to compare different cause-effect assignments (i.e., different assumptions of what should be considered a cause and what should be considered an effect). The leaning is λ EC = ρ EC ρ CE J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

97 Weighted mean observed leaning (g 4 ) Causal leaning A difference of penchants can be used to compare different cause-effect assignments (i.e., different assumptions of what should be considered a cause and what should be considered an effect). The leaning is λ EC = ρ EC ρ CE Leaning interpretation If λ EC > 0, then C drives E more than E drives C. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

98 Weighted mean observed leaning (g 4 ) Usefulness of the leaning The usefulness of the leaning depends on two things, J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

99 Weighted mean observed leaning (g 4 ) Usefulness of the leaning The usefulness of the leaning depends on two things, 1. Operational definitions of C and E (called the cause-effect assignment) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

100 Weighted mean observed leaning (g 4 ) Usefulness of the leaning The usefulness of the leaning depends on two things, 1. Operational definitions of C and E (called the cause-effect assignment) 2. Estimations of P(C), P(E), P(C E), and P(E C) from the data J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

101 Weighted mean observed leaning (g 4 ) Usefulness of the leaning The usefulness of the leaning depends on two things, 1. Operational definitions of C and E (called the cause-effect assignment) 2. Estimations of P(C), P(E), P(C E), and P(E C) from the data The primary cause-effect assignment will be the l-standard assignment, l-standard assignment Consider a time series pair (X, Y). The l-standard assignment initially assumes the cause is the l lagged time step of X and the effect is the current time step of Y; i.e., {C, E} = {x t l, y t }. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

102 Weighted mean observed leaning (g 4 ) Usefulness of the leaning The usefulness of the leaning depends on two things, 1. Operational definitions of C and E (called the cause-effect assignment) 2. Estimations of P(C), P(E), P(C E), and P(E C) from the data The primary cause-effect assignment will be the l-standard assignment, l-standard assignment Consider a time series pair (X, Y). The l-standard assignment initially assumes the cause is the l lagged time step of X and the effect is the current time step of Y; i.e., {C, E} = {x t l, y t }. Probabilities will estimated using data frequency counts. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

103 Weighted mean observed leaning (g 4 ) Leaning from the data The cause-effect assignment must be specific if the probabilities are to be estimated with frequency counts and need to include tolerance domains to account for noise in the measurements. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

104 Weighted mean observed leaning (g 4 ) Leaning from the data Consider the time series pair (X, Y). The penchant calculation depends on the conditional P(y t = a x t l = b), where a Y and b X. This conditional will be estimated as P(y t [a δ L y, a + δ R y ] x t l [b δ L x, b + δ R x ]) = n a b n b J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

105 Weighted mean observed leaning (g 4 ) Leaning from the data Consider the time series pair (X, Y). The penchant calculation depends on the conditional P(y t = a x t l = b), where a Y and b X. This conditional will be estimated as P(y t [a δ L y, a + δ R y ] x t l [b δ L x, b + δ R x ]) = n a b n b n a b is the number of times y t [a δ L y, a + δ R y ] and x t l [b δ L x, b + δ R x ] in (X, Y) J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

106 Weighted mean observed leaning (g 4 ) Leaning from the data Consider the time series pair (X, Y). The penchant calculation depends on the conditional P(y t = a x t l = b), where a Y and b X. This conditional will be estimated as P(y t [a δ L y, a + δ R y ] x t l [b δ L x, b + δ R x ]) = n a b n b n b is the number of times x t l [b δ L x, b + δ R x ] in X J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

107 Weighted mean observed leaning (g 4 ) Leaning from the data Consider the time series pair (X, Y). The penchant calculation depends on the conditional P(y t = a x t l = b), where a Y and b X. This conditional will be estimated as P(y t [a δ L y, a + δ R y ] x t l [b δ L x, b + δ R x ]) = n a b n b The tolerance domains are usually considered symmetric; i.e., δx L = δx R δy L = δy R and J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

108 Weighted mean observed leaning (g 4 ) Leaning from the data Consider the time series pair (X, Y). The penchant calculation depends on the conditional P(y t = a x t l = b), where a Y and b X. This conditional will be estimated as P(y t [a δ L y, a + δ R y ] x t l [b δ L x, b + δ R x ]) = n a b n b The causal inference implied by the leaning calculations are dependent on both the cause-effect assignment and the tolerance domains. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

109 Weighted mean observed leaning (g 4 ) Weighted mean Any time series pair (X, Y) will have many leanings; e.g., an l-standard assignment of {C, E} = {x t l = b ± δ x, y t = a ± δ y } will have a different leaning calculation for each x t 1 [b δ x, b + δ x ] and y t [a δ y, a + δ y ]. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

110 Weighted mean observed leaning (g 4 ) Weighted mean Consider a time series pair (X, Y) and some cause-effect assignment {C, E} for which reasonable tolerance domains have been defined. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

111 Weighted mean observed leaning (g 4 ) Weighted mean Consider a time series pair (X, Y) and some cause-effect assignment {C, E} for which reasonable tolerance domains have been defined. Any penchant calculation for which the (estimated) conditional P(E C) 0 (or P(C E) 0) is called an observed penchant. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

112 Weighted mean observed leaning (g 4 ) Weighted mean Consider a time series pair (X, Y) and some cause-effect assignment {C, E} for which reasonable tolerance domains have been defined. Any penchant calculation for which the (estimated) conditional P(E C) 0 (or P(C E) 0) is called an observed penchant. The weighed mean observed penchant, ρ EC w, is the weighed algebraic mean of the observed penchants. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

113 Weighted mean observed leaning (g 4 ) Weighted mean Consider a time series pair (X, Y) and some cause-effect assignment {C, E} for which reasonable tolerance domains have been defined. Any penchant calculation for which the (estimated) conditional P(E C) 0 (or P(C E) 0) is called an observed penchant. The weighed mean observed penchant, ρ EC w, is the weighed algebraic mean of the observed penchants. The weighed mean observed leaning, λ EC w, is the difference of the weighed mean observed penchants; i.e., λ EC w = ρ EC w ρ CE w J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

114 Weighted mean observed leaning (g 4 ) Causal inference Operational causality (penchant) X causes Y if the weighted mean observed leaning is positive given a cause-effect assignment (and reasonable tolerance domains) in which the assumed cause X precedes the assumed effect Y; i.e., λ EC w > 0 X Y λ EC w < 0 Y X λ EC w = 0 no causal inference given C X, E Y, and C precedes E. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

115 Lagged cross-correlation difference (g 5 ) Cross-correlation The cross-correlation between two time series X and Y is ρ xy = E [(x t µ X ) (y t µ Y )] σx 2 σ2 Y J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

116 Lagged cross-correlation difference (g 5 ) Cross-correlation The cross-correlation between two time series X and Y is ) ] E [(x t µ X (y t µ Y ) ρ xy = σx 2 σ2 Y Every point in X is compared to the mean of X J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

117 Lagged cross-correlation difference (g 5 ) Cross-correlation The cross-correlation between two time series X and Y is [ )] E (x t µ X ) (y t µ Y ρ xy = σx 2 σ2 Y Every point in Y is compared to the mean of Y J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

118 Lagged cross-correlation difference (g 5 ) Cross-correlation The cross-correlation between two time series X and Y is ρ xy = E [(x t µ X ) (y t µ Y )] σ 2 X σ2 Y The product of the individual variances of X and Y is used as a normalization J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

119 Lagged cross-correlation difference (g 5 ) Cross-correlation The cross-correlation between two time series X and Y is ρ xy = E [(x t µ X ) (y t µ Y )] σx 2 σ2 Y Example (to help with intuition) X = Y ρ xy = E [(x t µ X ) (y t µ Y )] σx 2 σ2 Y = E [(x t µ X ) 2] σ 2 X = σ2 X σ 2 X = 1 J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

120 Lagged cross-correlation difference (g 5 ) Lagged cross-correlation Consider a time series pair (X, Y). The past of Y may be compared to the present of X by introducing a lag l into the cross-correlation calculation, l = E [(x t µ X ) (y t l µ Y )] ρ xy σ 2 X σ2 Y J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

121 Lagged cross-correlation difference (g 5 ) Lagged cross-correlation Consider a time series pair (X, Y). The past of Y may be compared to the present of X by introducing a lag l into the cross-correlation calculation, l = E [(x t µ X ) (y t l µ Y )] ρ xy σ 2 X σ2 Y Operational causality (correlation) X causes Y (at lag l) if the past of X (i.e., X lagged by l time steps) is more strongly correlated with the present of Y than the past of Y (i.e., Y lagged by l time steps) is with the present of X; i.e., ρ xy l ρ yx ρ yx ρ xy l ρ xy l l < 0 X Y l > 0 Y X ρ yx l = 0 no causal inference J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

122 Computational tools for the ECA summary Open source packages are available for some of the mentioned times series causality tools and others required code to be develop from scratch. g 1 g 2 g 3 g 4 g 5 Java Information Dynamics Toolkit (JIDT) Multivariate Granger Causality (MVGC) MATLAB toolbox (C++) (MATLAB) (MATLAB) All the code is available at J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

123 Cooling/Heating System Data Time series data Consider a time series pair (X, Y) where X are indoor temperature measurements (in degrees Celsius) in a house with experimental environmental controls and Y is the temperature outside of that house, measured at the same time intervals (168 measurements in each series) x t t y t t X Y 3 This data was originally presented at a time series conference. The abstract is available here, The data is also available as part of the UCI Machine Learning Repository. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

124 Cooling/Heating System Data Time series data Consider a time series pair (X, Y) where X are indoor temperature measurements (in degrees Celsius) in a house with experimental environmental controls and Y is the temperature outside of that house, measured at the same time intervals (168 measurements in each series) x t t y t t X Y The intuitive causal inference is Y X. 3 This data was originally presented at a time series conference. The abstract is available here, The data is also available as part of the UCI Machine Learning Repository. J. M. McCracken (GMU) ECA w/ time series causality November 20, / 50

EXPLORATORY CAUSAL ANALYSIS IN BIVARIATE TIME SERIES DATA

EXPLORATORY CAUSAL ANALYSIS IN BIVARIATE TIME SERIES DATA EXPLORATORY CAUSAL ANALYSIS IN BIVARIATE TIME SERIES DATA by James M. McCracken A Dissertation Submitted to the Graduate Faculty of George Mason University In Partial fulfillment of The Requirements for

More information

Some Basic Concepts of Probability and Information Theory: Pt. 2

Some Basic Concepts of Probability and Information Theory: Pt. 2 Some Basic Concepts of Probability and Information Theory: Pt. 2 PHYS 476Q - Southern Illinois University January 22, 2018 PHYS 476Q - Southern Illinois University Some Basic Concepts of Probability and

More information

for valid PSD. PART B (Answer all five units, 5 X 10 = 50 Marks) UNIT I

for valid PSD. PART B (Answer all five units, 5 X 10 = 50 Marks) UNIT I Code: 15A04304 R15 B.Tech II Year I Semester (R15) Regular Examinations November/December 016 PROBABILITY THEY & STOCHASTIC PROCESSES (Electronics and Communication Engineering) Time: 3 hours Max. Marks:

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 89 Part II

More information

L2: Review of probability and statistics

L2: Review of probability and statistics Probability L2: Review of probability and statistics Definition of probability Axioms and properties Conditional probability Bayes theorem Random variables Definition of a random variable Cumulative distribution

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 3 Probability Contents 1. Events, Sample Spaces, and Probability 2. Unions and Intersections 3. Complementary Events 4. The Additive Rule and Mutually Exclusive

More information

Basic Probability and Statistics

Basic Probability and Statistics Basic Probability and Statistics Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Jerry Zhu, Mark Craven] slide 1 Reasoning with Uncertainty

More information

Origins of Probability Theory

Origins of Probability Theory 1 16.584: INTRODUCTION Theory and Tools of Probability required to analyze and design systems subject to uncertain outcomes/unpredictability/randomness. Such systems more generally referred to as Experiments.

More information

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1 Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes

More information

Switching Regime Estimation

Switching Regime Estimation Switching Regime Estimation Series de Tiempo BIrkbeck March 2013 Martin Sola (FE) Markov Switching models 01/13 1 / 52 The economy (the time series) often behaves very different in periods such as booms

More information

DD Advanced Machine Learning

DD Advanced Machine Learning Modelling Carl Henrik {chek}@csc.kth.se Royal Institute of Technology November 4, 2015 Who do I think you are? Mathematically competent linear algebra multivariate calculus Ok programmers Able to extend

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY OUTLINE 3.1 Why Probability? 3.2 Random Variables 3.3 Probability Distributions 3.4 Marginal Probability 3.5 Conditional Probability 3.6 The Chain

More information

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability Theory for Machine Learning. Chris Cremer September 2015 Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters

More information

Technical Track Session I:

Technical Track Session I: Impact Evaluation Technical Track Session I: Click to edit Master title style Causal Inference Damien de Walque Amman, Jordan March 8-12, 2009 Click to edit Master subtitle style Human Development Human

More information

Communication Theory II

Communication Theory II Communication Theory II Lecture 5: Review on Probability Theory Ahmed Elnakib, PhD Assistant Professor, Mansoura University, Egypt Febraury 22 th, 2015 1 Lecture Outlines o Review on probability theory

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

Lecture 11: Information theory THURSDAY, FEBRUARY 21, 2019

Lecture 11: Information theory THURSDAY, FEBRUARY 21, 2019 Lecture 11: Information theory DANIEL WELLER THURSDAY, FEBRUARY 21, 2019 Agenda Information and probability Entropy and coding Mutual information and capacity Both images contain the same fraction of black

More information

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye Chapter 2: Entropy and Mutual Information Chapter 2 outline Definitions Entropy Joint entropy, conditional entropy Relative entropy, mutual information Chain rules Jensen s inequality Log-sum inequality

More information

Review of Basic Probability

Review of Basic Probability Review of Basic Probability Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA 01003 September 16, 2009 Abstract This document reviews basic discrete

More information

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides slide 1 Inference with Bayes rule: Example In a bag there are two envelopes one has a red ball (worth $100) and a black ball one

More information

Introduction to Probability and Statistics (Continued)

Introduction to Probability and Statistics (Continued) Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:

More information

Sociology 6Z03 Topic 10: Probability (Part I)

Sociology 6Z03 Topic 10: Probability (Part I) Sociology 6Z03 Topic 10: Probability (Part I) John Fox McMaster University Fall 2014 John Fox (McMaster University) Soc 6Z03: Probability I Fall 2014 1 / 29 Outline: Probability (Part I) Introduction Probability

More information

Statistical Models for Causal Analysis

Statistical Models for Causal Analysis Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring

More information

ISSN Article. Simulation Study of Direct Causality Measures in Multivariate Time Series

ISSN Article. Simulation Study of Direct Causality Measures in Multivariate Time Series Entropy 2013, 15, 2635-2661; doi:10.3390/e15072635 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article Simulation Study of Direct Causality Measures in Multivariate Time Series Angeliki

More information

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS ANALYTIC COMPARISON of Pearl and Rubin CAUSAL FRAMEWORKS Content Page Part I. General Considerations Chapter 1. What is the question? 16 Introduction 16 1. Randomization 17 1.1 An Example of Randomization

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Technical Track Session I: Causal Inference

Technical Track Session I: Causal Inference Impact Evaluation Technical Track Session I: Causal Inference Human Development Human Network Development Network Middle East and North Africa Region World Bank Institute Spanish Impact Evaluation Fund

More information

Quantitative Biology Lecture 3

Quantitative Biology Lecture 3 23 nd Sep 2015 Quantitative Biology Lecture 3 Gurinder Singh Mickey Atwal Center for Quantitative Biology Summary Covariance, Correlation Confounding variables (Batch Effects) Information Theory Covariance

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

ECE531: Principles of Detection and Estimation Course Introduction

ECE531: Principles of Detection and Estimation Course Introduction ECE531: Principles of Detection and Estimation Course Introduction D. Richard Brown III WPI 22-January-2009 WPI D. Richard Brown III 22-January-2009 1 / 37 Lecture 1 Major Topics 1. Web page. 2. Syllabus

More information

9 Graphical modelling of dynamic relationships in multivariate time series

9 Graphical modelling of dynamic relationships in multivariate time series 9 Graphical modelling of dynamic relationships in multivariate time series Michael Eichler Institut für Angewandte Mathematik Universität Heidelberg Germany SUMMARY The identification and analysis of interactions

More information

Machine Learning Recitation 8 Oct 21, Oznur Tastan

Machine Learning Recitation 8 Oct 21, Oznur Tastan Machine Learning 10601 Recitation 8 Oct 21, 2009 Oznur Tastan Outline Tree representation Brief information theory Learning decision trees Bagging Random forests Decision trees Non linear classifier Easy

More information

Lecture 1: Bayesian Framework Basics

Lecture 1: Bayesian Framework Basics Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

Basic Probability. Introduction

Basic Probability. Introduction Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

Joint Gaussian Graphical Model Review Series I

Joint Gaussian Graphical Model Review Series I Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun

More information

Introduction to Artificial Intelligence. Unit # 11

Introduction to Artificial Intelligence. Unit # 11 Introduction to Artificial Intelligence Unit # 11 1 Course Outline Overview of Artificial Intelligence State Space Representation Search Techniques Machine Learning Logic Probabilistic Reasoning/Bayesian

More information

Lecture 9. Time series prediction

Lecture 9. Time series prediction Lecture 9 Time series prediction Prediction is about function fitting To predict we need to model There are a bewildering number of models for data we look at some of the major approaches in this lecture

More information

Modern Navigation. Thomas Herring

Modern Navigation. Thomas Herring 12.215 Modern Navigation Thomas Herring Estimation methods Review of last class Restrict to basically linear estimation problems (also non-linear problems that are nearly linear) Restrict to parametric,

More information

Title. Description. var intro Introduction to vector autoregressive models

Title. Description. var intro Introduction to vector autoregressive models Title var intro Introduction to vector autoregressive models Description Stata has a suite of commands for fitting, forecasting, interpreting, and performing inference on vector autoregressive (VAR) models

More information

Elementary Discrete Probability

Elementary Discrete Probability Elementary Discrete Probability MATH 472 Financial Mathematics J Robert Buchanan 2018 Objectives In this lesson we will learn: the terminology of elementary probability, elementary rules of probability,

More information

INTRODUCTION TO INFORMATION THEORY

INTRODUCTION TO INFORMATION THEORY INTRODUCTION TO INFORMATION THEORY KRISTOFFER P. NIMARK These notes introduce the machinery of information theory which is a eld within applied mathematics. The material can be found in most textbooks

More information

Review of probabilities

Review of probabilities CS 1675 Introduction to Machine Learning Lecture 5 Density estimation Milos Hauskrecht milos@pitt.edu 5329 Sennott Square Review of probabilities 1 robability theory Studies and describes random processes

More information

Some Concepts of Probability (Review) Volker Tresp Summer 2018

Some Concepts of Probability (Review) Volker Tresp Summer 2018 Some Concepts of Probability (Review) Volker Tresp Summer 2018 1 Definition There are different way to define what a probability stands for Mathematically, the most rigorous definition is based on Kolmogorov

More information

1 Random walks and data

1 Random walks and data Inference, Models and Simulation for Complex Systems CSCI 7-1 Lecture 7 15 September 11 Prof. Aaron Clauset 1 Random walks and data Supposeyou have some time-series data x 1,x,x 3,...,x T and you want

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

Lecture 5 - Information theory

Lecture 5 - Information theory Lecture 5 - Information theory Jan Bouda FI MU May 18, 2012 Jan Bouda (FI MU) Lecture 5 - Information theory May 18, 2012 1 / 42 Part I Uncertainty and entropy Jan Bouda (FI MU) Lecture 5 - Information

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed

More information

Expectations and Variance

Expectations and Variance 4. Model parameters and their estimates 4.1 Expected Value and Conditional Expected Value 4. The Variance 4.3 Population vs Sample Quantities 4.4 Mean and Variance of a Linear Combination 4.5 The Covariance

More information

Machine learning - HT Maximum Likelihood

Machine learning - HT Maximum Likelihood Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (www.cs.ucla.

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (www.cs.ucla. OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea/) Statistical vs. Causal vs. Counterfactual inference: syntax and semantics

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint

More information

Review of probability

Review of probability Review of probability Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts definition of probability random variables

More information

Probability Rules. MATH 130, Elements of Statistics I. J. Robert Buchanan. Fall Department of Mathematics

Probability Rules. MATH 130, Elements of Statistics I. J. Robert Buchanan. Fall Department of Mathematics Probability Rules MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2018 Introduction Probability is a measure of the likelihood of the occurrence of a certain behavior

More information

Expectations and Variance

Expectations and Variance 4. Model parameters and their estimates 4.1 Expected Value and Conditional Expected Value 4. The Variance 4.3 Population vs Sample Quantities 4.4 Mean and Variance of a Linear Combination 4.5 The Covariance

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity

More information

PROBABILISTIC REASONING SYSTEMS

PROBABILISTIC REASONING SYSTEMS PROBABILISTIC REASONING SYSTEMS In which we explain how to build reasoning systems that use network models to reason with uncertainty according to the laws of probability theory. Outline Knowledge in uncertain

More information

Econ 325: Introduction to Empirical Economics

Econ 325: Introduction to Empirical Economics Econ 325: Introduction to Empirical Economics Lecture 2 Probability Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 3-1 3.1 Definition Random Experiment a process leading to an uncertain

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Probabilistic Reasoning

Probabilistic Reasoning Course 16 :198 :520 : Introduction To Artificial Intelligence Lecture 7 Probabilistic Reasoning Abdeslam Boularias Monday, September 28, 2015 1 / 17 Outline We show how to reason and act under uncertainty.

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

CME 106: Review Probability theory

CME 106: Review Probability theory : Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:

More information

conditional cdf, conditional pdf, total probability theorem?

conditional cdf, conditional pdf, total probability theorem? 6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics February 12, 2018 CS 361: Probability & Statistics Random Variables Monty hall problem Recall the setup, there are 3 doors, behind two of them are indistinguishable goats, behind one is a car. You pick

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Data Analysis and Uncertainty Part 1: Random Variables

Data Analysis and Uncertainty Part 1: Random Variables Data Analysis and Uncertainty Part 1: Random Variables Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu 1 Topics 1. Why uncertainty exists? 2. Dealing

More information

Data assimilation with and without a model

Data assimilation with and without a model Data assimilation with and without a model Tim Sauer George Mason University Parameter estimation and UQ U. Pittsburgh Mar. 5, 2017 Partially supported by NSF Most of this work is due to: Tyrus Berry,

More information

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari Quantifying Uncertainty & Probabilistic Reasoning Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari Outline Previous Implementations What is Uncertainty? Acting Under Uncertainty Rational Decisions Basic

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004 A Brief Introduction to Graphical Models Presenter: Yijuan Lu November 12,2004 References Introduction to Graphical Models, Kevin Murphy, Technical Report, May 2001 Learning in Graphical Models, Michael

More information

Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence

Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence Graphical models and causality: Directed acyclic graphs (DAGs) and conditional (in)dependence General overview Introduction Directed acyclic graphs (DAGs) and conditional independence DAGs and causal effects

More information

Chapter I: Fundamental Information Theory

Chapter I: Fundamental Information Theory ECE-S622/T62 Notes Chapter I: Fundamental Information Theory Ruifeng Zhang Dept. of Electrical & Computer Eng. Drexel University. Information Source Information is the outcome of some physical processes.

More information

Computational Genomics

Computational Genomics Computational Genomics http://www.cs.cmu.edu/~02710 Introduction to probability, statistics and algorithms (brief) intro to probability Basic notations Random variable - referring to an element / event

More information

Machine Learning Lecture Notes

Machine Learning Lecture Notes Machine Learning Lecture Notes Predrag Radivojac January 3, 25 Random Variables Until now we operated on relatively simple sample spaces and produced measure functions over sets of outcomes. In many situations,

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Statistical Inference

Statistical Inference Statistical Inference Lecture 1: Probability Theory MING GAO DASE @ ECNU (for course related communications) mgao@dase.ecnu.edu.cn Sep. 11, 2018 Outline Introduction Set Theory Basics of Probability Theory

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

Bounding the Probability of Causation in Mediation Analysis

Bounding the Probability of Causation in Mediation Analysis arxiv:1411.2636v1 [math.st] 10 Nov 2014 Bounding the Probability of Causation in Mediation Analysis A. P. Dawid R. Murtas M. Musio February 16, 2018 Abstract Given empirical evidence for the dependence

More information

Exercises with solutions (Set D)

Exercises with solutions (Set D) Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where

More information

An introduction to biostatistics: part 1

An introduction to biostatistics: part 1 An introduction to biostatistics: part 1 Cavan Reilly September 6, 2017 Table of contents Introduction to data analysis Uncertainty Probability Conditional probability Random variables Discrete random

More information

Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.

Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I. Vector Autoregressive Model Vector Autoregressions II Empirical Macroeconomics - Lect 2 Dr. Ana Beatriz Galvao Queen Mary University of London January 2012 A VAR(p) model of the m 1 vector of time series

More information

CAUSAL INFERENCE IN TIME SERIES ANALYSIS. Michael Eichler

CAUSAL INFERENCE IN TIME SERIES ANALYSIS. Michael Eichler CAUSAL INFERENCE IN TIME SERIES ANALYSIS Michael Eichler Department of Quantitative Economics, Maastricht University P.O. Box 616, 6200 MD Maastricht, The Netherlands November 11, 2011 1. ÁÒØÖÓ ÙØ ÓÒ The

More information

STA Module 4 Probability Concepts. Rev.F08 1

STA Module 4 Probability Concepts. Rev.F08 1 STA 2023 Module 4 Probability Concepts Rev.F08 1 Learning Objectives Upon completing this module, you should be able to: 1. Compute probabilities for experiments having equally likely outcomes. 2. Interpret

More information

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations Simultaneous Equation Models. Introduction: basic definitions 2. Consequences of ignoring simultaneity 3. The identification problem 4. Estimation of simultaneous equation models 5. Example: IS LM model

More information

CISC 876: Kolmogorov Complexity

CISC 876: Kolmogorov Complexity March 27, 2007 Outline 1 Introduction 2 Definition Incompressibility and Randomness 3 Prefix Complexity Resource-Bounded K-Complexity 4 Incompressibility Method Gödel s Incompleteness Theorem 5 Outline

More information

Stats Probability Theory

Stats Probability Theory Stats 241.3 Probability Theory Instructor: Office: W.H.Laverty 235 McLean Hall Phone: 966-6096 Lectures: Evaluation: M T W Th F 1:30pm - 2:50pm Thorv 105 Lab: T W Th 3:00-3:50 Thorv 105 Assignments, Labs,

More information

Lecture 16 : Independence, Covariance and Correlation of Discrete Random Variables

Lecture 16 : Independence, Covariance and Correlation of Discrete Random Variables Lecture 6 : Independence, Covariance and Correlation of Discrete Random Variables 0/ 3 Definition Two discrete random variables X and Y defined on the same sample space are said to be independent if for

More information

1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution

1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution NETWORK ANALYSIS Lourens Waldorp PROBABILITY AND GRAPHS The objective is to obtain a correspondence between the intuitive pictures (graphs) of variables of interest and the probability distributions of

More information