A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data

Size: px
Start display at page:

Download "A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data"

Transcription

1 A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data David S. Matteson Department of Statistical Science Cornell University Joint work with: Nicholas A. James, ORIE, Cornell University Sponsorship: National Science Foundation 2014 October David S. Matteson Change Point Analysis 2014 October 1 / 40

2 Introduction Change Point Analysis The process of detecting distributional changes within time ordered data Framework: Retrospective, offline analysis Multivariate observations Estimation: number of change points and their positions Hierarchical algorithms Applications: Genetics Finance Emergency Medical Services David S. Matteson Change Point Analysis 2014 October 2 / 40

3 Introduction Change Point Analysis Given independent, time ordered observations X 1, X 2,..., X n R d Partition into k homogeneous, temporally contiguous subsets k is unknown Size of each subset is unknown David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 3 / 40

4 Introduction Change Point Analysis Given independent, time ordered observations X 1, X 2,..., X n R d Partition into k homogeneous, temporally contiguous subsets k is unknown Size of each subset is unknown David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 3 / 40

5 Introduction Change Point Analysis Given independent, time ordered observations X 1, X 2,..., X n R d Partition into k homogeneous, temporally contiguous subsets k is unknown Size of each subset is unknown David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 3 / 40

6 Introduction Change Point Analysis Given independent, time ordered observations X 1, X 2,..., X n R d Partition into k homogeneous, temporally contiguous subsets k is unknown Size of each subset is unknown David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 3 / 40

7 Cluster Analysis Cluster Analysis Change point analysis is similar to cluster analysis In cluster analysis we also wish to partition the observations into homogeneous subsets Subsets may not be contiguous in time without some constraints David S. Matteson Change Point Analysis 2014 October 4 / 40

8 Cluster Analysis Cluster Analysis Change point analysis is similar to cluster analysis In cluster analysis we also wish to partition the observations into homogeneous subsets Subsets may not be contiguous in time without some constraints David S. Matteson Change Point Analysis 2014 October 4 / 40

9 Cluster Analysis Cluster Analysis Change point analysis is similar to cluster analysis In cluster analysis we also wish to partition the observations into homogeneous subsets Subsets may not be contiguous in time without some constraints David S. Matteson Change Point Analysis 2014 October 4 / 40

10 Hierarchical Estimation Hierarchical Estimation Apply methods from clustering to find change points Exhaustive search is not practical: O(n k ), in general. May consider Dynamic Programming We use a hierarchical or sequential approach: O(kn 2 ) David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 5 / 40

11 Hierarchical Estimation Hierarchical Estimation Apply methods from clustering to find change points Exhaustive search is not practical: O(n k ), in general. May consider Dynamic Programming We use a hierarchical or sequential approach: O(kn 2 ) Divisive: Clusters are divided until each observation is its own cluster avid S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 5 / 40

12 Hierarchical Estimation Hierarchical Estimation Apply methods from clustering to find change points Exhaustive search is not practical: O(n k ), in general. May consider Dynamic Programming We use a hierarchical or sequential approach: O(kn 2 ) Divisive: Clusters are divided until each observation is its own cluster Agglomerative: Clusters are merged until all observations belong to a single cluster avid S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 5 / 40

13 Hierarchical Estimation Hierarchical Estimation: Divisive Progression David S. Matteson Change Point Analysis 2014 October 6 / 40

14 Hierarchical Estimation Hierarchical Estimation: Divisive Progression David S. Matteson Change Point Analysis 2014 October 6 / 40

15 Hierarchical Estimation Hierarchical Estimation: Divisive Progression David S. Matteson Change Point Analysis 2014 October 6 / 40

16 Hierarchical Estimation Hierarchical Estimation: Agglomerative Progression David S. Matteson Change Point Analysis 2014 October 7 / 40

17 Hierarchical Estimation Hierarchical Estimation: Agglomerative Progression David S. Matteson Change Point Analysis 2014 October 7 / 40

18 Hierarchical Estimation Hierarchical Estimation: Agglomerative Progression David S. Matteson Change Point Analysis 2014 October 7 / 40

19 Hierarchical Estimation Hierarchical Estimation: Agglomerative Progression David S. Matteson Change Point Analysis 2014 October 7 / 40

20 Multivariate Homogeneity Measuring Multivariate Homogeneity Suppose X, Y R d with X F x Y F y Let φ x (t) = E ( e i t,x ) and φ y (t) = E ( e i t,y ) characteristic functions Define a divergence between F x and F y as E(X, Y; w) = φ x (t) φ y (t) 2 w(t) dt, R d w(t) denotes an arbitrary positive weight function, for which E exists avid S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 8 / 40

21 Multivariate Homogeneity A Weight Function A convenient choice for w(t) > 0 (Székely and Rizzo, 2005): w(t; α) = in which Γ(x) is the gamma function ( ) 1 2π d/2 Γ(1 α/2) α2 α Γ((d + α)/2) t d+α Note: for any fixed (d, α), w(t; α) t (d+α) David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 9 / 40

22 Multivariate Homogeneity Equivalent Divergence Measures Let X and Y be independent, and (X, Y ) be an iid copy of (X, Y) Theorem Suppose that E( X α + Y α ) <, for some α (0, 2], then E(X, Y; α) = R d φ x (t) φ y (t) 2 ( ) 1 2π d/2 Γ(1 α/2) α2 α Γ((d + α)/2) t d+α dt = 2E X Y α E X X α E Y Y α < If 0 < α < 2 then E(X, Y; α) = 0 if and only if X and Y are identically distributed If α = 2 then E(X, Y; α) = 0 if and only if EX = EY David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 10 / 40

23 Multivariate Homogeneity An Empirical Measure (U-statistics) Let X n = {X i : i = 1,..., n} and Y m = {Y j : j = 1,..., m} be independent iid samples from the distribution of X, Y R d, respectively, such that E X α, E Y α < for some α (0, 2) Define Ê(X n, Y m ; α) = 2 mn n i=1 j=1 m X i Y j α ( ) 1 n X i X k α 2 1 i<k n ( ) 1 m Y j Y k α 2 1 j<k m and Q(X n, Y m ; α) = mn m + n Ê(X n, Y m ; α) David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 11 / 40

24 Multivariate Homogeneity Known Location: Two-Sample Homogeneity Test By strong law of large number for U-statistics Hoeffding (1961) almost surely, as min(m, n). Ê(X n, Y m ; α) E(X, Y ; α) avid S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 12 / 40

25 Multivariate Homogeneity Known Location: Two-Sample Homogeneity Test By strong law of large number for U-statistics Hoeffding (1961) almost surely, as min(m, n). Ê(X n, Y m ; α) E(X, Y ; α) Under the null hypothesis of equal distributions, i.e. E(X, Y ; α) = 0, Q(X n, Y m ; α) Q(X, Y ; α) = λ i Q i in distribution, as min(m, n). Here, the λ i > 0 are constants that depend on α and the distributions of X and Y, and the Q i are iid χ 2 1, see Rizzo and Székely (2010). i=1 avid S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 12 / 40

26 Multivariate Homogeneity Known Location: Two-Sample Homogeneity Test By strong law of large number for U-statistics Hoeffding (1961) almost surely, as min(m, n). Ê(X n, Y m ; α) E(X, Y ; α) Under the null hypothesis of equal distributions, i.e. E(X, Y ; α) = 0, Q(X n, Y m ; α) Q(X, Y ; α) = λ i Q i in distribution, as min(m, n). Here, the λ i > 0 are constants that depend on α and the distributions of X and Y, and the Q i are iid χ 2 1, see Rizzo and Székely (2010). Under alternative hypothesis of unequal distributions, i.e. E(X, Y ; α) > 0, i=1 Q(X n, Y m ; α) a.s. as min(m, n). avid S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 12 / 40

27 Single Change Point Single Change Point: Unknown Location Let Z 1,..., Z T R d be an independent sequence. Suppose heterogeneous sample with observations from two distributions. Let γ (0, 1) denote the division of observations, such that Z 1,..., Z γt F x and Z γt +1,..., Z T F y for every sample of size T. avid S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 13 / 40

28 Single Change Point Single Change Point: Unknown Location Let Z 1,..., Z T R d be an independent sequence. Suppose heterogeneous sample with observations from two distributions. Let γ (0, 1) denote the division of observations, such that Z 1,..., Z γt F x and Z γt +1,..., Z T F y for every sample of size T. Define X τ = {Z 1, Z 2,..., Z τ } and Y τ = {Z τ+1, Z τ+2,..., Z T }. A change point location ˆτ T is then estimated as ˆτ T = argmax τ Q T (X τ, Y τ ; α). avid S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 13 / 40

29 Single Change Point Single Change Point: Unknown Location Let Z 1,..., Z T R d be an independent sequence. Suppose heterogeneous sample with observations from two distributions. Let γ (0, 1) denote the division of observations, such that Z 1,..., Z γt F x and Z γt +1,..., Z T F y for every sample of size T. Define X τ = {Z 1, Z 2,..., Z τ } and Y τ = {Z τ+1, Z τ+2,..., Z T }. A change point location ˆτ T is then estimated as Theorem ˆτ T = argmax τ Q T (X τ, Y τ ; α). If E(X, Y ; α) < and γ (0, 1), then ˆτ T /T a.s. γ, as T. David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 13 / 40

30 Multiple Change Points Multiple Change Points: Unknown Locations A generalized bisection approach for sequential estimation For 1 τ < κ T, define: X τ = {Z 1, Z 2,..., Z τ } and Y τ (κ) = {Z τ+1, Z τ+2,..., Z κ } A change point location ˆτ is then estimated as (ˆτ, ˆκ) = argmax (τ,κ) Q(X τ, Y τ (κ); α). avid S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 14 / 40

31 Multiple Change Points Sequentially Estimating Multiple Change Points Suppose k 1 change points have been estimated: ˆτ 1 < < ˆτ k 1 This partitions the observations into k clusters Ĉ1, Ĉ2,..., Ĉk David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 15 / 40

32 Multiple Change Points Sequentially Estimating Multiple Change Points Suppose k 1 change points have been estimated: ˆτ 1 < < ˆτ k 1 This partitions the observations into k clusters Ĉ1, Ĉ2,..., Ĉk Given these clusters, we then apply the single change point procedure within each of the k clusters. For ith cluster Ĉi, denote proposed change point location ˆτ(i), and the associated constant ˆκ(i) David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 15 / 40

33 Multiple Change Points Sequentially Estimating Multiple Change Points Suppose k 1 change points have been estimated: ˆτ 1 < < ˆτ k 1 This partitions the observations into k clusters Ĉ1, Ĉ2,..., Ĉk Given these clusters, we then apply the single change point procedure within each of the k clusters. For ith cluster Ĉi, denote proposed change point location ˆτ(i), and the associated constant ˆκ(i) Now let i = argmax i {1,...,k} ˆQ[Xˆτ(i), Yˆτ(i) (ˆκ(i)); α], in which Xˆτ(i) and Yˆτ(i) (ˆκ(i)) are defined with respect to Ĉi Denote test statistic as ˆq k = ˆQ(Xˆτk, Yˆτk (ˆκ k ); α), ˆτ k = ˆτ(i ) is kth estimated change point, located within cluster C i David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 15 / 40

34 The E-Divisive Algorithm Estimation The E-Divisive Algorithm: Estimating Location A τ = {Z 1, Z 2,..., Z τ } and B τ (κ) = {Z τ+1, Z τ+2,..., Z κ } Recall, a change point location ˆτ is estimated as (ˆτ, ˆκ) = argmax (τ,κ) Q(A τ, B τ (κ); α) Thus, we maximize n+mê(a, mn B; α) for all subsets A and B: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 16 / 40

35 The E-Divisive Algorithm Estimation The E-Divisive Algorithm: Estimating Location A τ = {Z 1, Z 2,..., Z τ } and B τ (κ) = {Z τ+1, Z τ+2,..., Z κ } Recall, a change point location ˆτ is estimated as (ˆτ, ˆκ) = argmax (τ,κ) Q(A τ, B τ (κ); α) Thus, we maximize n+mê(a, mn B; α) for all subsets A and B: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 16 / 40

36 The E-Divisive Algorithm Estimation The E-Divisive Algorithm: Estimating Location A τ = {Z 1, Z 2,..., Z τ } and B τ (κ) = {Z τ+1, Z τ+2,..., Z κ } Recall, a change point location ˆτ is estimated as (ˆτ, ˆκ) = argmax (τ,κ) Q(A τ, B τ (κ); α) Thus, we maximize n+mê(a, mn B; α) for all subsets A and B: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 16 / 40

37 The E-Divisive Algorithm Estimation The E-Divisive Algorithm: Estimating Location A τ = {Z 1, Z 2,..., Z τ } and B τ (κ) = {Z τ+1, Z τ+2,..., Z κ } Recall, a change point location ˆτ is estimated as (ˆτ, ˆκ) = argmax (τ,κ) Q(A τ, B τ (κ); α) Thus, we maximize n+mê(a, mn B; α) for all subsets A and B: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 16 / 40

38 The E-Divisive Algorithm Estimation The E-Divisive Algorithm: Estimating Location A τ = {Z 1, Z 2,..., Z τ } and B τ (κ) = {Z τ+1, Z τ+2,..., Z κ } Recall, a change point location ˆτ is estimated as (ˆτ, ˆκ) = argmax (τ,κ) Q(A τ, B τ (κ); α) Thus, we maximize n+mê(a, mn B; α) for all subsets A and B: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 16 / 40

39 The E-Divisive Algorithm Estimation The E-Divisive Algorithm: Estimating Location A τ = {Z 1, Z 2,..., Z τ } and B τ (κ) = {Z τ+1, Z τ+2,..., Z κ } Recall, a change point location ˆτ is estimated as (ˆτ, ˆκ) = argmax (τ,κ) Q(A τ, B τ (κ); α) Thus, we maximize n+mê(a, mn B; α) for all subsets A and B: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 16 / 40

40 The E-Divisive Algorithm Estimation The E-Divisive Algorithm: Estimating Location A τ = {Z 1, Z 2,..., Z τ } and B τ (κ) = {Z τ+1, Z τ+2,..., Z κ } Recall, a change point location ˆτ is estimated as (ˆτ, ˆκ) = argmax (τ,κ) Q(A τ, B τ (κ); α) Thus, we maximize n+mê(a, mn B; α) for all subsets A and B: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 16 / 40

41 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Inference via Permutation Test Distribution of test statistic ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is unknown Significance of proposed change point measured via permutation test Randomly permute series, maximize n+mê(a, mn B; α), record and repeat: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 17 / 40

42 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Inference via Permutation Test Distribution of test statistic ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is unknown Significance of proposed change point measured via permutation test Randomly permute series, maximize n+mê(a, mn B; α), record and repeat: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 17 / 40

43 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Inference via Permutation Test Distribution of test statistic ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is unknown Significance of proposed change point measured via permutation test Randomly permute series, maximize n+mê(a, mn B; α), record and repeat: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 17 / 40

44 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Inference via Permutation Test Distribution of test statistic ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is unknown Significance of proposed change point measured via permutation test Randomly permute series, maximize n+mê(a, mn B; α), record and repeat: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 17 / 40

45 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Inference via Permutation Test Distribution of test statistic ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is unknown Significance of proposed change point measured via permutation test Randomly permute series, maximize n+mê(a, mn B; α), record and repeat: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 17 / 40

46 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Inference via Permutation Test Distribution of test statistic ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is unknown Significance of proposed change point measured via permutation test Randomly permute series, maximize n+mê(a, mn B; α), record and repeat: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 17 / 40

47 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Inference via Permutation Test Distribution of test statistic ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is unknown Significance of proposed change point measured via permutation test Randomly permute series, maximize n+mê(a, mn B; α), record and repeat: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 17 / 40

48 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

49 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

50 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

51 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

52 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

53 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

54 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

55 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

56 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

57 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

58 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

59 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points If ˆq = Q(A τ, B τ (κ); α) τ=ˆτ is insignificant: STOP If significant, condition on location, and repeat within clusters: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 18 / 40

60 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points Once again, perform permutation test However, only permute within each cluster: David S. Matteson Change Point Analysis 2014 October 19 / 40

61 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points Once again, perform permutation test However, only permute within each cluster: David S. Matteson Change Point Analysis 2014 October 19 / 40

62 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points Once again, perform permutation test However, only permute within each cluster: David S. Matteson Change Point Analysis 2014 October 19 / 40

63 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points Once again, perform permutation test However, only permute within each cluster: David S. Matteson Change Point Analysis 2014 October 19 / 40

64 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points Once again, perform permutation test However, only permute within each cluster: David S. Matteson Change Point Analysis 2014 October 19 / 40

65 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points Once again, perform permutation test However, only permute within each cluster: David S. Matteson Change Point Analysis 2014 October 19 / 40

66 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points Once again, perform permutation test However, only permute within each cluster: David S. Matteson Change Point Analysis 2014 October 19 / 40

67 The E-Divisive Algorithm Inference The E-Divisive Algorithm: Multiple Change Points Once again, perform permutation test However, only permute within each cluster: David S. Matteson Change Point Analysis 2014 October 19 / 40

68 The E-Divisive Algorithm ecp Package The ecp R package (CRAN) Signature: e.divisive(x, sig.lvl=0.05, R=199, k=null, min.size=30, alpha=1) Arguments: X - A T d matrix representation of a length T time series, with d-dimensional observations. sig.lvl - The significance level used for the permutation test. R - The maximum number of permutations to perform in the permutation test. k - The number of change points to return. If this is NULL only the statistically significant estimated change points are returned. min.size - The minimum number of observations btw change points. alpha - The index for test statistic. David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 20 / 40

69 The E-Divisive Algorithm ecp Package Complexity is O(kT 2 ) David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 21 / 40 The ecp R package (CRAN) Returned list: k.hat - Number of clusters created by the estimated change points. order.found - The order in which the change points were estimated. estimates - Locations of the statistically significant change points. considered.last - Location of the last change point, that was not found to be statistically significant at the given significance level. permutations - The number of permutations performed by each of the sequential permutation test. cluster - The estimated cluster membership vector. p.values - Approximate p-values estimated from each permutation test.

70 Simulation Simulation Study: Rand Index Compare E-Divisive with a generalized Wilcoxon/MannWhitney approach: the MultiRank procedure Lung-Yut-Fong et al. (2011) David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 22 / 40

71 Simulation Simulation Study: Rand Index Compare E-Divisive with a generalized Wilcoxon/MannWhitney approach: the MultiRank procedure Lung-Yut-Fong et al. (2011) For two partitions U & V, the Rand Index considers all pairs of observations: David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 22 / 40

72 Simulation Simulation Study: Rand Index Compare E-Divisive with a generalized Wilcoxon/MannWhitney approach: the MultiRank procedure Lung-Yut-Fong et al. (2011) For two partitions U & V, the Rand Index considers all pairs of observations: Define {A} Pairs in same cluster under U and in same cluster under V David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 22 / 40

73 Simulation Simulation Study: Rand Index Compare E-Divisive with a generalized Wilcoxon/MannWhitney approach: the MultiRank procedure Lung-Yut-Fong et al. (2011) For two partitions U & V, the Rand Index considers all pairs of observations: Define {A} Pairs in same cluster under U and in same cluster under V {B} Pairs in different cluster under U and in different cluster under V David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 22 / 40

74 Simulation Simulation Study: Rand Index Compare E-Divisive with a generalized Wilcoxon/MannWhitney approach: the MultiRank procedure Lung-Yut-Fong et al. (2011) For two partitions U & V, the Rand Index considers all pairs of observations: Define {A} Pairs in same cluster under U and in same cluster under V {B} Pairs in different cluster under U and in different cluster under V Rand index = #A + #B ( T 2) An equivalent definition of the Rand index can be found in Hubert and Arabie (1985) David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 22 / 40

75 Simulation Simulation Study: Rand Index Compare E-Divisive with a generalized Wilcoxon/MannWhitney approach: the MultiRank procedure Lung-Yut-Fong et al. (2011) For two partitions U & V, the Rand Index considers all pairs of observations: Define {A} Pairs in same cluster under U and in same cluster under V {B} Pairs in different cluster under U and in different cluster under V Rand index = #A + #B ( T 2) An equivalent definition of the Rand index can be found in Hubert and Arabie (1985) Adjusted Rand = Index Expected Index Rand Expected Rand = Max Index Expected Index 1 Expected Rand David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 22 / 40

76 Simulation A change in variance for univariate normal data Method Correct k Average Adjusted Rand MultiRank 22/ E-Divisive 95/ David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 23 / 40

77 Simulation A change in correlation for bivariate normal data Method Correct k Average Adjused Rand MultiRank 72/ E-Divisive 92/ David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 24 / 40

78 Simulation 1,000 simulations, 2 CP: N(0,1), N(µ,1), N(0,1) Average Rand Average Adj. Rand T µ MultiRank E-Divisive MultiRank E-Divisive David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 25 / 40

79 Simulation 1,000 simulations, 2 CP: N(0,1), N(0, σ 2 ), N(0,1) Average Rand Average Adj. Rand T σ 2 MultiRank E-Divisive MultiRank E-Divisive David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 26 / 40

80 Simulation 1,000 simulations, 2 CP: N(0,1), t ν (0, 1), N(0,1) Average Rand Average Adj. Rand T ν MultiRank E-Divisive MultiRank E-Divisive David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 27 / 40

81 Simulation 1,000 simulations, 2 CP: N 2 (0, I ), N 2 (µ, I ), N 2 (0, I ) Average Rand Average Adj. Rand T µ MultiRank E-Divisive MultiRank E-Divisive David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 28 / 40

82 Simulation 1,000 simulations, 2 CP: N 2 (0, Σ), N 2 (0, I ), N 2 (0, Σ) ( ) 1 ρ Σ = ρ 1 Average Rand Average Adj. Rand T ρ MultiRank E-Divisive MultiRank E-Divisive David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 29 / 40

83 Simulation 1,000 simulations, 2 CP: N d (0, Σ), N d (0, I ), N d (0, Σ) ρ ρ ρ ρ 1 ρ ρ Σ w.o./noise = ρ ρ 1 ρ C.. A ρ ρ ρ ρ 0 0 ρ Σ w/noise = C.. A Without Noise With Noise T d Avg. Rand Avg. Adj. Rand Avg. Rand Avg. Adj. Rand David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 30 / 40

84 Applications Genetics Genetics Data We applied E-divisive to the acgh mico-array dataset of 43 individuals with a bladder tumor (Bleakley and Vert, 2011); relative hybridization intensity profile for one individual. MultiRank (Lung-Yut-Fong et al., 2011) ˆk = 17 adjrand = KCPA (Arlot et al., 2012) ˆk = 41 adjrand = PELT (Killick et al., 2012) ˆk = 47 adjrand = MultiRank Signal Index KCPA Signal Index PELT Signal Index E Divisive Signal Index David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 31 / 40

85 Applications Finance Financial Data: Cisco Systems The E-divisive procedure was applied to the monthly log returns of the Dow 30 Marginal analysis of Cisco Systems Inc. from April 1990 to January The procedure found change points at April 2000 and October David S. Matteson Change Point Analysis 2014 October 32 / 40

86 Applications Finance Financial Data: Cisco Systems Marginal analysis of Cisco Systems Inc. from April 1990 to January The procedure found change points at April 2000 and October David S. Matteson Change Point Analysis 2014 October 33 / 40

87 Applications Finance Financial Data: S&P 500 Index S&P 500: May 20, 1999 April 25, 2011 log returns Date David S. Matteson Change Point Analysis 2014 October 34 / 40

88 Agglomerative Algorithm An Agglomerative Algorithm Given a partition of k clusters C = {C 1, C 2,..., C k }, clusters may or may not be single observations Consider combining a pair of adjacent clusters The partition that maximizes the goodness-of-fit statistic determines change point locations David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 35 / 40

89 Agglomerative Algorithm An Agglomerative Algorithm: Goodness-of-Fit Goodness-of-fit statistic S(k): sum the E-distances between adjacent clusters Given clusters C = {C 1, C 2,..., C k } with n i = #C i, define k 1 ( ) ni n i+1 S(k) = Ên α n i + n i,n i+1 (C i, C i+1 ), i+1 i=1 David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 36 / 40

90 Agglomerative Algorithm An Agglomerative Algorithm The partitioning which maximized S(k) is then used to estimate change point locations. Figure: Progression of the goodness of fit statistic, and where it is maximized. David S. Matteson Change Point Analysis 2014 October 37 / 40

91 Agglomerative Algorithm Application: EMS EMS Priority One Response for Toronto 2007 David S. Matteson Change Point Analysis 2014 October 38 / 40

92 Agglomerative Algorithm Application: EMS EMS Priority One Response for Toronto 2007 David S. Matteson Change Point Analysis 2014 October 39 / 40

93 Bibliography Bibliography matteson/ Bleakley, K., and Vert, J.-P. (2011), The group fused Lasso for multiple change-point detection,, Technical Report HAL , Bioinformatics Center (CBIO). Hoeffding, W. (1961), The Strong Law of Large Numbers for U-Statistics,, Technical Report 302, North Carolina State University. Dept. of Statistics. Hubert, L., and Arabie, P. (1985), Comparing Partitions, Journal of Classification, 2(1), James, N. A., and Matteson, D. S. (2013), ecp: An R Package for Nonparametric Multiple Change Point Analysis of Multivariate Data, arxiv: ,. Lung-Yut-Fong, A., Lévy-Leduc, C., and Cappé, O. (2011), Homogeneity and change-point detection tests for multivariate data using rank statistics,. Matteson, D. S., and James, N. A. (2013), A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data, Journal of the American Statistical Association, To Appear. Rizzo, M. L., and Székely, G. J. (2010), Disco Analysis: A Nonparametric Extension of Analysis of Variance, The Annals of Applied Statistics, 4(2), Székely, G. J., and Rizzo, M. L. (2005), Hierarchical Clustering via Joint Between-Within Distances: Extending Ward s Minimum Variance Method, Journal of Classification, 22(2), David S. Matteson (matteson@cornell.edu) Change Point Analysis 2014 October 40 / 40

A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data

A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data David S. Matteson and Nicholas A. James Cornell University April 30, 2013 Abstract Change point analysis has applications

More information

A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data

A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data David S. Matteson and Nicholas A. James Cornell University arxiv:1306.4933v2 [stat.me] 15 Oct 2013 October 16, 2013 Abstract

More information

MULTIPLE CHANGE POINT ANALYSIS OF MULTIVARIATE DATA VIA ENERGY STATISTICS

MULTIPLE CHANGE POINT ANALYSIS OF MULTIVARIATE DATA VIA ENERGY STATISTICS MULTIPLE CHANGE POINT ANALYSIS OF MULTIVARIATE DATA VIA ENERGY STATISTICS A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements

More information

ecp: An R Package for Nonparametric Multiple Change Point Analysis of Multivariate Data

ecp: An R Package for Nonparametric Multiple Change Point Analysis of Multivariate Data ecp: An R Package for Nonparametric Multiple Change Point Analysis of Multivariate Data Nicholas A. James and David S. Matteson Cornell University arxiv:1309.3295v2 [stat.co] 23 Nov 2013 Abstract There

More information

Detecting Changes in Multivariate Time Series

Detecting Changes in Multivariate Time Series Detecting Changes in Multivariate Time Series Alan Wise* Supervisor: Rebecca Wilson September 2 nd, 2016 *STOR-i Ball: Best Dressed Male 2016 What I Will Cover 1 Univariate Changepoint Detection Detecting

More information

A Bayesian Criterion for Clustering Stability

A Bayesian Criterion for Clustering Stability A Bayesian Criterion for Clustering Stability B. Clarke 1 1 Dept of Medicine, CCS, DEPH University of Miami Joint with H. Koepke, Stat. Dept., U Washington 26 June 2012 ISBA Kyoto Outline 1 Assessing Stability

More information

A Spatio-Temporal Point Process Model for Ambulance Demand

A Spatio-Temporal Point Process Model for Ambulance Demand A Spatio-Temporal Point Process Model for Ambulance Demand David S. Matteson Department of Statistical Science Department of Social Statistics Cornell University matteson@cornell.edu http://www.stat.cornell.edu/~matteson/

More information

Bivariate Paired Numerical Data

Bivariate Paired Numerical Data Bivariate Paired Numerical Data Pearson s correlation, Spearman s ρ and Kendall s τ, tests of independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Variable selection for model-based clustering

Variable selection for model-based clustering Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition

More information

Data Preprocessing. Cluster Similarity

Data Preprocessing. Cluster Similarity 1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M

More information

Clustering. Stephen Scott. CSCE 478/878 Lecture 8: Clustering. Stephen Scott. Introduction. Outline. Clustering.

Clustering. Stephen Scott. CSCE 478/878 Lecture 8: Clustering. Stephen Scott. Introduction. Outline. Clustering. 1 / 19 sscott@cse.unl.edu x1 If no label information is available, can still perform unsupervised learning Looking for structural information about instance space instead of label prediction function Approaches:

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

The energy Package. October 28, 2007

The energy Package. October 28, 2007 The energy Package October 28, 2007 Title E-statistics (energy statistics) tests of fit, independence, clustering Version 1.0-7 Date 2007-10-27 Author Maria L. Rizzo and Gabor

More information

The energy Package. October 3, Title E-statistics (energy statistics) tests of fit, independence, clustering

The energy Package. October 3, Title E-statistics (energy statistics) tests of fit, independence, clustering The energy Package October 3, 2005 Title E-statistics (energy statistics) tests of fit, independence, clustering Version 1.0-3 Date 2005-10-02 Author Maria L. Rizzo and Gabor J.

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010 Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu COMPSTAT

More information

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples. STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples. Rebecca Barter March 16, 2015 The χ 2 distribution The χ 2 distribution We have seen several instances

More information

Overview. and data transformations of gene expression data. Toy 2-d Clustering Example. K-Means. Motivation. Model-based clustering

Overview. and data transformations of gene expression data. Toy 2-d Clustering Example. K-Means. Motivation. Model-based clustering Model-based clustering and data transformations of gene expression data Walter L. Ruzzo University of Washington UW CSE Computational Biology Group 2 Toy 2-d Clustering Example K-Means? 3 4 Hierarchical

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Bayesian Nonparametric Models for Ranking Data

Bayesian Nonparametric Models for Ranking Data Bayesian Nonparametric Models for Ranking Data François Caron 1, Yee Whye Teh 1 and Brendan Murphy 2 1 Dept of Statistics, University of Oxford, UK 2 School of Mathematical Sciences, University College

More information

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find

More information

More on nuisance parameters

More on nuisance parameters BS2 Statistical Inference, Lecture 3, Hilary Term 2009 January 30, 2009 Suppose that there is a minimal sufficient statistic T = t(x ) partitioned as T = (S, C) = (s(x ), c(x )) where: C1: the distribution

More information

Distance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics

Distance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics Distance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics David B. Dahl August 5, 2008 Abstract Integration of several types of data is a burgeoning field.

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision

More information

TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION

TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION Gábor J. Székely Bowling Green State University Maria L. Rizzo Ohio University October 30, 2004 Abstract We propose a new nonparametric test for equality

More information

Pattern Recognition of Multivariate Time Series using Wavelet

Pattern Recognition of Multivariate Time Series using Wavelet Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS070) p.6862 Pattern Recognition of Multivariate Time Series using Wavelet Features Maharaj, Elizabeth Ann Monash

More information

Distribution-Free Procedures (Devore Chapter Fifteen)

Distribution-Free Procedures (Devore Chapter Fifteen) Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal

More information

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1) Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ

More information

Machine learning - HT Maximum Likelihood

Machine learning - HT Maximum Likelihood Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce

More information

Information geometry for bivariate distribution control

Information geometry for bivariate distribution control Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Bayesian data analysis in practice: Three simple examples

Bayesian data analysis in practice: Three simple examples Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to

More information

The Multinomial Model

The Multinomial Model The Multinomial Model STA 312: Fall 2012 Contents 1 Multinomial Coefficients 1 2 Multinomial Distribution 2 3 Estimation 4 4 Hypothesis tests 8 5 Power 17 1 Multinomial Coefficients Multinomial coefficient

More information

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Yale School of Public Health Joint work with Ning Hao, Yue S. Niu presented @Tsinghua University Outline 1 The Problem

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

Geometric Skew-Normal Distribution

Geometric Skew-Normal Distribution Debasis Kundu Arun Kumar Chair Professor Department of Mathematics & Statistics Indian Institute of Technology Kanpur Part of this work is going to appear in Sankhya, Ser. B. April 11, 2014 Outline 1 Motivation

More information

Chapter 3 - Temporal processes

Chapter 3 - Temporal processes STK4150 - Intro 1 Chapter 3 - Temporal processes Odd Kolbjørnsen and Geir Storvik January 23 2017 STK4150 - Intro 2 Temporal processes Data collected over time Past, present, future, change Temporal aspect

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

A copula goodness-of-t approach. conditional probability integral transform. Daniel Berg 1 Henrik Bakken 2

A copula goodness-of-t approach. conditional probability integral transform. Daniel Berg 1 Henrik Bakken 2 based on the conditional probability integral transform Daniel Berg 1 Henrik Bakken 2 1 Norwegian Computing Center (NR) & University of Oslo (UiO) 2 Norwegian University of Science and Technology (NTNU)

More information

Extreme Value Analysis and Spatial Extremes

Extreme Value Analysis and Spatial Extremes Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

STAT 520: Forecasting and Time Series. David B. Hitchcock University of South Carolina Department of Statistics

STAT 520: Forecasting and Time Series. David B. Hitchcock University of South Carolina Department of Statistics David B. University of South Carolina Department of Statistics What are Time Series Data? Time series data are collected sequentially over time. Some common examples include: 1. Meteorological data (temperatures,

More information

c 4, < y 2, 1 0, otherwise,

c 4, < y 2, 1 0, otherwise, Fundamentals of Big Data Analytics Univ.-Prof. Dr. rer. nat. Rudolf Mathar Problem. Probability theory: The outcome of an experiment is described by three events A, B and C. The probabilities Pr(A) =,

More information

Using Model Selection and Prior Specification to Improve Regime-switching Asset Simulations

Using Model Selection and Prior Specification to Improve Regime-switching Asset Simulations Using Model Selection and Prior Specification to Improve Regime-switching Asset Simulations Brian M. Hartman, PhD ASA Assistant Professor of Actuarial Science University of Connecticut BYU Statistics Department

More information

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56 Cointegrated VAR s Eduardo Rossi University of Pavia November 2013 Rossi Cointegrated VAR s Financial Econometrics - 2013 1 / 56 VAR y t = (y 1t,..., y nt ) is (n 1) vector. y t VAR(p): Φ(L)y t = ɛ t The

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 5 Sequential Monte Carlo methods I 31 March 2017 Computer Intensive Methods (1) Plan of today s lecture

More information

Simultaneous inference for multiple testing and clustering via a Dirichlet process mixture model

Simultaneous inference for multiple testing and clustering via a Dirichlet process mixture model Simultaneous inference for multiple testing and clustering via a Dirichlet process mixture model David B Dahl 1, Qianxing Mo 2 and Marina Vannucci 3 1 Texas A&M University, US 2 Memorial Sloan-Kettering

More information

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Daniel B Rowe Division of Biostatistics Medical College of Wisconsin Technical Report 40 November 00 Division of Biostatistics

More information

More on Unsupervised Learning

More on Unsupervised Learning More on Unsupervised Learning Two types of problems are to find association rules for occurrences in common in observations (market basket analysis), and finding the groups of values of observational data

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Dependent hierarchical processes for multi armed bandits

Dependent hierarchical processes for multi armed bandits Dependent hierarchical processes for multi armed bandits Federico Camerlenghi University of Bologna, BIDSA & Collegio Carlo Alberto First Italian meeting on Probability and Mathematical Statistics, Torino

More information

Hybrid Copula Bayesian Networks

Hybrid Copula Bayesian Networks Kiran Karra kiran.karra@vt.edu Hume Center Electrical and Computer Engineering Virginia Polytechnic Institute and State University September 7, 2016 Outline Introduction Prior Work Introduction to Copulas

More information

Package EMMIXmcfa. October 18, 2017

Package EMMIXmcfa. October 18, 2017 Package EMMIXmcfa October 18, 2017 Type Package Title Mixture of Factor Analyzers with Common Factor Loadings Version 2.0.8 Date 2017-10-18 Author Suren Rathnayake, Geoff McLachlan, Jungsun Baek Maintainer

More information

Sample Size and Power Considerations for Longitudinal Studies

Sample Size and Power Considerations for Longitudinal Studies Sample Size and Power Considerations for Longitudinal Studies Outline Quantities required to determine the sample size in longitudinal studies Review of type I error, type II error, and power For continuous

More information

Discrimination Among Groups. Discrimination Among Groups

Discrimination Among Groups. Discrimination Among Groups Discrimination Among Groups Id Species Canopy Snag Canopy Cover Density Height 1 A 80 1.2 35 2 A 75 0.5 32 3 A 72 2.8 28..... 31 B 35 3.3 15 32 B 75 4.1 25 60 B 15 5.0 3..... 61 C 5 2.1 5 62 C 8 3.4 2

More information

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception PROBABILITY DISTRIBUTIONS Credits 2 These slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Parametric Distributions 3 Basic building blocks: Need to determine given Representation:

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

On the Estimation and Application of Max-Stable Processes

On the Estimation and Application of Max-Stable Processes On the Estimation and Application of Max-Stable Processes Zhengjun Zhang Department of Statistics University of Wisconsin Madison, WI 53706, USA Co-author: Richard Smith EVA 2009, Fort Collins, CO Z. Zhang

More information

Testing for Poisson Behavior

Testing for Poisson Behavior Testing for Poisson Behavior Philip B. Stark Department of Statistics, UC Berkeley joint with Brad Luen 17 April 2012 Seismological Society of America Annual Meeting San Diego, CA Quake Physics versus

More information

Modelling Dependence with Copulas and Applications to Risk Management. Filip Lindskog, RiskLab, ETH Zürich

Modelling Dependence with Copulas and Applications to Risk Management. Filip Lindskog, RiskLab, ETH Zürich Modelling Dependence with Copulas and Applications to Risk Management Filip Lindskog, RiskLab, ETH Zürich 02-07-2000 Home page: http://www.math.ethz.ch/ lindskog E-mail: lindskog@math.ethz.ch RiskLab:

More information

Co-expression analysis of RNA-seq data

Co-expression analysis of RNA-seq data Co-expression analysis of RNA-seq data Etienne Delannoy & Marie-Laure Martin-Magniette & Andrea Rau Plant Science Institut of Paris-Saclay (IPS2) Applied Mathematics and Informatics Unit (MIA-Paris) Genetique

More information

Conjugate Analysis for the Linear Model

Conjugate Analysis for the Linear Model Conjugate Analysis for the Linear Model If we have good prior knowledge that can help us specify priors for β and σ 2, we can use conjugate priors. Following the procedure in Christensen, Johnson, Branscum,

More information

Multivariate Analysis of Ecological Data using CANOCO

Multivariate Analysis of Ecological Data using CANOCO Multivariate Analysis of Ecological Data using CANOCO JAN LEPS University of South Bohemia, and Czech Academy of Sciences, Czech Republic Universitats- uric! Lanttesbibiiothek Darmstadt Bibliothek Biologie

More information

Random Numbers and Simulation

Random Numbers and Simulation Random Numbers and Simulation Generating random numbers: Typically impossible/unfeasible to obtain truly random numbers Programs have been developed to generate pseudo-random numbers: Values generated

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

More information

Parametric Empirical Bayes Methods for Microarrays

Parametric Empirical Bayes Methods for Microarrays Parametric Empirical Bayes Methods for Microarrays Ming Yuan, Deepayan Sarkar, Michael Newton and Christina Kendziorski April 30, 2018 Contents 1 Introduction 1 2 General Model Structure: Two Conditions

More information

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036

More information

A Brief Introduction to the Matlab package StepSignalMargiLike

A Brief Introduction to the Matlab package StepSignalMargiLike A Brief Introduction to the Matlab package StepSignalMargiLike Chu-Lan Kao, Chao Du Jan 2015 This article contains a short introduction to the Matlab package for estimating change-points in stepwise signals.

More information

Lecture 13: Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf

Lecture 13: Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf Lecture 13: 2011 Bootstrap ) R n x n, θ P)) = τ n ˆθn θ P) Example: ˆθn = X n, τ n = n, θ = EX = µ P) ˆθ = min X n, τ n = n, θ P) = sup{x : F x) 0} ) Define: J n P), the distribution of τ n ˆθ n θ P) under

More information

Math 181B Homework 1 Solution

Math 181B Homework 1 Solution Math 181B Homework 1 Solution 1. Write down the likelihood: L(λ = n λ X i e λ X i! (a One-sided test: H 0 : λ = 1 vs H 1 : λ = 0.1 The likelihood ratio: where LR = L(1 L(0.1 = 1 X i e n 1 = λ n X i e nλ

More information

Forecasting Wind Ramps

Forecasting Wind Ramps Forecasting Wind Ramps Erin Summers and Anand Subramanian Jan 5, 20 Introduction The recent increase in the number of wind power producers has necessitated changes in the methods power system operators

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

On Reparametrization and the Gibbs Sampler

On Reparametrization and the Gibbs Sampler On Reparametrization and the Gibbs Sampler Jorge Carlos Román Department of Mathematics Vanderbilt University James P. Hobert Department of Statistics University of Florida March 2014 Brett Presnell Department

More information

CS 195-5: Machine Learning Problem Set 1

CS 195-5: Machine Learning Problem Set 1 CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of

More information

Random Forests. These notes rely heavily on Biau and Scornet (2016) as well as the other references at the end of the notes.

Random Forests. These notes rely heavily on Biau and Scornet (2016) as well as the other references at the end of the notes. Random Forests One of the best known classifiers is the random forest. It is very simple and effective but there is still a large gap between theory and practice. Basically, a random forest is an average

More information

Chapter 2. Review of basic Statistical methods 1 Distribution, conditional distribution and moments

Chapter 2. Review of basic Statistical methods 1 Distribution, conditional distribution and moments Chapter 2. Review of basic Statistical methods 1 Distribution, conditional distribution and moments We consider two kinds of random variables: discrete and continuous random variables. For discrete random

More information

ACM 116: Lectures 3 4

ACM 116: Lectures 3 4 1 ACM 116: Lectures 3 4 Joint distributions The multivariate normal distribution Conditional distributions Independent random variables Conditional distributions and Monte Carlo: Rejection sampling Variance

More information

A Peak to the World of Multivariate Statistical Analysis

A Peak to the World of Multivariate Statistical Analysis A Peak to the World of Multivariate Statistical Analysis Real Contents Real Real Real Why is it important to know a bit about the theory behind the methods? Real 5 10 15 20 Real 10 15 20 Figure: Multivariate

More information

Lecture 2: A Las Vegas Algorithm for finding the closest pair of points in the plane

Lecture 2: A Las Vegas Algorithm for finding the closest pair of points in the plane Randomized Algorithms Lecture 2: A Las Vegas Algorithm for finding the closest pair of points in the plane Sotiris Nikoletseas Professor CEID - ETY Course 2017-2018 Sotiris Nikoletseas, Professor Randomized

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30 Random Vectors 1 STA442/2101 Fall 2017 1 See last slide for copyright information. 1 / 30 Background Reading: Renscher and Schaalje s Linear models in statistics Chapter 3 on Random Vectors and Matrices

More information

Nonparametric Independence Tests

Nonparametric Independence Tests Nonparametric Independence Tests Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Nonparametric

More information

Machine Learning Brett Bernstein. Recitation 1: Gradients and Directional Derivatives

Machine Learning Brett Bernstein. Recitation 1: Gradients and Directional Derivatives Machine Learning Brett Bernstein Recitation 1: Gradients and Directional Derivatives Intro Question 1 We are given the data set (x 1, y 1 ),, (x n, y n ) where x i R d and y i R We want to fit a linear

More information

COS513 LECTURE 8 STATISTICAL CONCEPTS

COS513 LECTURE 8 STATISTICAL CONCEPTS COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions

More information

arxiv: v1 [stat.me] 14 Jan 2019

arxiv: v1 [stat.me] 14 Jan 2019 arxiv:1901.04443v1 [stat.me] 14 Jan 2019 An Approach to Statistical Process Control that is New, Nonparametric, Simple, and Powerful W.J. Conover, Texas Tech University, Lubbock, Texas V. G. Tercero-Gómez,Tecnológico

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Are Declustered Earthquake Catalogs Poisson?

Are Declustered Earthquake Catalogs Poisson? Are Declustered Earthquake Catalogs Poisson? Philip B. Stark Department of Statistics, UC Berkeley Brad Luen Department of Mathematics, Reed College 14 October 2010 Department of Statistics, Penn State

More information

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Joseph P. Romano Department of Statistics Stanford University Michael Wolf Department of Economics and Business Universitat Pompeu

More information

Complexity of two and multi-stage stochastic programming problems

Complexity of two and multi-stage stochastic programming problems Complexity of two and multi-stage stochastic programming problems A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA The concept

More information