Introduction Outline Introduction Copula Specification Heuristic Example Simple Example The Copula perspective Mutual Information as Copula dependent

Size: px

Start display at page:

Download "Introduction Outline Introduction Copula Specification Heuristic Example Simple Example The Copula perspective Mutual Information as Copula dependent"

Gwenda Ellis
6 years ago
Views:

1 Copula Based Independent Component Analysis SAMSI 2008 Abayomi, Kobi + + SAMSI 2008 April 2008

2 Introduction Outline Introduction Copula Specification Heuristic Example Simple Example The Copula perspective Mutual Information as Copula dependent function MI KL Independence PCA as a special case Implementation on ESI Inputs CICA outputs Results Next

3 Introduction Copula Specification A copula is a distribution function... Take X 1 F X1, X 2 F X2

4 Introduction Copula Specification...on the marginal distributions of random variables Set U = F X1 and V = F X2. The pair (U, V ) are the grades of (X 1, X 2 ) i.e. the mapping of (X 1, X 2 ) in F X1, F X2 space.

5 Introduction Copula Specification...on the marginal distributions of random variables Set U = F X1 and V = F X2. The pair (U, V ) are the grades of (X 1, X 2 ) i.e. the mapping of (X 1, X 2 ) in F X1, F X2 space.

6 Introduction Copula Specification Copula specification A copula is a function that takes the grades as arguments and returns a joint distribution function, with marginals F X1, F X2. C(U, V )=F X1,X 2

7 Introduction Copula Specification Copula generation Any multivariate distribution function can yield a copula function. F X1,X 2 (F 1 X 1 (U), F 1 X 2 (V )) = C (U, V )

8 Introduction Copula Specification Heuristically speaking The correspondence which assigns the value of the joint distribution function to each ordered pair of values (F X1, F X2 ) for each X 1, X 2 is a distribution function called a Copula.

9 Introduction Heuristic Example GPA - a copula based function

10 Introduction Simple Example Simple Example: Bivariate Distribution H θ (x, y) =1 e x e y + e (x+y+θxy) for x, y R +. H = 0 otherwise.

11 Introduction Simple Example Simple Example: Bivariate Distribution H ( 1) X (u) = ln (1 u); H ( 1) Y (v) = ln (1 v)

12 Introduction Simple Example Simple Example: Bivariate Distribution C θ (u, v) =H( ln(1 u), ln(1 v)) = θ ln(1 u) ln(1 v) =(u + v 1)+(1 u)(1 v) e Notice if θ = 0 C θ (u, v) =uv...the independence copula.

13 Introduction Simple Example Bivariate Copula θ = 0, 1, 5

14 Introduction The Copula perspective Why use a Copula Copulas are useful when modelling: Multivariate settings where a different family is needed for each marginal distribution. Parametric estimates/versions of measures of association.

15 Introduction The Copula perspective Why use a Copula Copulas are useful when modelling: Multivariate settings where a different family is needed for each marginal distribution. Parametric estimates/versions of measures of association.

16 Introduction The Copula perspective Copulas dependent measures of association Many measures of association can be expressed as solely copula dependent. Kendalls Tau and, for example, Spearman s Rho with U = F X, V = F Y : ρ (X,Y ) = 12 C(u, v)dudv 3 ρ (U,V ) = 12E(C(u, v)) 3

17 Mutual Information as Copula dependent function Outline Introduction Copula Specification Heuristic Example Simple Example The Copula perspective Mutual Information as Copula dependent function MI KL Independence PCA as a special case Implementation on ESI Inputs CICA outputs Results Next

18 Mutual Information as Copula dependent function Copula Density With u =(F X1,..., F Xk ), the copula density dc(u) is dc(u) = df x(x) dfxi (x i )

19 Mutual Information as Copula dependent function Mutual Information It turns out the mutual information MI(x) df X log( df X ) R k dfxi is solely copula dependent...

20 Mutual Information as Copula dependent function Mutual Information is Copula based Since, with u =(F X1,..., F Xk ) MI(x) dc(u)log(dc(u)) I k Thus the MI is a copula based measure of association.

21 Mutual Information as Copula dependent function MI KL Independence Mutual Information MI is often used as a proxy for independence in general (i.e. non gaussian) settings. MI is the Kullback-Liebler divergence ( distance ) between dependence and independence. MI= 0 implies independence.

22 Mutual Information as Copula dependent function MI KL Independence Mutual Information MI is often used as a proxy for independence in general (i.e. non gaussian) settings. MI is the Kullback-Liebler divergence ( distance ) between dependence and independence. MI= 0 implies independence.

23 Mutual Information as Copula dependent function MI KL Independence Mutual Information MI is often used as a proxy for independence in general (i.e. non gaussian) settings. MI is the Kullback-Liebler divergence ( distance ) between dependence and independence. MI= 0 implies independence.

24 Outline Introduction Copula Specification Heuristic Example Simple Example The Copula perspective Mutual Information as Copula dependent function MI KL Independence PCA as a special case Implementation on ESI Inputs CICA outputs Results Next

25 PCA as a special case The PCA model Given multivariate data x k with scatter matrix Σ PCA program: Find B such that y = Bx yields uncorrelated y i and y j.

26 PCA as a special case The PCA model Given multivariate data x k with scatter matrix Σ PCA program: Find B such that y = Bx yields uncorrelated y i and y j.

27 PCA as a special case The PCA model Well known result: Singular Value Decomposition (SVD) Σ= e t Λe Yielding: with Cov(y i, y j )=0, i j. y i = e t x

28 PCA as a special case The PCA model - mixed data

29 PCA as a special case The PCA model - rotated data

30 (ICA) ICA is the PCA program under the more general assumption of statistical independence Given x k ICA program: Find y = Bx such that y i = b i x are independent of y j = b j x.

31 (ICA) ICA is the PCA program under the more general assumption of statistical independence Given x k ICA program: Find y = Bx such that y i = b i x are independent of y j = b j x.

32 (ICA) In ICA: x = As the observed data are the mixed outputs, B, of independent sources s. The y are the estimates (ŝ) of these independent components, or signals. B is an estimate of A 1 ; B = A ˆ 1.

33 (ICA) In ICA: x = As the observed data are the mixed outputs, B, of independent sources s. The y are the estimates (ŝ) of these independent components, or signals. B is an estimate of A 1 ; B = A ˆ 1.

34 (ICA) In ICA: x = As the observed data are the mixed outputs, B, of independent sources s. The y are the estimates (ŝ) of these independent components, or signals. B is an estimate of A 1 ; B = A ˆ 1.

35 The ICA model - illustration

36 The ICA model

37 Comments on ICA In both PCA and ICA the objective is the recovery of the mixing A of the independent signals x. The difference is the characterization of statistical independence. True statistical independence requires factorization of probability densities. ICA procedures often use high order moments, or empirical mutual information as independence proxies. Most are non-parametric approaches to estimating statistical independence.

38 Comments on ICA In both PCA and ICA the objective is the recovery of the mixing A of the independent signals x. The difference is the characterization of statistical independence. True statistical independence requires factorization of probability densities. ICA procedures often use high order moments, or empirical mutual information as independence proxies. Most are non-parametric approaches to estimating statistical independence.

39 Comments on ICA In both PCA and ICA the objective is the recovery of the mixing A of the independent signals x. The difference is the characterization of statistical independence. True statistical independence requires factorization of probability densities. ICA procedures often use high order moments, or empirical mutual information as independence proxies. Most are non-parametric approaches to estimating statistical independence.

40 Comments on ICA In both PCA and ICA the objective is the recovery of the mixing A of the independent signals x. The difference is the characterization of statistical independence. True statistical independence requires factorization of probability densities. ICA procedures often use high order moments, or empirical mutual information as independence proxies. Most are non-parametric approaches to estimating statistical independence.

41 The ICA model

42 Copula Based (CICA) General Approach Replace non-parametric measures of dependence-independence with parametric copula families Appeal to the information theoretic distance - K-L divergence Exploit the role of the copula.

43 Copula Based (CICA) General Approach Replace non-parametric measures of dependence-independence with parametric copula families Appeal to the information theoretic distance - K-L divergence Exploit the role of the copula.

44 Copula Based (CICA) General Approach Replace non-parametric measures of dependence-independence with parametric copula families Appeal to the information theoretic distance - K-L divergence Exploit the role of the copula.

45 Bivariate Copula θ = 0, 1, 5

46 K-L divergence The Kullback-Liebler divergence between two probability density functions f (t) and g(t) is K(f, g) = f (t)log( f (t) g(t) ) (1) t I can (ab)use the notation K(w, z) for the divergence between the distribution of two random vectors w and z. K 0 with equality if and only if w and z have the same distribution.

47 Divergence decomposition A classic property (Kullback [1968], others) of (1) is K(y, s) =K(y, y )+K(y, s) (2) with y a random vector with independent entries and margins distributed as y; s is an independent vector.

48 Divergence decomposition The LHS K(y, s) the divergence of the ICA outputs (estimates) from the hypothesized inputs (sources). K(y, y ) is the independence-dependence of the outputs K(y, s) is the mismatch of the margins of the estimates from the margins of the sources.

49 Divergence decomposition The LHS K(y, s) the divergence of the ICA outputs (estimates) from the hypothesized inputs (sources). K(y, y ) is the independence-dependence of the outputs K(y, s) is the mismatch of the margins of the estimates from the margins of the sources.

50 Divergence decomposition The LHS K(y, s) the divergence of the ICA outputs (estimates) from the hypothesized inputs (sources). K(y, y ) is the independence-dependence of the outputs K(y, s) is the mismatch of the margins of the estimates from the margins of the sources.

51 L maximization implies KL minimization Model x as generated from As = x. Log likelihood for n independent samples of x, under our estimate ˆq of the true source distribution is: L A = n 1 log(ˆq(a 1 x)) log( deta ) Which converges to q( )log(ˆq( )) + cst. Which can be rewritten: q( )log(q( )) q( )log( q( ) )=H(y) K (q( ), ˆq( )) ˆq( )

52 L maximization implies KL minimization Model x as generated from As = x. Log likelihood for n independent samples of x, under our estimate ˆq of the true source distribution is: L A = n 1 log(ˆq(a 1 x)) log( deta ) Which converges to q( )log(ˆq( )) + cst. Which can be rewritten: q( )log(q( )) q( )log( q( ) )=H(y) K (q( ), ˆq( )) ˆq( )

53 L maximization implies KL minimization Model x as generated from As = x. Log likelihood for n independent samples of x, under our estimate ˆq of the true source distribution is: L A = n 1 log(ˆq(a 1 x)) log( deta ) Which converges to q( )log(ˆq( )) + cst. Which can be rewritten: q( )log(q( )) q( )log( q( ) )=H(y) K (q( ), ˆq( )) ˆq( )

54 ML maximization implies KL minimization = H(y) K(y, s) =H(x) K(y, s) The RHS first term is the entropy of the inputs; the RHS second term is the distance between (true) sources and their estimates. ML maximization is equivalent to minimization of the KL distance between the outputs and the sources.

55 Copula Based (CICA) The procedure is to recast (2) via the copula: K(y, y )=MI(y) = H(dC(u)) if u are the marginal distribution functions u i = F i (y i ). K(y, s) = K(y i, s i), since both y, s have independent entries.

56 Copula Based (CICA) The procedure is to recast (2) via the copula: K(y, y )=MI(y) = H(dC(u)) if u are the marginal distribution functions u i = F i (y i ). K(y, s) = K(y i, s i), since both y, s have independent entries.

57 Copula Based (CICA) Under fixed assumptions about the distribution of the sources, I have to minimize two terms: the true objective, the mutual information, expressed via the copula; the mismatch of the marginal distributions to the assumed distributions.

58 Copula Based Component Analysis - Summary To cast the K-L partitioning in terms of the copula I write the independence term as min B and the marginal fit term as MI(y; B) =min E q (log(dc Θ (u))) B min Θ [C Θ (u) k (u i )]. That is, I minimize the mutual information via the copula via rotation B = Â 1 after minimizing the distance between parametric copula and independent marginals. i=1

59 Copula Based Component Analysis - Overview Setting u = G(y ) where y is still a random, mutually independent vector with margins distributed equivalently with y. Thus, u is independent with margins distributed as y. K(û, u) =K(û, u )+K(u, û) with û the estimate of the true sources output from a copula based procedure and u the true distribution of the sources. The K-L distance between the outputs and the sources is then: (1) the fit of the outputs to independence K(û, u ); and (2) the fit of the marginals of the outputs to the true source distributions.

60 Copula Based Component Analysis - Full Model An approach is to minimize the mutual information of the outputs, with the distributional assumption either fixed or parameterized, exploiting the copula representation: min B MI(y; B) =min Eˆq (log(dc Θ (u))) B

61 Copula Based Component Analysis - Full Model This is equivalent to maximizing the score L B = K(q( ), ˆq(, B)) B via the marginal distributions L B = K(û, u) B using the copula model.

62 Copula Based Component Analysis - Full Model The mixing matrix estimate B can be recovered via gradient ascent/descent, for example. Versions where the joint dependence is captured in a single parameter (multivariate or scalar) may not be applicable for the ICA problem. If Θ is the copula parameter at independence then lim Θ Θ C Θ (u) = k i u i and the mixing matrix at Θ is unidentifiable.

63 Copula Based Component Analysis - Partite Model Another approach is to set y = RW x, with W a whitening matrix - the product of PCA - and R the ICA rotation. This allows orthogonalization of a mutual information matrix via well known procedures like Singular Value Decomposition. Construct scatter/kernel matrix Γ CΘ = ((C θij (ˆF n w i T x (w it x), ˆF n w j T x (w j T x)))) i,j=1..k Find orthogonalization of Γ CΘ, λ 1,..., λ k Yield y k = b k x k = r k w k x k with y i y j via C Θ

64 Copula Based Component Analysis - Partite Model Another approach is to set y = RW x, with W a whitening matrix - the product of PCA - and R the ICA rotation. This allows orthogonalization of a mutual information matrix via well known procedures like Singular Value Decomposition. Construct scatter/kernel matrix Γ CΘ = ((C θij (ˆF n w i T x (w it x), ˆF n w j T x (w j T x)))) i,j=1..k Find orthogonalization of Γ CΘ, λ 1,..., λ k Yield y k = b k x k = r k w k x k with y i y j via C Θ

65 Copula Based Component Analysis - Partite Model Another approach is to set y = RW x, with W a whitening matrix - the product of PCA - and R the ICA rotation. This allows orthogonalization of a mutual information matrix via well known procedures like Singular Value Decomposition. Construct scatter/kernel matrix Γ CΘ = ((C θij (ˆF n w i T x (w it x), ˆF n w j T x (w j T x)))) i,j=1..k Find orthogonalization of Γ CΘ, λ 1,..., λ k Yield y k = b k x k = r k w k x k with y i y j via C Θ

66 Copula Based Component Analysis - Partite Model Choose exchangeable families at each bivariate pair. Treat the univariate distributions u i = F Xi (x i ) as observed. Bivariate Mutual Information(s), or, E(log(dC(u i, v i ))) are the elements of scatter matrix

67 Copula Based Component Analysis - Partite Model Choose exchangeable families at each bivariate pair. Treat the univariate distributions u i = F Xi (x i ) as observed. Bivariate Mutual Information(s), or, E(log(dC(u i, v i ))) are the elements of scatter matrix

68 Copula Based Component Analysis - Partite Model Choose exchangeable families at each bivariate pair. Treat the univariate distributions u i = F Xi (x i ) as observed. Bivariate Mutual Information(s), or, E(log(dC(u i, v i ))) are the elements of scatter matrix

69 Copula Based Component Analysis - Partite Model For well fit Cˆθij MI(Cˆθij ) 0 for all i, j. Thus R = ((MI(Cˆθij ))) is positive semi-definite, by exchangeability. Singular Value Decomposition of R yields orthogonal basis (w.r.t MI) [Tipping and Bishop 1999].

70 Copula Based Component Analysis - Partite Model For well fit Cˆθij MI(Cˆθij ) 0 for all i, j. Thus R = ((MI(Cˆθij ))) is positive semi-definite, by exchangeability. Singular Value Decomposition of R yields orthogonal basis (w.r.t MI) [Tipping and Bishop 1999].

71 Copula Based Component Analysis - Partite Model For well fit Cˆθij MI(Cˆθij ) 0 for all i, j. Thus R = ((MI(Cˆθij ))) is positive semi-definite, by exchangeability. Singular Value Decomposition of R yields orthogonal basis (w.r.t MI) [Tipping and Bishop 1999].

72 Copula Based Component Analysis - Partite Model Some ICA algorithms employ arbitrary non-linear transforms (e.g. logistic [Bell and Sejnowski 1995], log(cosh) [Teschendorf 2004]). Other algorithms use nonparametric estimates of MI [Stogbauer 2004] or cumulant moments [Comon 1994].

73 Copula Based Component Analysis - Partite Model Some ICA algorithms employ arbitrary non-linear transforms (e.g. logistic [Bell and Sejnowski 1995], log(cosh) [Teschendorf 2004]). Other algorithms use nonparametric estimates of MI [Stogbauer 2004] or cumulant moments [Comon 1994].

74 Copula Based Component Analysis - following 3, Comparisons Copula approach allows flexible choice of non-linear u. Parametric estimators of MI (O(n 1/2 )) superior to nonparametric version (O(n 4/5 )). Parametric approach may be more stable on smaller datasets and under moment perturbation [McCullagh 1994, Everson and Roberts 2000]; and on extreme value distributions.

75 Copula Based Component Analysis - following 3, Comparisons Copula approach allows flexible choice of non-linear u. Parametric estimators of MI (O(n 1/2 )) superior to nonparametric version (O(n 4/5 )). Parametric approach may be more stable on smaller datasets and under moment perturbation [McCullagh 1994, Everson and Roberts 2000]; and on extreme value distributions.

76 Copula Based Component Analysis - following 3, Comparisons Copula approach allows flexible choice of non-linear u. Parametric estimators of MI (O(n 1/2 )) superior to nonparametric version (O(n 4/5 )). Parametric approach may be more stable on smaller datasets and under moment perturbation [McCullagh 1994, Everson and Roberts 2000]; and on extreme value distributions.

77 Example Results 1 - Extreme value distribution.

78 Example Results 1 - Rotated extreme value distribution.

79 Example Results 1; Gumbel-Hougard (GH) type copula - y i = B i x i

80 Example Results 2 - GH type dependency gradient.

81 Example Results 3 - CICA vs. fastica

82 Sample Size Comparison, log(mise) - CICA(GH) vs. fastica

83 Partite Example Finding closed form MLE estimators is difficult for many models. Balance between tractable MLE estimation and model flexibility.

84 Partite Example Finding closed form MLE estimators is difficult for many models. Balance between tractable MLE estimation and model flexibility.

85 Two Parameter Archimedean Families A copula family is called archimedean if the copula can be written: C δ (u, v) =φ δ (φ 1 δ (u)+φ 1 (v)) with φ δ a generating function parameterized by δ. δ

86 Two Parameter Archimedean Families Then C θ,δ (u, v) =η θ,δ (η 1 θ,δ (u)+η 1 θ,δ (v)) with η θ,δ (s) =ψ θ ( logφ δ (s)) is a natural two-parameter extension

87 Application: CICA on ESI CICA on the 2002 Environmental Sustainability Index (ESI) A scaled linear combination of sixty-eight metrics of environmental concern Traditional quantities (such as NO x and SO 2 concentrations) are included with more expansive measures of environmental sustainability -(such as civil liberty and corruption) A measure of overall progress towards environmental sustainability - designed to permit systematic and quantitative comparison between nations

88 Application: CICA on ESI CICA on the 2002 Environmental Sustainability Index (ESI) A scaled linear combination of sixty-eight metrics of environmental concern Traditional quantities (such as NO x and SO 2 concentrations) are included with more expansive measures of environmental sustainability -(such as civil liberty and corruption) A measure of overall progress towards environmental sustainability - designed to permit systematic and quantitative comparison between nations

89 Application: CICA on ESI CICA on the 2002 Environmental Sustainability Index (ESI) A scaled linear combination of sixty-eight metrics of environmental concern Traditional quantities (such as NO x and SO 2 concentrations) are included with more expansive measures of environmental sustainability -(such as civil liberty and corruption) A measure of overall progress towards environmental sustainability - designed to permit systematic and quantitative comparison between nations

90 ESI Grouping Environmental Systems Natural stocks such as air, soil, and water. Environmental Stresses Stress on ecosystems such as pollution and deforestation. Vulnerability Basic needs such as health, nutrition, and mortality. Capacity Social and economic variables such as corruption and liberty, energy consumption, and schooling rate. Stewardship Global cooperation such as treaty participation and compliance.

91 ESI Grouping Environmental Systems Natural stocks such as air, soil, and water. Environmental Stresses Stress on ecosystems such as pollution and deforestation. Vulnerability Basic needs such as health, nutrition, and mortality. Capacity Social and economic variables such as corruption and liberty, energy consumption, and schooling rate. Stewardship Global cooperation such as treaty participation and compliance.

92 ESI Grouping Environmental Systems Natural stocks such as air, soil, and water. Environmental Stresses Stress on ecosystems such as pollution and deforestation. Vulnerability Basic needs such as health, nutrition, and mortality. Capacity Social and economic variables such as corruption and liberty, energy consumption, and schooling rate. Stewardship Global cooperation such as treaty participation and compliance.

93 ESI Grouping Environmental Systems Natural stocks such as air, soil, and water. Environmental Stresses Stress on ecosystems such as pollution and deforestation. Vulnerability Basic needs such as health, nutrition, and mortality. Capacity Social and economic variables such as corruption and liberty, energy consumption, and schooling rate. Stewardship Global cooperation such as treaty participation and compliance.

94 ESI Grouping Environmental Systems Natural stocks such as air, soil, and water. Environmental Stresses Stress on ecosystems such as pollution and deforestation. Vulnerability Basic needs such as health, nutrition, and mortality. Capacity Social and economic variables such as corruption and liberty, energy consumption, and schooling rate. Stewardship Global cooperation such as treaty participation and compliance.

95 The complete data ESI

96 Implementation on ESI Partite Approach The approach: Exploit the empirical distribution, setting u i = ˆF n i (x i ) and treating the univariate marginals as observed or unparameterized. Fit copulas pairwise and minimize MI(u) by diagonalization of a mutual information matrix.

97 Implementation on ESI Partite Approach The approach: Exploit the empirical distribution, setting u i = ˆF n i (x i ) and treating the univariate marginals as observed or unparameterized. Fit copulas pairwise and minimize MI(u) by diagonalization of a mutual information matrix.

98 Implementation on ESI Implementation Fit bivariate copula The candidate copulae at two-parameter extensions of Archimedean type copulae.

99 Implementation on ESI Implementation Fit bivariate copula The candidate copulae at two-parameter extensions of Archimedean type copulae.

100 Implementation on ESI Implementation Additionally, I fit rotations of bivariate copula. The copula are rotated by setting the argument equal to the values in the table, with u = 1 u, v = 1 v: Rotation (u, v) 0 (u, v) 90 (v, u) 180 (v, u) 270 (v, u)

101 Inputs Bivariate Scatterplot - PCA 1 vs. PCA 2

102 Inputs Bivariate Scatterplot - PCA 2 vs. PCA 3

103 Inputs Bivariate Scatterplot - PCA 1 vs. PCA 3

104 Inputs Trivariate Scatterplot - PCA 1 vs. PCA 2 vs PCA 3

105 CICA outputs Contour/Scatter plot: Copula on CICA 1 vs. CICA 2

106 CICA outputs Contour/Scatter plot: Copula on CICA 2 vs. CICA 3

107 CICA outputs Contour/Scatter plot: Copula on CICA 1 vs. CICA 3

108 Results Image plots: Covariance vs. MI

109 Results Scree plot: PCA and CICA

110 Results CICA loadings Variable Name Component 1 Component 2 Component 3 SO NO TSP ISO WATCAP IUCN CO2GDP

111 Results CICA loadings vs. PCA loadings CICA SO2 NO2 TSP ISO14 WATCAP IUCN CO2GDP PCA NUKE BODWAT TFR FSHCAT PESTHA WATSUP GRAFT

112 Results ESI Grouping Environmental Systems Natural stocks such as air, soil, and water. Environmental Stresses Stress on ecosystems such as pollution and deforestation. Vulnerability Basic needs such as health, nutrition, and mortality. Capacity Social and economic variables such as corruption and liberty, energy consumption, and schooling rate. Stewardship Global cooperation such as treaty participation and compliance.

113 Results ESI Grouping Environmental Systems Natural stocks such as air, soil, and water. Environmental Stresses Stress on ecosystems such as pollution and deforestation. Vulnerability Basic needs such as health, nutrition, and mortality. Capacity Social and economic variables such as corruption and liberty, energy consumption, and schooling rate. Stewardship Global cooperation such as treaty participation and compliance.

114 Results ESI Grouping Environmental Systems Natural stocks such as air, soil, and water. Environmental Stresses Stress on ecosystems such as pollution and deforestation. Vulnerability Basic needs such as health, nutrition, and mortality. Capacity Social and economic variables such as corruption and liberty, energy consumption, and schooling rate. Stewardship Global cooperation such as treaty participation and compliance.

115 Results ESI Grouping Environmental Systems Natural stocks such as air, soil, and water. Environmental Stresses Stress on ecosystems such as pollution and deforestation. Vulnerability Basic needs such as health, nutrition, and mortality. Capacity Social and economic variables such as corruption and liberty, energy consumption, and schooling rate. Stewardship Global cooperation such as treaty participation and compliance.

116 Results ESI Grouping Environmental Systems Natural stocks such as air, soil, and water. Environmental Stresses Stress on ecosystems such as pollution and deforestation. Vulnerability Basic needs such as health, nutrition, and mortality. Capacity Social and economic variables such as corruption and liberty, energy consumption, and schooling rate. Stewardship Global cooperation such as treaty participation and compliance.

117 Next Outline Introduction Copula Specification Heuristic Example Simple Example The Copula perspective Mutual Information as Copula dependent function MI KL Independence PCA as a special case Implementation on ESI Inputs CICA outputs Results Next

118 Next Comments Copula approach unifies PCA/ICA as likelihood based approach Natural first step of generalized (linear) Copula based models Exploits the rich family of lower dimension copulae

119 Next Comments Copula approach unifies PCA/ICA as likelihood based approach Natural first step of generalized (linear) Copula based models Exploits the rich family of lower dimension copulae

120 Next Comments Copula approach unifies PCA/ICA as likelihood based approach Natural first step of generalized (linear) Copula based models Exploits the rich family of lower dimension copulae

121 Next Comments, Next Steps

122 Next Comments, Next Steps

Copula Based Independent Component Analysis

Copula Based Independent Component Analysis CUNY October 2008 Kobi Abayomi + + Asst. Professor Industrial Engineering - Statistics Group Georgia Institute of Technology October 2008 Introduction Outline