# Independent Component (IC) Models: New Extensions of the Multinormal Model

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Independent Component (IC) Models: New Extensions of the Multinormal Model Davy Paindaveine (joint with Klaus Nordhausen, Hannu Oja, and Sara Taskinen) School of Public Health, ULB, April 2008

2 My research is in multivariate statistics, where several (p, say) measurements are recorded on each of the n individuals. We want to come up with models that are potentially useful for a broad range of setups (p << n, though).

3 My research is in multivariate statistics, where several (p, say) measurements are recorded on each of the n individuals. We want to come up with models that are potentially useful for a broad range of setups (p << n, though). In those models, we develop procedures that are robust to some possible model misspecification

4 My research is in multivariate statistics, where several (p, say) measurements are recorded on each of the n individuals. We want to come up with models that are potentially useful for a broad range of setups (p << n, though). In those models, we develop procedures that are robust to some possible model misspecification robust to possible outlying observations (crucial in the multivariate case!)

5

6

7 My research is in multivariate statistics, where several (p, say) measurements are recorded on each of the n individuals. We want to come up with models that are potentially useful for a broad range of setups (p << n, though). In those models, we develop procedures that are robust to some possible model misspecification robust to possible outlying observations yet efficient...

8 Outline Introduction 1 Introduction A (too?) simple multivariate problem Normal and elliptic models 2 What is it? How does it work? vs PCA 3 Definition Inference

9 Outline Introduction A (too?) simple multivariate problem Normal and elliptic models 1 Introduction A (too?) simple multivariate problem Normal and elliptic models 2 What is it? How does it work? vs PCA 3 Definition Inference

10 A (too?) simple multivariate problem Normal and elliptic models cigarette sales in packs per capita per capita disposable income

11 A (too?) simple multivariate problem Normal and elliptic models X i = ( Xi1 X i2 ) = ( ) sales (after - before) for state i, i = 1,...,n income (after - before) for state i

12 A (too?) simple multivariate problem Normal and elliptic models Assume one wants to find out, on the basis of the sample X 1, X 2,..., X n, whether the tax reform had an effect (or not) on any of the variables. Typically, in statistical terms, this would translate into testing { H0 : µ j = 0 for all j H 1 : µ j 0 for at least j, at some fixed level α (5%, say).

13 A (too?) simple multivariate problem Normal and elliptic models Assume one wants to find out, on the basis of the sample X 1, X 2,..., X n, whether the tax reform had some fixed specified effect on each variable mean. Typically, in statistical terms, this would translate into testing { H0 : µ j = c j for all j H 1 : µ j c j for at least j, at some fixed level α (5%, say).

14 A (too?) simple multivariate problem Normal and elliptic models The most basic idea is to go univariate, i.e., for each j = 1, 2, to test on the basis of X 1j,..., X nj whether H (j) 0 : µ j = c j holds or not (at level 5%), to reject H 0 as soon as one H (j) 0 has been rejected. This is a bad multivariate testing procedure, since it is easy to show that P[RH 0 ] > 5% under H 0. You cannot properly control the level if you act marginally...

15 A (too?) simple multivariate problem Normal and elliptic models cigarette sales in packs per capita per capita disposable income

16 A (too?) simple multivariate problem Normal and elliptic models The most basic idea is to go univariate, i.e., for each j = 1, 2, to test on the basis of X 1j,..., X nj whether H (j) 0 : µ j = c j holds or not (at level 5%), to reject H 0 as soon as one H (j) 0 has been rejected. This is a bad multivariate testing procedure, since it is easy to show that P[RH 0 ] > 5% under H 0. You cannot properly control the level if you act marginally...

17 A (too?) simple multivariate problem Normal and elliptic models Confidence zones also cannot be built marginally...

18 A (too?) simple multivariate problem Normal and elliptic models Confidence zones also cannot be built marginally...

19 A (too?) simple multivariate problem Normal and elliptic models Hence there is a need for multivariate modelling. The most classical model the multivariate normal model specifies that the common density of the X i s is of the form f X (x) exp( (x µ) Σ 1 (x µ)/2). A necessary condition for to hold is that each of the p variables is normally distributed. Hence, even for p = 2, 3, it is extremely unlikely that the underlying distribution is multivariate normal... (you need to win at Euromillions p times in a row!)

20 A (too?) simple multivariate problem Normal and elliptic models

21 A (too?) simple multivariate problem Normal and elliptic models Not quite the same model!

22 A (too?) simple multivariate problem Normal and elliptic models

23 A (too?) simple multivariate problem Normal and elliptic models The marginals are far from Gaussian...

24 A (too?) simple multivariate problem Normal and elliptic models Does it hurt?

25 A (too?) simple multivariate problem Normal and elliptic models Does it hurt? Oh yes, it does... For H 0 : µ = µ 0, the Gaussian LR test (i) is efficient at the multinormal only, and (ii) is valid only if variances exist (what about financial series?) For H 0 : Σ = Σ 0, the Gaussian LR test is valid at the multivariate normal distribution!

26 A (too?) simple multivariate problem Normal and elliptic models Does it hurt? Oh yes, it does... For H 0 : µ = µ 0, the Gaussian LR test (i) is efficient at the multinormal only, and (ii) is valid only if variances exist (what about financial series?) For H 0 : Σ = Σ 0, the Gaussian LR test is valid at the multivariate normal distribution! Remarks: Even for n = Incidently, those tests are not robust w.r.t. possible outliers.

27 A (too?) simple multivariate problem Normal and elliptic models

28 A (too?) simple multivariate problem Normal and elliptic models An equivalent definition of the multivariate normal distribution specifies that where X = A(RU) + µ, U is uniformly distributed on the unit sphere in R p R 2 χ 2 p is independent of U A is a constant p p matrix µ is a constant p-vector

29 A (too?) simple multivariate problem Normal and elliptic models

30 A (too?) simple multivariate problem Normal and elliptic models

31 A (too?) simple multivariate problem Normal and elliptic models An equivalent definition of the multivariate normal distribution specifies that where X = A(RU) + µ, U is uniformly distributed on the unit sphere in R p R 2 χ 2 p is independent of U A is a constant p p matrix µ is a constant p-vector

32 A (too?) simple multivariate problem Normal and elliptic models An equivalent definition of the multivariate normal distribution specifies that where X = A(RU) + µ, U is uniformly distributed on the unit sphere in R p R 2 χ 2 p is independent of U A is a constant p p matrix µ is a constant p-vector Elliptical distributions allow for an arbitrary distribution R.

33 A (too?) simple multivariate problem Normal and elliptic models Elliptical distributions add some flexibility (in particular, allow for heavy tails). but still give raise to marginals with a common (type of) distribution. symmetric marginals a deep multivariate symmetry structure... These stylized facts often are sufficient to rule out the assumption of ellipticity... (no need for a test of ellipticity!) I am burning my old records here!

34 A (too?) simple multivariate problem Normal and elliptic models

35 A (too?) simple multivariate problem Normal and elliptic models Elliptical distributions add some flexibility (in particular, allow for heavy tails). but still give raise to marginals with a common (type of) distribution. symmetric marginals a deep multivariate symmetry structure... These stylized facts often are sufficient to rule out the assumption of ellipticity... (no need for a test of ellipticity!)

36 A (too?) simple multivariate problem Normal and elliptic models And now something completely different..." (Monthy Python Flying Circus, 1970)

37 Outline Introduction What is it? How does it work? vs PCA 1 Introduction A (too?) simple multivariate problem Normal and elliptic models 2 What is it? How does it work? vs PCA 3 Definition Inference

38 What is it? How does it work? vs PCA stands for Independent Component Analysis. It is a technique used in Blind Source Separation problems, such as in the cocktail-party problem": 3 conversations: Z it (i = 1, 2, 3, t = 1,..., n) 3 microphones: X it The goal is to recover the original conversations... Under the only assumption the latter are independent.

39 What is it? How does it work? vs PCA s Z 1t Z 2t Z 3t

40 What is it? How does it work? vs PCA s Z 1t Z 2t Z 3t

41 What is it? How does it work? vs PCA The basic model is X 1t = a 11 Z 1t + a 12 Z 2t + a 13 Z 3t X 2t = a 21 Z 1t + a 22 Z 2t + a 23 Z 3t X 3t = a 31 Z 1t + a 32 Z 2t + a 33 Z 3t, that is, X t = A Z t ; where one assumes all Z it s are mutually independent. Conversations" are independent. No serial dependence.

42 What is it? How does it work? vs PCA The basic model is X 1t = a 11 Z 1t + a 12 Z 2t + a 13 Z 3t X 2t = a 21 Z 1t + a 22 Z 2t + a 23 Z 3t X 3t = a 31 Z 1t + a 32 Z 2t + a 33 Z 3t, that is, X t = A Z t ; where one assumes all Z it s are mutually independent. Conversations" are independent. No serial dependence. The mixing matrix A does not depend on t.

43 What is it? How does it work? vs PCA For BW images, Z ij {0, 1,..., 255} represents the grey intensity of the ith image for the jth pixel (in vectorized form).

44 What is it? How does it work? vs PCA For BW images, Z ij {0, 1,..., 255} represents the grey intensity of the ith image for the jth pixel (in vectorized form).

45 What is it? How does it work? vs PCA For BW images, Z ij {0, 1,..., 255} represents the grey intensity of the ith image for the jth pixel (in vectorized form). Here, n =

46 What is it? How does it work? vs PCA For BW images, Z ij {0, 1,..., 255} represents the grey intensity of the ith image for the jth pixel (in vectorized form). Here, n = And Z i1 = 61, Z i2 = 61,...

47 What is it? How does it work? vs PCA For BW images, Z ij {0, 1,..., 255} represents the grey intensity of the ith image for the jth pixel (in vectorized form). Here, n = And Z i1 = 61, Z i2 = 61,... Minimimal value=45 (dark grey) Maximal value=255 (white)

48 What is it? How does it work? vs PCA Can you guess the Z 1, Z 2, Z 3 which generated this mixture X 1? X 1

49 What is it? How does it work? vs PCA Would you guess who are X 1 X 2

50 What is it? How does it work? vs PCA X 1 X 2 X 3

51 What is it? How does it work? vs PCA magic

52 What is it? How does it work? vs PCA Ẑ 1 Ẑ 2 Ẑ 3

53 What is it? How does it work? vs PCA Z 1 Z 2 Z 3

54 What is it? How does it work? vs PCA Engineers typically estimate A (hence recover sources Ẑt = Â 1 X t ) by choosing the matrix A that makes the marginals of A 1 X t as independent as possible, or as non-gaussian as possible.

55 What is it? How does it work? vs PCA Engineers typically estimate A (hence recover sources Ẑt = Â 1 X t ) by choosing the matrix A that makes the marginals of A 1 X t as independent as possible, or as non-gaussian as possible. Drawbacks: Arbitrary objective functions. Computationally intensive procedures. Lack of robustness.

56 What is it? How does it work? vs PCA We have our own way to do that: A p p scatter matrix S = S(X 1,...,X n ) is a statistic such that S(AX 1,...,AX n ) = AS(X 1,..., X n )A for all p p matrix A. Example: S 2 = 1 n 1 S 1 = 1 n 1 n i=1 n (X i X)(X i X) i=1 [(X i X) S 1 1 (X i X)](X i X)(X i X)

57 What is it? How does it work? vs PCA We have our own way to do that: A p p scatter matrix S = S(X 1,...,X n ) is a statistic such that S(AX 1,...,AX n ) = AS(X 1,..., X n )A for all p p matrix A. Assume lim n S(X 1,..., X n ) is diagonal as soon as the common distribution of the X i s has independent marginals. Then we say S has the independence property. Example: S 2 = 1 n 1 S 1 = 1 n 1 n i=1 n (X i X)(X i X) i=1 [(X i X) S 1 1 (X i X)](X i X)(X i X)

58 What is it? How does it work? vs PCA We have our own way to do that: A p p scatter matrix S = S(X 1,...,X n ) is a statistic such that S(AX 1,...,AX n ) = AS(X 1,..., X n )A for all p p matrix A. Assume lim n S(X 1,..., X n ) is diagonal as soon as the common distribution of the X i s has independent marginals. Then we say S has the independence property. Examples: S 2 = 1 n 1 S 1 = 1 n 1 n i=1 n (X i X)(X i X) i=1 [(X i X) S 1 1 (X i X)](X i X)(X i X)

59 What is it? How does it work? vs PCA We have our own way to do that: A p p scatter matrix S = S(X 1,...,X n ) is a statistic such that S(AX 1,...,AX n ) = AS(X 1,..., X n )A for all p p matrix A. Assume lim n S(X 1,..., X n ) is diagonal as soon as the common distribution of the X i s has independent marginals. Then we say S has the independence property. Theorem Let S 1, S 2 be scatter matrices with the independence property. Then the p p matrix B n, whose columns are the eigenvectors of S 1 2 (X 1,..., X n )S 1 (X 1,..., X n ), is consistent for (A ) 1.

60 What is it? How does it work? vs PCA Proof. By using the definition of a scatter and the independence property, we obtain { S 1 = S 1 (X i ) = S 1 (AZ i ) = AS 1 (Z i )A = AD 1 A S 2 = S 2 (X i ) = S 2 (AZ i ) = AS 2 (Z i )A = AD 2 A, for some diagonal matrices D 1, D 2. Hence, (S 1 2 S 1)A 1 = (AD2 A ) 1 (AD 1 A )A 1 = A 1 (D 1 2 D 1).

61 What is it? How does it work? vs PCA Proof. By using the definition of a scatter and the independence property, we obtain { S 1 = S 1 (X i ) = S 1 (AZ i ) = AS 1 (Z i )A = AD 1 A S 2 = S 2 (X i ) = S 2 (AZ i ) = AS 2 (Z i )A = AD 2 A, for some diagonal matrices D 1, D 2. Hence, (S 1 2 S 1)A 1 = (AD2 A ) 1 (AD 1 A )A 1 = A 1 (D 1 2 D 1).

62 What is it? How does it work? vs PCA Proof. By using the definition of a scatter and the independence property, we obtain { S 1 = S 1 (X i ) = S 1 (AZ i ) = AS 1 (Z i )A = AD 1 A S 2 = S 2 (X i ) = S 2 (AZ i ) = AS 2 (Z i )A = AD 2 A, for some diagonal matrices D 1, D 2. Hence, (S 1 2 S 1)A 1 = (AD2 A ) 1 (AD 1 A )A 1 = A 1 (D 1 2 D 1).

63 What is it? How does it work? vs PCA We have our own way to do that: Theorem Let S 1, S 2 be scatter matrices with the independence property. Then the p p matrix B n, whose columns are the eigenvectors of S 1 2 S 1, is consistent for (A ) 1. Of course, if we choose robust S 1 and S 2, the resulting Â will be robust as well, which guarantees a robust reconstruction of the independent sources...

64 What is it? How does it work? vs PCA With robust S 1, S 2... Ẑ 1 Ẑ 2 Ẑ 3

65 What is it? How does it work? vs PCA With non-robust S 1, S 2... (the ones given above) Ẑ 1 Ẑ 2 Ẑ 3

66 What is it? How does it work? vs PCA PCA makes marginals uncorrelated... makes marginals independent... Actually, is going one step further than PCA: = PCA + a rotation...

67 What is it? How does it work? vs PCA PCA makes marginals uncorrelated... makes marginals independent... Actually, is going one step further than PCA: = PCA + a rotation... This explains PCA is often used as a preliminary step to perform.

68 What is it? How does it work? vs PCA V V2 Raw data V V4 V V

69 What is it? How does it work? vs PCA V V2 Principal components V V4 V V

70 What is it? How does it work? vs PCA V Independent components V2 V V4 V V

71 Outline Introduction Definition Inference 1 Introduction A (too?) simple multivariate problem Normal and elliptic models 2 What is it? How does it work? vs PCA 3 Definition Inference

72 Definition Inference We reject the elliptical model, which states that X i = AZ i + µ, where Z i = (Z i1,...,z ip ) is spherically symmetric (about 0 R p ), in favor of the following: Definition The independent component (IC) model states that X i = AZ i + µ, where Z i = (Z i1,...,z ip ) has independent marginals (with median 0 and MAD 1).

73 Definition Inference provide an extension of the multinormal model, which is obtained when all ICs are Gaussian.

74 Definition Inference provide an extension of the multinormal model, which is obtained when all ICs are Gaussian. Both extensions are disjoint.

75 Definition Inference provide an extension of the multinormal model, which is obtained when all ICs are Gaussian. Both extensions are disjoint. This IC extension is bigger than that of elliptic models. In : µ, A, and p densities g 1,...,g p. In elliptic models: µ, A, and a single density g (that of Z ).

76 Definition Inference As a summary...

77 Definition Inference provide an extension of the multinormal model, which is obtained when all ICs are Gaussian. Both extensions are disjoint. This IC extension is bigger than that of elliptic models. In : µ, A, and p densities g 1,...,g p. In elliptic models: µ, A, and a single density g (that of Z ). The g j s allow for much flexibility. In particular, - we can play with p different kurtosis values... - the X i may very well be asymmetric...

78 Definition Inference

79 Definition Inference Inference problem Test H 0 : µ = 0 for n i.i.d. observations from the IC model X i = AZ i + µ, where Z i = (Z i1,...,z ip ) has independent marginals. The parameters: location vector µ, scatter matrix A, p densities (g 1,..., g p ). Of course, we can hardly assume the g j s to be known, and it is expected that this nuisance will be an important issue.

80 Definition Inference Quite nicely, our estimators Â (based on a couple of scatter S 1, S 2 ) do not require estimating µ nor g 1,...,g p.

81 Definition Inference Quite nicely, our estimators Â (based on a couple of scatter S 1, S 2 ) do not require estimating µ nor g 1,...,g p. We then may (a) write Y i := Â 1 X i = Â 1 AZ i + Â 1 µ Z i + Â 1 µ (1) and (b) go univariate to test componentwise whether the location is 0 (RH (j) 0 for large values of T j, with T j N(0, 1) under H (j) 0 ).

82 Definition Inference Quite nicely, our estimators Â (based on a couple of scatter S 1, S 2 ) do not require estimating µ nor g 1,...,g p. We then may (a) write Y i := Â 1 X i = Â 1 AZ i + Â 1 µ Z i + Â 1 µ (1) and (b) go univariate to test componentwise whether the location is 0 (RH (j) 0 for large values of T j, with T j N(0, 1) under H (j) 0 ). Crucial point: we will be able to aggregate those univariate tests easily because the components are independent (RH 0 for large values of p j=1 T j 2, which is χ 2 p under H 0).

83 Definition Inference Which T j should we choose? Student: T j = n Ȳ.j s.j = 1 n n i=1 Y ij s.j = 1 n n i=1 Sign(Y ij ) Y ij s.j This yields a multivariate Student test (φ N, say), which unfortunately suffers the same drawbacks as classical Gaussian tests:

84 Definition Inference Which T j should we choose? Student: T j = n Ȳ.j s.j = 1 n n i=1 Y ij s.j = 1 n n i=1 Sign(Y ij ) Y ij s.j This yields a multivariate Student test (φ N, say), which unfortunately suffers the same drawbacks as classical Gaussian tests: It cannot deal with heavy tails. It is poorly robust.

85 Definition Inference Which T j should we choose? Student: T j = n Ȳ.j s.j = 1 n n i=1 Y ij s.j = 1 n n i=1 Sign(Y ij ) Y ij s.j This yields a multivariate Student test (φ N, say), which unfortunately suffers the same drawbacks as classical Gaussian tests: It cannot deal with heavy tails. It is poorly robust.

86 Definition Inference Which T j should we choose? Student: T j = n Ȳ.j s.j = 1 n n i=1 Y ij s.j = 1 n n i=1 Sign(Y ij ) Y ij s.j This yields a multivariate Student test (φ N, say), which unfortunately suffers the same drawbacks as classical Gaussian tests: It cannot deal with heavy tails. It is poorly robust.

87 Definition Inference Which T j should we choose? Student: T j = n Ȳ.j s.j = 1 n T j := 1 n at the multinormal, where n i=1 n i=1 Y ij s.j = 1 n n i=1 Sign(Y ij ) Y ij s.j, ( Sign(Y ij )Φ 1 Rij ) +, n + 1 R ij denotes the rank of Y ij among Y 1j,..., Y nj, and Φ + (z) = P[ N(0, 1) z ].

88 Definition Inference How good is the resulting test ( φ N, say), which rejects H 0 for large values of p j=1 T j 2? It is fairly robust to outliers It can deal with heavy tails

89 Definition Inference How good is the resulting test ( φ N, say), which rejects H 0 for large values of p j=1 T j 2? and... It is fairly robust to outliers It can deal with heavy tails it is, at the multinormal, as powerful as φ N! (since T j = T j + o P (1) at the multinormal)

90 Definition Inference How good is the resulting test ( φ N, say), which rejects H 0 for large values of p j=1 T j 2? and... It is fairly robust to outliers It can deal with heavy tails it is, at the multinormal, as powerful as φ N! (since T j = T j + o P (1) at the multinormal) A natural question: How does it compare with φ N (in terms of power) away from the multinormal?

91 Definition Inference The answer is in favor of our rank test: Theorem The asymptotic relative efficiency (ARE) of φ N with respect to φ N under µ = n 1/2 τ, A, and (g 1,..., g p ) is of the form ARE = p j=1 w j(a,τ) c(g j ) p j=1 w j(a,τ), w j (A,τ) 0. g j t 3 t 6 t 12 N e 2 e 3 e Table: Various values of c(g j ).

92 Definition Inference Actually, c(g j ) 1 for all g j, which implies that φ N is always (asymptotically) more powerful than the Student test φ N!

93 Definition Inference Actually, c(g j ) 1 for all g j, which implies that φ N is always (asymptotically) more powerful than the Student test φ N! Our tests therefore dominate the Student ones both in terms of robustness and efficiency!

94 Definition Inference Remark: rather than Gaussian scores" as in T j = 1 n n i=1 ( Sign(Y ij )Φ 1 Rij ) +, n + 1 one can use (more robust) Wilcoxon scores T j := 3 n n R ij Sign(Y ij ) n + 1 i=1 or (even more robust) sign scores T j := 1 n n Sign(Y ij ). i=1

95 Definition Inference Efficiency is then not as good, as a price for the better robustness... g j t 3 t 6 t 12 N e 2 e 3 e 5 φ N test φw φ S Table: Various values of c(g j ) for our Gaussian, Wilcoxon, and sign tests.

96 Definition Inference Original data

97 Definition Inference 95% confidence zone Gaussian method

98 Definition Inference 95% confidence zone our φ N IC method

99 Definition Inference 95% confidence zone our φ W IC method

100 Definition Inference 95% confidence zone our φ S IC method

101 Definition Inference Original data

102 Definition Inference Contaminated data

103 Definition Inference 95% confidence zone Gaussian method

104 Definition Inference 95% confidence zone our φ N IC method

105 Definition Inference 95% confidence zone our φ W IC method

106 Definition Inference 95% confidence zone our φ S IC method

107 Conclusion Introduction Definition Inference provide quite flexible semiparametric models for multivariate statistics. Rank methods are efficient and robust alternatives to Gaussian methods.

108 Appendix References References I Oja, H., Sirkiä, S., & J. Eriksson (2006). Scatter matrices and independent component analysis, Austrian Journal of Statistics 35, Oja, H., Nordhausen, K., & D. Paindaveine (2007). Signed-rank tests for location in the symmetric independent component model, ECORE DP 2007/123. Submitted. Oja, H., Paindaveine, D., & S. Taskinen (2008). Parametric and nonparametric tests for multivariate independence in. Manuscript in preparation.

### Invariant coordinate selection for multivariate data analysis - the package ICS

Invariant coordinate selection for multivariate data analysis - the package ICS Klaus Nordhausen 1 Hannu Oja 1 David E. Tyler 2 1 Tampere School of Public Health University of Tampere 2 Department of Statistics

### Scatter Matrices and Independent Component Analysis

AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 175 189 Scatter Matrices and Independent Component Analysis Hannu Oja 1, Seija Sirkiä 2, and Jan Eriksson 3 1 University of Tampere, Finland

### Independent component analysis for functional data

Independent component analysis for functional data Hannu Oja Department of Mathematics and Statistics University of Turku Version 12.8.216 August 216 Oja (UTU) FICA Date bottom 1 / 38 Outline 1 Probability

### Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference

### Independent Component Analysis and Its Applications. By Qing Xue, 10/15/2004

Independent Component Analysis and Its Applications By Qing Xue, 10/15/2004 Outline Motivation of ICA Applications of ICA Principles of ICA estimation Algorithms for ICA Extensions of basic ICA framework

### Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

### CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

### x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

### Principal Component Analysis vs. Independent Component Analysis for Damage Detection

6th European Workshop on Structural Health Monitoring - Fr..D.4 Principal Component Analysis vs. Independent Component Analysis for Damage Detection D. A. TIBADUIZA, L. E. MUJICA, M. ANAYA, J. RODELLAR

### The ICS Package. May 4, 2007

Type Package The ICS Package May 4, 2007 Title ICS / ICA Computation Based on two Scatter Matrices Version 1.0-0 Date 2007-05-04 Author Klaus Nordhausen, Hannu Oja, Dave Tyler Maintainer Klaus Nordhausen

### STATS 306B: Unsupervised Learning Spring Lecture 2 April 2

STATS 306B: Unsupervised Learning Spring 2014 Lecture 2 April 2 Lecturer: Lester Mackey Scribe: Junyang Qian, Minzhe Wang 2.1 Recap In the last lecture, we formulated our working definition of unsupervised

### Advanced Statistics II: Non Parametric Tests

Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011 Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney

### Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

### A simple graphical method to explore tail-dependence in stock-return pairs

A simple graphical method to explore tail-dependence in stock-return pairs Klaus Abberger, University of Konstanz, Germany Abstract: For a bivariate data set the dependence structure can not only be measured

### MTTS1 Dimensionality Reduction and Visualization Spring 2014 Jaakko Peltonen

MTTS1 Dimensionality Reduction and Visualization Spring 2014 Jaakko Peltonen Lecture 3: Linear feature extraction Feature extraction feature extraction: (more general) transform the original to (k < d).

### Contents 1. Contents

Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample

### Introduction to Independent Component Analysis. Jingmei Lu and Xixi Lu. Abstract

Final Project 2//25 Introduction to Independent Component Analysis Abstract Independent Component Analysis (ICA) can be used to solve blind signal separation problem. In this article, we introduce definition

### HST.582J/6.555J/16.456J

Blind Source Separation: PCA & ICA HST.582J/6.555J/16.456J Gari D. Clifford gari [at] mit. edu http://www.mit.edu/~gari G. D. Clifford 2005-2009 What is BSS? Assume an observation (signal) is a linear

### Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

### Robust scale estimation with extensions

Robust scale estimation with extensions Garth Tarr, Samuel Müller and Neville Weber School of Mathematics and Statistics THE UNIVERSITY OF SYDNEY Outline The robust scale estimator P n Robust covariance

### Nonparametric Location Tests: k-sample

Nonparametric Location Tests: k-sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

### Plug-in Measure-Transformed Quasi Likelihood Ratio Test for Random Signal Detection

Plug-in Measure-Transformed Quasi Likelihood Ratio Test for Random Signal Detection Nir Halay and Koby Todros Dept. of ECE, Ben-Gurion University of the Negev, Beer-Sheva, Israel February 13, 2017 1 /

### What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

### MTTTS17 Dimensionality Reduction and Visualization Spring 2018 Jaakko Peltonen

MTTTS17 Dimensionality Reduction and Visualization Spring 2018 Jaakko Peltonen Lecture 3: Linear feature extraction 1 Feature extraction feature extraction: (more general) transform the original to (k

### AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC

Journal of Applied Statistical Science ISSN 1067-5817 Volume 14, Number 3/4, pp. 225-235 2005 Nova Science Publishers, Inc. AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC FOR TWO-FACTOR ANALYSIS OF VARIANCE

### Dimensionality Reduction. CS57300 Data Mining Fall Instructor: Bruno Ribeiro

Dimensionality Reduction CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Visualize high dimensional data (and understand its Geometry) } Project the data into lower dimensional spaces }

### The Multivariate Gaussian Distribution

The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance

### Testing Statistical Hypotheses

E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

### Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

### Bayesian linear regression

Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

### GARCH Models Estimation and Inference

GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in

### Multivariate Non-Normally Distributed Random Variables

Multivariate Non-Normally Distributed Random Variables An Introduction to the Copula Approach Workgroup seminar on climate dynamics Meteorological Institute at the University of Bonn 18 January 2008, Bonn

### I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Comparisons of Two Means Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c

### Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and two-sample tests 2 / 16 If data do not come from a normal

### Robust estimation of principal components from depth-based multivariate rank covariance matrix

Robust estimation of principal components from depth-based multivariate rank covariance matrix Subho Majumdar Snigdhansu Chatterjee University of Minnesota, School of Statistics Table of contents Summary

### An Introduction to Independent Components Analysis (ICA)

An Introduction to Independent Components Analysis (ICA) Anish R. Shah, CFA Northfield Information Services Anish@northinfo.com Newport Jun 6, 2008 1 Overview of Talk Review principal components Introduce

### [y i α βx i ] 2 (2) Q = i=1

Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

### Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

### Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

### Elliptically Contoured Distributions

Elliptically Contoured Distributions Recall: if X N p µ, Σ), then { 1 f X x) = exp 1 } det πσ x µ) Σ 1 x µ) So f X x) depends on x only through x µ) Σ 1 x µ), and is therefore constant on the ellipsoidal

### Variational Principal Components

Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

### Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

### Donghoh Kim & Se-Kang Kim

Behav Res (202) 44:239 243 DOI 0.3758/s3428-02-093- Comparing patterns of component loadings: Principal Analysis (PCA) versus Independent Analysis (ICA) in analyzing multivariate non-normal data Donghoh

The Adequate Bootstrap arxiv:1608.05913v1 [stat.me] 21 Aug 2016 Toby Kenney Department of Mathematics and Statistics, Dalhousie University and Hong Gu Department of Mathematics and Statistics, Dalhousie

### Master s Written Examination - Solution

Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

### Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

### Long-Run Covariability

Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips

### Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

### Lecture 5 Singular value decomposition

Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn

### CHICAGO: A Fast and Accurate Method for Portfolio Risk Calculation

CHICAGO: A Fast and Accurate Method for Portfolio Risk Calculation University of Zürich April 28 Motivation Aim: Forecast the Value at Risk of a portfolio of d assets, i.e., the quantiles of R t = b r

### Lecture 11: Regression Methods I (Linear Regression)

Lecture 11: Regression Methods I (Linear Regression) Fall, 2017 1 / 40 Outline Linear Model Introduction 1 Regression: Supervised Learning with Continuous Responses 2 Linear Models and Multiple Linear

### Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com

### MATH 829: Introduction to Data Mining and Analysis Principal component analysis

1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional

### Chapter 5. The multivariate normal distribution. Probability Theory. Linear transformations. The mean vector and the covariance matrix

Probability Theory Linear transformations A transformation is said to be linear if every single function in the transformation is a linear combination. Chapter 5 The multivariate normal distribution When

### Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

### Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

### p L yi z n m x N n xi

y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

### PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

### Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes

TTU, October 26, 2012 p. 1/3 Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes Hao Zhang Department of Statistics Department of Forestry and Natural Resources Purdue University

### Nonlinear Time Series Modeling

Nonlinear Time Series Modeling Part II: Time Series Models in Finance Richard A. Davis Colorado State University (http://www.stat.colostate.edu/~rdavis/lectures) MaPhySto Workshop Copenhagen September

### Package BSSasymp. R topics documented: September 12, Type Package

Type Package Package BSSasymp September 12, 2017 Title Asymptotic Covariance Matrices of Some BSS Mixing and Unmixing Matrix Estimates Version 1.2-1 Date 2017-09-11 Author Jari Miettinen, Klaus Nordhausen,

### A Process over all Stationary Covariance Kernels

A Process over all Stationary Covariance Kernels Andrew Gordon Wilson June 9, 0 Abstract I define a process over all stationary covariance kernels. I show how one might be able to perform inference that

### Independent Component Analysis

A Short Introduction to Independent Component Analysis Aapo Hyvärinen Helsinki Institute for Information Technology and Depts of Computer Science and Psychology University of Helsinki Problem of blind

### AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

Econometrics Working Paper EWP0401 ISSN 1485-6441 Department of Economics AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Lauren Bin Dong & David E. A. Giles Department of Economics, University of Victoria

### Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

### COM336: Neural Computing

COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk

### Introduction to Machine Learning

1, DATA11002 Introduction to Machine Learning Lecturer: Teemu Roos TAs: Ville Hyvönen and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer

### Testing Statistical Hypotheses

E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

### A Fully Nonparametric Modeling Approach to. BNP Binary Regression

A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation

### Mean Vector Inferences

Mean Vector Inferences Lecture 5 September 21, 2005 Multivariate Analysis Lecture #5-9/21/2005 Slide 1 of 34 Today s Lecture Inferences about a Mean Vector (Chapter 5). Univariate versions of mean vector

### Dimension Reduction (PCA, ICA, CCA, FLD,

Dimension Reduction (PCA, ICA, CCA, FLD, Topic Models) Yi Zhang 10-701, Machine Learning, Spring 2011 April 6 th, 2011 Parts of the PCA slides are from previous 10-701 lectures 1 Outline Dimension reduction

### 2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

### Lecture 9: Elementary Matrices

Lecture 9: Elementary Matrices Review of Row Reduced Echelon Form Consider the matrix A and the vector b defined as follows: 1 2 1 A b 3 8 5 A common technique to solve linear equations of the form Ax

### Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

### Lecture Notes Part 2: Matrix Algebra

17.874 Lecture Notes Part 2: Matrix Algebra 2. Matrix Algebra 2.1. Introduction: Design Matrices and Data Matrices Matrices are arrays of numbers. We encounter them in statistics in at least three di erent

### Introduction to Normal Distribution

Introduction to Normal Distribution Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 17-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction

### Principal Component Analysis CS498

Principal Component Analysis CS498 Today s lecture Adaptive Feature Extraction Principal Component Analysis How, why, when, which A dual goal Find a good representation The features part Reduce redundancy

### On Multivariate Runs Tests. for Randomness

On Multivariate Runs Tests for Randomness Davy Paindaveine Université Libre de Bruxelles, Brussels, Belgium Abstract This paper proposes several extensions of the concept of runs to the multivariate setup,

### Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

### Machine Learning for Data Science (CS4786) Lecture 12

Machine Learning for Data Science (CS4786) Lecture 12 Gaussian Mixture Models Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Back to K-means Single link is sensitive to outliners We

### CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

### Tests Using Spatial Median

AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 331 338 Tests Using Spatial Median Ján Somorčík Comenius University, Bratislava, Slovakia Abstract: The multivariate multi-sample location problem

### Nonparametric Drift Estimation for Stochastic Differential Equations

Nonparametric Drift Estimation for Stochastic Differential Equations Gareth Roberts 1 Department of Statistics University of Warwick Brazilian Bayesian meeting, March 2010 Joint work with O. Papaspiliopoulos,

### One-Sample Numerical Data

One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

### EXPECTED VALUE of a RV. corresponds to the average value one would get for the RV when repeating the experiment, =0.

EXPECTED VALUE of a RV corresponds to the average value one would get for the RV when repeating the experiment, independently, infinitely many times. Sample (RIS) of n values of X (e.g. More accurately,

### Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures David Hunter Pennsylvania State University, USA Joint work with: Tom Hettmansperger, Hoben Thomas, Didier Chauveau, Pierre Vandekerkhove,

### Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009

### Statistical techniques for data analysis in Cosmology

Statistical techniques for data analysis in Cosmology arxiv:0712.3028; arxiv:0911.3105 Numerical recipes (the bible ) Licia Verde ICREA & ICC UB-IEEC http://icc.ub.edu/~liciaverde outline Lecture 1: Introduction

### TESTING FOR CO-INTEGRATION

Bo Sjö 2010-12-05 TESTING FOR CO-INTEGRATION To be used in combination with Sjö (2008) Testing for Unit Roots and Cointegration A Guide. Instructions: Use the Johansen method to test for Purchasing Power

### 8 Eigenvectors and the Anisotropic Multivariate Gaussian Distribution

Eigenvectors and the Anisotropic Multivariate Gaussian Distribution Eigenvectors and the Anisotropic Multivariate Gaussian Distribution EIGENVECTORS [I don t know if you were properly taught about eigenvectors

### Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests

### Independent component analysis: algorithms and applications

PERGAMON Neural Networks 13 (2000) 411 430 Invited article Independent component analysis: algorithms and applications A. Hyvärinen, E. Oja* Neural Networks Research Centre, Helsinki University of Technology,

### Comparison of Two Samples

2 Comparison of Two Samples 2.1 Introduction Problems of comparing two samples arise frequently in medicine, sociology, agriculture, engineering, and marketing. The data may have been generated by observation

### Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

### Bootstrap prediction intervals for factor models

Bootstrap prediction intervals for factor models Sílvia Gonçalves and Benoit Perron Département de sciences économiques, CIREQ and CIRAO, Université de Montréal April, 3 Abstract We propose bootstrap prediction

### Photos placed in horizontal position with even amount of white space between photos and header

Photos placed in horizontal position with even amount of white space between photos and header XPCA: Copula-based Decompositions for Ordinal Data Clifford Anderson-Bergman, Kina Kincher-Winoto and Tamara

### One-unit Learning Rules for Independent Component Analysis

One-unit Learning Rules for Independent Component Analysis Aapo Hyvarinen and Erkki Oja Helsinki University of Technology Laboratory of Computer and Information Science Rakentajanaukio 2 C, FIN-02150 Espoo,

### Factor Analysis (10/2/13)

STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.