Independent Component Analysis and Unsupervised Learning
|
|
- Alan Blair
- 5 years ago
- Views:
Transcription
1 Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University
2 TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood ratio ICA 3. Case Study II: Blind Source Separation Convex divergence ICA Nonstationary Bayesian ICA Online Gaussian process ICA 4. Summary
3 Introduction Independent component analysis (ICA) is essential for blind source separation. ICA is applied to separate the mixed signals and find the independent components. The demixed components can be grouped into clusters where the intra-cluster elements are dependent and intercluster elements are independent. ICA provides unsupervised learning approach to acoustic modeling, signal separation and many others. APSIPA DL: Independent Component Analysis and Unsupervised Learning 3
4 Blind Source Separation Cocktail-party problem Goal Unknown: A and s Reconstruct the source signals via demixing matrix W Mixture matrix A is assumed to be fixed. APSIPA DL: Independent Component Analysis and Unsupervised Learning 4
5 Independent Component Analysis Three assumptions sources statistically independent independent component nongaussian distribution mixing matrix square matrix m t = mt m t mm m m mt m t S S S S A A A A X X X X m m t m x = As APSIPA DL: Independent Component Analysis and Unsupervised Learning 5
6 ICA Objective Function Independent Component Analysis Maximum Likelihood Mutual Information Kurtosis Negentropy APSIPA DL: Independent Component Analysis and Unsupervised Learning 6
7 ICA Learning Rule ICA demixing matrix can be estimated by optimizing an objective function via gradient descent algorithm or natural gradient algorithm APSIPA DL: Independent Component Analysis and Unsupervised Learning 7
8 TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood ratio ICA 3. Case Study II: Blind Source Separation Convex divergence ICA Nonstationary Bayesian ICA Online Gaussian process ICA 4. Summary
9 ICA for Speech Recognition Mismatch between training and test data always exists. Adaptation of HMM parameters is important. Eigenvoice (PCA) versus Independent Voice (ICA) PCA performs a linear de-correlation process ICA extracts the higher-order statistics E[ e e2 em ] = E[ e1 ] E[ e2 ] E[ e 1 M r r r E[ s1 s2 sm ] = E[ s1 ] E[ s2 r r ] E[ s ] r M ] uncorrelation PCA higher-order correlations are zero ICA APSIPA DL: Independent Component Analysis and Unsupervised Learning 9
10 Sparseness & Information Redundancy The degree of sparseness in distribution of the transformed signals is proportional to the amount of information conveyed by the transformation. Sparseness measurement fourth-order statistics (kurtosis) nongaussianity kurt( s) = E[ s ] E [ s ] 3 Information redundancy reduction using ICA is higher than that using PCA. APSIPA DL: Independent Component Analysis and Unsupervised Learning 10
11 Eigenvoices versus Independent Voices Independent voice Reference Model 1 Eigenvoice Reference Model 2 Reference Model 3 Independent voice adapted model Adaptation data Eigenvoice adapted model APSIPA DL: Independent Component Analysis and Unsupervised Learning 11
12 Evaluation of Kurtosis Independent voice Eigenvoice 25 Kurtosis Voice index APSIPA DL: Independent Component Analysis and Unsupervised Learning 12
13 Word Error Rates on Aurora2 Word error rates (%) No adaptation Eigenvoice L=5 Eigenvoice L=10 Eigenvoice L=15 Independent voice L=5 Independent voice L=10 Independent voice L= K=10 K=15 K: number of components L: number of adaptation sentences APSIPA DL: Independent Component Analysis and Unsupervised Learning 13
14 TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood ratio ICA 3. Case Study II: Blind Source Separation Convex divergence ICA Nonstationary Bayesian ICA Online Gaussian process ICA 4. Summary
15 Test of Independence Given the demixing signals, the null & alternative hypotheses are defined as If y is Gaussian distributed, we are testing whether the correlation between and is equal to zero, i.e. or APSIPA DL: Independent Component Analysis and Unsupervised Learning 15
16 Likelihood Ratio LR serves as the test statistics which measures the confidence for against. LR is a measure of independence for and can act as an objective function for finding ICA demixing matrix. However, it is not allowed to assume Gaussianity for ICA problem. APSIPA DL: Independent Component Analysis and Unsupervised Learning 16
17 Nonparametric Approach Let each sample be transformed by. Instead of assuming Gaussianity, we apply the kernel density estimation using Gaussian kernel Kernel centroid is given by APSIPA DL: Independent Component Analysis and Unsupervised Learning 17
18 Nonparametric Likelihood Ratio NLR objective function with multivariate Gaussian kernel APSIPA DL: Independent Component Analysis and Unsupervised Learning 18
19 ICA Learning Procedure Parameter Initialization Centering Whitening Output W Stopping criterion NLR-ICA Learning Log likelihood ratio for null and alternative hypotheses Maximizing with respect to,, we obtain APSIPA DL: Independent Component Analysis and Unsupervised Learning 19
20 Training data Viterbi alignment Segment-based supervector collection for a subword unit X NLR-ICA Y K-means clustering.... Cluster 1 Cluster 2 Cluster M Hidden Markov model training Lexicon HMM 1 HMM 2 HMM M.... θ1 θ2 θ θm Test data Speech recognizer Recognition result
21 Segment-Based Supervector Aligned utterance Aligned utterance x s1 xs 2 x 1... x s N x 2... Aligned utterance... xt 1 x T Segment-based supervector matrix X = { [ ] [ ] [ ] [ ] } x1 2 x... xt 1 x T APSIPA DL: Independent Component Analysis and Unsupervised Learning 21
22 Syllable Error Rates Continuous Mandarin speech recognition 7080 training utterances (40 males and 40 females) 1000 test utterances (10 males and 10 females) Context-dependent subsyllable HMM modeling Each HMM cluster had at most four clusters. Without Clustering Clustering with no ICA Clustering with MMI- ICA Clustering with NLR- ICA SER (%)
23 TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood ratio ICA 3. Case Study II: Blind Source Separation Convex divergence ICA Nonstationary Bayesian ICA Online Gaussian process ICA 4. Summary
24 ICA Objective Function Independent Component Analysis Maximum Likelihood Divergence Measure Kurtosis Negentropy α -Divergence Kullback- Leiblier (KL) Divergence Euclidean Divergence Cauchy Schwartz Divergence Convex Divergence APSIPA DL: Independent Component Analysis and Unsupervised Learning 25
25 Mutual Information & KL Divergence Mutual information between two variables and is defined by using the Shannon entropy. It can be formulated as the KL divergence or relative entropy between the joint distribution and the product of marginal distribution where. APSIPA DL: Independent Component Analysis and Unsupervised Learning 26
26 Divergence Measures Euclidean divergence Cauchy-Schwartz divergence -divergence APSIPA DL: Independent Component Analysis and Unsupervised Learning 27
27 Divergence Measures f-divergence Jensen-Shannon divergence where. Entropy is a concave function. APSIPA DL: Independent Component Analysis and Unsupervised Learning 28
28 Convex Function A convex function should meet the Jensen s inequality f ( ) : convex function A general convex function is defined by APSIPA DL: Independent Component Analysis and Unsupervised Learning 29
29 Convex Divergence By assuming equal weight, we have When, C-DIV is derived as a case with convex function APSIPA DL: Independent Component Analysis and Unsupervised Learning 30
30 Different Divergence Measures APSIPA DL: Independent Component Analysis and Unsupervised Learning 31
31 Different Divergence Measures APSIPA DL: Independent Component Analysis and Unsupervised Learning 32
32 Convex Divergence ICA C-ICA learning algorithm Nonparametric C-ICA is established by using Parzen window density function. APSIPA DL: Independent Component Analysis and Unsupervised Learning 33
33 Simulated Experiments A parametric demixing matrix Two sources: super-gaussian and sub-gaussian distribution p Kurtosis W cosθ1 = cosθ 2 1, s [ τ, τ ] s ) = 2τ 1 0, otherwise ( 1 sinθ1 sinθ 2 Source 1: -1.13, source 2: p( s 2 ) exp 2τ 2 s 2 = τ 2 APSIPA DL: Independent Component Analysis and Unsupervised Learning 34
34 KL-DIV C-DIV alpha=1 C-DIV, alpha= -1 APSIPA DL: Independent Component Analysis and Unsupervised Learning 35
35 Learning Curves APSIPA DL: Independent Component Analysis and Unsupervised Learning 36
36 Experiments on Blind Source Separation One music signal and two speech signals from two male speakers were sampled from ICA 99 BSS Test Sets at Mixing matrix A = Evaluation metric signal-to-interference ratio (SIR) SIR(dB) T t t t= 1 T 2 = 10log 10 s = 1 y t s t 2 APSIPA DL: Independent Component Analysis and Unsupervised Learning 37
37 Comparison of Different Methods PC-ICA NC-ICA APSIPA DL: Independent Component Analysis and Unsupervised Learning 38
38 TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood ratio ICA 3. Case Study II: Blind Source Separation Convex divergence ICA Nonstationary Bayesian ICA Online Gaussian process ICA 4. Summary
39 Why Nonstationary Source Separation? Real-world blind source separation number of sources is unknown BSS is a dynamic time-varying system mixing process is nonstationary Why nonstationary? Bayesian method using ARD can determine the changing number of sources recursive Bayesian for online tracking of nonstationary conditions Gaussian process provides a nonparametric solution to represent temporal structure of time-varying mixing system. APSIPA DL: Independent Component Analysis and Unsupervised Learning 40
40 Nonstationary Mixing Systems Time-varying mixing matrix Source signals may abruptly appear or disappear S 3 S 1 a a ( t) 23 ( + ) 1) ( 1) = t+ ( + 2) a t a 23 = 23 a t 23 ( t) d a ( t+ 1) 2) = 0 ( t+ 2) d 1 a ' 22 d ' ( ) ( 1) 1 d a t ( t) 22 a = a t d 2 S 2 S 2 APSIPA DL: Independent Component Analysis and Unsupervised Learning 41
41 Nonstationary Bayesian (NB) Learning Maximum a posteriori estimation of NB-ICA parameters and compensation parameters updating Learning epoch t Learning epoch t+1 (t-1) θ (t) θ (t+1) θ Prior Updating Prior Updating η (t-1) η (t) η (t+1) Learning epoch t Learning epoch t+1 APSIPA DL: Independent Component Analysis and Unsupervised Learning 42
42 Model Construction Noisy ICA model Likelihood function of an observation Distribution of model parameters source mixing matrix noise APSIPA DL: Independent Component Analysis and Unsupervised Learning 43
43 Prior & Marginal Distributions Prior distributions precision of noise precision of mixing matrix Marginal likelihood of NB-ICA model APSIPA DL: Independent Component Analysis and Unsupervised Learning 44
44 Automatic Relevance Determination Detection of source signals number of sources can be determined APSIPA DL: Independent Component Analysis and Unsupervised Learning 45
45 Compensation for Nonstationary ICA Prior density of compensation parameter conjugate prior (Wishart distribution) APSIPA DL: Independent Component Analysis and Unsupervised Learning 46
46 Graphical Model for NB-ICA ( l 1) ρ ( l 1) V ( l 1) u α ( l ) ( l) H (l) A NM (l) Π (l) R ( l 1) Φ π ( l 1) Φ r ( l 1) ω (l) B N (l) ε t (l) x t (l) s t L (l) M M ( l 1) Φ m APSIPA DL: Independent Component Analysis and Unsupervised Learning 47
47 Experiments Nonstationary Blind Source Separation ICA'99 Scenarios state of source signals: active or inactive source signals or sensors are moving: nonstationary mixing matrix APSIPA DL: Independent Component Analysis and Unsupervised Learning 48
48 Source Signals and ARD Curves Alpha sec Blue: first source signal Red: second source signal APSIPA DL: Independent Component Analysis and Unsupervised Learning 49
49 TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood ratio ICA 3. Case Study II: Blind Source Separation Convex divergence ICA Nonstationary Bayesian ICA Online Gaussian process ICA 4. Summary
50 Online Gaussian Process (OLGP) Basic ideas incrementally detect the status of source signals and estimate the corresponding distributions from online observation data. temporal structure of time-varying mixing coefficients characterized by Gaussian process. are Gaussian process is a nonparametric model which defines the prior distribution over functions for Bayesian inference. APSIPA DL: Independent Component Analysis and Unsupervised Learning 51
51 Model Construction Noisy ICA model Likelihood function Distribution of model parameters source noise P APSIPA DL: Independent Component Analysis and Unsupervised Learning 52
52 Gaussian Process Mixing matrix is generated by the latent function GP is adopted to describe the distribution of are hyperparameters of kernel function APSIPA DL: Independent Component Analysis and Unsupervised Learning 53
53 Graphical Model for OLGP-ICA ( l 1) Λ a ( l 1) Ξ a ( l 1) M a ( l 1) R a NM ( l 1) λ s ( l 1) u (l) A t ( l 1) R s ( l 1) ρ s ( l 1) ω (l) (l) B ε t N (l) x t (l) s t L ( l 1) M s M APSIPA DL: Independent Component Analysis and Unsupervised Learning 54
54 Experimental Setup Nonstationary source separation using source signals from Nonstationary scenarios status of source signals: active or inactive source signals or sensors are moving: nonstationary mixing matrix APSIPA DL: Independent Component Analysis and Unsupervised Learning 56
55 Male Music Female APSIPA DL: Independent Component Analysis and Unsupervised Learning 57
56 Comparison of Different Methods Signal-to-interference ratios (SIRs) (db) VB-ICA BICA-HMM Switching- ICA Online VB-ICA OLGP-ICA Demixed signal 1 Demixed signal APSIPA DL: Independent Component Analysis and Unsupervised Learning 58
57 Summary We presented speaker adaptation method based on independent voices by fulfilling ICA perspective. A nonparametric likelihood ratio ICA was proposed according to hypothesis test theory. A convex divergence was developed as an optimization metric for ICA algorithm. A nonstationary Bayesian ICA was proposed to deal with nonstationary mixing system. An online Gaussian process ICA was presented for nonstationary and temporally correlated source separation. ICA methods could be extended to solve nonnegative matrix factorization and single-channel separation. APSIPA DL: Independent Component Analysis and Unsupervised Learning 59
58 References J.-T. Chien and B.-C. Chen, A new independent component analysis for speech recognition and separation, IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 4, pp , J.-T. Chien, H.-L. Hsieh and S. Furui, A new mutual information measure for independent component analysis, in Proc. ICASSP, pp , H.-L. Hsieh, J.-T. Chien, K. Shinoda and S. Furui, Independent component analysis for noisy speech recognition, in Proc. ICASSP, pp , H.-L. Hsieh and J.-T. Chien, Online Bayesian learning for dynamic source separation, in Proc. ICASSP, pp , H.-L. Hsieh and J.-T. Chien, Online Gaussian process for nonstationary speech separation, in Proc. INTERSPEECH, pp , H.-L. Hsieh and J.-T. Chien, Nonstationary and temporally-correlated source separation using Gaussian process, in Proc. ICASSP, pp , J.-T. Chien and H.-L. Hsieh, Convex divergence ICA for blind source separation, IEEE Transactions on Audio, Speech and Language Processing, vol. 20, no. 1, pp , APSIPA DL: Independent Component Analysis and Unsupervised Learning 60
59 Thanks to H.-L. Hsieh K. Shinoda S. Furui APSIPA DL: Independent Component Analysis and Unsupervised Learning 61
Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood
More informationA Convex Cauchy-Schwarz Divergence Measure for Blind Source Separation
INTERNATIONAL JOURNAL OF CIRCUITS, SYSTEMS AND SIGNAL PROCESSING Volume, 8 A Convex Cauchy-Schwarz Divergence Measure for Blind Source Separation Zaid Albataineh and Fathi M. Salem Abstract We propose
More informationIndependent Component Analysis. Contents
Contents Preface xvii 1 Introduction 1 1.1 Linear representation of multivariate data 1 1.1.1 The general statistical setting 1 1.1.2 Dimension reduction methods 2 1.1.3 Independence as a guiding principle
More informationMonaural speech separation using source-adapted models
Monaural speech separation using source-adapted models Ron Weiss, Dan Ellis {ronw,dpwe}@ee.columbia.edu LabROSA Department of Electrical Enginering Columbia University 007 IEEE Workshop on Applications
More informationShort-Time ICA for Blind Separation of Noisy Speech
Short-Time ICA for Blind Separation of Noisy Speech Jing Zhang, P.C. Ching Department of Electronic Engineering The Chinese University of Hong Kong, Hong Kong jzhang@ee.cuhk.edu.hk, pcching@ee.cuhk.edu.hk
More informationSINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan
SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS Emad M. Grais and Hakan Erdogan Faculty of Engineering and Natural Sciences, Sabanci University, Orhanli
More informationMachine Learning Techniques for Computer Vision
Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM
More informationA Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement
A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement Simon Leglaive 1 Laurent Girin 1,2 Radu Horaud 1 1: Inria Grenoble Rhône-Alpes 2: Univ. Grenoble Alpes, Grenoble INP,
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationFundamentals of Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Independent Vector Analysis (IVA)
Fundamentals of Principal Component Analysis (PCA),, and Independent Vector Analysis (IVA) Dr Mohsen Naqvi Lecturer in Signal and Information Processing, School of Electrical and Electronic Engineering,
More informationIntroduction to Independent Component Analysis. Jingmei Lu and Xixi Lu. Abstract
Final Project 2//25 Introduction to Independent Component Analysis Abstract Independent Component Analysis (ICA) can be used to solve blind signal separation problem. In this article, we introduce definition
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationJoint Factor Analysis for Speaker Verification
Joint Factor Analysis for Speaker Verification Mengke HU ASPITRG Group, ECE Department Drexel University mengke.hu@gmail.com October 12, 2012 1/37 Outline 1 Speaker Verification Baseline System Session
More informationNon-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology
Non-Negative Matrix Factorization And Its Application to Audio Tuomas Virtanen Tampere University of Technology tuomas.virtanen@tut.fi 2 Contents Introduction to audio signals Spectrogram representation
More informationIndependent Component Analysis
Independent Component Analysis Seungjin Choi Department of Computer Science Pohang University of Science and Technology, Korea seungjin@postech.ac.kr March 4, 2009 1 / 78 Outline Theory and Preliminaries
More informationHeeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University
Heeyoul (Henry) Choi Dept. of Computer Science Texas A&M University hchoi@cs.tamu.edu Introduction Speaker Adaptation Eigenvoice Comparison with others MAP, MLLR, EMAP, RMP, CAT, RSW Experiments Future
More informationTRINICON: A Versatile Framework for Multichannel Blind Signal Processing
TRINICON: A Versatile Framework for Multichannel Blind Signal Processing Herbert Buchner, Robert Aichner, Walter Kellermann {buchner,aichner,wk}@lnt.de Telecommunications Laboratory University of Erlangen-Nuremberg
More informationIndependent Component Analysis and Its Applications. By Qing Xue, 10/15/2004
Independent Component Analysis and Its Applications By Qing Xue, 10/15/2004 Outline Motivation of ICA Applications of ICA Principles of ICA estimation Algorithms for ICA Extensions of basic ICA framework
More informationPILCO: A Model-Based and Data-Efficient Approach to Policy Search
PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationPATTERN CLASSIFICATION
PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationCIFAR Lectures: Non-Gaussian statistics and natural images
CIFAR Lectures: Non-Gaussian statistics and natural images Dept of Computer Science University of Helsinki, Finland Outline Part I: Theory of ICA Definition and difference to PCA Importance of non-gaussianity
More informationReformulating the HMM as a trajectory model by imposing explicit relationship between static and dynamic features
Reformulating the HMM as a trajectory model by imposing explicit relationship between static and dynamic features Heiga ZEN (Byung Ha CHUN) Nagoya Inst. of Tech., Japan Overview. Research backgrounds 2.
More informationIndependent Component Analysis (ICA)
Independent Component Analysis (ICA) Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationNon-negative Matrix Factorization: Algorithms, Extensions and Applications
Non-negative Matrix Factorization: Algorithms, Extensions and Applications Emmanouil Benetos www.soi.city.ac.uk/ sbbj660/ March 2013 Emmanouil Benetos Non-negative Matrix Factorization March 2013 1 / 25
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More informationHST.582J/6.555J/16.456J
Blind Source Separation: PCA & ICA HST.582J/6.555J/16.456J Gari D. Clifford gari [at] mit. edu http://www.mit.edu/~gari G. D. Clifford 2005-2009 What is BSS? Assume an observation (signal) is a linear
More informationGatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II
Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference
More informationEUSIPCO
EUSIPCO 213 1569744273 GAMMA HIDDEN MARKOV MODEL AS A PROBABILISTIC NONNEGATIVE MATRIX FACTORIZATION Nasser Mohammadiha, W. Bastiaan Kleijn, Arne Leijon KTH Royal Institute of Technology, Department of
More informationLatent Tree Approximation in Linear Model
Latent Tree Approximation in Linear Model Navid Tafaghodi Khajavi Dept. of Electrical Engineering, University of Hawaii, Honolulu, HI 96822 Email: navidt@hawaii.edu ariv:1710.01838v1 [cs.it] 5 Oct 2017
More informationHidden Markov Models and Gaussian Mixture Models
Hidden Markov Models and Gaussian Mixture Models Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 4&5 23&27 January 2014 ASR Lectures 4&5 Hidden Markov Models and Gaussian
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationSeparation of Different Voices in Speech using Fast Ica Algorithm
Volume-6, Issue-6, November-December 2016 International Journal of Engineering and Management Research Page Number: 364-368 Separation of Different Voices in Speech using Fast Ica Algorithm Dr. T.V.P Sundararajan
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationUpper Bound Kullback-Leibler Divergence for Hidden Markov Models with Application as Discrimination Measure for Speech Recognition
Upper Bound Kullback-Leibler Divergence for Hidden Markov Models with Application as Discrimination Measure for Speech Recognition Jorge Silva and Shrikanth Narayanan Speech Analysis and Interpretation
More informationExpectation Maximization (EM)
Expectation Maximization (EM) The EM algorithm is used to train models involving latent variables using training data in which the latent variables are not observed (unlabeled data). This is to be contrasted
More informationAdvanced Introduction to Machine Learning CMU-10715
Advanced Introduction to Machine Learning CMU-10715 Independent Component Analysis Barnabás Póczos Independent Component Analysis 2 Independent Component Analysis Model original signals Observations (Mixtures)
More informationCurve Fitting Re-visited, Bishop1.2.5
Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the
More informationNovel spectrum sensing schemes for Cognitive Radio Networks
Novel spectrum sensing schemes for Cognitive Radio Networks Cantabria University Santander, May, 2015 Supélec, SCEE Rennes, France 1 The Advanced Signal Processing Group http://gtas.unican.es The Advanced
More informationp L yi z n m x N n xi
y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen
More informationICA. Independent Component Analysis. Zakariás Mátyás
ICA Independent Component Analysis Zakariás Mátyás Contents Definitions Introduction History Algorithms Code Uses of ICA Definitions ICA Miture Separation Signals typical signals Multivariate statistics
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationSupport Vector Machines using GMM Supervectors for Speaker Verification
1 Support Vector Machines using GMM Supervectors for Speaker Verification W. M. Campbell, D. E. Sturim, D. A. Reynolds MIT Lincoln Laboratory 244 Wood Street Lexington, MA 02420 Corresponding author e-mail:
More informationON-LINE MINIMUM MUTUAL INFORMATION METHOD FOR TIME-VARYING BLIND SOURCE SEPARATION
O-IE MIIMUM MUTUA IFORMATIO METHOD FOR TIME-VARYIG BID SOURCE SEPARATIO Kenneth E. Hild II, Deniz Erdogmus, and Jose C. Principe Computational euroengineering aboratory (www.cnel.ufl.edu) The University
More informationDiscriminative training of GMM-HMM acoustic model by RPCL type Bayesian Ying-Yang harmony learning
Discriminative training of GMM-HMM acoustic model by RPCL type Bayesian Ying-Yang harmony learning Zaihu Pang 1, Xihong Wu 1, and Lei Xu 1,2 1 Speech and Hearing Research Center, Key Laboratory of Machine
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationComparison of Fast ICA and Gradient Algorithms of Independent Component Analysis for Separation of Speech Signals
K. Mohanaprasad et.al / International Journal of Engineering and echnolog (IJE) Comparison of Fast ICA and Gradient Algorithms of Independent Component Analsis for Separation of Speech Signals K. Mohanaprasad
More informationMAP adaptation with SphinxTrain
MAP adaptation with SphinxTrain David Huggins-Daines dhuggins@cs.cmu.edu Language Technologies Institute Carnegie Mellon University MAP adaptation with SphinxTrain p.1/12 Theory of MAP adaptation Standard
More informationEstimating Correlation Coefficient Between Two Complex Signals Without Phase Observation
Estimating Correlation Coefficient Between Two Complex Signals Without Phase Observation Shigeki Miyabe 1B, Notubaka Ono 2, and Shoji Makino 1 1 University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki
More informationCSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection
CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection (non-examinable material) Matthew J. Beal February 27, 2004 www.variational-bayes.org Bayesian Model Selection
More informationPattern Classification
Pattern Classification Introduction Parametric classifiers Semi-parametric classifiers Dimensionality reduction Significance testing 6345 Automatic Speech Recognition Semi-Parametric Classifiers 1 Semi-Parametric
More informationComparing linear and non-linear transformation of speech
Comparing linear and non-linear transformation of speech Larbi Mesbahi, Vincent Barreaud and Olivier Boeffard IRISA / ENSSAT - University of Rennes 1 6, rue de Kerampont, Lannion, France {lmesbahi, vincent.barreaud,
More informationFuncICA for time series pattern discovery
FuncICA for time series pattern discovery Nishant Mehta and Alexander Gray Georgia Institute of Technology The problem Given a set of inherently continuous time series (e.g. EEG) Find a set of patterns
More informationBlind Machine Separation Te-Won Lee
Blind Machine Separation Te-Won Lee University of California, San Diego Institute for Neural Computation Blind Machine Separation Problem we want to solve: Single microphone blind source separation & deconvolution
More informationFully Bayesian Deep Gaussian Processes for Uncertainty Quantification
Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification N. Zabaras 1 S. Atkinson 1 Center for Informatics and Computational Science Department of Aerospace and Mechanical Engineering University
More informationBayesian Hidden Markov Models and Extensions
Bayesian Hidden Markov Models and Extensions Zoubin Ghahramani Department of Engineering University of Cambridge joint work with Matt Beal, Jurgen van Gael, Yunus Saatci, Tom Stepleton, Yee Whye Teh Modeling
More informationBayesian ensemble learning of generative models
Chapter Bayesian ensemble learning of generative models Harri Valpola, Antti Honkela, Juha Karhunen, Tapani Raiko, Xavier Giannakopoulos, Alexander Ilin, Erkki Oja 65 66 Bayesian ensemble learning of generative
More informationHidden Markov Models in Language Processing
Hidden Markov Models in Language Processing Dustin Hillard Lecture notes courtesy of Prof. Mari Ostendorf Outline Review of Markov models What is an HMM? Examples General idea of hidden variables: implications
More informationMassachusetts Institute of Technology
Massachusetts Institute of Technology 6.867 Machine Learning, Fall 2006 Problem Set 5 Due Date: Thursday, Nov 30, 12:00 noon You may submit your solutions in class or in the box. 1. Wilhelm and Klaus are
More informationVariable selection and feature construction using methods related to information theory
Outline Variable selection and feature construction using methods related to information theory Kari 1 1 Intelligent Systems Lab, Motorola, Tempe, AZ IJCNN 2007 Outline Outline 1 Information Theory and
More informationIndependent Component Analysis. PhD Seminar Jörgen Ungh
Independent Component Analysis PhD Seminar Jörgen Ungh Agenda Background a motivater Independence ICA vs. PCA Gaussian data ICA theory Examples Background & motivation The cocktail party problem Bla bla
More informationExpectation Maximization
Expectation Maximization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 /
More informationHST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007
MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationBrief Introduction of Machine Learning Techniques for Content Analysis
1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview
More informationMaster 2 Informatique Probabilistic Learning and Data Analysis
Master 2 Informatique Probabilistic Learning and Data Analysis Faicel Chamroukhi Maître de Conférences USTV, LSIS UMR CNRS 7296 email: chamroukhi@univ-tln.fr web: chamroukhi.univ-tln.fr 2013/2014 Faicel
More informationInformation Theory in Computer Vision and Pattern Recognition
Francisco Escolano Pablo Suau Boyan Bonev Information Theory in Computer Vision and Pattern Recognition Foreword by Alan Yuille ~ Springer Contents 1 Introduction...............................................
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationEigenvoice Speaker Adaptation via Composite Kernel PCA
Eigenvoice Speaker Adaptation via Composite Kernel PCA James T. Kwok, Brian Mak and Simon Ho Department of Computer Science Hong Kong University of Science and Technology Clear Water Bay, Hong Kong [jamesk,mak,csho]@cs.ust.hk
More informationAn Introduction to Independent Components Analysis (ICA)
An Introduction to Independent Components Analysis (ICA) Anish R. Shah, CFA Northfield Information Services Anish@northinfo.com Newport Jun 6, 2008 1 Overview of Talk Review principal components Introduce
More informationTemporal Modeling and Basic Speech Recognition
UNIVERSITY ILLINOIS @ URBANA-CHAMPAIGN OF CS 498PS Audio Computing Lab Temporal Modeling and Basic Speech Recognition Paris Smaragdis paris@illinois.edu paris.cs.illinois.edu Today s lecture Recognizing
More informationMachine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.
Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning
More informationBoundary Contraction Training for Acoustic Models based on Discrete Deep Neural Networks
INTERSPEECH 2014 Boundary Contraction Training for Acoustic Models based on Discrete Deep Neural Networks Ryu Takeda, Naoyuki Kanda, and Nobuo Nukaga Central Research Laboratory, Hitachi Ltd., 1-280, Kokubunji-shi,
More informationVECTOR-QUANTIZATION BY DENSITY MATCHING IN THE MINIMUM KULLBACK-LEIBLER DIVERGENCE SENSE
VECTOR-QUATIZATIO BY DESITY ATCHIG I THE IIU KULLBACK-LEIBLER DIVERGECE SESE Anant Hegde, Deniz Erdogmus, Tue Lehn-Schioler 2, Yadunandana. Rao, Jose C. Principe CEL, Electrical & Computer Engineering
More informationOverview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated
Fall 3 Computer Vision Overview of Statistical Tools Statistical Inference Haibin Ling Observation inference Decision Prior knowledge http://www.dabi.temple.edu/~hbling/teaching/3f_5543/index.html Bayesian
More informationVariational Principal Components
Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings
More informationAn Evolutionary Programming Based Algorithm for HMM training
An Evolutionary Programming Based Algorithm for HMM training Ewa Figielska,Wlodzimierz Kasprzak Institute of Control and Computation Engineering, Warsaw University of Technology ul. Nowowiejska 15/19,
More informationExperiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition
Experiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition ABSTRACT It is well known that the expectation-maximization (EM) algorithm, commonly used to estimate hidden
More informationIndependent Component Analysis
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 1 Introduction Indepent
More informationSpeaker recognition by means of Deep Belief Networks
Speaker recognition by means of Deep Belief Networks Vasileios Vasilakakis, Sandro Cumani, Pietro Laface, Politecnico di Torino, Italy {first.lastname}@polito.it 1. Abstract Most state of the art speaker
More informationALGONQUIN - Learning dynamic noise models from noisy speech for robust speech recognition
ALGONQUIN - Learning dynamic noise models from noisy speech for robust speech recognition Brendan J. Freyl, Trausti T. Kristjanssonl, Li Deng 2, Alex Acero 2 1 Probabilistic and Statistical Inference Group,
More informationSession Variability Compensation in Automatic Speaker Recognition
Session Variability Compensation in Automatic Speaker Recognition Javier González Domínguez VII Jornadas MAVIR Universidad Autónoma de Madrid November 2012 Outline 1. The Inter-session Variability Problem
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationComputer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization
Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Expectation Maximization (EM) and Mixture Models Hamid R. Rabiee Jafar Muhammadi, Mohammad J. Hosseini Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2 Agenda Expectation-maximization
More informationA Convex Cauchy-Schwarz Divergence Measure for Blind Source Separation
A Convex Cauchy-Schwarz Divergence easure for Blind Source Separation Zaid Albataineh and Fathi. Salem Abstract We propose a new class of divergence measures for Independent Component Analysis (ICA for
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More informationProbabilistic Reasoning in Deep Learning
Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian
More informationSingle-channel source separation using non-negative matrix factorization
Single-channel source separation using non-negative matrix factorization Mikkel N. Schmidt Technical University of Denmark mns@imm.dtu.dk www.mikkelschmidt.dk DTU Informatics Department of Informatics
More informationIndependent Component Analysis
A Short Introduction to Independent Component Analysis Aapo Hyvärinen Helsinki Institute for Information Technology and Depts of Computer Science and Psychology University of Helsinki Problem of blind
More informationJorge Silva and Shrikanth Narayanan, Senior Member, IEEE. 1 is the probability measure induced by the probability density function
890 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Average Divergence Distance as a Statistical Discrimination Measure for Hidden Markov Models Jorge Silva and Shrikanth
More informationBayesian Analysis of Speaker Diarization with Eigenvoice Priors
Bayesian Analysis of Speaker Diarization with Eigenvoice Priors Patrick Kenny Centre de recherche informatique de Montréal Patrick.Kenny@crim.ca A year in the lab can save you a day in the library. Panu
More informationGaussian with mean ( µ ) and standard deviation ( σ)
Slide from Pieter Abbeel Gaussian with mean ( µ ) and standard deviation ( σ) 10/6/16 CSE-571: Robotics X ~ N( µ, σ ) Y ~ N( aµ + b, a σ ) Y = ax + b + + + + 1 1 1 1 1 1 1 1 1 1, ~ ) ( ) ( ), ( ~ ), (
More informationPredictive information in Gaussian processes with application to music analysis
Predictive information in Gaussian processes with application to music analysis Samer Abdallah 1 and Mark Plumbley 2 1 University College London 2 Queen Mary University of London Abstract. We describe
More informationUnsupervised learning: beyond simple clustering and PCA
Unsupervised learning: beyond simple clustering and PCA Liza Rebrova Self organizing maps (SOM) Goal: approximate data points in R p by a low-dimensional manifold Unlike PCA, the manifold does not have
More informationLecture 3: Pattern Classification
EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures
More information