GMM parameter estimation. Xiaoye Lu CMPS290c Final Project

Similar documents
Normal Random Variable and its discriminant functions

Lecture VI Regression

Lecture 6: Learning for Control (Generalised Linear Regression)

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4

CHAPTER 10: LINEAR DISCRIMINATION

Volatility Interpolation

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

( ) [ ] MAP Decision Rule

Notes on the stability of dynamic systems and the use of Eigen Values.

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

Outline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model

THEORETICAL AUTOCORRELATIONS. ) if often denoted by γ. Note that

(,,, ) (,,, ). In addition, there are three other consumers, -2, -1, and 0. Consumer -2 has the utility function

Ordinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s

An introduction to Support Vector Machine

January Examinations 2012

3D Human Pose Estimation from a Monocular Image Using Model Fitting in Eigenspaces

Machine Learning Linear Regression

Fitting a Conditional Linear Gaussian Distribution

Lecture 11 SVM cont

Bayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance

Clustering (Bishop ch 9)

LECTURE :FACTOR ANALYSIS

Variants of Pegasos. December 11, 2009

Math 128b Project. Jude Yuen

CHAPTER 5: MULTIVARIATE METHODS

In the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!

CHAPTER 10: LINEAR DISCRIMINATION

Single-loop System Reliability-Based Design & Topology Optimization (SRBDO/SRBTO): A Matrix-based System Reliability (MSR) Method

Appendix to Online Clustering with Experts

Learning Objectives. Self Organization Map. Hamming Distance(1/5) Introduction. Hamming Distance(3/5) Hamming Distance(2/5) 15/04/2015

John Geweke a and Gianni Amisano b a Departments of Economics and Statistics, University of Iowa, USA b European Central Bank, Frankfurt, Germany

Advanced Machine Learning & Perception

1 Widrow-Hoff Algorithm

Lecture 18: The Laplace Transform (See Sections and 14.7 in Boas)

Solution in semi infinite diffusion couples (error function analysis)

EXPONENTIAL PROBABILITY DISTRIBUTION

Fall 2010 Graduate Course on Dynamic Learning

Appendix H: Rarefaction and extrapolation of Hill numbers for incidence data

COMPUTER SCIENCE 349A SAMPLE EXAM QUESTIONS WITH SOLUTIONS PARTS 1, 2

THE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS

( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model

Anomaly Detection. Lecture Notes for Chapter 9. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

J i-1 i. J i i+1. Numerical integration of the diffusion equation (I) Finite difference method. Spatial Discretization. Internal nodes.

Graduate Macroeconomics 2 Problem set 5. - Solutions

10. A.C CIRCUITS. Theoretically current grows to maximum value after infinite time. But practically it grows to maximum after 5τ. Decay of current :

FI 3103 Quantum Physics

Neural Networks-Based Time Series Prediction Using Long and Short Term Dependence in the Learning Process

Fall 2009 Social Sciences 7418 University of Wisconsin-Madison. Problem Set 2 Answers (4) (6) di = D (10)

EEL 6266 Power System Operation and Control. Chapter 5 Unit Commitment

Chapter Lagrangian Interpolation

FTCS Solution to the Heat Equation

OMXS30 Balance 20% Index Rules

WiH Wei He

Chapter 6 DETECTION AND ESTIMATION: Model of digital communication system. Fundamental issues in digital communications are

AT&T Labs Research, Shannon Laboratory, 180 Park Avenue, Room A279, Florham Park, NJ , USA

Lecture 2 L n i e n a e r a M od o e d l e s

Li An-Ping. Beijing , P.R.China

Mixture o f of Gaussian Gaussian clustering Nov

Robustness Experiments with Two Variance Components

Midterm Exam. Thursday, April hour, 15 minutes

6/27/2012. Signals and Systems EE235. Chicken. Today s menu. Why did the chicken cross the Möbius Strip? To get to the other er um

CS 536: Machine Learning. Nonparametric Density Estimation Unsupervised Learning - Clustering

Second-Order Non-Stationary Online Learning for Regression

Methods of Improving Constitutive Equations

CptS 570 Machine Learning School of EECS Washington State University. CptS Machine Learning 1

10/10/2011. Signals and Systems EE235. Today s menu. Chicken

V.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS

The Performance of Expectation Maximization (EM) Algorithm in Gaussian Mixed Models (GMM)

THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Series A, OF THE ROMANIAN ACADEMY Volume 9, Number 1/2008, pp

Introduction D P. r = constant discount rate, g = Gordon Model (1962): constant dividend growth rate.

Part II CONTINUOUS TIME STOCHASTIC PROCESSES

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

Machine Learning 2nd Edition

ISCA Archive

Bernoulli process with 282 ky periodicity is detected in the R-N reversals of the earth s magnetic field

Clustering with Gaussian Mixtures

Fitting a transformation: Feature based alignment May 1 st, 2018

CHAPTER 7: CLUSTERING

Dual Population-Based Incremental Learning for Problem Optimization in Dynamic Environments

Long Term Power Load Combination Forecasting Based on Chaos-Fractal Theory in Beijing

Dishonest casino as an HMM

Excess Error, Approximation Error, and Estimation Error

COS 511: Theoretical Machine Learning

Comb Filters. Comb Filters

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Introduction to Compact Dynamical Modeling. III.1 Reducing Linear Time Invariant Systems. Luca Daniel Massachusetts Institute of Technology

RELATIONSHIP BETWEEN VOLATILITY AND TRADING VOLUME: THE CASE OF HSI STOCK RETURNS DATA

Comparison of Supervised & Unsupervised Learning in βs Estimation between Stocks and the S&P500

Knowing What Others Know: Coordination Motives in Information Acquisition Additional Notes

EM and Structure Learning

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

3. OVERVIEW OF NUMERICAL METHODS

NATIONAL UNIVERSITY OF SINGAPORE PC5202 ADVANCED STATISTICAL MECHANICS. (Semester II: AY ) Time Allowed: 2 Hours

F-Tests and Analysis of Variance (ANOVA) in the Simple Linear Regression Model. 1. Introduction

Survival Analysis and Reliability. A Note on the Mean Residual Life Function of a Parallel System

Computing Relevance, Similarity: The Vector Space Model

Foundations of State Estimation Part II

Chapter 6: AC Circuits

Advanced Macroeconomics II: Exchange economy

Transcription:

GMM paraeer esaon Xaoye Lu M290c Fnal rojec

GMM nroducon Gaussan ure Model obnaon of several gaussan coponens Noaon: For each Gaussan dsrbuon:, s he ean and covarance ar. A GMM h ures(coponens): p ( 2π ) ( ) p( ) (, L,,,,, ) p ( ) ep d 2 2 2 T ( ) ( ), 2 2 L

EM algorh EM s ofen used n GMM paraeer esaon 0.45 0.7 0.5 0.45 0.4 0.6 0.4 0.35 0.35 0.5 0.3 0.3 0.4 0.25 0.25 0.2 0.3 0.2 0.5 0.5 0.2 0. 0. 0. 0.05 0.05 0-3 -2-0 2 3 4

EM algorh (Bad Inals)

EM algorh EM s an algorh o aze he lkelhood by fng he oher paraeer n each eraon Lke K-Mean Effcen hen gven good nal sengs Local opu algorh No good onlne verson for GMM esaon

EM updang equaon Updang Weghs Updang Mean Updang ovarance N M N p p N N p p N N p p

KM fraeork A fraeork of onlne learnng algorh: Fnd he updae radng off he dvergence n he paraeer doan h he dvergence n he labellng doan. Dvergence Loss Facor To nze:, ( Loss ) U ( ) (, ) Loss( )

Iplc vs Eplc Too hard o solve plc updang equaon: U ( ) (, ) Loss( ) Eplc Equaon U, Loss( )

GMM Jon Enropy Updang Usng jon enropy as he dvergence ln (, ) ˆ (, ) Usng negave log lkelhood as he loss funcon ln ( ) ln( )

GMM JE updang Lagrange Mehod o ge he Mnu h consran Loss V, ˆ λ 0 ln, ˆ λ ϖ ϖ ϖ V 0 ln, ˆ V 0 ln, ˆ V

GMM JE updang To avod fuure dervave n he loss funcon, JE uses Taylor Epanson: Loss ( ) Loss( ) ( ) ( ) Loss Loss ( ) Loss( )

GMM JE updang Dervaves of dvergence Dervaves of Loss funcon here λ ϖ T r d 2 2 2 ln 2 ln, ˆ, ˆ 2 2, ˆ β ln α ln 2 ln T I α α β

Updang Equaons On-Lne Verson ϖ ϖ ep Z j ( β ) Z ϖ ep β β β ( ) ( )( ) T I ( )( ) Bach Verson ϖ ϖ ep β Z j ϖ ep Z β β β ( ) T I ( )( )

Iplc Updang Usng fuure dervaves for loss funcon Z ep I,,,, 0 T

Le ( ) ξ Weghs updae ( ) ep ξ earch for ξ nzng he U funcon U funcon s conve (Log s concave) Usng bnary search, hch s fas. Z

Mean and ovarance F For quadrac equaon of, Usng: Roo(s) of here Need o check he valdy of α β I,,,, 0 T I 0 Q B A [ ] T B ± 2 AQ B B T T 4

Learnng Rae For Onlne: Learnng rae should ge closer o 0 as he daa nubers ncreases. () ( 0) 0.0 ( 0) < 00 > 00 For Bach: Learnng rae s fed. For EM, e don have learnng rae o adjus.

Onlne Verson Resuls I

Eperens I Toy Daa: 2 ures ( densonal) Weghs: [0.5, 0.5] Mean: [-, ] ovarance [0.5, 0.5] W_In [0.2; 0.8]; Mu_In [-0.5; 0.5]; _In [/3; /3]; ea0.05;

0.45 0.4 0.35 0.3 0.25 0.2 0.5 0. EM localzaon Fnal Approaon, Hsogra 5000 Daa Hsogra True Dsrbuon Densy go by In Value e Densy go by In Value e Densy go by In Value e eng : W_In [0.; 0.9]; Mu_In [-0.; 0.]; _In [0.; 0.]; eng : W_In [0.2; 0.8]; Mu_In [-0.5; 0.5]; _In [0.4; 0.4]; eng : W_In [0.4; 0.6]; Mu_In [-0.9; 0.9]; _In [0.4; 0.4]; 0.05 0-3 -2-0 2 3

0 4 Bach MyTral - -.5-2 -2.5-3 ea.0 ea.05 ea. ea.5 ea 2 ea 3 2 4 6 8 0 2 4 6 8 20

Bach JE -400-4200 -4300 ea.0 ea.05 ea. ea.5 ea 2 ea 3-4400 -4500-4600 -4700-4800 -4900 Faled 3 es, because he becae negave soes. -5000-500 2 3 4 5 6 7 8 9 0

Bach JE vs MyTral Densy afer 50 eraons Densy afer 3 eraons 0.45 0.4 Fnal Approaon, Hsogra 0.5 0.45 5000 Daa Hsogra True densy JE MyTral Ieraon 3 0.35 0.4 0.3 0.35 0.3 0.25 0.25 0.2 0.2 0.5 0.5 0. 0.05 5000 Daa Hsogra True Dsrbuon Densy go by JEBach Densy go by TralBach 0. 0.05 0-3 -2-0 2 3 0-3 -2-0 2 3

On-Lne JE vs MyTral Densy afer 00 daa 0.45 JE Densy a eraon 00 Tral Densy a eraon 00 Densy afer 300 daa 0.45 0.4 0.4 0.35 0.35 0.3 0.25 0.2 0.5 0.3 0.25 0.2 0.5 0. 0.05 0. 0.05 JE Densy a eraon 300 Tral Densy a eraon 300 0-3 -2-0 2 3 Inal eng 0.45 0.4 0-3 -2-0 2 3 Fnal Approaon, Hsogra hsogra rue densy JE a eraon MyTral a eraon 0.35 0.3 0.25 0.2 0.5 0. 0.05 0-3 -2-0 2 3

Wha I learned MyTral s uch ore sable hen JE, snce JE ll generae non posve defne ovarance ar. JE and MyTral depend less on he nal seng, hle EM does.