Math 128b Project. Jude Yuen

Similar documents
CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4

Clustering (Bishop ch 9)

Solution in semi infinite diffusion couples (error function analysis)

John Geweke a and Gianni Amisano b a Departments of Economics and Statistics, University of Iowa, USA b European Central Bank, Frankfurt, Germany

CHAPTER 10: LINEAR DISCRIMINATION

( ) () we define the interaction representation by the unitary transformation () = ()

Robustness Experiments with Two Variance Components

Bayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance

V.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS

Variants of Pegasos. December 11, 2009

Graduate Macroeconomics 2 Problem set 5. - Solutions

Mechanics Physics 151

Lecture 6: Learning for Control (Generalised Linear Regression)

Mechanics Physics 151

UNIVERSITAT AUTÒNOMA DE BARCELONA MARCH 2017 EXAMINATION

In the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!

Normal Random Variable and its discriminant functions

Appendix H: Rarefaction and extrapolation of Hill numbers for incidence data

Robust and Accurate Cancer Classification with Gene Expression Profiling

CS286.2 Lecture 14: Quantum de Finetti Theorems II

Chapter 6: AC Circuits

( ) [ ] MAP Decision Rule

On One Analytic Method of. Constructing Program Controls

Lecture VI Regression

Outline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model

TSS = SST + SSE An orthogonal partition of the total SS

Econ107 Applied Econometrics Topic 5: Specification: Choosing Independent Variables (Studenmund, Chapter 6)

. The geometric multiplicity is dim[ker( λi. number of linearly independent eigenvectors associated with this eigenvalue.

Linear Response Theory: The connection between QFT and experiments

Department of Economics University of Toronto

THE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS

Ordinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s

Mechanics Physics 151

GENERATING CERTAIN QUINTIC IRREDUCIBLE POLYNOMIALS OVER FINITE FIELDS. Youngwoo Ahn and Kitae Kim

Comb Filters. Comb Filters

Anomaly Detection. Lecture Notes for Chapter 9. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

(,,, ) (,,, ). In addition, there are three other consumers, -2, -1, and 0. Consumer -2 has the utility function

2 Aggregate demand in partial equilibrium static framework

Density Matrix Description of NMR BCMB/CHEM 8190

Scattering at an Interface: Oblique Incidence

FI 3103 Quantum Physics

. The geometric multiplicity is dim[ker( λi. A )], i.e. the number of linearly independent eigenvectors associated with this eigenvalue.

Part II CONTINUOUS TIME STOCHASTIC PROCESSES

EEL 6266 Power System Operation and Control. Chapter 5 Unit Commitment

Epistemic Game Theory: Online Appendix

FTCS Solution to the Heat Equation

Dynamic Team Decision Theory. EECS 558 Project Shrutivandana Sharma and David Shuman December 10, 2005

Appendix to Online Clustering with Experts

Density Matrix Description of NMR BCMB/CHEM 8190

[Link to MIT-Lab 6P.1 goes here.] After completing the lab, fill in the following blanks: Numerical. Simulation s Calculations

Political Economy of Institutions and Development: Problem Set 2 Due Date: Thursday, March 15, 2019.

Advanced time-series analysis (University of Lund, Economic History Department)

Chapter Lagrangian Interpolation

Lecture 11 SVM cont

Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms

Introduction to Boosting

Computing Relevance, Similarity: The Vector Space Model

Bernoulli process with 282 ky periodicity is detected in the R-N reversals of the earth s magnetic field

Should Exact Index Numbers have Standard Errors? Theory and Application to Asian Growth

Notes on the stability of dynamic systems and the use of Eigen Values.

Machine Learning 2nd Edition

Volatility Interpolation

January Examinations 2012

Advanced Macroeconomics II: Exchange economy

J i-1 i. J i i+1. Numerical integration of the diffusion equation (I) Finite difference method. Spatial Discretization. Internal nodes.

F-Tests and Analysis of Variance (ANOVA) in the Simple Linear Regression Model. 1. Introduction

Single-loop System Reliability-Based Design & Topology Optimization (SRBDO/SRBTO): A Matrix-based System Reliability (MSR) Method

Time-interval analysis of β decay. V. Horvat and J. C. Hardy

Lecture 18: The Laplace Transform (See Sections and 14.7 in Boas)

CHAPTER 5: MULTIVARIATE METHODS

Discrete Markov Process. Introduction. Example: Balls and Urns. Stochastic Automaton. INTRODUCTION TO Machine Learning 3rd Edition

II. Light is a Ray (Geometrical Optics)

2 Aggregate demand in partial equilibrium static framework

New M-Estimator Objective Function. in Simultaneous Equations Model. (A Comparative Study)

Li An-Ping. Beijing , P.R.China

Bayesian Inference of the GARCH model with Rational Errors

( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model

e-journal Reliability: Theory& Applications No 2 (Vol.2) Vyacheslav Abramov

WiH Wei He

HEAT CONDUCTION PROBLEM IN A TWO-LAYERED HOLLOW CYLINDER BY USING THE GREEN S FUNCTION METHOD

Panel Data Regression Models

CHAPTER 2: Supervised Learning

Comparison of Differences between Power Means 1

Fall 2010 Graduate Course on Dynamic Learning

Example: MOSFET Amplifier Distortion

ABSTRACT KEYWORDS. Bonus-malus systems, frequency component, severity component. 1. INTRODUCTION

Cubic Bezier Homotopy Function for Solving Exponential Equations

DEEP UNFOLDING FOR MULTICHANNEL SOURCE SEPARATION SUPPLEMENTARY MATERIAL

An introduction to Support Vector Machine

Online Supplement for Dynamic Multi-Technology. Production-Inventory Problem with Emissions Trading

GMM parameter estimation. Xiaoye Lu CMPS290c Final Project

A HIERARCHICAL KALMAN FILTER

Attribute Reduction Algorithm Based on Discernibility Matrix with Algebraic Method GAO Jing1,a, Ma Hui1, Han Zhidong2,b

Anisotropic Behaviors and Its Application on Sheet Metal Stamping Processes

Hidden Markov Models

2/20/2013. EE 101 Midterm 2 Review

, t 1. Transitions - this one was easy, but in general the hardest part is choosing the which variables are state and control variables

2. SPATIALLY LAGGED DEPENDENT VARIABLES

National Exams December 2015 NOTES: 04-BS-13, Biology. 3 hours duration

CH.3. COMPATIBILITY EQUATIONS. Continuum Mechanics Course (MMC) - ETSECCPB - UPC

Transcription:

Mah 8b Proec Jude Yuen

. Inroducon Le { Z } be a sequence of observed ndependen vecor varables. If he elemens of Z have a on normal dsrbuon hen { Z } has a mean vecor Z and a varancecovarance marx z. Geomercally hs s equvalen o ha he Z s are cluserng around he coordnae Z and s denser when s closer o Z. he changng of densy around Z s governed by he varance-covarance marx z. o esmae Z and z a common pracce n Sascs/Economercs s smply o maxmze he lelhood funcon Maxθ F( S; θ) = f ( Z ; z Z ) () Where = f ( Z ; z n/ / Z ) = (π ) exp[.5( Z Z)' ( Z Z)] z z Z Z... Z S Z θ and z n: he dmensons of Z. Now suppose { y } s a mxure of wo sequences { Z } and { Z } each has dfferen mean vecor ( ) and possbly dfferen varance-covarance marx ( ) respecvely. ha s some of he y s are cluserng around and he res around and he changng of densy around and could be dfferen because and may no be he same. If we can separae { y } no { Z } and { Z } hen ( ) and ( ) can be esmaed separaely as n (). he challenge comes when we ry o esmae ( ) and ( ) bu a pror we do no now whch y belongs o whch cluser. hen we need o fnd some ways o separae { Z } and { Z }.

Sudyng cluserng daa s an mporan feld n many dscplnes. he mean vecor and varance-covarance marx of a gven observed daa se convey a lo of nformaon abou he observed daa. And hs nformaon could be nvaluable o busness and economc sudy. For example f Z s a -vecor one elemen s annual ncome and he oher s annual consumpon of an ndvdual and f { Z } s cluserng n dfferen places nowng he mean vecor and he varance-covarance marx of each cluser economss may be able o beer predc wha wll happen and explan wha s happenng n an economy. And busness execuves may be able o beer arge ndvduals n each cluser wh her producs. he mos common mehod used n esmang he mean vecors and he varancecovarance marces of normally dsrbued cluserng daa s he mxure lelhood model mehod. Prof. W. Kahan also suggess a way (he separae mehod) o separae he clusers and evaluae he mean vecor and varance-covarance marx of each cluser. hs paper proposes a smple algorhm o esmae he mean vecors and he varancecovarance marces of a daa se wh normally dsrbued clusers. In comparng hese mehods he proposed mehod seems o be more relable and s speed s comparable o ha of he separaon mehod. he mxure lelhood mehod s he slowes bu s farly accurae. hs paper s organzed n he followng way. Nex secon wll summarze he exsng mehods n esmang he mean vecors and he varance-covarance marces of some normally dsrbued daa conssng of clusers. he proposed mehod wll be oulned n secon 3. Secon 4 wll compare he esmaed resuls from he separaon mehod he proposed mehod and he mxure lelhood model mehod for some smulaed daa. Secon 5 concludes.. he Exsng Mehods

he common mehod used n esmang ( ) and ( ) n Sascs/Economercs when { Z } and { Z } are boh normally dsrbued s he mxure lelhood model approach (McLachlan and Basford 988). he dea behnd he mxure lelhood model approach s ha a fxed percenage ( w ) of he y s belong o cluser. he mxure lelhood model approach wll maxmze he lelhood funcon: Max θ p f ( y; ) () Where = = n/ ( ; ) ( ) / f y = π exp[.5( y )' ( y p θ )] p = w = w / exp[.5( y )' ( y )] he poseror Bayesan (3) exp[.5( y )' ( y )] / probably ha y belongs o cluser. w = = p / he percenage of daa belongs o cluser. (4) n: he dmensons of Z. =. =. he algorhm o carry ou he maxmzaon of (): I. Pc he nal values for w and for =. II. Subsue he nal values no (3) o ge he p s. III. Subsue he p s no () o ge he new esmaes of and and no (4) o ge he new esmaes of w for =. IV. Sop f p and converge; oherwse repea sep (II). 3

hs algorhm s anamoun o wegh each cluser. If y has more wegh oward cluser hen y o see how much of belongs o each y wll conrbue more o he calculaon of he and n he eraon. hs s also called he EM (Expecaon & Maxmzaon) algorhm. Expecaon: + p he poseror Bayesan probably ha y belongs o cluser gven w and. Maxmzaon: maxmzng () gven p. Anoher closely relaed mehod whch s ofen used n me seres Economercs o esmae he neresed sascs [( ) and ( ) n our case ] s he Marov Regme Swchng model mehod (Hamlon 989). he dea behnd he Marov Regme Swchng model mehod s ha here are n (n= n our case) saes of he world. When a parcular sae of he world s realzed n a gven perod here s a fxed probably q ha he sae of he world wll be realzed n he nex perod for =.n q and = q =. he Marov Regme Swchng model mehod s n bascally he same as he mxure lelhood model mehod bu he Marov Regme Swchng model mehod also mposes he assumpon ha a frs order Marov process governs he probably ha a parcular vecor n he sequence belongs o a parcular cluser a sae of he world. he numercal mehod for he Marov Regme Swchng model s cosly complcaed and convergence s no guaraneed even for ncely dsrbued well-separaed cluser daa. Snce he mehod s que complcaed we wll no ge no he deals of he Marov Regme Swchng model here. Anoher drawbac of he Marov Regme Swchng model s ha he model assumes he changng of he sae of he world follows a Marov process. If he realzaon of a parcular sae of he world does no follow a Marov process hen he mposon of he Marov process wll consran he daa o f he Marov process and oher sascal esmaes. ha n urn may bas all he sascal esmaes. he numercal mehod for he mxure lelhood model s less complcaed and cosly bu convergence s also no guaraneed. 4

he separaon mehod s que sraghforward. Gven wo clusers of daa n a Eucldean space he separaon mehod wll fnd a hyperplane o separae he daa no wo clusers and he hyperplane wll pass he pon (regon) where he densy s he lowes n beween he clusers. o locae he pon (regon) wh he lowes densy he norm vecor of he hyperplane s frs found by mnmze and hen maxmze he followng Max n Mn c n' ( Y cu) n n' n n R Where n: he norm of he hyperplane Y : = [ y y... y ] u := [...]. c urns ou o be he mean vecor of he whole daa se he cener gravy of all he pons n he daa se. Afer fndng he norm vecor (n) of he hyperplane place he hyperplane a (c) he cener gravy of he all he pons n he daa se o separae he daa no wo ses emporarly. hen calculae he mean vecors of each se. Fnally he pon (regon) wh he lowes densy can be found along he norm n beween he wo mean vecors us found. hen he hyperplane can be placed here o separae wo clusers. Obvously convergence s no an ssue here because no eraon s need. Bu f he daa overlap oo much here may no be any leas dense pon along he norm vecor (n) n beween he wo mean vecors us descrbed 3. he Proposed Mehod Jus le he approach n esmang he mean vecor and varance-covarance marx n he sngle cluser vecor sequence { Z } we esmae ( ) and ( ) by maxmzng he lelhood funcon bu now wh respec o all four parameers Max θ F( S; θ ) = f ( y; ) f ( y; ) (5) = = 5

Where n / ( ; ) ( ) / f y = π exp[.5( y )' ( y S y y... y θ = And n: he dmensons of y. )] Snce { Z } and { Z } are ndependen maxmzng (3) s equvalen o maxmzng = f ( ; and f ; ) wh respec o ( ) and ( ) n y ) = ( y separaon as n (). he dea of he mehod proposed here s he same as n () he rc s o separae { y } no { Z } and { Z }. he frs order condons F ( S; θ ) = = ( y ) = = => ( y ) = = => y = * => = = y F ( S; θ ) π * =.5* π +.5* = ( y )*( y )' = 6

=> = ( y )*( )' y = = and = +. Proposon: If * * * and * sasfy he above frs order condons. hen * * * * F ( S; θ*) = f ( y; ) f ( y ; ) s he global maxmum. = = Proof: Snce we now ha f a se of parameers θ * sasfes he frs order condons n a maxmzaon problem wh a pseudo-concave obecve funcon over a convex regon hen he obecve funcon aans s global maxmum a θ * o show he frs order condons are he suffcen condons n he maxmzaon problem above we only need o show: ) he obecve funcon s pseudo-concave and ) he feasble parameers se(s) are convex. We frs show he obecve funcon s pseudo-concave. Snce n / ( ; ) ( ) / f y = π exp[.5( y )' ( y )] s he bell-shaped normal dsrbuon densy curve s pseudo-concave. And he obecve funcon s he produc of he pseudo-concave funcons so s pseudo-concave. Now we show he feasble parameers se(s) s convex. Snce here s no resrcon on he parameer and excep ha marx R and s an n by vecor and s an n by n symmerc posve defne R s convex and belongs o he se of all n by n symmerc posve defne marces whch s also convex he feasble regon s convex. hs complees he proof. 7

hese frs order condons are que convenonal and hey us have been shown o be he necessary and suffcen condons n maxmzng he lelhood funcon F ( S; θ ) = f ( y; ) f ( y; ). Now he problem s o decde whch y = = belongs o whch cluser. he followng algorhm should do he rc o separae he y s he algorhm I. Pc he nal II. Subsue and and hen values. no he followng equaon o ge wo sequences of f ( y ) = f ( y ; ) ( = ; and = ) values n / f ( y ) = f ( y ; ) = (π ) exp[.5( y )'( ) ( y / )] III. Compare he f y ) values for each f f ( ) s greaer han f ( ) assgn ( y y y o se S. If ( y ) f s greaer han f ( ) assgn y o se y S. If f ( ) equals f ( ) hen assgn y o se y y S 3. IV. Calculae he rao: cardnal y( S ) r =. And assgn = = cardnal y( S ) celng [( r ) * cardnal y( S3 )] numbers of y n se S 3 o se S randomly he res o se S. V. Calculae he mean vecor and varance-covarance marx of se S as + and +. VI. If and converge hen sop. Oherwse repea sep (II). 8

roubled Inal Values If some exreme nal values are pced such ha he values from f ( ; ) compleely domnae he values from f ( ; ) or vce verse hen obvously he y y algorhm canno go on. hs problem can be deal wh easly by choosng he nal mean vecors and varance-covarance marces near he mean vecor and varancecovarance marces of he whole daa se. Sll f are pced such ha α( y )( y )' ( y )( y )' = + ε ( I ) (8) ( y )'( y ) ( y )'( y ) α >> and ε for any gven y hen f ; ) wll end o nfny and he ( y algorhm wll no wor. heorecally hs could happen. Snce f ( y; ) s a bellshaped Gaussan curve and f ( y ; ) dy = f he base of he bell s made very n R narrow he pea of he bell could go o nfny. So f one or a few of he f ( ; ) values approach nfny and he res are no zero he lelhood funcon approaches nfny. Ceranly an nal pon such as he one n (8) can be pced and oban a very hgh or an nfne lelhood value. If he y s are no ncely dsrbued (ha s one or a few y s are far away from he res) he esmaes generaed by he algorhm could also converge o such problem spos. y Anoher possbly s ha he of he { wh Z } and { Z and } n { y } aan hgher vales wh are pced such ha exacly halves and he oher halves hen he algorhm may be suc a he same pon. We hn such a pon s no sable and he algorhm canno converge o such a pon. Bu hs conecure sll needs o be proven. 9

Proposon: If { y } consss of nce normally-dsrbued clusers (ha s no oulners are locaed far away from boh clusers) he wo clusers are well-separaed and he roubled nal values menoned above are avoded hen he generaed by he above algorhm wll converge o and. and Proof. From he frs order condons s clear ha o maxmze he lelhood funcon (5) we smply have o pc he parameers ( ) o generae as many hgh f ( y ; ) values as possble wh { y } =. Gven a se of nal values ( ) = whou los of generaly assume f ; ) aans a hgher value wh ( y more of he y s cluserng around (for convenen call hs cluser A and he oher cluser B). hen s he mean of more y s n cluser A han hose n cluser B. And s he average ouer producs of he y s mosly near cluser A mnus are only wo clusers n he daa f he y s are no used n generang he (. Snce here ) hey are used n generang ( ) wha happens n cluser A s us he mrror mage of wha s happenng n cluser B. Snce and are generaed by he y s mosly n cluser A ( ) are closer o ( ) han ( ) respecvely. Smlarly ( ) are closer o ( ) han ( ) respecvely. Snce f ( y ; ) s a bell-shaped Gaussan curve and f ( y ; ) = n dy f ( y ; ) s small f y s far away from y s closer o R * *. ha s for a gven ( ) f ( ; * * ) aans a hgher value f *. Snce s closer o han cluser A han hose from cluser B and he same s rue for So more y s n cluser A wll be ncluded n calculang ( y s closer o mos of he y s n wh respec o cluser B. ). Successvely ( ) wll ge closer and closer o ( ) and ( ) o ( ). As ( ) ge close enough o ( ) f ; ) wll aan a hgher value han ( y

f ; ) wh every y s n cluser A. he same s rue for f ; ) wh ( y ( y every y s n cluser B. Consequenly and wll converge o and respecvely. 4. Mehods Comparson In hs secon we wll compare he separaon mehod he proposed mehod and he mxure lelhood mehod n erms of he numbers of eraons and compung me (snce no eraon s need for he separaon mehod only he compung me (as n MaLab) of he mehod s compared o he oher wo mehods) needed for he esmaes o converge and he sum of he norms of he dfferences beween he rue mean vecors varance-covarance marces and he respecve esmaed mean vecors varancecovarance marces. he mehod of generang he comparng fgures: wo vecor sequences wh dfferen mean vecors and varance-covarance marces are generaed. he mean vecor and varance-covarance marx of each sequence are calculaed as he rue mean vecor and varance-covarance marx. hen he wo sequences are mxed he separaon mehod he proposed mehod and he mxure lelhood mehod are employed o esmae he mean vecor and he varance-covarance marx of each sequence. he amoun of me and he numbers of eraons needed for he esmaes o converge for each mehod are recorded. Also he norms of he dfferences beween he rue mean vecors and he varance-covarance marces and he mean vecors and varance-covarance marces esmaed by each mehod are calculaed. Snce he daa generaed or observed are random f he cluser overlap oo much whou nowng beforehand here s no way one can ell whch daa pon belongs o whch cluser n he overlappng area. So we mae an arbrary decson ha f he clusers overlap oo much roughly more han en percen we wll advse he user of he proposed algorhm o loo a he esmaes wh cauon. he percenage of overlappng s

esmaed frs by calculang he dsance beween he esmaed mean vecors of he wo clusers A and B. hen draw a hyperplane (P B ) normal o he vecor whch runs from he mean vecor of cluser A o ha of cluser B o pass hrough he mean vecor of cluser B. hen calculae he average dsance (D B ) of all he pons n cluser B whch le on he sde of he hyperplane (P B ) away from cluser A. Follow he same procedure o calculae D A. If y s 3 dmensonal when he dsance beween he esmaed mean vecors s less han.*( D A + D B ) he clusers should overlap each oher more han en percen. hen we send ou he warnng message o urge he user of hs mehod o loo a he esmaes wh cauon. Under hs crcumsance we esmae he varance-covarance marx of cluser B based on he daa ha le on he sde of P B ha s away from cluser A n each eraon. he same mehod s used o esmae he varance-covarance marx of cluser A. Now we move on o loo a he expermen resuls. We frs loo a how does each mehod do wh well-separaed clusers as n fgure n he Appendx. Daa ses wh well-separaed clusers and a oal of and daa pons were generaed. And expermens for each scenaro were repeaed bu only he resuls of he las en expermens and he summares fgures are repored n able and able n he Appendx. In he able under he headng of each mehod he frs column me s he amoun of me needed for he esmaes generaed by he mehod o converge for he separaon mehod s he me need o generae he esmaes. Ieraons s he numbers of eraons needed for he esmaes generaed by he mehod o converge. Norms s he sum of he norms as descrbed n he prevous paragraph. N.C. s he me when he esmaes generaed by he mehod do no converge n ha expermen (afer eraons) hen he column aes he value of one. Snce Ieraons and N.C. do no apply o he separaon mehod we us pu N/A n hese columns for he separaon mehod. he frs en rows of he able are he resuls of las en of he one housand expermens. he nex o he las row s he sum of each column and he las row s he average of each column.

As we can see from able and able n he Appendx he message s conssen from boh ables. he proposed mehod s he fases and has he smalles Norms. he separaon mehod comes n second. he mxure lelhood mehod s he slowes and has he bgges norms. he numbers of eraons for he proposed mehod and he mxure lelhood mehod are conssen wh he compung me: he proposed mehod s abou hree mes faser han he mxure lelhood mehod. In one occason under each scenaro he esmaes from mxure lelhood mehod dd no converge. For he clusers ha are closed bu may or may no overlap as n Fgure n he Appendx he expermen wh and daa pons are repored n able 3 and able 4. able 3 s very much le able and able. From able 4 when he clusers are closed he separaon mehod becomes he fases bu he norms s also he bgges more han wce as bg as he proposed mehod. he mxure lelhood mehod s sll abou hree mes as cosly as he proposed mehod n erms of compung me or numbers of eraons. Also abou wo percen of he me he esmaes for he mxure lelhood mehod dd no converge. Lasly we loo a he case where he clusers overlap as n Fgure 3 n he Appendx. he resuls for and daa pons are repored n able 5 and able 6. he resuls n able 5 and able 6 are smlar o hose n able 4. he separaon mehod s he fases bu he proposed mehod has he smalles norms. In he case of daa pons abou one percen of he me he esmaes from he proposed mehod dd no converge. And abou wo and a half percen of he me he esmaes from mxure lelhood mehod dd no converge. In able 6 (he daa pons case) abou fve percen of he me esmaes from he mxure lelhood mehod dd no converge. hs fgure seems a b oo hgh. Bu snce he daa were random mosly lely he mxure lelhood mehod was us lucy o ge such a hgh fgure. 3

5. Concluson In hs paper a smple algorhm s proposed o esmae he mean vecors and he varance-covarance marces of wo clusers of daa whch are no overlappng oo much. he resuls produced by he proposed mehod are also compared o hose produced by he separaon mehod and he mxure lelhood mehod for some smulaed daa under dfferen scenaros. Clearly when he clusers are well-separaed he proposed mehod sands ou n erms of speed and accuracy. When he clusers are closed or overlap he separaon mehod s he fases bu he proposed mehod s he mos accurae. he mxure lelhood mehod s farly accurae bu s relavely slow. 4

Reference Hamlon James 99 Analyss of me Seres Subec o Changes n Regme Journal of Economercs 45 39-7. Mclachlan Geoffrey J. Kaye E. Basford 988 Mxure Models: Inference and Applcaons o Cluserng New Yor: Marcel Deer. 5

Appendx Fgure 6

Fgure 7

Fgure 3 8

able Wh Well-Separaed Daa Pon Separaon Mehod Proposed Mehod Mxure Lelhood me Ieraons Norms N.C. me Ieraons Norms N.C. me Ieraons Norms N.C....34 N/A. 3..34..4..69....35 N/A...35..8..57....37 N/A. 3..37..3 7..38....36 N/A...36..3 9..47....35 N/A. 3..35..3 9..46....35 N/A. 3..35..3 7..36....37 N/A. 3..37..3 7..37....38 N/A...38..3 7..4....36 N/A. 3..36..3 9..45....36 N/A...36..4..6. 8.5..8985 N/A.833 556. 7.634. 3.699 8547. 9.57..8..9 N/A.8.556.76..37 8.547.9. Noe: me s he amoun of me needed for convergence n MaLab. Ieraons s he numbers of eraons need for convergence. Norms s he sum of norms of he dfferences of he rue sascs and he respecve esmaed sascs. N.C. equals f he program does no converge afer eraons. N/A: no appled. he frs en rows are he las en resuls of he one housand expermens. he nex o las row s he sum of each column he las row s he average. 9

able Wh Well-Separaed Daa Pons Separaon Mehod Proposed Mehod Mxure Lelhood me Ieraons Norms N.C. me Ieraons Norms N.C. me Ieraons Norms N.C..5 N/A.3 N/A.3 3..3..37..6..5 N/A.3 N/A.3 3..3..4..3..5 N/A.3 N/A.3 3..3..3 8..9..5 N/A.57 N/A.3 3..3..37..4..5 N/A.3 N/A.3 3..3..3 8....5 N/A.3 N/A.4 3..3..4..6..5 N/A.3 N/A.9 4..38..4..73..5 N/A.58 N/A.4 3..3..4..6..5 N/A.3 N/A.3 3..3..38..3..5 N/A.9 N/A.3 3..9..4..74. 5.9 N/A 6.664 N/A 37.958 35. 3.986. 36.693 964. 4.96..5 N/A.6 N/A.38 3.5.39..367 9.64.5. Noe: me s he amoun of me needed for convergence n MaLab. Ieraons s he numbers of eraons need for convergence. Norms s he sum of norms of he dfferences of he rue sascs and he respecve esmaed sascs. N.C. equals f he program does no converge afer eraons. N/A: no appled. he frs en rows are he las en resuls of he one housand expermens. he nex o las row s he sum of each column he las row s he average.

able 3 Wh Closed Daa Pons Separaon Mehod Proposed Mehod Mxure Lelhood me Ieraons Norms N.C. me Ieraons Norms N.C. me Ieraons Norms N.C....358 N/A. 4..358..8..63....759 N/A. 3..343..9 5..593....9335 N/A. 4..389........96 N/A. 3..37.. 3..85....35 N/A. 3..35..8 3..64..3..39 N/A. 3..879.. 6..49....9 N/A. 4..9..9 7..4....399 N/A.3 4..875.. 8..348....336 N/A. 4..936.. 7..778....644 N/A. 3..644..8..94. 8.648. 77.844 N/A 7.35 359. 5.73. 85.596 335..857..86..77 N/A.7 3.59.5..856 3.35.9. Noe: me s he amoun of me needed for convergence n MaLab. Ieraons s he numbers of eraons need for convergence. Norms s he sum of norms of he dfferences of he rue sascs and he respecve esmaed sascs. N.C. equals f he program does no converge afer eraons. N/A: no appled. he frs en rows are he las en resuls of he one housand expermens. he nex o las row s he sum of each column he las row s he average.

able 4 Wh Closed Daa Pons Separaon Mehod Proposed Mehod Mxure Lelhood me Ieraons Norms N.C. me Ieraons Norms N.C. me Ieraons Norms N.C..5 N/A.39 N/A.9 4..373..84 3..96..5 N/A.49 N/A.4 5..456..8..83..5 N/A.789 N/A.3 5..485..8..89..5 N/A.58 N/A.4 5..658..89 4..5..5 N/A.377 N/A.9 4..555..8....5 N/A.5 N/A.4 5..933..93 5..37..5 N/A.39 N/A.9 4..339..93 5..5..5 N/A.88 N/A.9 4..395..8..833..5 N/A.988 N/A.9 4..873..9 4..48..5 N/A.477 N/A.4 5..76..8..84. 5.38 N/A.377 N/A.383 4594. 5.86. 835.85 396. 9.56 4..5 N/A.4 N/A.4 4.594.58..8359.396.9.4 Noe: me s he amoun of me needed for convergence n MaLab. Ieraons s he numbers of eraons need for convergence. Norms s he sum of norms of he dfferences of he rue sascs and he respecve esmaed sascs. N.C. equals f he program does no converge afer eraons. N/A: no appled. he frs en rows are he las en resuls of he one housand expermens. he nex o las row s he sum of each column he las row s he average.

able 5 Wh Overlappng Daa Pons Separaon Mehod Proposed Mehod Mxure Lelhood me Ieraons Norms N.C. me Ieraons Norms N.C. me Ieraons Norms N.C....377 N/A. 5..69.. 6..554....4654 N/A.3 6..666..7 75..697....795 N/A.3 7..436..7 75..47....8858 N/A. 5..596..9 8..53....77 N/A.3 5..4865..4 66..579....6 N/A. 5..47..8 5..87....556 N/A. 6..53..5 68..598....55 N/A.3 7..566..8 5..45....75 N/A. 5..43..3 84..58....47 N/A.3 5..497..3 65..4645. 8.75. 673.666 N/A 7.35 579. 48.3384. 37.97 6697. 5.69 4..88..6736 N/A.73 5.79.483..38 66.97.57.4 Noe: me s he amoun of me needed for convergence n MaLab. Ieraons s he numbers of eraons need for convergence. Norms s he sum of norms of he dfferences of he rue sascs and he respecve esmaed sascs. N.C. equals f he program does no converge afer eraons. N/A: no appled. he frs en rows are he las en resuls of he one housand expermens. he nex o las row s he sum of each column he las row s he average. 3

able 6 Wh Overlappng Daa Pons Separaon Mehod Proposed Mehod Mxure Lelhood me Ieraons Norms N.C. me Ieraons Norms N.C. me Ieraons Norms N.C..5 N/A.77 N/A.39 8..487..34 63..458..6 N/A.486 N/A.34 7..345..4 6..477..5 N/A.55 N/A.35 7..39..434 65..475..6 N/A.474 N/A.35 7..745..493 67..4338..5 N/A.568 N/A.35 7..789..43 65..43..6 N/A.63 N/A.35 7..33..34 6..4576..5 N/A.437 N/A.4 8..3438..54 67..4543..6 N/A.783 N/A.9 6..875..343 63..4335..5 N/A.784 N/A.35 7..36..33 6..468..5 N/A.463 N/A.4 8..34..353 63..4534. 5.47 N/A 545.596 N/A 386.897 777. 33.646. 38.6 639. 445.488 5..55 N/A.5456 N/A.3869 7.77.336..386 6.39.445.5 Noe: me s he amoun of me needed for convergence n MaLab. Ieraons s he numbers of eraons need for convergence. Norms s he sum of norms of he dfferences of he rue sascs and he respecve esmaed sascs. N.C. equals f he program does no converge afer eraons. N/A: no appled. he frs en rows are he las en resuls of he one housand expermens. he nex o las row s he sum of each column he las row s he average. 4