The Expectation-Maximization (EM) Algorithm

Size: px
Start display at page:

Download "The Expectation-Maximization (EM) Algorithm"

Transcription

1 The Expectatio-Maximizatio (EM) Algorithm Readig Assigmets T. Mitchell, Machie Learig, McGraw-Hill, 997 (sectio 6.2, hard copy). S. Gog et al. Dyamic Visio: From Images to Face Recogitio, Imperial College Pres, 200 (Appedix C, hard copy). A. Webb, Statistical Patter Recogitio, Arold, 999 (sectio 2.3, hard copy). Case Studies B. Moghaddam ad A. Petlad, "Probabilistic visual learig for object represetatio", IEEE Trasactios o Patter Aalysis ad Machie Itelligece, vol. 9, o. 7, pp , 997 (o-lie). S. Mcea, Y. Raja, ad S. Gog, "Trackig color objects usig adaptive mixture models", Image ad Visio Computig, vol. 7, pp , 999 (o-lie). C. Stauffer ad E. Grimso, "Adaptive backgroud mixture models for real-time trackig", IEEE Computer Visio ad Patter Recogitio Coferece, Vol. 2, pp , 998 (o-lie).

2 -2- The Expectatio-Maximizatio (EM) Algorithm Overview -Itisaiterative algorithm that starts with a iitial estimate for θ ad iteratively modifies θ to icrease the likelihood of the observed data. -Works best i situatios where the data is icomplete or ca be thought of as beig icomplete. -EMistypically used with mixture models (e.g., mixtures of Gaussias). The case of icomplete data -May times, it is impossible to apply ML estimatio because we ca ot measure all the features or certai feature values are missig. - The EM algorithm is ideal (i.e., it produces ML estimates) for problems with uobserved (missig) data. Actual data: x = x x 2 x 3, Observed data: y = x x 2 Complete pdf: p(x/θ), Icomplete pdf: p(y/θ) -Icomplete pdf ca be derived from complete pdf: p(y/θ) =... p(x/θ)dx missig

3 -3- A example -Assume the followig two classes i a patter-recogitio problem: () A class of dark object (.) Roud black objects (.2) Square black objects (2) A class of light objects Complete data ad pdf: x = x x 2 x 3 p(x, x 2, x 3 /θ) = ( Observed (icomplete) data ad pdf: umber of roud dark objects umber of square dark objects umber of light objects! x! x 2!x 3! )(/4)x ( /4 + θ /4) x 2 ( /2 θ /4) x 3 y = y = x + x 2 umber of dark objects y 2 x 3 umber of light objects (may-to-oe mappig!!)

4 -4- EM: mai idea ad steps -Ifxwas available, the we could use ML to estimate θ,i.e., arg max θ l p(d x /θ) Idea: maximize the expectatio of p(x/θ) give the data y ad our curret estimate of θ.. Iitializatio step: iitialize the algorithm with a guess θ 0 2. Expectatio step: it is with respect to the ukow variables, usig the curret estimate of parameters ad coditioed upo the observatios. Q(θ; θ t ) = E xuobserved (l p(d x /θ) /D y,θ t ) * Expectatio is over the values of the uobserved variables sice the observed data is fixed. *Whe l p(d x /θ) isaliear fuctio of the uobserved variables, the the above step is equivalet to fidig E(x uobserved /D y, θ t ) 3. Maximizatio step: provides a ew estimate of the parameters. θ t+ = arg max θ Q(θ; θ t ) 4. Covergece step: if θ t+ θ t < ε,stop; otherwise, go to step 2.

5 -5- A example (cot d)) -Assume the followig two classes i a patter-recogitio problem: () A class of dark object (.) Roud black objects (.2) Square black objects (2) A class of light objects Complete data ad pdf: x = x x 2 x 3 p(x, x 2, x 3 /θ) = ( Observed (icomplete) data ad pdf: umber of roud dark objects umber of square dark objects umber of light objects! x! x 2!x 3! )(/4)x ( /4 + θ /4) x 2 ( /2 θ /4) x 3 y = y = x + x 2 umber of dark objects y 2 x 3 umber of light objects (may-to-oe mappig!!) Expectatio step: compute E(l p(d x /θ) /D y,θ t ))

6 -6- Σ l( p(d x /θ) = Π p(x i /θ) ==> l p(d x /θ) = Σ l p(x i /θ) =! x i! x i2!x i3! ) + x i l( /4) + x i2 l( /4 + θ /4) + x i3 l( /2 θ /4) E[l p(d x /θ)/d y,θ t ] = Σ! E[l( x i! x i2!x i3! )/D y,θ t ] + E[x i /D y,θ t ] l( /4) + E[x i2 /D y, θ t ] l( /4 + θ /4) + x i3 l( /2 θ /4) Maximizatio step: compute θ t+ by maximizig E(l p(d x /θ) /D y,θ t ) d dθ E[l p(d x/θ)/d y,θ t ] = 0==> θ t+ = 2 + E[x i2/d y, θ t ] x i3 E[x i2 /D y, θ t ] + x i3 Expectatio step (cot d): estimatig E[x i2 /D y, θ t ] P(x i2 /y i, y i2 ) = P(x i2 /y i ) = y i x i2 ( /4)x i2 ( /4 + θ /4) y i x i2 ( /2 + θ /4) yi E[x i2 /D y, θ t ] = y i /4 /2 + θ t /4

7 -7- Covergece properties of the EM algorithm - At each iteratio, a value of θ is computed so that the likelihood fuctio does ot decrease. -Itca be show that by icreasig Q(θ; θ t ) = E xuobserved (l p(d x /θ) /D y,θ t )with the EM algorithm, we are also icreasig l p(d x /θ). -This does ot guaratee that the algorithm will reach the ML estimate (global maximum) ad, i practice, it may get stuck i a local optimum. -The solutio depeds o the iitial estimate θ 0. -The algorithm is guarateed to be stable ad to coverge to a ML estimate (i.e., there is o chace of "overshootig" or divergig from the maximum).

8 -8- Maximum Likelihood of mixtures via EM Mixture model - I a mixture model, there are may "sub-models", each of which has its ow probability distributio which describes how itgeerates data whe it is active. -There is also a "mixer" or "gate" which cotrols how ofte each sub-model is active. -Formally, a mixture is defied as a weighted sum of compoets where each compoet is a parametric desity fuctio p(x/θ k ): p(x/θ) = Σ p(x/θ k )π k Mixture parameters -The parameters θ to estimate are: *the values of π k *the parameters θ k of p(x/θ k ) -The compoet desities p(x/θ k )may be of differet parametric forms ad are specified usig kowledge of the data geeratio process, if available. -The weights π k are the mixig parameters ad they sum to uity: Σ π k = -Fittig a mixture model to a set of observatios D x cosists of estimatig the

9 -9- set of mixture parameters that best describe this data. -Two fudametal issues arise i mixture fittig: () Estimatio of the mixture parameters. (2) Estimatio of the mixture compoets. Mixtures of Gaussias -Ithe mixtures of Gaussia model, p(x/θ k )isthe multivariate Gaussia distributio. -Ithis case, the parameters θ k are (µ k, Σ k ). Mixture parameter estimatio usig ML -Aswehave see, give aset of data D=(x, x 2,..., x ), ML seeks the value of θ that maximizes the followig probability: p(d/θ) = Π p(x i /θ) -Sice p(x i /θ) ismodeled as a mixture (i.e., p(x i /θ) = Σ p(x i /θ k )π k )the above expressio ca be writte as: p(d/θ) = Π Σ p(x i /θ k )π k -Igeeral, it is ot possible to solve p(d/θ) ad iterative schemes must be employed. θ = 0 explicitly for the parameters

10 -0- Estimate the meas of Gaussias usig EM (special case) Data geeratio process usig mixtures -Assume the data D is geerated by a probability distributio that is a mixture of k Gaussias. k = 2 -Each istace is geerated usig a two-step process: () Oe of the Gaussias is selected at radom, with probabilities π, π 2,...,π. (2) A sigle radom istace x i is geerated accordig to this selected distributio. -This process is repeated to geerate a set of data poits D. Assumptios (this example) () π = π 2 =... = π (uiform distributio) (2) Each Gaussia has the same variace σ 2 which is kow. -The problem is to estimate the meas of the Gaussias θ = (µ, µ 2,..., µ ) Note: if we kew which Gaussia geerated each datapoit, the it would be

11 -- easy to fid the parameters for each Gaussia usig ML. Ivolvig hidde or uobserved variables -Weca thik of the full descriptio of each istace x i as y i =(x i, z i )=(x i, z i, z i2,...,z i ) where z i is a class idicator vector (hidde variable): z ij = 0 if x i was geerated by j th compoet otherwise -Ithis case, x i are observable ad z i o-observable. Mai steps usig EM -The EM algorithm searches for a ML hypothesis through the followig iterative scheme: () Iitialize the hypothesis θ 0 =(µ 0, µ 0 2,..., µ 0 ) (2) Estimate the expected values of the hidde variables z ij usig the curret hypothesis θ t =(µ t, µ t 2,..., µ t ) (3) Update the hypothesis θ t+ =(µ t+, µ 2 t+,..., µ t+ )usig the expected values of the hidde variables from step 2. -Repeat steps (2)-(3) util covergece.

12 -2- Derivatio of the Expectatio-step -Wemust derive a expressio for Q(θ; θ t ) = E zi (l p(d y /θ) /D x,θ t ) () Derive the form of l p(d y /θ): -Weca write p(y i /θ) asfollows: p(d y /θ) = Π p(y i /θ) p(y i /θ) = p(x i, z i /θ) = p(x i /z i, θ)p(z i /θ) = p(x i /θ j )π j (assumig z ij = ad z ik =0 for k j) -Weca rewrite p(x i /θ j )π j as follows: p(y i /θ) = Π[p(x i /θ k )π k ] z ik -Thus, p(d y /θ) ca be writte as follows (π k s are all equal): p(d y /θ) = Π Π[p(x i /θ k )] z ik -Wehave assumed the form of p(x i /θ k )tobegaussia: p(x i /θ k ) = σ 2 exp[ (x i µ k ) 2 ], thus π 2σ 2 Π[p(x i /θ k )] z ik = σ 2 exp[ π 2σ 2 which leads to the followig form for p(d y /θ): Σ z ik (x i µ k ) 2 ] p(d y /θ) = Π σ 2 exp[ π 2σ 2 Σ z ik (x i µ k ) 2 ] -Let s compute ow l p(d y /θ): l p(d y /θ) = Σ(l σ 2 π 2σ 2 Σ z ik (x i µ k ) 2 )

13 -3- (2) Take the expected value of l p(d y /θ): E zi (l p(d y /θ)/d x,θ t ) = E( Σ(l Σ(l σ 2 π 2σ 2 σ 2 π 2σ 2 Σ z ik (x i µ t k) 2 ))) = Σ E(z ik )(x i µ t k) 2 ) - E(z ik )isjust the probability that the istace x i was geerated by the k-th compoet (i.e., E(z ik ) = Σ z ij P(z ij ) = P(z ik ) = P(k/x i ): j E(z ik ) = exp[ (x i µ t k) 2 ] 2σ 2 Σ exp[ (x i µ t j )2 ] 2σ 2 j= Derivatio of the Maximizatio-step -Maximize Q(θ; θ t ) = E zi (l p(d y /θ) /D x,θ t ) Q µ k = 0 or µ t+ k = Σ E(z ik )x i Σ E(z ik )

14 -4- Summary of the two steps -Choose the umber of compoets Iitializatio step Expectatio step θ 0 k=µ 0 k E(z ik ) = exp[ (x i µ t k) 2 ] 2σ 2 Σ exp[ (x i µ t j )2 ] 2σ 2 j= Maximizatio step µ t+ k = Σ E(z ik )x i Σ E(z ik )

15 -5- Estimate the mixture parameters (geeral case) -If we kew which sub-model was resposible for geeratig each datapoit, the it would be easy to fid the ML parameters for each sub-model. () Use EM to estimate which sub-model was resposible for geeratig each datapoit. (2) Fid the ML parameters based o these estimates. (3) Use the ew MLparameters to re-estimate the resposibilities ad iterate. Ivolvig hidde variables -Wedoot kow which istace x i was geerated by which compoet (i.e., the missig data are the labels showig which sub-model geerated each datapoit). -Augmet each istace x i by the missig iformatio: y i = (x i, z i ) where z i is a class idicator vector z i = (z i, z 2i,...,z i ): z ij = 0 if x i geerated by j th compoet otherwise (x i are observable ad z i o-observable)

16 -6- Derivatio of the Expectatio step -Wemust derive a expressio for Q(θ; θ t ) = E zi (l p(d y /θ) /D x,θ t ) () Derive the form of l p(d y /θ): -Weca write p(y i /θ) asfollows: p(d y /θ) = Π p(y i /θ) p(y i /θ) = p(x i, z i /θ) = p(x i /z i, θ)p(z i /θ) = p(x i /θ j )π j (assumig z ij = ad z ik =0 for k j) -Weca rewrite the above expressio as follows: p(y i /θ) = Π[p(x i /θ k )π k ] z ik -Thus, p(d y /θ) ca be writte as follows: p(d y /θ) = -Weca ow compute l p(d y /θ) l p(d y /θ) = Σ Π Σ Σ z ik l ( p(x i /θ k )) + Π[p(x i /θ k )π k ] z ik Σ z ik l ( p(x i /θ k )π k ) = Σ Σ z ik l (π k )

17 -7- (2) Take the expected value of l p(d y /θ): E(l p(d y /θ)/d x,θ t ) = Σ Σ E(z ik )l (p(x i /θ t k)) + Σ Σ E(z ik )l (π t k) - E(z ik )isjust the probability that istace x i was geerated by the k-th compoet (i.e., E(z ik ) = Σ z ij P(z ij ) = P(z ik ) = P(k/x i ): j E(z ik ) = j= p(x i/θ t k)π t k Σ p(x i /θ t j )π t j Derivatio of the Maximizatio step -Maximize Q(θ; θ t )subject to the costrait Σ π k = : Q (θ; θ t ) = Σ Σ E(z ik )l (p(x i /θ k )) + where λ is the Lagrage multiplier. Σ Σ E(z ik )l (π k ) + λ( Σ π k ) k = Q π k = 0 or Σ E(z ik ) λ = 0 or πk t+ π k = Σ E(z ik ) (the costrait Σ π k = gives Σ Σ E(z ik ) = λ) Q µ k = 0 or µ t+ k = π t+ k Σ E(z ik )x i Q Σ k = 0 or Σ t+ k = π t+ k Σ E(z ik )(x i µ t+ k )(x i µ t+ k ) T

18 -8- Summary of steps -Choose the umber of compoets Iitializatio step θ 0 k=(π k, 0 µ 0 k, Σ 0 k) Expectatio step E(z ik ) = j= p(x i/θ t k)π t k Σ p(x i /θ t j )π t j Maximizatio step π t+ k = Σ E(z ik ) µ t+ k = π t+ k Σ E(z ik )x i Σ t+ k = π t+ k Σ E(z ik )(x i µ t+ k )(x i µ t+ k ) T (4) If θ t+ θ t < ε,stop; otherwise, go to step 2.

19 -9- Estimatig the umber of compoets -Use EM to obtai a sequece of parameter estimates for a rage of values {Θ (), = mi,..., max } -The estimate of is the defied as a miimizer of some cost fuctio: ˆ = arg mi (C(Θ (), ), = mi,..., max -Most ofte, the cost fuctio icludes l p(d y /θ) ad a additioal term whose role is to pealize large values of. -Several criteria have bee used, e.g., Miimum descriptio legth (MDL)

20 -20- Lagrage Optimizatio -Suppose we wat to maximize f (x) subject to some costrait expressed i the form: g(x) = 0 -Tofid the maximum, first we form the Lagragia fuctio: L(x, λ) = f (x) + λ g(x) (λ is called the Lagrage udetermied multiplier) -Take the derivative ad set it equal to zero: L(x, λ) x = f (x) x + λ g(x) x = 0 -Solve the resultig equatio for λ ad the value x that maximizes f (x)

Expectation-Maximization Algorithm.

Expectation-Maximization Algorithm. Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................

More information

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar. Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.

More information

Algorithms for Clustering

Algorithms for Clustering CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

More information

Mixtures of Gaussians and the EM Algorithm

Mixtures of Gaussians and the EM Algorithm Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity

More information

CSE 527, Additional notes on MLE & EM

CSE 527, Additional notes on MLE & EM CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Expectation maximization

Expectation maximization Motivatio Expectatio maximizatio Subhrasu Maji CMSCI 689: Machie Learig 14 April 015 Suppose you are builig a aive Bayes spam classifier. After your are oe your boss tells you that there is o moey to label

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Probabilistic Unsupervised Learning

Probabilistic Unsupervised Learning HT2015: SC4 Statistical Data Miig ad Machie Learig Dio Sejdiovic Departmet of Statistics Oxford http://www.stats.ox.ac.u/~sejdiov/sdmml.html Probabilistic Methods Algorithmic approach: Data Probabilistic

More information

Distributional Similarity Models (cont.)

Distributional Similarity Models (cont.) Distributioal Similarity Models (cot.) Regia Barzilay EECS Departmet MIT October 19, 2004 Sematic Similarity Vector Space Model Similarity Measures cosie Euclidea distace... Clusterig k-meas hierarchical

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Regression and generalization

Regression and generalization Regressio ad geeralizatio CE-717: Machie Learig Sharif Uiversity of Techology M. Soleymai Fall 2016 Curve fittig: probabilistic perspective Describig ucertaity over value of target variable as a probability

More information

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

Distributional Similarity Models (cont.)

Distributional Similarity Models (cont.) Sematic Similarity Vector Space Model Similarity Measures cosie Euclidea distace... Clusterig k-meas hierarchical Last Time EM Clusterig Soft versio of K-meas clusterig Iput: m dimesioal objects X = {

More information

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019 Outlie CSCI-567: Machie Learig Sprig 209 Gaussia mixture models Prof. Victor Adamchik 2 Desity estimatio U of Souther Califoria Mar. 26, 209 3 Naive Bayes Revisited March 26, 209 / 57 March 26, 209 2 /

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian Chapter 2 EM algorithms The Expectatio-Maximizatio (EM) algorithm is a maximum likelihood method for models that have hidde variables eg. Gaussia Mixture Models (GMMs), Liear Dyamic Systems (LDSs) ad Hidde

More information

Unsupervised Learning 2001

Unsupervised Learning 2001 Usupervised Learig 2001 Lecture 3: The EM Algorithm Zoubi Ghahramai zoubi@gatsby.ucl.ac.uk Carl Edward Rasmusse edward@gatsby.ucl.ac.uk Gatsby Computatioal Neurosciece Uit MSc Itelliget Systems, Computer

More information

Probability and MLE.

Probability and MLE. 10-701 Probability ad MLE http://www.cs.cmu.edu/~pradeepr/701 (brief) itro to probability Basic otatios Radom variable - referrig to a elemet / evet whose status is ukow: A = it will rai tomorrow Domai

More information

Pattern Classification, Ch4 (Part 1)

Pattern Classification, Ch4 (Part 1) Patter Classificatio All materials i these slides were take from Patter Classificatio (2d ed) by R O Duda, P E Hart ad D G Stork, Joh Wiley & Sos, 2000 with the permissio of the authors ad the publisher

More information

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices Radom Matrices with Blocks of Itermediate Scale Strogly Correlated Bad Matrices Jiayi Tog Advisor: Dr. Todd Kemp May 30, 07 Departmet of Mathematics Uiversity of Califoria, Sa Diego Cotets Itroductio Notatio

More information

Clustering: Mixture Models

Clustering: Mixture Models Clusterig: Mixture Models Machie Learig 10-601B Seyoug Kim May of these slides are derived from Tom Mitchell, Ziv- Bar Joseph, ad Eric Xig. Thaks! Problem with K- meas Hard Assigmet of Samples ito Three

More information

ECE 901 Lecture 13: Maximum Likelihood Estimation

ECE 901 Lecture 13: Maximum Likelihood Estimation ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered

More information

Recurrence Relations

Recurrence Relations Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The

More information

Lecture 11 and 12: Basic estimation theory

Lecture 11 and 12: Basic estimation theory Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis

More information

Grouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014

Grouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014 Groupig 2: Spectral ad Agglomerative Clusterig CS 510 Lecture #16 April 2 d, 2014 Groupig (review) Goal: Detect local image features (SIFT) Describe image patches aroud features SIFT, SURF, HoG, LBP, Group

More information

Chapter 2 The Monte Carlo Method

Chapter 2 The Monte Carlo Method Chapter 2 The Mote Carlo Method The Mote Carlo Method stads for a broad class of computatioal algorithms that rely o radom sampligs. It is ofte used i physical ad mathematical problems ad is most useful

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Questions and Answers on Maximum Likelihood

Questions and Answers on Maximum Likelihood Questios ad Aswers o Maximum Likelihood L. Magee Fall, 2008 1. Give: a observatio-specific log likelihood fuctio l i (θ) = l f(y i x i, θ) the log likelihood fuctio l(θ y, X) = l i(θ) a data set (x i,

More information

Monte Carlo method and application to random processes

Monte Carlo method and application to random processes Mote Carlo method ad applicatio to radom processes Lecture 3: Variace reductio techiques (8/3/2017) 1 Lecturer: Eresto Mordecki, Facultad de Ciecias, Uiversidad de la República, Motevideo, Uruguay Graduate

More information

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

5. Fractional Hot deck Imputation

5. Fractional Hot deck Imputation 5. Fractioal Hot deck Imputatio Itroductio Suppose that we are iterested i estimatig θ EY or eve θ 2 P ry < c where y fy x where x is always observed ad y is subject to missigess. Assume MAR i the sese

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes. Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely

More information

Probabilistic Unsupervised Learning

Probabilistic Unsupervised Learning Statistical Data Miig ad Machie Learig Hilary Term 2016 Dio Sejdiovic Departmet of Statistics Oxford Slides ad other materials available at: http://www.stats.ox.ac.u/~sejdiov/sdmml Probabilistic Methods

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet

More information

Matrix Representation of Data in Experiment

Matrix Representation of Data in Experiment Matrix Represetatio of Data i Experimet Cosider a very simple model for resposes y ij : y ij i ij, i 1,; j 1,,..., (ote that for simplicity we are assumig the two () groups are of equal sample size ) Y

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

Clases 7-8: Métodos de reducción de varianza en Monte Carlo *

Clases 7-8: Métodos de reducción de varianza en Monte Carlo * Clases 7-8: Métodos de reducció de variaza e Mote Carlo * 9 de septiembre de 27 Ídice. Variace reductio 2. Atithetic variates 2 2.. Example: Uiform radom variables................ 3 2.2. Example: Tail

More information

Markov Decision Processes

Markov Decision Processes Markov Decisio Processes Defiitios; Statioary policies; Value improvemet algorithm, Policy improvemet algorithm, ad liear programmig for discouted cost ad average cost criteria. Markov Decisio Processes

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Stat410 Probability and Statistics II (F16)

Stat410 Probability and Statistics II (F16) Some Basic Cocepts of Statistical Iferece (Sec 5.) Suppose we have a rv X that has a pdf/pmf deoted by f(x; θ) or p(x; θ), where θ is called the parameter. I previous lectures, we focus o probability problems

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Monte Carlo Integration

Monte Carlo Integration Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3

Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 No-Parametric Techiques Jacob Hays Amit Pillay James DeFelice 4.1, 4.2, 4.3 Parametric vs. No-Parametric Parametric Based o Fuctios (e.g Normal Distributio) Uimodal Oly oe peak Ulikely real data cofies

More information

f(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1

f(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1 Parameter Estimatio Samples from a probability distributio F () are: [,,..., ] T.Theprobabilitydistributio has a parameter vector [,,..., m ] T. Estimator: Statistic used to estimate ukow. Estimate: Observed

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Introduction of Expectation-Maximization Algorithm, Cross-Entropy Method and Genetic Algorithm

Introduction of Expectation-Maximization Algorithm, Cross-Entropy Method and Genetic Algorithm Itroductio of Expectatio-Maximizatio Algorithm, Cross-Etropy Method ad Geetic Algorithm Wireless Iformatio Trasmissio System Lab. Istitute of Commuicatios Egieerig Natioal Su Yat-se Uiversity 2012/07/23

More information

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018 CSE 353 Discrete Computatioal Structures Sprig 08 Sequeces, Mathematical Iductio, ad Recursio (Chapter 5, Epp) Note: some course slides adopted from publisher-provided material Overview May mathematical

More information

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice 0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model Back to Maximum Likelihood Give a geerative model f (x, y = k) =π k f k (x) Usig a geerative modellig approach, we assume a parametric form for f k (x) =f (x; k ) ad compute the MLE θ of θ =(π k, k ) k=

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

Introduction to Optimization Techniques. How to Solve Equations

Introduction to Optimization Techniques. How to Solve Equations Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Computing the maximum likelihood estimates: concentrated likelihood, EM-algorithm. Dmitry Pavlyuk

Computing the maximum likelihood estimates: concentrated likelihood, EM-algorithm. Dmitry Pavlyuk Computig the maximum likelihood estimates: cocetrated likelihood, EM-algorithm Dmitry Pavlyuk The Mathematical Semiar, Trasport ad Telecommuicatio Istitute, Riga, 13.05.2016 Presetatio outlie 1. Basics

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification INF 4300 90 Itroductio to classifictio Ae Solberg ae@ifiuioo Based o Chapter -6 i Duda ad Hart: atter Classificatio 90 INF 4300 Madator proect Mai task: classificatio You must implemet a classificatio

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Axis Aligned Ellipsoid

Axis Aligned Ellipsoid Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Linear Support Vector Machines

Linear Support Vector Machines Liear Support Vector Machies David S. Roseberg The Support Vector Machie For a liear support vector machie (SVM), we use the hypothesis space of affie fuctios F = { f(x) = w T x + b w R d, b R } ad evaluate

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

10.6 ALTERNATING SERIES

10.6 ALTERNATING SERIES 0.6 Alteratig Series Cotemporary Calculus 0.6 ALTERNATING SERIES I the last two sectios we cosidered tests for the covergece of series whose terms were all positive. I this sectio we examie series whose

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Department of Mathematics

Department of Mathematics Departmet of Mathematics Ma 3/103 KC Border Itroductio to Probability ad Statistics Witer 2017 Lecture 19: Estimatio II Relevat textbook passages: Larse Marx [1]: Sectios 5.2 5.7 19.1 The method of momets

More information

TAMS24: Notations and Formulas

TAMS24: Notations and Formulas TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: =

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information