Classification as a Regression Problem

Size: px
Start display at page:

Download "Classification as a Regression Problem"

Transcription

1 Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class representaton s qute arbtrary; Careful numercal class representaton s a crtcal step. Bnary Classfcaton Let us represent the classes C and C wth numercal values and,.e., f y C then y =, and f y C then y = Snce we have assgned numerc values to classes, bnary classfcaton can be consdered to be a regresson problem. The optmal predctor for regresson s * f x) = E y x = P( y C x) + P( y C x) = P( y C x [ ] ) ( Therefore, the optmal predctor outputs the posteror probablty of class C. By applyng the Bayes classfcaton rule we can obtan the Bayes classfer!! Important Concluson: Wth approprate class representaton, the optmal classfcaton s equvalent to optmal regresson. ult-class Classfcaton (K classes) Could the prevous result be generalzed to mult-class classfcaton? Example. 3 classes: whte, black, and blue. Let us examne the representaton: y = - (class s whte ), y = (class s black ), and y = (class s blue ) Dscusson: The representaton s napproprate snce t enforces order; mples that whte and blue are further away then let s say black and blue. What f t s evdent that an example could be ether whte or blue, but defntely not black? The proposed representaton would probably lead to a completely msleadng answer black!! Soluton: Decompose the mult-class problem to K bnary classfcaton problems: Problem ( K ) : f y C then y = f y C then y = The regresson functon E [ y x] on Problem wll equal the posteror P( y C x). By repeatng for all K classes, all posterors wll be avalable, whch s suffcent to construct the Bayes classfer!!

2 Approaches to nmzng SE from a Fnte Dataset Goal: Gven a dataset D fnd a mappng f that mnmzes SE. Two extreme approaches:. earest neghbor algorthm (non-parametrc approach). Lnear regresson (parametrc approach) on-parametrc approach assumes that data ponts close n the attrbute space are smlar to each other. It does not assume any functonal form. Parametrc approach assumes a functonal form: e.g., the output s a lnear functon of the nputs. earest eghbor Algorthms 6 (x) k-nearest neghbor (k-): ( x) = y k x k ( x) f, where (x) are the k nearest neghbors of x k Regresson: Classfcaton: The predcton s the average y among the k nearest neghbors The predcton s the maorty class among the k nearest neghbors k- s also known as a lazy learnng algorthm because t does not perform any calculatons pror to seeng a data pont; t has to analyze the whole dataset for the nearest neghbors every tme a new data pont appears. ote: Parametrc learnng algorthms learn a functon from the dataset; they are much faster n gvng predctons but need to spend some tme beforehand. k Theorem: If, k, (large number of neghbors n a very tght neghborhood) k- s an optmal predctor. Practcally, f we have dmensons ( attrbutes), k- s a good try; f we have more attrbutes, we may run out of data ponts (n practce, data sze s always lmted). Example: Generate an -dmensonal dataset such that all attrbutes X, =,,, are unformly dstrbuted n nterval [,]. What s the length r of the sze of a hypercube that contans % of all data ponts (.e. % of the hypercube volume)? Answer: r =. /.

3 r=. r=. r= In hgh dmensons, all neghborng ponts are far away, and could not be used to accurately estmate values wth k-!! Lnear Regresson Assumes a functonal form f ( x, θ) = θ + θ x + θ x + K + θ x (Eq.) where x = (x, x, x ) are the attrbutes and θ = (θ, θ, θ ) are the functon parameters. ore generally, f ( x, θ) = θ + θ φ( x) + θ φ ( x) + K + θ φ ( x) where φ s are the so called bass functons Example: 4 f ( x, θ) = θ + θ x + θ x + θ 3 x, where x = (x, x ) are the attrbutes and θ = (θ, θ, θ, θ 3 ) are the functon parameters. ote that functon f(x,θ) from the example s lnear n the parameters. We can easly transform t nto a functon from (Eq.) by ntroducng new attrbutes x =, x =x and x =x, and x 3 =x 4. Lnear regresson s sutable for problems where functonal form f(x,θ) s known wth suffcent certanty. Learnng goal: Fnd θ that mnmzes SE SE s a functon of parameters θ, so the problem of mnmzng SE can be solved by standard methods of the unconstraned optmzaton. Illustraton of the regresson: f ( x,θ ) = θ x

4 Lnear Regresson Lnear regresson can be represented by a functonal form: f(x; θ) = θ x +θ x + + θ x = θ ote: x s a dummy attrbute and ts value s a constant equal to. Lnear regresson can also be represented n a graphc form: x θ = x x θ. + output.... θ x Goal: nmze ean Square Error (SE): SE = ( y f ( x ; θ)) = SE s a quadratc functon n parameters θ It s a convex functon There s only one mnmum, t s the global mnmum SE Soluton: Suffcent condton s =, θ, =,,,. θ Therefore, fnd θ such that SE = θ =, θ y k xk x k There are + lnear equatons wth + unknown varables we can get a closed-form soluton. Specal Case: If some attrbute s a lnear combnaton of others, there s no unque soluton. SE θ where: = = y x = θ k = k = x k x (n matrx form) X T Y = X T Xθ, X [ (+)] = {x } =:, =:(+), (x s th attrbute of th data pont) Y [ ] = {y } =:,

5 θ [(+) ] = {θ } =:(+) ote: D = [X Y],.e., [X Y] s what we defned prevously as the data set. The optmal parameter choce s then: θ = (X T X) - X T Y, whch s a closed form soluton. ote: the above soluton exsts f X T X s nvertble,.e. f ts rank equals +,.e. no attrbute s a lnear combnaton of others (n atlab, use functon rank). ote: usng matrx dervatons we can do the optmzaton n a more elegant way by defnng Statstcal results: SE = (Y Xθ) T (Y Xθ) θ SE = X T (Y Xθ) = [(+) ] θ = (X T X) - X T Y Assumpton: the true data generatng process (DGP) s y = β x + e, e s nose wth E(e) =, Var(e)= σ = ote: Ths s a bg assumpton! Questons: How close s the estmate θ to the true value β? Answer : E[θ] = E[(X T X) - X T Y] = (X T X) - X T E[Y] (remember, Y=Xβ+e [x] ) E[θ] = (X T X) - X T Xβ + (X T X) - X T E[e] E[θ] = β + = β Concluson: f we repeat lnear regresson on dfferent data sets sampled accordng to the true DGP, the average θ wll equal β (.e., E[θ] = β), whch are the true parameters. Therefore, the lnear regresson s an unbased predctor. Answer : The varance of parameter estmate θ s Var[θ] = (after some calculaton) = (X T X) - σ Concluson: Var[θ] s a measure of how dfferent estmaton θ s from the true parameters β,.e. how successful s the lnear regresson. Therefore, qualty of lnear regresson depends on the nose level (.e. σ ) and on the data sze. The varance ncreases lnearly wth σ and decreases as / wth the sze of the dataset. ore strngent assumpton: the true DGP s y = β x + e, and e ~ (, σ ) (.e., e s Gaussan addtve nose) = If the assumpton s vald we could: Estmate θ can be consdered as a mult-dmensonal Gaussan varable wth θ = (β, (X T X) - σ ). Therefore, we could do some nce thng such as test the hypothess that β = (.e. that attrbute s not nfluencng the target y).

6 onlnear Regresson Queston: What f we know that f(x;θ) s a non-lnear parametrc functon? For example: f(x;θ) = θ + θ x x θ, ths s a functon nonlnear n parameters. Soluton: nmze SE = ( y f ( x ; θ)) Start from the necessary condton for mnmum: SE f ( x ; ) = ( ( ; )) θ y f x θ = θ θ Agan, we have to solve nonlnear equatons wth unknowns. But, ths tme closed-form soluton s not easy to derve. ath Background: Unconstraned Optmzaton: Problem: Gven f(x), fnd ts mnmum. Popular Soluton: Use the gradent descent algorthm. Idea: The gradent of f(x) at the mnmum s zero vector. So,. start from an ntal guess x ;. calculate gradent f(x ); 3. move n the drecton opposte of the gradent,.e., generate new guess x as x = x α f(x ), where α s a properly selected constant; 4. repeat ths process untl convergence to the mnmum.

7 Two problems wth gradent descent algorthm:. It accepts convergence to a local mnmum. The smplest soluton to avod the local mnmum s to repeat the procedure startng from multple ntal guesses x.. Possble slow convergence to a mnmum. There are a number of algorthms provdng faster convergence (e.g. conugate gradent; second order methods such as ewton or quaz-ewton; nondervatve methods) Back to solvng nonlnear regresson usng gradent descent procedure: Step : Start from an ntal guess for parameters θ. Step k: Update the parameters as θ k+ = θ k α f(θ k ) Specal Case: For lnear predcton the update step would be θ k+ = θ k +αx T (Y Xθ k )

8 Logstc Regresson Classfcaton by SE nmzaton Remember: Classfcaton can be solved by SE mnmzaton methods (E[y x] can be used to derve posterors P(y C x)). Queston: What functonal form f(x;θ) can be an approprate choce for representng posteror class probabltes? Opton : What about lnear model f(x;θ) = s not a good choce. θ x? The range of the functon goes beyond -, so t = Opton : We can use sgmod functon to do squeeze the output of a lnear model to the range between and : f(x;θ) = g( θ x ). If g(z) = e z /(+e z ), optmzng f(x;θ) s called logstc regresson. = Soluton: Logstc regresson can be solved by mnmzng SE. Dervatve SE/ θ s SE = ( y f ( x ; θ)) x = θ θ g' k = k xk ote: Solvng θ SE = results n (+) nonlnear equatons wth (+) unknowns optmzaton can be done by usng gradent descent algorthm.

9 axmum Lkelhood (L) Algorthm Basc Idea: Gven a data set D and a parametrc model wth parameters θ that descrbes the data generatng process, the best soluton θ* s the one that maxmzes P(D θ),.e. θ* = arg max θ P(D θ) P(D θ) s called the lkelhood, so the name of the algorthm that fnds the optmal soluton θ* s called the maxmum lkelhood algorthm. Ths dea can be appled for both unsupervsed and supervsed learnng problems. L for Unsupervsed Learnng: Densty Estmaton Gven D = {x, =,, }, and assumng the functonal form p(x θ) of the data generatng process, the goal s to estmate the optmal parameters θ that maxmze lkelhood P(D θ): P(D θ) = P(x, x,, x θ) By assumng that data ponts x are ndependent and dentcally dstrbuted (d) P(D θ) = p( x θ) (p s the probablty densty functon.) = Snce log(x) s monotoncally ncreasng functon wth x, maxmzaton of P(D θ) s equvalent to maxmzaton of l = log(p(d θ)). l s called the log-lkelhood. So, l = log = ( p( x ) θ) Example: Data set D = {x, =,, } s drawn from a Gaussan dstrbuton wth mean μ and standard devaton σ,.e., X ~ (μ,σ ). Therefore, ( x μ) ( x μ, σ ) = σ e πσ p ( x μ) l = log = πσ σ Values μ and σ that maxmze the log-lkelhood satsfy the necessary condton for local optmum: l = μ = l ˆ x μ =, = σ ˆ = = ( x μ) σ ˆ L for Supervsed Learnng Gven D = {(x,y ), =,, }, and assumng the functonal form p(y x,θ) of the data generatng process, the goal s to estmate the optmal parameters θ that maxmze lkelhood P(D θ): P(D θ) = P(y, y,, y x, x,, x,θ) = /f data s d = = p( x, θ) y

10 L for Regresson Assume the data generatng process corresponds to: y = f ( x, θ) + e, where e ~ (μ,σ ) ote: ths s a relatvely strong assumpton! y ~ ( f ( x, θ), σ ) ( x f ( x, θ)) p ( y x, θ) = e σ πσ l = log P( D θ ) = = log ( y f ( x, θ)) πσ σ Snce σ s a constant, maxmzaton of l s equvalent to mnmzaton of ( y f ( x, θ )) = Important concluson: Regresson usng L under the assumpton of DGP wth addtve Gaussan nose s equvalent to regresson usng SE mnmzaton!! L for Classfcaton There are two man approaches to classfcaton nvolvng L: the Bayesan Estmaton approach, and logstc regresson. Bayesan Estmaton Idea: gven a dataset D, decompose D nto datasets { D}, =,..., k = # of classes, whereu D = D and D D = for all,. For each D, we can estmate the pdf p( x y C ) (the class-condtonal densty). These denstes can be estmated usng the L methods descrbed n Lecture 3, provded we make a (strong) assumpton about the functonal form of the densty (e.g., Gaussan). We also note that ths approach s useful theoretcally and when the nput dmenson s low, but densty estmaton s generally not practcal n hgh dmensons. In order to obtan a classfer, we want to be able to estmate the probabltes py ( C x) (the posteror class probabltes). A new nput wll be assgned to the class wth the hghest estmated posteror class probablty. We can estmate these probabltes by applyng Bayes Theorem: Bayes Theorem: f A & B are events, then PAB ( ) = PB ( APA ) ( ) PB ( ) So we see that p( x y C ) p( y C ) py ( C x) = p( x) where py ( C ) (the pror class probablty) may be estmated wthout reference to the nputs as the relatve frequency of C n the dataset D. For the purpose of classfcaton, t s not necessary to compute

11 p( x ), snce we are nterested only n the relatve szes of the posteror probabltes. Fnally, we may defne the Bayesan classfer by yˆ = arg max C p( y C x) = arg max C p( x y C) p( y C) k We reterate that ths method s only practcally applcable n low dmensons, and requres strong assumptons about the functonal form of the class dstrbutons. k Logstc Regresson The assumptons nvolved n logstc regresson are smlar to those nvolved wth lnear regresson, namely the exstence of a lnear relatonshp between the nputs and the output. In the case of logstc regresson, ths assumpton takes a somewhat dfferent form: we assume that the posteror class probabltes can be estmated as a lnear functon of the nputs, passed through a sgmodal functon. Parameter estmates (coeffcents of the nputs) are then calculated to mnmze SE. For smplcty, assume we are dong bnary classfcaton and that y {,}. Then the logstc regresson model s The lkelhood functon of the data D s gven by η e μ = p( y C x ) = where η = θ x η + e y y pd ( Θ) = py ( x, Θ ) = μ ( μ) = = y y ote that the term μ ( μ ) reduces to the posteror class probablty of class when y =, and the posteror class probablty of class otherwse, so ths expresson makes sense. In order to fnd the L estmators of the parameters, we form the log-lkelhood l = log pd ( Θ) = [ ylog μ + ( y)log( μ )] = The L estmators requre us to solve =, whch s a non-lnear system of (+) equatons n (+) unknowns, so we don t expect a closed form soluton. Hence we would, for nstance, apply the gradent descent algorthm to get the parameter estmates for the classfer. Θ l

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Machine learning: Density estimation

Machine learning: Density estimation CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Lossy Compression. Compromise accuracy of reconstruction for increased compression. Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

Probabilistic Classification: Bayes Classifiers. Lecture 6:

Probabilistic Classification: Bayes Classifiers. Lecture 6: Probablstc Classfcaton: Bayes Classfers Lecture : Classfcaton Models Sam Rowes January, Generatve model: p(x, y) = p(y)p(x y). p(y) are called class prors. p(x y) are called class condtonal feature dstrbutons.

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maxmum Lkelhood Estmaton INFO-2301: Quanttatve Reasonng 2 Mchael Paul and Jordan Boyd-Graber MARCH 7, 2017 INFO-2301: Quanttatve Reasonng 2 Paul and Boyd-Graber Maxmum Lkelhood Estmaton 1 of 9 Why MLE?

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Generative classification models

Generative classification models CS 675 Intro to Machne Learnng Lecture Generatve classfcaton models Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Data: D { d, d,.., dn} d, Classfcaton represents a dscrete class value Goal: learn

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

The big picture. Outline

The big picture. Outline The bg pcture Vncent Claveau IRISA - CNRS, sldes from E. Kjak INSA Rennes Notatons classes: C = {ω = 1,.., C} tranng set S of sze m, composed of m ponts (x, ω ) per class ω representaton space: R d (=

More information

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and

More information

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore 8/5/17 Data Modelng Patrce Koehl Department of Bologcal Scences atonal Unversty of Sngapore http://www.cs.ucdavs.edu/~koehl/teachng/bl59 koehl@cs.ucdavs.edu Data Modelng Ø Data Modelng: least squares Ø

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression 11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Error Probability for M Signals

Error Probability for M Signals Chapter 3 rror Probablty for M Sgnals In ths chapter we dscuss the error probablty n decdng whch of M sgnals was transmtted over an arbtrary channel. We assume the sgnals are represented by a set of orthonormal

More information

Logistic Classifier CISC 5800 Professor Daniel Leeds

Logistic Classifier CISC 5800 Professor Daniel Leeds lon 9/7/8 Logstc Classfer CISC 58 Professor Danel Leeds Classfcaton strategy: generatve vs. dscrmnatve Generatve, e.g., Bayes/Naïve Bayes: 5 5 Identfy probablty dstrbuton for each class Determne class

More information

SDMML HT MSc Problem Sheet 4

SDMML HT MSc Problem Sheet 4 SDMML HT 06 - MSc Problem Sheet 4. The recever operatng characterstc ROC curve plots the senstvty aganst the specfcty of a bnary classfer as the threshold for dscrmnaton s vared. Let the data space be

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

The conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above

The conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above The conjugate pror to a Bernoull s A) Bernoull B) Gaussan C) Beta D) none of the above The conjugate pror to a Gaussan s A) Bernoull B) Gaussan C) Beta D) none of the above MAP estmates A) argmax θ p(θ

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Semi-Supervised Learning

Semi-Supervised Learning Sem-Supervsed Learnng Consder the problem of Prepostonal Phrase Attachment. Buy car wth money ; buy car wth wheel There are several ways to generate features. Gven the lmted representaton, we can assume

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Lecture 2: Prelude to the big shrink

Lecture 2: Prelude to the big shrink Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Learning from Data 1 Naive Bayes

Learning from Data 1 Naive Bayes Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College Unverst at Alban PAD 705 Handout: Maxmum Lkelhood Estmaton Orgnal b Davd A. Wse John F. Kenned School of Government, Harvard Unverst Modfcatons b R. Karl Rethemeer Up to ths pont n

More information

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil Outlne Multvarate Parametrc Methods Steven J Zel Old Domnon Unv. Fall 2010 1 Multvarate Data 2 Multvarate ormal Dstrbuton 3 Multvarate Classfcaton Dscrmnants Tunng Complexty Dscrete Features 4 Multvarate

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of Chapter 7 Generalzed and Weghted Least Squares Estmaton The usual lnear regresson model assumes that all the random error components are dentcally and ndependently dstrbuted wth constant varance. When

More information

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation Econ 388 R. Butler 204 revsons Lecture 4 Dummy Dependent Varables I. Lnear Probablty Model: the Regresson model wth a dummy varables as the dependent varable assumpton, mplcaton regular multple regresson

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Statistical pattern recognition

Statistical pattern recognition Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression STAT 45 BIOSTATISTICS (Fall 26) Handout 5 Introducton to Logstc Regresson Ths handout covers materal found n Secton 3.7 of your text. You may also want to revew regresson technques n Chapter. In ths handout,

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s

More information

4.3 Poisson Regression

4.3 Poisson Regression of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015 CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks Other NN Models Renforcement learnng (RL) Probablstc neural networks Support vector machne (SVM) Renforcement learnng g( (RL) Basc deas: Supervsed dlearnng: (delta rule, BP) Samples (x, f(x)) to learn

More information

Explaining the Stein Paradox

Explaining the Stein Paradox Explanng the Sten Paradox Kwong Hu Yung 1999/06/10 Abstract Ths report offers several ratonale for the Sten paradox. Sectons 1 and defnes the multvarate normal mean estmaton problem and ntroduces Sten

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We

More information

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one) Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2 Salmon: Lectures on partal dfferental equatons 5. Classfcaton of second-order equatons There are general methods for classfyng hgher-order partal dfferental equatons. One s very general (applyng even to

More information

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU Group M D L M Chapter Bayesan Decson heory Xn-Shun Xu @ SDU School of Computer Scence and echnology, Shandong Unversty Bayesan Decson heory Bayesan decson theory s a statstcal approach to data mnng/pattern

More information

Feb 14: Spatial analysis of data fields

Feb 14: Spatial analysis of data fields Feb 4: Spatal analyss of data felds Mappng rregularly sampled data onto a regular grd Many analyss technques for geophyscal data requre the data be located at regular ntervals n space and/or tme. hs s

More information

Quantifying Uncertainty

Quantifying Uncertainty Partcle Flters Quantfyng Uncertanty Sa Ravela M. I. T Last Updated: Sprng 2013 1 Quantfyng Uncertanty Partcle Flters Partcle Flters Appled to Sequental flterng problems Can also be appled to smoothng problems

More information

Probability Density Function Estimation by different Methods

Probability Density Function Estimation by different Methods EEE 739Q SPRIG 00 COURSE ASSIGMET REPORT Probablty Densty Functon Estmaton by dfferent Methods Vas Chandraant Rayar Abstract The am of the assgnment was to estmate the probablty densty functon (PDF of

More information