The conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above

Size: px
Start display at page:

Download "The conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above"

Transcription

1 The conjugate pror to a Bernoull s A) Bernoull B) Gaussan C) Beta D) none of the above

2 The conjugate pror to a Gaussan s A) Bernoull B) Gaussan C) Beta D) none of the above

3 MAP estmates A) argmax θ p(θ D) B) argmax θ p(d θ) C) argmax θ p(d θ)p(θ) D) None of the above

4 MLE estmates A) argmax θ p(θ D) B) argmax θ p(d θ) C) argmax θ p(d θ)p(θ) D) None of the above

5 Consstent estmator A consstent estmator (or asymptotcally consstent estmator) s an estmator a rule for computng estmates of a parameter θ havng the property that as the number of data ponts used ncreases ndefntely, the resultng sequence of estmates converges n probablty to the true parameter θ.

6 Whch s consstent for our conflppng example? A) MLE B) MAP C) Both D) Nether P(D θ) P(θ D) ~ P(D θ)p(θ) Slde 6

7 Covarance Gven random varables X and Y wth jont densty p(x, y) and means E(X) = µ 1, E(Y) = µ The covarance of X and Y s cov(x,y) = E[(X µ 1 )(Y µ )] cov(x, Y) = E(XY) E(X) E(Y) Proof follows easly from the defnton cov(x, X) = var(x)

8 Covarance If X and Y are ndependent then cov(x, Y) = 0. A) True B) False If cov(x, Y) = 0 then X and Y are ndependent. A) True B) False

9 Covarance If X and Y are ndependent then cov(x, Y) = 0 Proof: Independence of X and Y mples that E(XY) = E(X)E(Y). Remark: The converse f NOT true n general. It can happen that the covarance s 0 but X and Y are hghly dependent. (Try to thnk of an example.) For the bvarate normal case the converse does hold.

10 Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your own needs. PowerPont orgnals are avalable. If you make use of a sgnfcant porton of these sldes n your own lecture, please nclude ths message, or the followng lnk to the source repostory of Andrew s ntroducton tutorals: to regresson Comments and correctons gratefully receved. Mostly by Andrew W. Moore But wth modfcatons by Lyle Ungar Copyrght 001, 003, Andrew W. Moore Slde 10

11 Two nterpretatons of regresson Lnear regresson ŷ = w. x Probablstc/Bayesan (MLE and MAP) y ~ N(w. x, σ ) argmax w p(d w) argmax w p(d w)p(w) Error mnmzaton y - w. X p p + λ w q q here: argmax w p(y w,x) But frst, we ll look at Gaussans

12 Sngle-Parameter Lnear Regresson Copyrght 001, 003, Andrew W. Moore

13 Lnear Regresson nputs outputs 1 w x 1 = 1 y 1 = 1 x = 3 y =. x 3 = y 3 = x 4 = 1.5 y 4 = 1.9 x 5 = 4 y 5 = 3.1 Lnear regresson assumes that the expected value of the output gven an nput, E[y x], s lnear. Smplest case: Out(x) = wx for some unknown w. Copyrght 001, 003, Andrew W. Moore Gven the data, we can estmate w.

14 Copyrght 001, 003, Andrew W. Moore 1-parameter lnear regresson Assume that the data s formed by y = wx + nose where the nose sgnals are ndependent the nose has a normal dstrbuton wth mean 0 and unknown varance σ p(y w,x) has a normal dstrbuton wth mean wx varance σ

15 Bayesan Lnear Regresson Copyrght 001, 003, Andrew W. Moore p(y w,x) = Normal (mean: wx, varance: σ ) y ~ N(wx, σ ) We have a set of data (x 1,y 1 ) (x,y ) (x n,y n ) We want to nfer w from the data. p(w x 1, x, x 3, x n, y 1, y y n ) = P(w D) You can use BAYES rule to work out a posteror dstrbuton for w gven the data. Or you could do Maxmum Lkelhood Estmaton

16 Maxmum lkelhood estmaton of w MLE asks : For whch value of w s ths data most lkely to have happened? <=> For what w s p(y 1, y y n w, x 1, x, x 3, x n ) maxmzed? <=> n For what w s ( y w, x ) maxmzed? Copyrght 001, 003, Andrew W. Moore p = 1

17 For what w s n For what w s Copyrght 001, 003, Andrew W. Moore p( y = 1 For what w s For what w s w, x 1 ( ) n y exp( wx = 1 σ n 1 = 1 σ n = 1 y ) maxmzed? wx ) maxmzed? maxmzed? ( ) mnmzed? y wx

18 Frst result MLE wth Gaussan nose s the same as mnmzng the L error argmn n =1 ( y wx ) Copyrght 001, 003, Andrew W. Moore

19 Lnear Regresson The maxmum lkelhood w s the one that mnmzes sum-of-squares of resduals We want to mnmze a quadratc functon of w. Copyrght 001, 003, Andrew W. Moore Ε = = E(w) y ( y wx ) w ( x y ) w + ( x ) w

20 Easy to show the sum of squares s mnmzed when x y w = x Copyrght 001, 003, Andrew W. Moore Lnear Regresson The maxmum lkelhood model s Out x We can use t for predcton ( ) = wx

21 Lnear Regresson Easy to show the sum of squares s mnmzed when x y w = x The maxmum lkelhood model s Out ( x ) = wx We can use t for predcton p(w) w Note: In Bayesan stats you d have ended up wth a prob dstrbuton of w And predctons would have gven a prob dsrbuton of expected output Often useful to know your confdence. Max lkelhood can gve some knds of confdence too. Copyrght 001, 003, Andrew W. Moore

22 But what about MAP? MLE n argmax p(y w, x ) =1 MAP argmax n =1 p(y w, x )p(w) Copyrght 001, 003, Andrew W. Moore

23 MAP argmax We assumed y ~ N(w x, σ ) But what about MAP? Now add a pror that assumpton that w ~ N(0, γ ) n =1 p(y w, x )p(w)

24 For what w s n =1 p(y For what w s n =1 exp( 1 For what w s n =1 1 For what w s n =1 # % $ w, x ) p(w) maxmzed? ( y wx σ y wx σ ( y wx ) ) ) exp( 1 ( w γ ) )maxmzed? & ( ' 1 ( w γ ) maxmzed? + ( σ w γ ) mnmzed? Copyrght 001, 003, Andrew W. Moore

25 Second result MAP wth a Gaussan pror on w s the same as mnmzng the L error plus an L penalty on w argmn Ths s called Rdge regresson Shrnkage Regularzaton Copyrght 001, 003, Andrew W. Moore n =1 ( y wx ) + λw

26 The speed of lectures s A) too slow B) good C) too fast Copyrght Andrew W. Moore

27 Multvarate Lnear Regresson Copyrght 001, 003, Andrew W. Moore

28 Multvarate Regresson What f the nputs are vectors? x -d nput example Copyrght 001, 003, Andrew W. Moore x 1 Dataset has form x 1 y 1 x y x 3 y 3.: :. x n y n

29 Multvarate Regresson Wrte matrx X and Y thus: x =...x x...!...x n... = x 11 x 1... x 1p x 1 x... x p! x n1 x n... x np y = y 1 y! y n (There are R data ponts. Each nput has m components) The lnear regresson model assumes a vector w such that Out(x) = x. w = w 1 x[1] + w x[] +.w p x[p] Copyrght 001, 003, Andrew W. Moore The max. lkelhood w s w = (X T X) -1 (X T y)

30 Multvarate Regresson Wrte matrx X and Y thus: x =... x... x!... x 1 R = x x x 11 1 R1 x x x 1 R......!... x x x 1m m Rm y = y y! y 1 R (There are R dataponts. Each nput has m components) The lnear regresson model assumes a vector w such that Out(x) = w T x = w 1 x[1] + w x[] +.w m x[d] Copyrght 001, 003, Andrew W. Moore IMPORTANT EXERCISE: PROVE IT!!!!! The max. lkelhood w s w = (X T X) -1 (X T Y)

31 Multvarate Regresson (con t) The max. lkelhood w s w = (X T X) -1 (X T y) X T X s an m x m matrx:,j th element s X T Y s an m-element vector: th element R k = 1 R k=1 x k x kj x k y k Copyrght 001, 003, Andrew W. Moore

32 Constant Term n Lnear Regresson Copyrght 001, 003, Andrew W. Moore

33 What about a constant term? We may expect lnear data that does not go through the orgn. Statstcans and Neural Net Folks all agree on a smple obvous hack. Can you guess?? Copyrght 001, 003, Andrew W. Moore

34 The constant term The trck s to create a fake nput X 0 that always takes the value 1 X 1 X Y Before: Y=w 1 X 1 + w X has to be a poor model Copyrght 001, 003, Andrew W. Moore In ths example, You should be able to see the MLE w 0, w 1 and w by nspecton X 0 X 1 X Y After: Y= w 0 X 0 +w 1 X 1 + w X = w 0 +w 1 X 1 + w X has a fne constant term

35 Heteroscedastcty... Lnear Regresson wth varyng nose Copyrght 001, 003, Andrew W. Moore

36 Regresson wth varyng nose Suppose you know the varance of the nose that was added to each datapont. x y σ ½ ½ /4 3 4 y=3 y= y=1 y=0 σ= σ=1 3 1/4 ~ (, Assume y N wx σ ) σ= σ=1/ x=0 x=1 x= x=3 σ=1/ Copyrght 001, 003, Andrew W. Moore

37 Copyrght 001, 003, Andrew W. Moore MLE estmaton wth varyng nose = ),,...,,,,...,,,...,, ( log argmax w x x x y y y p w R R R σ σ σ = = R y wx w 1 ) ( argmn σ = = = 0 ) ( such that 1 R wx y x w σ = = R R x y x 1 1 σ σ Assumng ndependence among nose and then pluggng n equaton for Gaussan and smplfyng. Settng dll/dw equal to zero Trval algebra

38 Ths s Weghted Regresson We are askng to mnmze the weghted sum of squares argmn w R = 1 ( y wx σ ) y=3 y= y=1 y=0 where weght for th datapont s σ= σ=1 σ= σ=1/ x=0 x=1 x= x=3 1 σ σ=1/ Copyrght 001, 003, Andrew W. Moore

39 Non-lnear Regresson Copyrght 001, 003, Andrew W. Moore

40 Non-lnear Regresson Suppose you know that y s related to a functon of x n such a way that the predcted values have a non-lnear dependence on w, e.g: x ½ y ½ y=3 y= y=1 y=0 x=0 x=1 x= x=3 Assume y ~ N( w + x, σ ) Copyrght 001, 003, Andrew W. Moore

41 argmax w Non-lnear MLE estmaton log p( y1, y,..., yr x1, x,..., x argmn w w such that R = 1 R = 1 ( y w + ) = y x w + x w + x R = 0 =, σ, w) = Assumng..d. and then pluggng n equaton for Gaussan and smplfyng. Settng dll/dw equal to zero Copyrght 001, 003, Andrew W. Moore

42 argmax w Non-lnear MLE estmaton log p( y1, y,..., yr x1, x,..., x argmn w w such that R = 1 R = 1 ( y w + ) = y x w + x w + x R = 0 =, σ, w) = Assumng..d. and then pluggng n equaton for Gaussan and smplfyng. Settng dll/dw equal to zero Copyrght 001, 003, Andrew W. Moore We re down the algebrac tolet So guess what we do?

43 argmax Copyrght 001, 003, Andrew W. Moore Non-lnear MLE estmaton log p( y1, y,..., yr x1, x,..., xr, σ, w) = w Common (but not only) approach: R Numercal Solutons: argmn ( y w + x ) = = 1 Lne Search w Smulated Annealng R y + w x Gradent Descent w such that = = 0 = 1 w + x Conjugate Gradent Levenberg Marquart Newton s Method Also, specal purpose statstcaloptmzaton-specfc trcks such as E.M. (See Gaussan Mxtures lecture for ntroducton) Assumng..d. and then pluggng n equaton for Gaussan and smplfyng. Settng dll/dw equal to zero We re down the algebrac tolet So guess what we do?

44 Polynomal Regresson Copyrght 001, 003, Andrew W. Moore

45 X 1 X Y Polynomal Regresson So far we ve manly been dealng wth lnear regresson X= 3 y= 1 1 : : : : : x 1 =(3,).. y 1 =7.. Z= 1 3 y= : : : β=(z T Z) -1 (Z T y) z 1 =(1,3,).. z k =(1,x k1,x k ) y 1 = : y est = β 0 + β 1 x 1 + β x Copyrght 001, 003, Andrew W. Moore

46 Quadratc Regresson It s trval to do lnear fts of fxed nonlnear bass functons X 1 X Y X= 3 y= : : : : : x 1 =(3,).. y 1 = y= Z= β=(z : : T Z) -1 (Z T y) : z=(1, x 1, x, x 1, x 1 x,x y, ) est = β 0 + β 1 x 1 + β x + β 3 x 1 + β 4 x 1 x + β 5 x : Copyrght 001, 003, Andrew W. Moore

47 Quadratc Regresson It s Each trval component to do lnear of a fts z vector of fxed s called nonlnear a term. bass functons X 1 X Y X= 3 y= Each column of the Z matrx s called a term 7 column How many terms n a quadratc regresson wth m nputs? : : : 1 constant term : : : x 1 =(3,).. y 1 =7.. 1 m 3 lnear terms y= Z= 7 1 (m+1)-choose = 1 m(m+1)/ 1 quadratc terms 3 β=(z : (m+)-choose- terms n : total = O(m ) T Z) -1 (Z T y) : z=(1, x 1, x, x 1, x 1 x,x y, ) est = β 0 + β 1 x 1 + β x + β 3 x 1 + β 4 x 1 x + β 5 x Note that solvng β=(z T Z) -1 (Z T y) s thus O(m 6 ) Copyrght 001, 003, Andrew W. Moore

48 Q th -degree polynomal Regresson X 1 X Y X= 3 y= 1 1 : : : : : x 1 =(3,).. y 1 = y= 7 Z= : : z=(all products of powers of nputs n whch sum of powers s q or less, ) 7 3 : β=(z T Z) -1 (Z T y) y est = β 0 + β 1 x 1 + Copyrght 001, 003, Andrew W. Moore

49 m nputs, degree Q: how many terms? = the number of unque terms m of the form q1 q qm x x... x where q Q 1 m =1 = the number of unque terms m of the form q0 q1 q qm 1 x x... x where q = Q 1 m = the number of lsts of non-negatve ntegers [q 0,q 1,q,..q m ] n whch Σq = Q = the number of ways of placng Q red dsks on a row of squares of length Q+m = (Q+m)-choose-Q =0 q 0 = q 1 = q =0 q 3 =4 q 4 =3 Q=11, m=4 Copyrght 001, 003, Andrew W. Moore

50 What we have seen MLE wth Gaussan nose s the same as mnmzng the L error Other nose models wll gve other loss functons MLE wth a Gaussan pror adds a penalty to the L error, gvng Rdge regresson Other prors wll gve dfferent penaltes One can make nonlnear relatons lnear by transformng the features Polynomal regresson Radal Bass Functons (RBF) wll be covered later

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification Instance-Based earnng (a.k.a. memory-based learnng) Part I: Nearest Neghbor Classfcaton Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n

More information

Clustering with Gaussian Mixtures

Clustering with Gaussian Mixtures Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

Single- Parameter Linear Regression

Single- Parameter Linear Regression Predctng Real-valued outputs an ntroducton to Regresson Note to other teachers and users of these sldes. Andrew would be delghted f ou found ths source materal useful n gvng our own lectures. Feel free

More information

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Learning with Maximum Likelihood

Learning with Maximum Likelihood Learnng wth Mamum Lelhood Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm,

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Clustering with Gaussian Mixtures

Clustering with Gaussian Mixtures Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore 8/5/17 Data Modelng Patrce Koehl Department of Bologcal Scences atonal Unversty of Sngapore http://www.cs.ucdavs.edu/~koehl/teachng/bl59 koehl@cs.ucdavs.edu Data Modelng Ø Data Modelng: least squares Ø

More information

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil Outlne Multvarate Parametrc Methods Steven J Zel Old Domnon Unv. Fall 2010 1 Multvarate Data 2 Multvarate ormal Dstrbuton 3 Multvarate Classfcaton Dscrmnants Tunng Complexty Dscrete Features 4 Multvarate

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maxmum Lkelhood Estmaton INFO-2301: Quanttatve Reasonng 2 Mchael Paul and Jordan Boyd-Graber MARCH 7, 2017 INFO-2301: Quanttatve Reasonng 2 Paul and Boyd-Graber Maxmum Lkelhood Estmaton 1 of 9 Why MLE?

More information

Generative classification models

Generative classification models CS 675 Intro to Machne Learnng Lecture Generatve classfcaton models Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Data: D { d, d,.., dn} d, Classfcaton represents a dscrete class value Goal: learn

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s

More information

Support Vector Machines

Support Vector Machines Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class

More information

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger JAB Chan Long-tal clams development ASTIN - September 2005 B.Verder A. Klnger Outlne Chan Ladder : comments A frst soluton: Munch Chan Ladder JAB Chan Chan Ladder: Comments Black lne: average pad to ncurred

More information

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression 11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Classification learning II

Classification learning II Lecture 8 Classfcaton learnng II Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Logstc regresson model Defnes a lnear decson boundar Dscrmnant functons: g g g g here g z / e z f, g g - s a logstc functon

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Regression Analysis. Regression Analysis

Regression Analysis. Regression Analysis Regresson Analyss Smple Regresson Multvarate Regresson Stepwse Regresson Replcaton and Predcton Error 1 Regresson Analyss In general, we "ft" a model by mnmzng a metrc that represents the error. n mn (y

More information

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

Hidden Markov Models

Hidden Markov Models Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Machine learning: Density estimation

Machine learning: Density estimation CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

β0 + β1xi and want to estimate the unknown

β0 + β1xi and want to estimate the unknown SLR Models Estmaton Those OLS Estmates Estmators (e ante) v. estmates (e post) The Smple Lnear Regresson (SLR) Condtons -4 An Asde: The Populaton Regresson Functon B and B are Lnear Estmators (condtonal

More information

Logistic Classifier CISC 5800 Professor Daniel Leeds

Logistic Classifier CISC 5800 Professor Daniel Leeds lon 9/7/8 Logstc Classfer CISC 58 Professor Danel Leeds Classfcaton strategy: generatve vs. dscrmnatve Generatve, e.g., Bayes/Naïve Bayes: 5 5 Identfy probablty dstrbuton for each class Determne class

More information

Regression and Classification with Neural Networks

Regression and Classification with Neural Networks egresson and Classfcaton th Neural Netors Note to other teachers and users of these sldes. Andre ould be delghted f you found ths source materal useful n gvng your on lectures. Feel free to use these sldes

More information

Semi-Supervised Learning

Semi-Supervised Learning Sem-Supervsed Learnng Consder the problem of Prepostonal Phrase Attachment. Buy car wth money ; buy car wth wheel There are several ways to generate features. Gven the lmted representaton, we can assume

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

15-381: Artificial Intelligence. Regression and cross validation

15-381: Artificial Intelligence. Regression and cross validation 15-381: Artfcal Intellgence Regresson and cross valdaton Where e are Inputs Densty Estmator Probablty Inputs Classfer Predct category Inputs Regressor Predct real no. Today Lnear regresson Gven an nput

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

Support Vector Machines

Support Vector Machines /14/018 Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics /7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space

More information

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5). (out of 15 ponts) STAT 3340 Assgnment 1 solutons (10) (10) 1. Fnd the equaton of the lne whch passes through the ponts (1,1) and (4,5). β 1 = (5 1)/(4 1) = 4/3 equaton for the lne s y y 0 = β 1 (x x 0

More information

Topic 5: Non-Linear Regression

Topic 5: Non-Linear Regression Topc 5: Non-Lnear Regresson The models we ve worked wth so far have been lnear n the parameters. They ve been of the form: y = Xβ + ε Many models based on economc theory are actually non-lnear n the parameters.

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

The big picture. Outline

The big picture. Outline The bg pcture Vncent Claveau IRISA - CNRS, sldes from E. Kjak INSA Rennes Notatons classes: C = {ω = 1,.., C} tranng set S of sze m, composed of m ponts (x, ω ) per class ω representaton space: R d (=

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the Chapter 11 Student Lecture Notes 11-1 Lnear regresson Wenl lu Dept. Health statstcs School of publc health Tanjn medcal unversty 1 Regresson Models 1. Answer What Is the Relatonshp Between the Varables?.

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces

More information

Linear Regression Introduction to Machine Learning. Matt Gormley Lecture 5 September 14, Readings: Bishop, 3.1

Linear Regression Introduction to Machine Learning. Matt Gormley Lecture 5 September 14, Readings: Bishop, 3.1 School of Computer Scence 10-601 Introducton to Machne Learnng Lnear Regresson Readngs: Bshop, 3.1 Matt Gormle Lecture 5 September 14, 016 1 Homework : Remnders Extenson: due Frda (9/16) at 5:30pm Rectaton

More information

Linear Classification, SVMs and Nearest Neighbors

Linear Classification, SVMs and Nearest Neighbors 1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush

More information

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Lecture 2: Prelude to the big shrink

Lecture 2: Prelude to the big shrink Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Probabilistic Classification: Bayes Classifiers. Lecture 6:

Probabilistic Classification: Bayes Classifiers. Lecture 6: Probablstc Classfcaton: Bayes Classfers Lecture : Classfcaton Models Sam Rowes January, Generatve model: p(x, y) = p(y)p(x y). p(y) are called class prors. p(x y) are called class condtonal feature dstrbutons.

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College Unverst at Alban PAD 705 Handout: Maxmum Lkelhood Estmaton Orgnal b Davd A. Wse John F. Kenned School of Government, Harvard Unverst Modfcatons b R. Karl Rethemeer Up to ths pont n

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Statistical pattern recognition

Statistical pattern recognition Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Statistics MINITAB - Lab 2

Statistics MINITAB - Lab 2 Statstcs 20080 MINITAB - Lab 2 1. Smple Lnear Regresson In smple lnear regresson we attempt to model a lnear relatonshp between two varables wth a straght lne and make statstcal nferences concernng that

More information

Section 8.3 Polar Form of Complex Numbers

Section 8.3 Polar Form of Complex Numbers 80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Laboratory 1c: Method of Least Squares

Laboratory 1c: Method of Least Squares Lab 1c, Least Squares Laboratory 1c: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

Recitation 2. Probits, Logits, and 2SLS. Fall Peter Hull

Recitation 2. Probits, Logits, and 2SLS. Fall Peter Hull 14.387 Rectaton 2 Probts, Logts, and 2SLS Peter Hull Fall 2014 1 Part 1: Probts, Logts, Tobts, and other Nonlnear CEFs 2 Gong Latent (n Bnary): Probts and Logts Scalar bernoull y, vector x. Assume y =

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

Differentiating Gaussian Processes

Differentiating Gaussian Processes Dfferentatng Gaussan Processes Andrew McHutchon Aprl 17, 013 1 Frst Order Dervatve of the Posteror Mean The posteror mean of a GP s gven by, f = x, X KX, X 1 y x, X α 1 Only the x, X term depends on the

More information

Small Area Interval Estimation

Small Area Interval Estimation .. Small Area Interval Estmaton Partha Lahr Jont Program n Survey Methodology Unversty of Maryland, College Park (Based on jont work wth Masayo Yoshmor, Former JPSM Vstng PhD Student and Research Fellow

More information

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values Fall 007 Soluton to Mdterm Examnaton STAT 7 Dr. Goel. [0 ponts] For the general lnear model = X + ε, wth uncorrelated errors havng mean zero and varance σ, suppose that the desgn matrx X s not necessarly

More information

Communication with AWGN Interference

Communication with AWGN Interference Communcaton wth AWG Interference m {m } {p(m } Modulator s {s } r=s+n Recever ˆm AWG n m s a dscrete random varable(rv whch takes m wth probablty p(m. Modulator maps each m nto a waveform sgnal s m=m

More information

Expected Value and Variance

Expected Value and Variance MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Hydrological statistics. Hydrological statistics and extremes

Hydrological statistics. Hydrological statistics and extremes 5--0 Stochastc Hydrology Hydrologcal statstcs and extremes Marc F.P. Berkens Professor of Hydrology Faculty of Geoscences Hydrologcal statstcs Mostly concernes wth the statstcal analyss of hydrologcal

More information

Properties of Least Squares

Properties of Least Squares Week 3 3.1 Smple Lnear Regresson Model 3. Propertes of Least Squares Estmators Y Y β 1 + β X + u weekly famly expendtures X weekly famly ncome For a gven level of x, the expected level of food expendtures

More information

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0 MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector

More information

p(z) = 1 a e z/a 1(z 0) yi a i x (1/a) exp y i a i x a i=1 n i=1 (y i a i x) inf 1 (y Ax) inf Ax y (1 ν) y if A (1 ν) = 0 otherwise

p(z) = 1 a e z/a 1(z 0) yi a i x (1/a) exp y i a i x a i=1 n i=1 (y i a i x) inf 1 (y Ax) inf Ax y (1 ν) y if A (1 ν) = 0 otherwise Dustn Lennon Math 582 Convex Optmzaton Problems from Boy, Chapter 7 Problem 7.1 Solve the MLE problem when the nose s exponentally strbute wth ensty p(z = 1 a e z/a 1(z 0 The MLE s gven by the followng:

More information

Laboratory 3: Method of Least Squares

Laboratory 3: Method of Least Squares Laboratory 3: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly they are correlated wth

More information

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Lossy Compression. Compromise accuracy of reconstruction for increased compression. Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost

More information

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation Econ 388 R. Butler 204 revsons Lecture 4 Dummy Dependent Varables I. Lnear Probablty Model: the Regresson model wth a dummy varables as the dependent varable assumpton, mplcaton regular multple regresson

More information

9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov

9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov 9.93 Class IV Part I Bayesan Decson Theory Yur Ivanov TOC Roadmap to Machne Learnng Bayesan Decson Makng Mnmum Error Rate Decsons Mnmum Rsk Decsons Mnmax Crteron Operatng Characterstcs Notaton x - scalar

More information