EIE6207: Maximum-Likelihood and Bayesian Estimation

Size: px
Start display at page:

Download "EIE6207: Maximum-Likelihood and Bayesian Estimation"

Transcription

1 EIE6207: Maximum-Likelihood and Bayesian Estimation Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University mwmak References: Steven M. Kay, Fundamentals of Statistical Signal Processing, Prentice Hall, ovember 12, 2018 Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

2 Overview 1 Introduction to ML Estimators 2 Biased and Unbiased ML Estimators 3 MLE of Transformed Parameters 4 Application: Range Estimation in Radar 5 Bayesian Estimators Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

3 What is ML Estimator? Maximum likelihood (ML) is the most popular estimation approach due to its applicability in complicated estimation problems. The basic principle is simple: Find the parameter θ that is the most probable to have generated the data x. The ML estimator is in general neither unbiased nor optimal in the minimum variance sense. However, asymptotically it becomes unbiased and reaches the Cramer-Rao bound. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

4 Definition The Maximum Likelihood estimate for a scalar parameter θ is defined to be the value that maximizes p(x; θ). log p(x; θ) is a log-likelihood function of θ. ML estimate is θ ML = argmax{log p(x; θ)} θ The figure next page shows the likelihood function and the log-likelihood function for one possible realization of data. The data consists of 50 points, with true A = 5. The likelihood function gives the probability of observing these particular points with different values of A. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

5 Example 1 Consider the DC level in WG: x[n] = A + w[n], n = 0, 1,..., 1, where w[n] (0, σ 2 ). The likelihood and log-likelihood functions are shown below: Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

6 Example 1 We maximize log p(x; A) with respect to A: Setting Setting A ML =  = argmax log p(x; A) = argmax A log p(x;a) A A { 2 log(2πσ2 ) 1 2σ 2 } (x[n] A) 2 = 0, we have the ML estimator  = 1 x[n] log p(x;a) σ 2 = 0, we have the ML estimator for σ 2 : σml 2 = ˆσ 2 = 1 (x[n] Â)2 (1) Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

7 Example 1 Â is unbiased because E{Â} = 1 ˆσ 2 is biased because { } E{ˆσ 2 1 } = E (x[n] Â)2 { } { 1 = E (x[n]) 2 E σ 2 To proof that, we need to use E{x[n]} = 1 A = A 2 x[n]â E{z 2 } = cov(z, z) + µ z = σ 2 z + µ z } + E {Â2 } (2) (3) Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

8 Example 1 The first term in Eq. 2 is { } [ 1 E (x[n]) 2 = 1 ] (cov(x[n], x[n]) + A 2 ) [ = 1 ] (σ 2 + A 2 ) = σ 2 + A 2 The second term in Eq. 2 is { } { 2 2 E x[n]â = E = 2 2 E { m=0 x[n]x[m] = 2 2 [ σ A 2] = 2 } x[n] 1 = 2 2 (A 2 + σ2 x[m] m=0 m=0 ) } (cov(x[n], x[m]) + A 2 ) Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

9 Example 1 The third term in Eq. 2 is E {Â2 } ( 1 = E [ = 1 2 x[n] ) 2 E{(x[n]) 2 } + m=0 = 1 2 [ (σ 2 + A 2 ) + ( 1)A 2] = 1 2 [ 2 A 2 + σ 2] = A 2 + σ2 E{x[n]x[m]} ] Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

10 Example 1 Combining the 3 terms, we have E{ˆσ 2 } = σ 2 + A 2 A 2 σ2 = σ 2 σ2 ( ) 1 = σ 2 σ 2 To make ˆσ 2 unbiased, we need to use ˆσ 2 = 1 (x[n] 1 Â)2 Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

11 Example 2 Consider the DC level in WG: x[n] = A + w[n], n = 0, 1,..., 1, where w[n] (0, A). CRLB cannot be used because log p(x; A) A = 2A + 1 (x[n] A) + 1 A 2A 2 (x[n] A) 2 I(A)(g(x) A) for any functions I(A) and g(x). However, we may use maximum-likelihood and set which gives  2 +  1 x 2 [n] = 0 =  = x 2 [n] log p(x;a) A = 0, Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31 (4)

12 Example 2 This estimator is biased because E{Â} = E x 2 [n] { } E x 2 [n] = A + A = A, as expectation cannot carry over square root. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

13 Example 2 However, if is large enough, the bias is negligible. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

14 Example 2 The ML estimator in Eq. 4 is a reasonable estimator because when, 1 x 2 [n] E{x 2 (n)} = A + A 2 Therefore, Â 1 ( 2 + A + 1 ) = A when 2 The MLE always becomes optimal and unbiased as. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

15 MLE of Transformed Parameters Often it is required to estimate a transformed parameter instead of the one the PDF depends on. For example, in the DC-level problem we might be interested in the power of the signal A 2 instead of the mean A. Given x[n] = A + w[n], n = 0, 1,..., 1, where w[n] (0, σ 2 ), find the MLE of a transformed parameter: The log-likelihood function is α = exp(a) log p T (x; α) = 2 log(2πσ2 ) 1 2σ 2 (x[n] log α) 2 Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

16 MLE of Transformed Parameters Setting the derivative of log p T (x; α) to 0 yields (x[n] log ˆα) = 0 = ˆα = exp( x), 1ˆα where ˆα > 0. Things get more complicated if the transformation is We need to consider two PDFs: α = A 2 = A = ± α log p T1 (x; α) = const 1 2σ 2 (x[n] α) 2 for α 0, A 0 log p T2 (x; α) = const 1 2σ 2 (x[n] + α) 2 for α > 0, A < 0 Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

17 MLE of Transformed Parameters Then, we solve the ML estimation problem in both cases and choose the one that has higher maximum value: ˆα = argmax{p T1 (x; α), p T2 (x; α)} α It can be easily shown that the MLE is ˆα = Â2 = x 2. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

18 Invariance property of the MLE Given a PDF p(x; θ) paramaterized by θ, the MLE of the parameter α = g(θ) is ˆα = g(ˆθ), where ˆθ is the MLE of θ, which is obtained by maximizing p(x; θ). If g is not a one-to-one function, then ˆα maximizes the modified likelihood function p(x; θ) = max p(x; θ) θ:α=g(θ) Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

19 Application: Range Estimation in Radar In radar or sonar, a signal pulse is transmitted. The round trip delay τ 0 from the transmitter to the target and back is related to the range R as τ 0 = 2R/c, where c is the speed of propagation. In analog form, the received signal can be written as x(t) = s(t τ 0 ) + w(t) 0 t T, where s(t) is the transmitted signal and w(t) is noise with variance σ 2. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

20 Application: Range Estimation in Radar After discretisation, we have w[n] 0 n n 0 1 x[n] = s[n n 0 ] + w[n] n 0 n n 0 + M 1 w[n] n 0 + M n 1 where M is the length of the sampled signal and n 0 = F s τ 0, where F s is the sample rate, which must be at least twice the bandwidth of the signal. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

21 Application: Range Estimation in Radar Assume that everything is Gaussian, the PDF is p(x; n 0 ) = n 0 1 { 1 exp 1 } 2πσ 2 2σ 2 x2 [n] n 0 +M 1 { 1 exp 1 } n=n 0 2πσ 2 2σ 2 (x[n] s[n n 0) 2 n=n 0 +M { 1 = exp (2πσ 2 ) 2 n 0 +M 1 n=n 0 { 1 exp 1 } 2πσ 2 2σ 2 x2 [n] } 1 2σ 2 x 2 [n] { exp 1 ( 2x[n]s[n n0 2σ 2 ] + s 2 [n n 0 ] ) } Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

22 Application: Range Estimation in Radar Considering the term involving n 0, the MLE of n 0 can be found by maximizing { exp 1 n 0 +M 1 ( 2x[n]s[n n0 2σ 2 ] + s 2 [n n 0 ] ) } n=n 0 Or by minimizing n 0 +M 1 n=n 0 ( 2x[n]s[n n0 ] + s 2 [n n 0 ] ) ote that n 0 +M 1 n=n 0 s 2 [n n 0 ] = M 1 m=0 s2 [m], which is independent of n 0. So, the MLE of n 0 is found by maximizing n 0 +M 1 n=n 0 x[n]s[n n 0 ] = M 1 m=0 x[m + n 0 ]s[m] Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

23 Application: Range Estimation in Radar This means that the MLE of n 0 is found by correlating the transmitted signal s[n] with all possible received signals x[n] and then choosing the maximum. By the invariance principle, the MLE of the range is R = cτ 0 2 = cn 0 2F s Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

24 Bayesian Estimators Bayesian estimators differ from classical estimators in that they consider the parameters as random variables instead of unknown constants. The parameters also have a PDF, which needs to be taken into account when seeking for an estimator. The PDF of the parameters can be used for incorporating any prior knowledge we may have about its value. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

25 Bayesian Estimators For example, we might know that the normalized frequency f 0 of an observed sinusoid cannot be greater than 0.1. This is ensured by choosing { 10 if 0 f0 0.1 p(f 0 ) = 0 otherwise as the prior PDF in the Bayesian framework. Usually differentiable PDFs are easier, and we could approximate the uniform PDF with, e.g., the Rayleigh PDF. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

26 Prior and Posterior Estimates Bayesian approach can be applied to small data records. The estimate can be improved sequentially as new data arrives. For example, consider tossing a coin and estimating the probability of a head, µ. Maximum-likelihood estimate is ˆµ = #heads #toss If the no. of tosses is 3 and 3 heads (no tail) are observed, then µ ML = 1. The Bayesian approach can circumvent this problem, because the prior regularizes the likelihood and avoids overfitting to the small amount of data. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

27 Prior and Posterior Estimates Likelihood, prior, and posterior after observing 3 heads in a row. Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

28 Prior and Posterior Estimates Likelihood function: p(x µ) = µ #heads (1 µ) #tails If x = {H, H, H}, max µ p(x µ) = 1 and argmax µ p(x µ) = 1. The prior p(µ) is selected to reflect the fact that we have a fair coin. The posterior density can be obtained from Bayes formula: p(µ x) = p(x µ)p(µ) p(x) p(x µ)p(µ) The Bayesian approach is the select the maximum of the posterior (maximum a posteriori: ˆµ = argmax µ p(µ x) = argmax p(x µ)p(µ) µ Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

29 Average Cost Bayesian estimators can be obtained by minimizing the average cost: ˆθ = argmin C(θ ˆθ)p(x, θ)dxdθ ˆθ θ x = argmin C(θ ˆθ)p(θ x)p(x)dxdθ ˆθ θ x = argmin ˆθ = argmin ˆθ x θ ( θ ) C(θ ˆθ)p(θ x)dθ p(x)dx C(θ ˆθ)p(θ x)dθ Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

30 Bayesian MMSE Estimator If C(z) = z 2, we have the Bayesian minimum mean-square error (MMSE) estimator: ˆθ mmse = argmin (θ ˆθ) 2 p(θ x)dθ ˆθ θ Differentiating the integral w.r.t. ˆθ and set the result to 0, we obtain 2(θ ˆθ)p(θ x)dθ = 0 = ˆθp(θ x)dθ = θp(θ x)dθ = ˆθ p(θ x)dθ = θp(θ x)dθ = ˆθ = θp(θ x)dθ Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

31 Bayesian MMSE Estimator Therefore, the Bayesian MMSE Estimator is ˆθ mmse = θp(θ x)dθ = E θ p(θ x) {θ x}, which is the mean of the posterior PDF, p(θ x). Man-Wai MAK (EIE) ML and Bayesian Estimation ovember 12, / 31

EIE6207: Estimation Theory

EIE6207: Estimation Theory EIE6207: Estimation Theory Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: Steven M.

More information

Detection theory. H 0 : x[n] = w[n]

Detection theory. H 0 : x[n] = w[n] Detection Theory Detection theory A the last topic of the course, we will briefly consider detection theory. The methods are based on estimation theory and attempt to answer questions such as Is a signal

More information

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) 1 ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) Jingxian Wu Department of Electrical Engineering University of Arkansas Outline Minimum Variance Unbiased Estimators (MVUE)

More information

COS513 LECTURE 8 STATISTICAL CONCEPTS

COS513 LECTURE 8 STATISTICAL CONCEPTS COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions

More information

Parameter Estimation

Parameter Estimation 1 / 44 Parameter Estimation Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay October 25, 2012 Motivation System Model used to Derive

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Advanced Signal Processing Introduction to Estimation Theory

Advanced Signal Processing Introduction to Estimation Theory Advanced Signal Processing Introduction to Estimation Theory Danilo Mandic, room 813, ext: 46271 Department of Electrical and Electronic Engineering Imperial College London, UK d.mandic@imperial.ac.uk,

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Detection theory 101 ELEC-E5410 Signal Processing for Communications

Detection theory 101 ELEC-E5410 Signal Processing for Communications Detection theory 101 ELEC-E5410 Signal Processing for Communications Binary hypothesis testing Null hypothesis H 0 : e.g. noise only Alternative hypothesis H 1 : signal + noise p(x;h 0 ) γ p(x;h 1 ) Trade-off

More information

Bayesian Decision and Bayesian Learning

Bayesian Decision and Bayesian Learning Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

CLASS NOTES Models, Algorithms and Data: Introduction to computing 2018

CLASS NOTES Models, Algorithms and Data: Introduction to computing 2018 CLASS NOTES Models, Algorithms and Data: Introduction to computing 208 Petros Koumoutsakos, Jens Honore Walther (Last update: June, 208) IMPORTANT DISCLAIMERS. REFERENCES: Much of the material (ideas,

More information

CSC321 Lecture 18: Learning Probabilistic Models

CSC321 Lecture 18: Learning Probabilistic Models CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling

More information

A Few Notes on Fisher Information (WIP)

A Few Notes on Fisher Information (WIP) A Few Notes on Fisher Information (WIP) David Meyer dmm@{-4-5.net,uoregon.edu} Last update: April 30, 208 Definitions There are so many interesting things about Fisher Information and its theoretical properties

More information

Module 2. Random Processes. Version 2, ECE IIT, Kharagpur

Module 2. Random Processes. Version 2, ECE IIT, Kharagpur Module Random Processes Version, ECE IIT, Kharagpur Lesson 9 Introduction to Statistical Signal Processing Version, ECE IIT, Kharagpur After reading this lesson, you will learn about Hypotheses testing

More information

Detection & Estimation Lecture 1

Detection & Estimation Lecture 1 Detection & Estimation Lecture 1 Intro, MVUE, CRLB Xiliang Luo General Course Information Textbooks & References Fundamentals of Statistical Signal Processing: Estimation Theory/Detection Theory, Steven

More information

ECE 275A Homework 7 Solutions

ECE 275A Homework 7 Solutions ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

Rowan University Department of Electrical and Computer Engineering

Rowan University Department of Electrical and Computer Engineering Rowan University Department of Electrical and Computer Engineering Estimation and Detection Theory Fall 2013 to Practice Exam II This is a closed book exam. There are 8 problems in the exam. The problems

More information

Linear Models A linear model is defined by the expression

Linear Models A linear model is defined by the expression Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Slide Set 2: Estimation Theory January 2018 Heikki Huttunen heikki.huttunen@tut.fi Department of Signal Processing Tampere University of Technology Classical Estimation

More information

Bayesian Learning (II)

Bayesian Learning (II) Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP

More information

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator Estimation Theory Estimation theory deals with finding numerical values of interesting parameters from given set of data. We start with formulating a family of models that could describe how the data were

More information

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28

More information

Detection & Estimation Lecture 1

Detection & Estimation Lecture 1 Detection & Estimation Lecture 1 Intro, MVUE, CRLB Xiliang Luo General Course Information Textbooks & References Fundamentals of Statistical Signal Processing: Estimation Theory/Detection Theory, Steven

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Due Thursday, September 19, in class What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

Hierarchical Models & Bayesian Model Selection

Hierarchical Models & Bayesian Model Selection Hierarchical Models & Bayesian Model Selection Geoffrey Roeder Departments of Computer Science and Statistics University of British Columbia Jan. 20, 2016 Contact information Please report any typos or

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

Lecture 2: Priors and Conjugacy

Lecture 2: Priors and Conjugacy Lecture 2: Priors and Conjugacy Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 6, 2014 Some nice courses Fred A. Hamprecht (Heidelberg U.) https://www.youtube.com/watch?v=j66rrnzzkow Michael I.

More information

ECE531 Lecture 10b: Maximum Likelihood Estimation

ECE531 Lecture 10b: Maximum Likelihood Estimation ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

Lecture 4: Probabilistic Learning

Lecture 4: Probabilistic Learning DD2431 Autumn, 2015 1 Maximum Likelihood Methods Maximum A Posteriori Methods Bayesian methods 2 Classification vs Clustering Heuristic Example: K-means Expectation Maximization 3 Maximum Likelihood Methods

More information

Estimation and Detection

Estimation and Detection stimation and Detection Lecture 2: Cramér-Rao Lower Bound Dr. ir. Richard C. Hendriks & Dr. Sundeep P. Chepuri 7//207 Remember: Introductory xample Given a process (DC in noise): x[n]=a + w[n], n=0,,n,

More information

Module 1 - Signal estimation

Module 1 - Signal estimation , Arraial do Cabo, 2009 Module 1 - Signal estimation Sérgio M. Jesus (sjesus@ualg.pt) Universidade do Algarve, PT-8005-139 Faro, Portugal www.siplab.fct.ualg.pt February 2009 Outline of Module 1 Parameter

More information

Detection and Estimation Theory

Detection and Estimation Theory Detection and Estimation Theory Instructor: Prof. Namrata Vaswani Dept. of Electrical and Computer Engineering Iowa State University http://www.ece.iastate.edu/ namrata Slide 1 What is Estimation and Detection

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters

More information

Some Probability and Statistics

Some Probability and Statistics Some Probability and Statistics David M. Blei COS424 Princeton University February 12, 2007 D. Blei ProbStat 01 1 / 42 Who wants to scribe? D. Blei ProbStat 01 2 / 42 Random variable Probability is about

More information

Density Estimation: ML, MAP, Bayesian estimation

Density Estimation: ML, MAP, Bayesian estimation Density Estimation: ML, MAP, Bayesian estimation CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Maximum-Likelihood Estimation Maximum

More information

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1) HW 1 due today Parameter Estimation Biometrics CSE 190 Lecture 7 Today s lecture was on the blackboard. These slides are an alternative presentation of the material. CSE190, Winter10 CSE190, Winter10 Chapter

More information

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1 Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the

More information

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K

More information

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation Machine Learning CMPT 726 Simon Fraser University Binomial Parameter Estimation Outline Maximum Likelihood Estimation Smoothed Frequencies, Laplace Correction. Bayesian Approach. Conjugate Prior. Uniform

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

Gaussian processes and bayesian optimization Stanisław Jastrzębski. kudkudak.github.io kudkudak

Gaussian processes and bayesian optimization Stanisław Jastrzębski. kudkudak.github.io kudkudak Gaussian processes and bayesian optimization Stanisław Jastrzębski kudkudak.github.io kudkudak Plan Goal: talk about modern hyperparameter optimization algorithms Bayes reminder: equivalent linear regression

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Generative Models Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1

More information

Technique for Numerical Computation of Cramér-Rao Bound using MATLAB

Technique for Numerical Computation of Cramér-Rao Bound using MATLAB Technique for Numerical Computation of Cramér-Rao Bound using MATLAB Hing Cheung So http://www.ee.cityu.edu.hk/~hcso Department of Electronic Engineering City University of Hong Kong H. C. So Page 1 The

More information

Basic concepts in estimation

Basic concepts in estimation Basic concepts in estimation Random and nonrandom parameters Definitions of estimates ML Maimum Lielihood MAP Maimum A Posteriori LS Least Squares MMS Minimum Mean square rror Measures of quality of estimates

More information

Review of Maximum Likelihood Estimators

Review of Maximum Likelihood Estimators Libby MacKinnon CSE 527 notes Lecture 7, October 7, 2007 MLE and EM Review of Maximum Likelihood Estimators MLE is one of many approaches to parameter estimation. The likelihood of independent observations

More information

Bayesian Methods: Naïve Bayes

Bayesian Methods: Naïve Bayes Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior

More information

Estimation Theory Fredrik Rusek. Chapters

Estimation Theory Fredrik Rusek. Chapters Estimation Theory Fredrik Rusek Chapters 3.5-3.10 Recap We deal with unbiased estimators of deterministic parameters Performance of an estimator is measured by the variance of the estimate (due to the

More information

Learning Bayesian network : Given structure and completely observed data

Learning Bayesian network : Given structure and completely observed data Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution

More information

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including

More information

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Lecture 18: Bayesian Inference

Lecture 18: Bayesian Inference Lecture 18: Bayesian Inference Hyang-Won Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 18 Probability and Statistics, Spring 2014 1 / 10 Bayesian Statistical Inference Statiscal inference

More information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b) LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 8: Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk Based on slides by Sharon Goldwater October 14, 2016 Frank Keller Computational

More information

Some Probability and Statistics

Some Probability and Statistics Some Probability and Statistics David M. Blei COS424 Princeton University February 13, 2012 Card problem There are three cards Red/Red Red/Black Black/Black I go through the following process. Close my

More information

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference Associate Instructor: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted

More information

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction Estimation Tasks Short Course on Image Quality Matthew A. Kupinski Introduction Section 13.3 in B&M Keep in mind the similarities between estimation and classification Image-quality is a statistical concept

More information

Chapter 8: Least squares (beginning of chapter)

Chapter 8: Least squares (beginning of chapter) Chapter 8: Least squares (beginning of chapter) Least Squares So far, we have been trying to determine an estimator which was unbiased and had minimum variance. Next we ll consider a class of estimators

More information

ECE 275A Homework 6 Solutions

ECE 275A Homework 6 Solutions ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Machine Learning (CSE 446): Probabilistic Machine Learning

Machine Learning (CSE 446): Probabilistic Machine Learning Machine Learning (CSE 446): Probabilistic Machine Learning oah Smith c 2017 University of Washington nasmith@cs.washington.edu ovember 1, 2017 1 / 24 Understanding MLE y 1 MLE π^ You can think of MLE as

More information

F2E5216/TS1002 Adaptive Filtering and Change Detection. Course Organization. Lecture plan. The Books. Lecture 1

F2E5216/TS1002 Adaptive Filtering and Change Detection. Course Organization. Lecture plan. The Books. Lecture 1 Adaptive Filtering and Change Detection Bo Wahlberg (KTH and Fredrik Gustafsson (LiTH Course Organization Lectures and compendium: Theory, Algorithms, Applications, Evaluation Toolbox and manual: Algorithms,

More information

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018 Bayesian Methods David S. Rosenberg New York University March 20, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 March 20, 2018 1 / 38 Contents 1 Classical Statistics 2 Bayesian

More information

Lecture 5. Gaussian Models - Part 1. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. November 29, 2016

Lecture 5. Gaussian Models - Part 1. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. November 29, 2016 Lecture 5 Gaussian Models - Part 1 Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza November 29, 2016 Luigi Freda ( La Sapienza University) Lecture 5 November 29, 2016 1 / 42 Outline 1 Basics

More information

Accouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF

Accouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF Accouncements You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF Please do not zip these files and submit (unless there are >5 files) 1 Bayesian Methods Machine Learning

More information

Detecting Parametric Signals in Noise Having Exactly Known Pdf/Pmf

Detecting Parametric Signals in Noise Having Exactly Known Pdf/Pmf Detecting Parametric Signals in Noise Having Exactly Known Pdf/Pmf Reading: Ch. 5 in Kay-II. (Part of) Ch. III.B in Poor. EE 527, Detection and Estimation Theory, # 5c Detecting Parametric Signals in Noise

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

10. Linear Models and Maximum Likelihood Estimation

10. Linear Models and Maximum Likelihood Estimation 10. Linear Models and Maximum Likelihood Estimation ECE 830, Spring 2017 Rebecca Willett 1 / 34 Primary Goal General problem statement: We observe y i iid pθ, θ Θ and the goal is to determine the θ that

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Empirical Bayes, Hierarchical Bayes Mark Schmidt University of British Columbia Winter 2017 Admin Assignment 5: Due April 10. Project description on Piazza. Final details coming

More information

5.2 Fisher information and the Cramer-Rao bound

5.2 Fisher information and the Cramer-Rao bound Stat 200: Introduction to Statistical Inference Autumn 208/9 Lecture 5: Maximum likelihood theory Lecturer: Art B. Owen October 9 Disclaimer: These notes have not been subjected to the usual scrutiny reserved

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Mathematical statistics

Mathematical statistics October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter

More information

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013 Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Aarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell

Aarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Machine Learning 0-70/5 70/5-78, 78, Spring 00 Probability 0 Aarti Singh Lecture, January 3, 00 f(x) µ x Reading: Bishop: Chap, Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Announcements Homework

More information

Lecture 5 September 19

Lecture 5 September 19 IFT 6269: Probabilistic Graphical Models Fall 2016 Lecture 5 September 19 Lecturer: Simon Lacoste-Julien Scribe: Sébastien Lachapelle Disclaimer: These notes have only been lightly proofread. 5.1 Statistical

More information

CS-E3210 Machine Learning: Basic Principles

CS-E3210 Machine Learning: Basic Principles CS-E3210 Machine Learning: Basic Principles Lecture 4: Regression II slides by Markus Heinonen Department of Computer Science Aalto University, School of Science Autumn (Period I) 2017 1 / 61 Today s introduction

More information

Fundamentals of Statistical Signal Processing Volume II Detection Theory

Fundamentals of Statistical Signal Processing Volume II Detection Theory Fundamentals of Statistical Signal Processing Volume II Detection Theory Steven M. Kay University of Rhode Island PH PTR Prentice Hall PTR Upper Saddle River, New Jersey 07458 http://www.phptr.com Contents

More information

Estimation Theory Fredrik Rusek. Chapter 11

Estimation Theory Fredrik Rusek. Chapter 11 Estimation Theory Fredrik Rusek Chapter 11 Chapter 10 Bayesian Estimation Section 10.8 Bayesian estimators for deterministic parameters If no MVU estimator exists, or is very hard to find, we can apply

More information

Classical Estimation Topics

Classical Estimation Topics Classical Estimation Topics Namrata Vaswani, Iowa State University February 25, 2014 This note fills in the gaps in the notes already provided (l0.pdf, l1.pdf, l2.pdf, l3.pdf, LeastSquares.pdf). 1 Min

More information

g-priors for Linear Regression

g-priors for Linear Regression Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,

More information

Chapters 9. Properties of Point Estimators

Chapters 9. Properties of Point Estimators Chapters 9. Properties of Point Estimators Recap Target parameter, or population parameter θ. Population distribution f(x; θ). { probability function, discrete case f(x; θ) = density, continuous case The

More information

Estimators as Random Variables

Estimators as Random Variables Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maimum likelihood Consistency Confidence intervals Properties of the mean estimator Introduction Up until

More information

Estimation. Max Welling. California Institute of Technology Pasadena, CA

Estimation. Max Welling. California Institute of Technology Pasadena, CA Preliminaries Estimation Max Welling California Institute of Technology 36-93 Pasadena, CA 925 welling@vision.caltech.edu Let x denote a random variable and p(x) its probability density. x may be multidimensional

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information