Parameter Estimation. Industrial AI Lab.

Parameter Estimation Industrial AI Lab.

Generative Model X Y w y = ω T x + ε ε~n(0, σ 2 ) σ 2 2

Maximum Likelihood Estimation (MLE) Estimate parameters θ ω, σ 2 given a generative model Given observed data such that maximize the likelihood Generative model structure (assumption) 3

Maximum Likelihood Estimation (MLE) Find parameters ω and σ that maximize the likelihood over the observed data Likelihood: Perhaps the simplest (but widely used) parameter estimation method 4

Drawn from a Gaussian Distribution You will often see the following derivation 5

Drawn from a Gaussian Distribution To maximize, l μ = 0, l σ = 0 BIG Lesson We often compute a mean and variance to represent data statistics We kind of assume that a data set is Gaussian distributed Good news: sample mean is Gaussian distributed by the central limit theorem 6

Numerical Example Compute the likelihood function, then Maximize the likelihood function Adjust the mean and variance of the Gaussian to maximize its product 7

Numerical Example 8

Numerical Example for Gaussian 9

When Mean is Unknown 10

When Variance is Unknown 11

Probabilistic Machine Learning Probabilistic Machine Learning I personally believe this is a more fundamental way of looking at machine learning Maximum Likelihood Estimation (MLE) Maximum a Posterior (MAP) Probabilistic Regression Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 12

Maximum Likelihood Estimation (MLE) 13

Linear Regression: A Probabilistic View Linear regression model with (Gaussian) normal errors 14

Linear Regression: A Probabilistic View BIG Lesson Same as the least squared optimization 15

Linear Regression: A Probabilistic View 16

Linear Regression: A Probabilistic View 17

Linear Regression: A Probabilistic View 18

Linear Regression: A Probabilistic View 19

Linear Regression: A Probabilistic View 20

Maximum a Posterior (MAP) 21

Data Fusion with Uncertainties Learning Theory (Reza Shadmehr, Johns Hopkins University) youtube link X y a y b In a matrix form 22

Data Fusion with Uncertainties Find x ML C T R 1 C 1 C T R 1 23

Data Fusion with Uncertainties 24

Summary Data Fusion with Less Uncertainties BIG Lesson: Two sensors are better than one sensor less uncertainties Accuracy or uncertainty information is also important in sensors σ a 2 = σ b 2 σ a 2 > σ b 2 μ a x ML μ b μ a x ML μ b 25

Example of Two Rulers 1D Examples How brain works on human measurements from both haptic and visual channels 26

Data Fusion with 1D Example 27

Data Fusion with 2D Example 28

Maximum-a-Posterior Estimation (MAP) Choose θ that maximizes the posterior probability of θ (i.e. probability in the light of the observed data) Posterior probability of θ is given by the Bayes Rule P θ : Prior probability of θ (without having seen any data) P D θ : Likelihood P D : Probability of the data (independent of θ ) The Bayes rule lets us update our belief about θ in the light of observed data 29

Maximum-a-Posterior Estimation (MAP) While doing MAP, we usually maximize the log of the posterior probability for multiple observations D = d 1, d 2,, d m same as MLE except the extra log-prior-distribution term MAP allows incorporating our prior knowledge about θ in its estimation 30

MAP for mean of a univariate Gaussian Suppose that θ is a random variable with θ~n μ, 1 2, but a prior knowledge (unknown θ and known μ, σ 2 ) Observations D = d 1, d 2,, d m : conditionally independent given θ Joint Probability 31

MAP for mean of a univariate Gaussian MAP: choose θ MAP 32

MAP for mean of a univariate Gaussian 33

MAP for mean of a univariate Gaussian ML interpretation: BIG Lesson: a prior acts as a data m = 0 m θ MAP μ തX Note: prior knowledge Education Get older School ranking 34

MAP for mean of a univariate Gaussian Example) Experiment in class Which one do you think is heavier? with eyes closed with visual inspection with haptic (touch) inspection 35

MAP Python code Suppose that θ is a random variable with θ~n μ, 1 2, but a prior knowledge (unknown θ and known μ, σ 2 ) for mean of a univariate Gaussian 36

MAP Python code 37

MAP Python code 38

MAP Python code 39

Optional Object Tracking in Computer Vision Lecture: Introduction to Computer Vision by Prof. Aaron Bobick at Georgia Tech 40

Object Tracking in Computer Vision 41

Kernel Density Estimation non-parametric estimate of density Lecture: Learning Theory (Reza Shadmehr, Johns Hopkins University) 42

Kernel Density Estimation 43