11 - The linear model

Size: px
Start display at page:

Download "11 - The linear model"

Transcription

1 11-1 The linear model S. Lall, Stanford The linear model The linear model The joint pdf and covariance Example: uniform pdfs The importance of the prior Linear measurements with Gaussian noise Example: Gaussian noise The signal-to-noise ratio Scalar systems and the SNR Example: small and large noise Posterior covariance Example: navigation Alternative formulae Weighted least squares

2 11-2 The linear model S. Lall, Stanford The linear model A very important class of estimation problems is the linear model y = Ax + w x and w are independent We have induced pdfs p x for x and p w for w The matrix A is m n We measure y = y meas and would like to estimate x

3 11-3 The linear model S. Lall, Stanford The mean Let µ x = E x and µ w = E w Then E y = Aµ x + µ w Call this µ y

4 11-4 The linear model S. Lall, Stanford The linear map Since y = Ax + w, we have [ ] x y = [ ] I 0 x A I][ w We will measure y = y meas and estimate x To do this, we would like the conditional pdf of x y = y meas For this, we need the joint pdf of x and y

5 11-5 The linear model S. Lall, Stanford The joint pdf The joint pdf p of x and y is p(x, y) = p x (x)p w (y Ax) because the joint pdf of x, w is p 1 ([ x w ]) = p x (x)p w (w) and we know [ ] x y = [ ] I 0 x A I][ w z = Hu implies p z (a) = det H 1 p u (H 1 (u)), so ([ ]) I 0 x p(x, y) = p 1 A I][ y

6 11-6 The linear model S. Lall, Stanford The covariance Let Σ w = cov(w) and Σ x = cov(x). We have [ x cov y] = [ Σx Σ x A T AΣ x AΣ x A T + Σ w ] Call this [ ] [ Σx Σ xy x. Above holds because cov Σ yx Σ y y] = [ ][ ] [ ] T I 0 Σx 0 I 0 A I 0 Σ w A I Then cov(y) = AΣ x A T + Σ w AΣ x A T is signal covariance Σ w is noise covariance

7 11-7 The linear model S. Lall, Stanford Example: uniform pdfs Suppose x U[ 2, 2] and w U[ 1, 1] and we measure y = x + w. The joint pdf and MMSE estimator are Notice the estimator is not x est = y meas, because of the prior information that x [ 2, 2].

8 11-8 The linear model S. Lall, Stanford The importance of the prior x U[ 2, 2] as before w U[ 0.3, 0.3]; signal x is large relative to the noise w The joint pdf and MMSE estimator are The estimator is almost x est = y meas

9 11-9 The linear model S. Lall, Stanford The importance of the prior x U[ 2, 2] as before w U[ 10, 10]; signal x is small relative to the noise w 10 5 The joint pdf and MMSE estimator are shown. The estimator mostly ignores the measurement 0 The estimate is almost the prior mean E x =

10 11-10 The linear model S. Lall, Stanford Linear measurements with Gaussian noise We have the linear model y = Ax + w x N(0, Σ x ) and w N(0, Σ w ) are independent So [ x y] is Gaussian, with mean and covariance [ x E y] = [ 0 0] [ x cov y] = [ Σx Σ x A T AΣ x AΣ x A T + Σ w ]

11 11-11 The linear model S. Lall, Stanford Linear measurements with Gaussian noise The MMSE estimate of x given y = y meas is ˆx mmse = Σ x A T (AΣ x A T + Σ w ) 1 y meas because we know ˆx mmse = Σ xy Σ 1 y y meas The matrix L = Σ x A T (AΣ x A T + Σ w ) 1 is called the estimator gain

12 11-12 The linear model S. Lall, Stanford Example: linear measurements with Gaussian noise Suppose y = 2x + w, with prior covariance cov(x) = 1 noise covariance cov(w) = y=ax x=ly the estimator is x mmse = 2y meas The MMSE estimator gives a smaller answer than just inverting A, x mmse A 1 y meas since we have prior information about x

13 11-13 The linear model S. Lall, Stanford Non-zero means Suppose x N(µ x, Σ x ) and w N(µ w, Σ w ). The MMSE estimate of x given y = y meas is ˆx mmse = µ x + Σ x A T (AΣ x A T + Σ w ) 1 (y meas Aµ x µ w )

14 11-14 The linear model S. Lall, Stanford The signal to noise ratio Suppose where x, y and w are scalar, and y = Ax + w. The signal-to-noise ratio is s = A2 Σ x Σw Commonly used for scalar w, x, y; no use in vector case In terms of s, the MMSE estimate is x mmse = µ x + AΣ x A 2 Σ x + Σ w (y meas Aµ x ) = s 2µ x + s2 1 + s 2A 1 y meas

15 11-15 The linear model S. Lall, Stanford Scalar systems and the SNR The MMSE estimate is x mmse = s 2µ x + s2 1 + s 2A 1 y meas let θ = s 2, then x mmse = θµ x + (1 θ)a 1 y a convex linear combination of the prior mean and the least-squares estimate when s is small, x mmse µ x, the prior mean when s is large, x mmse A 1 y, the least-squares estimate of y

16 11-16 The linear model S. Lall, Stanford Example: small noise Suppose y = 2x + w, with prior covariance cov(x) = 1 noise covariance cov(w) = 0.4; signal is large compared to noise y=ax x=ly Hence SNR s = A2 Σ x Σw Estimate is 2 i.e., close to y meas /2 x mmse = s2 y 1 + s 2A 1 meas 0.9A 1 y meas

17 11-17 The linear model S. Lall, Stanford Example: large noise Suppose y = 2x + w, with prior covariance cov(x) = 1 noise covariance cov(w) = 20; signal is small compared to noise y=ax x=ly Hence SNR s = A2 Σ x Σw Estimate is 2 x mmse = s2 y 1 + s 2A 1 meas 0.17A 1 y meas i.e., closer to 0 for all y meas

18 11-18 The linear model S. Lall, Stanford The posterior covariance The posterior covariance of x given y = y meas is cov ( x y = ymeas ) = Σx Σ x A T (AΣ x A T + Σ w ) 1 AΣ x above follows because cov ( x y = ymeas ) = Σx Σ xy Σ 1 y Σ yx We can use this to compute the MSE since E ( x ˆx mmse 2 y = ymeas ) = tracecov ( x y = ymeas )

19 11-19 The linear model S. Lall, Stanford The posterior covariance and SNR For scalar problems, the posterior covariance of x given y = y meas is cov ( x y = ymeas ) = Σ x 1 + s 2 The uncertainty (covariance) in x is reduced by the factor 1 by measurement 1 + s2

20 11-20 The linear model S. Lall, Stanford Example: navigation [ p x = our location, we measure distances r q] i to m beacons at points (u i, v i ) (u 3 ;v 3 ) r 2 (u 2 ;v 2 ) (u 4 ;v 4 ) r 3 r 1 (p;q) r 4 (u 1 ;v 1 ) assume p, q are small compared to u i, v i. then, approximately y = Ax A R m 2, ith row of A is the transpose of unit vector in the direction of beacon i u v1 2 y = r 1. measured vector of distances u 2 m + vm 2 r m

21 11-21 The linear model S. Lall, Stanford Example: navigation 5 here A R 3 2 with A = b 1 b b 2 b 1 b 3 and y = Ax. Each b i is a unit vector. 2 b 3 ([ 4 Prior information is x N, 4] [ ]) Prior mean actual location y is measured; y i is range measurement in the direction b i with noise w added [ ] [ ] [ ] beacons at,, figure shows prior 90% confidence ellipsoid

22 11-22 The linear model S. Lall, Stanford Example: posterior confidence ellipsoids Posterior confidence ellipsoids for two different possible noise covariances. Σ w = Σ w = b 2 4 b 2 3 b 1 3 b 1 2 b 3 2 b 3 1 actual location MMSE estimate actual location MMSE estimate

23 11-23 The linear model S. Lall, Stanford Alternative formula There is another way to write the posterior covariance: cov ( x y = y meas ) = ( Σ 1 x + A T Σ 1 w A ) 1 follows from the Sherman-Morrison-Woodbury formula (A BD 1 C) 1 = A 1 + A 1 B(D CA 1 B) 1 CA 1 This is very useful when we have fewer unknowns than measurements; i.e., Σ x is smaller that AΣ x A T

24 11-24 The linear model S. Lall, Stanford Alternative formula There is also an alternative formula for the estimator gain L = (Σ 1 x + A T Σ 1 w A) 1 A T Σ 1 w Because L = Σ x A T (AΣ x A T + Σ w ) 1 = Σ x A T (Σ 1 w AΣ x A T + I) 1 Σ 1 w = (Σ x A T Σ 1 w A + I) 1 Σ x A T Σ 1 w by push-through identity = (A T Σ 1 w A + Σ 1 x ) 1 A T Σ 1 w

25 11-25 The linear model S. Lall, Stanford Comparison with least-squares The least-squares approach minimizes y Ax 2 = m ( yi a T i x ) 2 i=1 where A = [ a 1 a 2... a m ] T Suppose instead we minimize m ( w i yi a T i x ) 2 i=1 where w 1, w 2,..., w m are positive weights

26 11-26 The linear model S. Lall, Stanford Weighted norms More generally, let s look at weighted norms 3 2 contours of the 2-norm x 2 = x T x contours of the weighted-norm x W = x T Wx = W 1 2x 2 0 where W = [ ]

27 11-27 The linear model S. Lall, Stanford Weighted least squares the weighted least-squares problem; given y meas R m, minimize y meas Ax W assume A R m n, skinny, full rank, and W R m m and W > 0 then (by differentiating) the optimum x is x wls = (A T WA) 1 A T Wy meas

28 11-28 The linear model S. Lall, Stanford Weighted least squares R m y meas range(a) Ax est if there is no noise, y lies in range A the weighted least-squares estimate x wls minimizes y meas Ax W Ax wls is the closest (in weighted-norm) point in rangea to y meas

29 11-29 The linear model S. Lall, Stanford MMSE and weighted least squares suppose we choose weight W = Σ 1 w ; then WLS solution is x wls = (A T Σ 1 w A) 1 A T Σ 1 w y meas compare with MMSE estimate when x N(0, Σ x ) and w N(0, Σ w ) x mmse = (Σ 1 x + A T Σ 1 w A) 1 A T Σ 1 w y meas as the prior covariance Σ x, the MMSE estimate tends to the WLS estimate if Σ w = I then MMSE tends to usual least-squares solution as Σ x the weighted norm heavily penalizes the residual y Ax in low-noise directions

8 - Continuous random vectors

8 - Continuous random vectors 8-1 Continuous random vectors S. Lall, Stanford 2011.01.25.01 8 - Continuous random vectors Mean-square deviation Mean-variance decomposition Gaussian random vectors The Gamma function The χ 2 distribution

More information

14 - Gaussian Stochastic Processes

14 - Gaussian Stochastic Processes 14-1 Gaussian Stochastic Processes S. Lall, Stanford 211.2.24.1 14 - Gaussian Stochastic Processes Linear systems driven by IID noise Evolution of mean and covariance Example: mass-spring system Steady-state

More information

Solutions to Homework Set #6 (Prepared by Lele Wang)

Solutions to Homework Set #6 (Prepared by Lele Wang) Solutions to Homework Set #6 (Prepared by Lele Wang) Gaussian random vector Given a Gaussian random vector X N (µ, Σ), where µ ( 5 ) T and 0 Σ 4 0 0 0 9 (a) Find the pdfs of i X, ii X + X 3, iii X + X

More information

Lecture 5 Least-squares

Lecture 5 Least-squares EE263 Autumn 2008-09 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

4 Derivations of the Discrete-Time Kalman Filter

4 Derivations of the Discrete-Time Kalman Filter Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof N Shimkin 4 Derivations of the Discrete-Time

More information

ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions

ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions Problem Solutions : Yates and Goodman, 9.5.3 9.1.4 9.2.2 9.2.6 9.3.2 9.4.2 9.4.6 9.4.7 and Problem 9.1.4 Solution The joint PDF of X and Y

More information

A Probability Review

A Probability Review A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in

More information

Preliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38

Preliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38 Preliminaries Copyright c 2018 Dan Nettleton (Iowa State University) Statistics 510 1 / 38 Notation for Scalars, Vectors, and Matrices Lowercase letters = scalars: x, c, σ. Boldface, lowercase letters

More information

y(x) = x w + ε(x), (1)

y(x) = x w + ε(x), (1) Linear regression We are ready to consider our first machine-learning problem: linear regression. Suppose that e are interested in the values of a function y(x): R d R, here x is a d-dimensional vector-valued

More information

Introduction to Probability and Stocastic Processes - Part I

Introduction to Probability and Stocastic Processes - Part I Introduction to Probability and Stocastic Processes - Part I Lecture 2 Henrik Vie Christensen vie@control.auc.dk Department of Control Engineering Institute of Electronic Systems Aalborg University Denmark

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

TSRT14: Sensor Fusion Lecture 8

TSRT14: Sensor Fusion Lecture 8 TSRT14: Sensor Fusion Lecture 8 Particle filter theory Marginalized particle filter Gustaf Hendeby gustaf.hendeby@liu.se TSRT14 Lecture 8 Gustaf Hendeby Spring 2018 1 / 25 Le 8: particle filter theory,

More information

ME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 PROBABILITY. Prof. Steven Waslander

ME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 PROBABILITY. Prof. Steven Waslander ME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 Prof. Steven Waslander p(a): Probability that A is true 0 pa ( ) 1 p( True) 1, p( False) 0 p( A B) p( A) p( B) p( A B) A A B B 2 Discrete Random Variable X

More information

Multivariate Gaussians. Sargur Srihari

Multivariate Gaussians. Sargur Srihari Multivariate Gaussians Sargur srihari@cedar.buffalo.edu 1 Topics 1. Multivariate Gaussian: Basic Parameterization 2. Covariance and Information Form 3. Operations on Gaussians 4. Independencies in Gaussians

More information

Lecture 8: Signal Detection and Noise Assumption

Lecture 8: Signal Detection and Noise Assumption ECE 830 Fall 0 Statistical Signal Processing instructor: R. Nowak Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(0, σ I n n and S = [s, s,..., s n ] T

More information

UCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei)

UCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei) UCSD ECE53 Handout #34 Prof Young-Han Kim Tuesday, May 7, 04 Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei) Linear estimator Consider a channel with the observation Y XZ, where the

More information

(Multivariate) Gaussian (Normal) Probability Densities

(Multivariate) Gaussian (Normal) Probability Densities (Multivariate) Gaussian (Normal) Probability Densities Carl Edward Rasmussen, José Miguel Hernández-Lobato & Richard Turner April 20th, 2018 Rasmussen, Hernàndez-Lobato & Turner Gaussian Densities April

More information

Recursive Estimation

Recursive Estimation Recursive Estimation Raffaello D Andrea Spring 08 Problem Set 3: Extracting Estimates from Probability Distributions Last updated: April 9, 08 Notes: Notation: Unless otherwise noted, x, y, and z denote

More information

Final Examination Solutions (Total: 100 points)

Final Examination Solutions (Total: 100 points) Final Examination Solutions (Total: points) There are 4 problems, each problem with multiple parts, each worth 5 points. Make sure you answer all questions. Your answer should be as clear and readable

More information

Lecture Notes 4 Vector Detection and Estimation. Vector Detection Reconstruction Problem Detection for Vector AGN Channel

Lecture Notes 4 Vector Detection and Estimation. Vector Detection Reconstruction Problem Detection for Vector AGN Channel Lecture Notes 4 Vector Detection and Estimation Vector Detection Reconstruction Problem Detection for Vector AGN Channel Vector Linear Estimation Linear Innovation Sequence Kalman Filter EE 278B: Random

More information

Solutions to Homework Set #5 (Prepared by Lele Wang) MSE = E [ (sgn(x) g(y)) 2],, where f X (x) = 1 2 2π e. e (x y)2 2 dx 2π

Solutions to Homework Set #5 (Prepared by Lele Wang) MSE = E [ (sgn(x) g(y)) 2],, where f X (x) = 1 2 2π e. e (x y)2 2 dx 2π Solutions to Homework Set #5 (Prepared by Lele Wang). Neural net. Let Y X + Z, where the signal X U[,] and noise Z N(,) are independent. (a) Find the function g(y) that minimizes MSE E [ (sgn(x) g(y))

More information

UCSD ECE153 Handout #30 Prof. Young-Han Kim Thursday, May 15, Homework Set #6 Due: Thursday, May 22, 2011

UCSD ECE153 Handout #30 Prof. Young-Han Kim Thursday, May 15, Homework Set #6 Due: Thursday, May 22, 2011 UCSD ECE153 Handout #30 Prof. Young-Han Kim Thursday, May 15, 2014 Homework Set #6 Due: Thursday, May 22, 2011 1. Linear estimator. Consider a channel with the observation Y = XZ, where the signal X and

More information

Ch. 12 Linear Bayesian Estimators

Ch. 12 Linear Bayesian Estimators Ch. 1 Linear Bayesian Estimators 1 In chapter 11 we saw: the MMSE estimator takes a simple form when and are jointly Gaussian it is linear and used only the 1 st and nd order moments (means and covariances).

More information

Appendix A : Introduction to Probability and stochastic processes

Appendix A : Introduction to Probability and stochastic processes A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of

More information

Lecture Note 1: Probability Theory and Statistics

Lecture Note 1: Probability Theory and Statistics Univ. of Michigan - NAME 568/EECS 568/ROB 530 Winter 2018 Lecture Note 1: Probability Theory and Statistics Lecturer: Maani Ghaffari Jadidi Date: April 6, 2018 For this and all future notes, if you would

More information

Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review

Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review You can t see this text! Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review Eric Zivot Spring 2015 Eric Zivot (Copyright 2015) Matrix Algebra Review 1 / 54 Outline 1

More information

Lecture 4: Least Squares (LS) Estimation

Lecture 4: Least Squares (LS) Estimation ME 233, UC Berkeley, Spring 2014 Xu Chen Lecture 4: Least Squares (LS) Estimation Background and general solution Solution in the Gaussian case Properties Example Big picture general least squares estimation:

More information

Econ 371 Problem Set #1 Answer Sheet

Econ 371 Problem Set #1 Answer Sheet Econ 371 Problem Set #1 Answer Sheet 2.1 In this question, you are asked to consider the random variable Y, which denotes the number of heads that occur when two coins are tossed. a. The first part of

More information

Bayes Decision Theory

Bayes Decision Theory Bayes Decision Theory Minimum-Error-Rate Classification Classifiers, Discriminant Functions and Decision Surfaces The Normal Density 0 Minimum-Error-Rate Classification Actions are decisions on classes

More information

11. Regression and Least Squares

11. Regression and Least Squares 11. Regression and Least Squares Prof. Tesler Math 186 Winter 2016 Prof. Tesler Ch. 11: Linear Regression Math 186 / Winter 2016 1 / 23 Regression Given n points ( 1, 1 ), ( 2, 2 ),..., we want to determine

More information

Problem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51

Problem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51 Yi Lu Correlation and Covariance Yi Lu ECE 313 2/51 Definition Let X and Y be random variables with finite second moments. the correlation: E[XY ] Yi Lu ECE 313 3/51 Definition Let X and Y be random variables

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2

Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2 You can t see this text! Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2 Eric Zivot Spring 2015 Eric Zivot (Copyright 2015) Probability Review - Part 2 1 /

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n JOINT DENSITIES - RANDOM VECTORS - REVIEW Joint densities describe probability distributions of a random vector X: an n-dimensional vector of random variables, ie, X = (X 1,, X n ), where all X is are

More information

Nonlinear Programming Models

Nonlinear Programming Models Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models p. Introduction Nonlinear Programming Models p. NLP problems minf(x) x S R n Standard form:

More information

DS-GA 1002 Lecture notes 10 November 23, Linear models

DS-GA 1002 Lecture notes 10 November 23, Linear models DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Derivation of the Kalman Filter

Derivation of the Kalman Filter Derivation of the Kalman Filter Kai Borre Danish GPS Center, Denmark Block Matrix Identities The key formulas give the inverse of a 2 by 2 block matrix, assuming T is invertible: T U 1 L M. (1) V W N P

More information

Fast Direct Methods for Gaussian Processes

Fast Direct Methods for Gaussian Processes Fast Direct Methods for Gaussian Processes Mike O Neil Departments of Mathematics New York University oneil@cims.nyu.edu December 12, 2015 1 Collaborators This is joint work with: Siva Ambikasaran Dan

More information

CS 591, Lecture 2 Data Analytics: Theory and Applications Boston University

CS 591, Lecture 2 Data Analytics: Theory and Applications Boston University CS 591, Lecture 2 Data Analytics: Theory and Applications Boston University Charalampos E. Tsourakakis January 25rd, 2017 Probability Theory The theory of probability is a system for making better guesses.

More information

ECE 275A Homework 6 Solutions

ECE 275A Homework 6 Solutions ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =

More information

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix: Joint Distributions Joint Distributions A bivariate normal distribution generalizes the concept of normal distribution to bivariate random variables It requires a matrix formulation of quadratic forms,

More information

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory [POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

More information

Introduction to Probabilistic Graphical Models: Exercises

Introduction to Probabilistic Graphical Models: Exercises Introduction to Probabilistic Graphical Models: Exercises Cédric Archambeau Xerox Research Centre Europe cedric.archambeau@xrce.xerox.com Pascal Bootcamp Marseille, France, July 2010 Exercise 1: basics

More information

Lecture Note 12: Kalman Filter

Lecture Note 12: Kalman Filter ECE 645: Estimation Theory Spring 2015 Instructor: Prof. Stanley H. Chan Lecture Note 12: Kalman Filter LaTeX prepared by Stylianos Chatzidakis) May 4, 2015 This lecture note is based on ECE 645Spring

More information

ECE534, Spring 2018: Solutions for Problem Set #3

ECE534, Spring 2018: Solutions for Problem Set #3 ECE534, Spring 08: Solutions for Problem Set #3 Jointly Gaussian Random Variables and MMSE Estimation Suppose that X, Y are jointly Gaussian random variables with µ X = µ Y = 0 and σ X = σ Y = Let their

More information

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y.

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y. ATMO/OPTI 656b Spring 009 Bayesian Retrievals Note: This follows the discussion in Chapter of Rogers (000) As we have seen, the problem with the nadir viewing emission measurements is they do not contain

More information

ECE531 Lecture 11: Dynamic Parameter Estimation: Kalman-Bucy Filter

ECE531 Lecture 11: Dynamic Parameter Estimation: Kalman-Bucy Filter ECE531 Lecture 11: Dynamic Parameter Estimation: Kalman-Bucy Filter D. Richard Brown III Worcester Polytechnic Institute 09-Apr-2009 Worcester Polytechnic Institute D. Richard Brown III 09-Apr-2009 1 /

More information

ECE531 Screencast 5.5: Bayesian Estimation for the Linear Gaussian Model

ECE531 Screencast 5.5: Bayesian Estimation for the Linear Gaussian Model ECE53 Screencast 5.5: Bayesian Estimation for the Linear Gaussian Model D. Richard Brown III Worcester Polytechnic Institute Worcester Polytechnic Institute D. Richard Brown III / 8 Bayesian Estimation

More information

Recall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.

Recall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type. Expectations of Sums of Random Variables STAT/MTHE 353: 4 - More on Expectations and Variances T. Linder Queen s University Winter 017 Recall that if X 1,...,X n are random variables with finite expectations,

More information

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester Physics 403 Propagation of Uncertainties Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Maximum Likelihood and Minimum Least Squares Uncertainty Intervals

More information

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157 Lecture 6: Gaussian Channels Copyright G. Caire (Sample Lectures) 157 Differential entropy (1) Definition 18. The (joint) differential entropy of a continuous random vector X n p X n(x) over R is: Z h(x

More information

Machine Learning and Pattern Recognition, Tutorial Sheet Number 3 Answers

Machine Learning and Pattern Recognition, Tutorial Sheet Number 3 Answers Machine Learning and Pattern Recognition, Tutorial Sheet Number 3 Answers School of Informatics, University of Edinburgh, Instructor: Chris Williams 1. Given a dataset {(x n, y n ), n 1,..., N}, where

More information

01 Probability Theory and Statistics Review

01 Probability Theory and Statistics Review NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement

More information

Strong Lens Modeling (II): Statistical Methods

Strong Lens Modeling (II): Statistical Methods Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Minimum Error Rate Classification

Minimum Error Rate Classification Minimum Error Rate Classification Dr. K.Vijayarekha Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur-613 401 Table of Contents 1.Minimum Error Rate Classification...

More information

Vector Derivatives and the Gradient

Vector Derivatives and the Gradient ECE 275AB Lecture 10 Fall 2008 V1.1 c K. Kreutz-Delgado, UC San Diego p. 1/1 Lecture 10 ECE 275A Vector Derivatives and the Gradient ECE 275AB Lecture 10 Fall 2008 V1.1 c K. Kreutz-Delgado, UC San Diego

More information

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace

More information

(Extended) Kalman Filter

(Extended) Kalman Filter (Extended) Kalman Filter Brian Hunt 7 June 2013 Goals of Data Assimilation (DA) Estimate the state of a system based on both current and all past observations of the system, using a model for the system

More information

Multivariate probability distributions and linear regression

Multivariate probability distributions and linear regression Multivariate probability distributions and linear regression Patrik Hoyer 1 Contents: Random variable, probability distribution Joint distribution Marginal distribution Conditional distribution Independence,

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction Estimation Tasks Short Course on Image Quality Matthew A. Kupinski Introduction Section 13.3 in B&M Keep in mind the similarities between estimation and classification Image-quality is a statistical concept

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 8 A. d Aspremont. Convex Optimization M2. 1/57 Applications A. d Aspremont. Convex Optimization M2. 2/57 Outline Geometrical problems Approximation problems Combinatorial

More information

Chapter 4 continued. Chapter 4 sections

Chapter 4 continued. Chapter 4 sections Chapter 4 sections Chapter 4 continued 4.1 Expectation 4.2 Properties of Expectations 4.3 Variance 4.4 Moments 4.5 The Mean and the Median 4.6 Covariance and Correlation 4.7 Conditional Expectation SKIP:

More information

Ellipsoidal Methods for Adaptive Choice-based Conjoint Analysis (CBCA)

Ellipsoidal Methods for Adaptive Choice-based Conjoint Analysis (CBCA) Ellipsoidal Methods for Adaptive Choice-based Conjoint Analysis (CBCA) Juan Pablo Vielma Massachusetts Institute of Technology Joint work with Denis Saure Operations Management Seminar, Rotman School of

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical

More information

1 Introduction. Systems 2: Simulating Errors. Mobile Robot Systems. System Under. Environment

1 Introduction. Systems 2: Simulating Errors. Mobile Robot Systems. System Under. Environment Systems 2: Simulating Errors Introduction Simulating errors is a great way to test you calibration algorithms, your real-time identification algorithms, and your estimation algorithms. Conceptually, the

More information

Linear Least Square Problems Dr.-Ing. Sudchai Boonto

Linear Least Square Problems Dr.-Ing. Sudchai Boonto Dr-Ing Sudchai Boonto Department of Control System and Instrumentation Engineering King Mongkuts Unniversity of Technology Thonburi Thailand Linear Least-Squares Problems Given y, measurement signal, find

More information

Statistics STAT:5100 (22S:193), Fall Sample Final Exam B

Statistics STAT:5100 (22S:193), Fall Sample Final Exam B Statistics STAT:5 (22S:93), Fall 25 Sample Final Exam B Please write your answers in the exam books provided.. Let X, Y, and Y 2 be independent random variables with X N(µ X, σ 2 X ) and Y i N(µ Y, σ 2

More information

Least Squares. Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter UCSD

Least Squares. Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter UCSD Least Squares Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 75A Winter 0 - UCSD (Unweighted) Least Squares Assume linearity in the unnown, deterministic model parameters Scalar, additive noise model: y f (

More information

9 Multi-Model State Estimation

9 Multi-Model State Estimation Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 9 Multi-Model State

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

Topics in Probability and Statistics

Topics in Probability and Statistics Topics in Probability and tatistics A Fundamental Construction uppose {, P } is a sample space (with probability P), and suppose X : R is a random variable. The distribution of X is the probability P X

More information

Lecture 22: A Review of Linear Algebra and an Introduction to The Multivariate Normal Distribution

Lecture 22: A Review of Linear Algebra and an Introduction to The Multivariate Normal Distribution Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 22: A Review of Linear Algebra and an Introduction to The Multivariate Normal Distribution Relevant

More information

Kalman Filtering. Namrata Vaswani. March 29, Kalman Filter as a causal MMSE estimator

Kalman Filtering. Namrata Vaswani. March 29, Kalman Filter as a causal MMSE estimator Kalman Filtering Namrata Vaswani March 29, 2018 Notes are based on Vincent Poor s book. 1 Kalman Filter as a causal MMSE estimator Consider the following state space model (signal and observation model).

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

B4 Estimation and Inference

B4 Estimation and Inference B4 Estimation and Inference 6 Lectures Hilary Term 27 2 Tutorial Sheets A. Zisserman Overview Lectures 1 & 2: Introduction sensors, and basics of probability density functions for representing sensor error

More information

Midterm. Introduction to Machine Learning. CS 189 Spring Please do not open the exam before you are instructed to do so.

Midterm. Introduction to Machine Learning. CS 189 Spring Please do not open the exam before you are instructed to do so. CS 89 Spring 07 Introduction to Machine Learning Midterm Please do not open the exam before you are instructed to do so. The exam is closed book, closed notes except your one-page cheat sheet. Electronic

More information

Statistical Filtering and Control for AI and Robotics. Part II. Linear methods for regression & Kalman filtering

Statistical Filtering and Control for AI and Robotics. Part II. Linear methods for regression & Kalman filtering Statistical Filtering and Control for AI and Robotics Part II. Linear methods for regression & Kalman filtering Riccardo Muradore 1 / 66 Outline Linear Methods for Regression Gaussian filter Stochastic

More information

WLS and BLUE (prelude to BLUP) Prediction

WLS and BLUE (prelude to BLUP) Prediction WLS and BLUE (prelude to BLUP) Prediction Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 21, 2018 Suppose that Y has mean X β and known covariance matrix V (but Y need

More information

7 Multivariate Statistical Models

7 Multivariate Statistical Models 7 Multivariate Statistical Models 7.1 Introduction Often we are not interested merely in a single random variable but rather in the joint behavior of several random variables, for example, returns on several

More information

A Bayesian Treatment of Linear Gaussian Regression

A Bayesian Treatment of Linear Gaussian Regression A Bayesian Treatment of Linear Gaussian Regression Frank Wood December 3, 2009 Bayesian Approach to Classical Linear Regression In classical linear regression we have the following model y β, σ 2, X N(Xβ,

More information

A Theoretical Overview on Kalman Filtering

A Theoretical Overview on Kalman Filtering A Theoretical Overview on Kalman Filtering Constantinos Mavroeidis Vanier College Presented to professors: IVANOV T. IVAN STAHN CHRISTIAN Email: cmavroeidis@gmail.com June 6, 208 Abstract Kalman filtering

More information

Statistical Techniques in Robotics (16-831, F12) Lecture#17 (Wednesday October 31) Kalman Filters. Lecturer: Drew Bagnell Scribe:Greydon Foil 1

Statistical Techniques in Robotics (16-831, F12) Lecture#17 (Wednesday October 31) Kalman Filters. Lecturer: Drew Bagnell Scribe:Greydon Foil 1 Statistical Techniques in Robotics (16-831, F12) Lecture#17 (Wednesday October 31) Kalman Filters Lecturer: Drew Bagnell Scribe:Greydon Foil 1 1 Gauss Markov Model Consider X 1, X 2,...X t, X t+1 to be

More information

The Multivariate Gaussian Distribution

The Multivariate Gaussian Distribution The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

2 Statistical Estimation: Basic Concepts

2 Statistical Estimation: Basic Concepts Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:

More information

1 Class Organization. 2 Introduction

1 Class Organization. 2 Introduction Time Series Analysis, Lecture 1, 2018 1 1 Class Organization Course Description Prerequisite Homework and Grading Readings and Lecture Notes Course Website: http://www.nanlifinance.org/teaching.html wechat

More information

Kalman Filter. Man-Wai MAK

Kalman Filter. Man-Wai MAK Kalman Filter Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S. Gannot and A. Yeredor,

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

Lecture 7 MIMO Communica2ons

Lecture 7 MIMO Communica2ons Wireless Communications Lecture 7 MIMO Communica2ons Prof. Chun-Hung Liu Dept. of Electrical and Computer Engineering National Chiao Tung University Fall 2014 1 Outline MIMO Communications (Chapter 10

More information

Informatics 2B: Learning and Data Lecture 10 Discriminant functions 2. Minimal misclassifications. Decision Boundaries

Informatics 2B: Learning and Data Lecture 10 Discriminant functions 2. Minimal misclassifications. Decision Boundaries Overview Gaussians estimated from training data Guido Sanguinetti Informatics B Learning and Data Lecture 1 9 March 1 Today s lecture Posterior probabilities, decision regions and minimising the probability

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

Uncorrelatedness and Independence

Uncorrelatedness and Independence Uncorrelatedness and Independence Uncorrelatedness:Two r.v. x and y are uncorrelated if C xy = E[(x m x )(y m y ) T ] = 0 or equivalently R xy = E[xy T ] = E[x]E[y T ] = m x m T y White random vector:this

More information

Econ 424 Review of Matrix Algebra

Econ 424 Review of Matrix Algebra Econ 424 Review of Matrix Algebra Eric Zivot January 22, 205 Matrices and Vectors Matrix 2 A = 2 22 2 ( )...... 2 = of rows, = of columns Square matrix : = Vector x ( ) = 2. Remarks R is a matrix oriented

More information

Jointly Distributed Random Variables

Jointly Distributed Random Variables Jointly Distributed Random Variables CE 311S What if there is more than one random variable we are interested in? How should you invest the extra money from your summer internship? To simplify matters,

More information