B4 Estimation and Inference
|
|
- Lisa Holt
- 6 years ago
- Views:
Transcription
1 B4 Estimation and Inference 6 Lectures Hilary Term 27 2 Tutorial Sheets A. Zisserman Overview Lectures 1 & 2: Introduction sensors, and basics of probability density functions for representing sensor error and uncertainty. Lecture 3 & 4: Estimators Maximum Likelihood (ML) and Maximum a Posteriori (MAP). Lecture 5 & 6: Decisions and classification loss functions, discriminant functions, linear classifiers. Textbooks 1 Estimation with Applications to Tracking and Navigation: Theory Algorithms and Software Yaakov Bar-Shalom, Xiao-Rong Li, Thiagalingam Kirubarajan, Wiley, 21 Covers probability and estimation, but contains much more material than required for this course. Use also for C4B Mobile Robots.
2 Textbooks 2 Pattern Classification Richard O. Duda, Peter E. Hart, David G. Stork, Wiley, 2 Covers classification, but also contains much more material than required for this course. Background reading and web resources Information Theory, Inference, and Learning Algorithms. David J. C. MacKay, CUP, 23 Covers all the course material though at an advanced level Available on line Introduction to Random Signals and Applied Kalman Filtering. R. Brown and P. Hwang, Wiley, 1997 Good review of probability and random variables in first chapter Further reading (www addresses) and the lecture notes are on
3 One more book for background reading Pattern Recognition and Machine Learning Christopher Bishop, Springer, 26. Excellent on classification and regression Quite advanced
4 Introduction: Sensors and estimation Sensors are used to give information about the state of some system. Our aim in this course is to develop methods which can be used to combine multiple sensor measurements possibly from multiple sensors possibly over time with prior information to obtain accurate estimates of a system s state Noise and uncertainty in sensors Real sensors give inexact measurements for a variety of reasons: discretization error (e.g. measurements on a grid) calibration error quantization noise (e.g. CCD) Successive observations of a system or phenomena do not produce exactly the same result Statistical methods are used to describe and understand this variability, and to incorporate variability into the decision-making processes Examples: ultrasound, laser range scanner, CCD images, GPS
5 Ultrasound objective: diagnose heart disease Axial view Brightness codes depth Contrast agent enhanced Doppler velocity image use prior shape model for heart for inference Laser range scanner objective: build map of room y Good quality data up to discretization on a grid x
6 CCD sensor objective: read number plate 1 frame low light sequence Temporal average to suppress mean zero additive noise I(t) = I + n(t) Time averaged frames histogram equalized 2 frames 8 frames frames Average N noise samples with zero mean and variance result has zero mean and variance /N
7 GPS objective: estimate car trajectory Global Positioning System GPS data collected from car close-up
8 GPS: error sources fit (estimate) curve and lines to reduce random errors Two canonical estimation problems 1. Regression estimate parameters, e.g. of line fit to trajectory of car 2. Classification estimate class, e.g. handwritten digit classification?? 1 7
9 The need for probability We have seen that there is uncertainty in the measurements due to sensor noise. There may also be uncertainty in the model we fit. Finally, we often want more than just a single value for an estimate, we would also like to model the confidence or uncertainty of the estimate. Probability theory the calculus of uncertainty provides the mathematical machinery. Revision of Probability Theory probability distribution functions (pdf) joint and conditional probability independence Normal (Gaussian) distributions of one and two variables
10 1D Probability a brief review Discrete sets Suppose an experiment has a set of possible outcomes S, and an event E is a sub-set of S, then probability of E = relative frequency of event = number of outcomes in E total number of outcomes in S If S = {a 1,a 2,...,a n } has probabilities {p 1,p 2,...,p n } then nx i=1 p i p i = 1 e.g. throw a die, then probability of any particular number = 1/6; and probability of an even number = 1/2 Probability density function (pdf) P (x <X <x+ dx) = p(x) dx Z p(x)dx = 1 p(x) probability over a range of x is given by area under the curve x x+dx
11 PDF Example 1 Univariate Normal Distribution The most important example of a continuous density/distribution is the normal or Gaussian distribution. p(x) = 1 (x µ)2 exp( 2πσ 2σ 2 ) x N(µ, σ 2 ) µ = σ = e.g. model sensor response for measured x, when true x = PDF Example 2 A Uniform Distribution p(x) U (a, b) 1 b a a b example: laser range finder
12 PDF Example 3 A histogram image intensity histogram frequency normalize to obtain a pdf intensity Joint and conditional probability Consider two (discrete) variables A and B Joint probability distribution of A and B P (A, B) = Probability A and B both occurring X P (A i,b j )=1 i,j Conditional probability distribution of A given B P (A B) =P (A, B)/P(B) Marginal (unconditional) distribution of A P (A) = X j P (A, B j )
13 Example Consider two sensors measuring the x and y coordinates of an object in 2D The joint distribution is given by the spreadsheet, where the array entry (i, j) isp (X = i, Y = j): X = X =1 X =2 row sum Y = Y = Y = To compute the joint distribution 1. Count the number of times the measured location is at (X,Y) for X=,2; Y=,2 Y 2. Normalize the count matrix such that X i,j P (i, j) =1 X Exercise: Compute the marginals and conditional P( Y X=1 ) X = X =1 X =2 row sum Y = Y = Y = col sum marginal P(Y=) marginal P (X =1)= X Y P (X =1,Y) Conditional P( Y X=1 ) normalize P(X=1,Y) so that it is a probability P (X =1,Y) P (Y X =1)= P (X =1) X =1 Y =.3 /.3 Y =1.24 /.3 Y =2.3 /.3 col sum 1. in words the probability of Y given that X = 1
14 Bayes Rule (or Bayes Theorem) From the definition of the conditional P (A, B) = P (A B)P (B) P (A, B) = P (B A)P (A) Hence P (A B) = P (B A)P (A) P (B) Bayes Rule lets the dependence of the conditionals be reversed will be important later for Maximum a Posteriori (MAP) estimation Writing P (B) = X i P (A i,b)= X i P (B A i )P (A i ) P (A B) = P (B A)P (A) P (B) = P (B A)P (A) P i P (B A i)p (A i ) with similar expressions for P (B A) Independent variables Two variables are independent if (and only if) p(a, B) = p(a)p(b) i.e. all joint values = product of the marginals Compare with conditional probability p(a, B) =p(a B)p(B) So p(a B) =p(a) and similarly p(b A) =p(b) e.g. two throws of a dice or coin are independent two cards picked without replacement from the same pack are not independent H 1 T 1 row sum H T col sum =
15 Examples are these joint distributions independent? X = X =1 X =2 row sum Y = Y = Y = col sum X = X =1 X =2 row sum Y = Y = Y = col sum y y y x x x Joint, conditional and independence for pdfs Similar results apply for pdfs of continuous variables x and y Z Z p(x, y) dx dy = 1 y x probability over a range of x and y is given by volume
16 Bivariate normal distribution N (x µ, Σ) = 1 2π Σ 1/2 exp ½ 1 2 (x µ)> Σ 1 ¾ (x µ) mean covariance x = Ã x y! µ = Ã µx µ y! Σ = " Σ11 Σ 12 Σ 21 Σ 22 # Example µ = Ã! Σ = " σ 2 x σ 2 y # = " 4 1 # p(x, y) = 1 ( exp 1 Ã x 2 2πσxσy 2 σ 2 x + y2!) σy Note iso-probability contour curves are ellipses p(x,y) = p(x)p(y) so x is independent of y x 2 σ 2 x + y2 σ 2 y = d 2
17 Conditional distribution p(x y) = = p(x, y) p(y) = 1 2πσxσy exp ½ πσy exp ( 1 exp x2 ) 2πσ x 2σ 2 x µ ¾ x 2 σ 2 + y2 x σ 2 y ½ ¾ y2 2σy 2 i.e. independent Normal distribution of n variables N (x µ, Σ) = 1 (2π) n/2 Σ 1/2 exp ½ 1 2 (x µ)> Σ 1 ¾ (x µ) where x is a n-component column vector µ is the n-component mean vector Σ is a n x n covariance matrix
18 Lecture 2: Describing and manipulating pdfs Expectations of moments in 1D Mean, variance, skew, kurtosis Expectations of moments in 2D Covariance and correlation Transforming variables Combining pdfs Introduction to Maximum Likelihood estimation Describing distributions Repeated measurements with the same sensor D 2D Sensor model: could store original measurements (x i, y i ), or store histogram of measurements p i, or compute (fit) a compact representation of the distribution
19 Expectations and moments 1D The expected value of a scalar random variable, also called its mean, average, or first moment is: Discrete case E[x] = nx i=1 p i x i Continuous case E[x]= Z xp(x) dx = µ Note that E is a linear operator, i.e. E[ax + by] = ae[x] + be[y] Moments The nth moment is E[x n ]= Z xn p(x) dx The second central moment, or variance, is var(x) =E[(x µ) 2 ]= Z (x µ)2 p(x) dx = E[x 2 ] µ 2 = σ 2 x The square root σ of the variance is the standard deviation
20 Example Gaussian pdf p(x) =N (µ, σ 2 )= 1 ( ) (x µ)2 exp 2πσ 2σ 2 mean E[x] = Z var(x) = E[(x µ) 2 ] xp(x) dx = µ = Z (x µ)2 p(x) dx = σ a Normal distribution is defined by its first and second moments Fitting models by moment matching Example: fit a Normal distribution to measured samples Sketch algorithm: 1. Compute mean µ of samples 2. Compute variance σ 2 of samples 3. Represent by a Normal distribution N(µ,σ 2 ) fitted model
21 Mean and Variance of discrete random variable two probability distributions can differ even though they have identical means and variance mean and variance are summary values; more is needed to know the distribution (e.g. that it is a normal distribution) What model should be fitted to this measured distribution? what does the fitted Normal distribution look like?
22 Higher moments - skewness skew(x) =E[ Ã x µ µ! 3 ]= Z Ã! 3 x µ p(x) dx µ symmetric (e.g. Gaussian): skew = skew positive Higher moments - kurtosis µ x µ 4] 3 Z kurt(x) =E[ = σ µ x µ 4 p(x) dx 3 σ positive: narrow peak with long tails negative: flat peak and little tail Gaussian): kurt =
23 Expectations and moments 2D Suppose we have 2 random variables with bivariate joint density: p(x, y) Define moments (expectation of their product) Z E[xy] = xyp(x, y) dx dy y x = E[x] E[y] if x and y are independent y NB not if and only if x Covariance and Correlation measure behaviour about mean the covariance is defined as cov(xy) =σxy = Z (x µx)(y µy)p(x, y) dx dy = E[xy] E[x] E[y] [proof: exercise] summarize as a 2 x 2 symmetric covariance matrix Σ = " var(x) cov(xy) cov(xy) var(y) # = E h (x µ)(x µ) >i in n dimensions covariance is a n x n symmetric matrix
24 the correlation is defined as ρ(xy) = cov(xy) q q var(x) var(y) measures normalized correlation of two random variables (c.f. correlation of two signals) in the discrete sample case P i ρ(xy) = (x i µ x )(y i µ y ) q P i (x i µ x ) 2q P i (y i µ y ) 2 ρ(xy) 1 e.g. if x = y then ρ(x,y) = 1, if x = -y then ρ(x,y) = -1 if x and y are independent then ρ(x,y) = cov(x,y) =
25 Fitting a Bivariate Normal distribution N (x µ, Σ) = 1 2π Σ 1/2 exp ½ 1 2 (x µ)> Σ 1 ¾ (x µ) a Normal distribution in 2D is defined by its first and second moments (mean and covariance matrix) y in a similar manner to the 1D case, a 2D Gaussian is fitted by computing the mean and covariance matrix of the samples x y x Example µ = Ã! Σ = " # if x and y are not independent and have correlation ρ, Σ = " σ 2 x ρσxσy ρσxσy σ 2 y # Let S =Σ 1 then iso-probability curves are x > Sx = d 2, i.e. ellipses
26 Transformation of random variables Problem: Suppose the pdf for a dart thrower is p(x, y) = 1 2πσ 2e (x2 +y 2 )/(2σ 2 ) express this pdf in polar coordinates. The coordinates are related as x = x(r, θ) = r cos θ y = y(r, θ) = r sinθ and taking account of the area change p(x, y) dxdy = p(x(r, θ),y(r, θ))j drdθ where J is the Jacobian, and J = r in this case p(x(r, θ),y(r, θ))j drdθ = 1 2πσ 2e r2 /(2σ 2 ) rdrdθ marginalize to get p(r) p rθ (r, θ) Z 2π p(r) = p(r, θ) dθ = = r /(2σ 2 ) σ 2e r2 r /(2σ 2 Z 2π ) 2πσ 2e r2 p(r) dθ.1.5 σ r
27 Example 1: Linear transformation of Normal distributions If the pdf of x is a Normal distribution p(x) =N (µ, Σ) = 1 (2π) n/2 Σ then under the linear transformation y = Ax + t 1/2 exp the pdf of y is also a Normal distribution with µ y = Aµx + t Σy = AΣxA > ½ 1 2 (x µ)> Σ 1 ¾ (x µ) Consider the quadratic term Under the transformation y = Ax + t x = A 1 (y t) developing the quadratic (x µ x ) > Σ 1 x (x µ x ) = (A 1 (y t) µ x ) > Σ 1 x (A 1 (y t) µ x ) = ³ A 1 ³ (y t Aµ x ) > Σ 1 x A 1 (y t Aµx ) = (y t Aµx) > A > Σ 1 x A 1 (y t Aµx) = (y µy) > Σ 1 y (y µy) with µ y = Aµ x + t Σ 1 y = A > Σ 1 x A 1 Σ y = AΣ x A >
28 Example 2: Sum of random variables Problem: suppose z = x + y and p xy (x,y) is the joint distribution of x and y, find p(z) Let t = x y then y p tz (t, z) dtdz = p xy (x, y)j dtdz = 1 2 p xy( z + t 2, z t 2 ) dtdz t p(z) = = = = Z p tz(t, z) dt Z 1 2 p xy( z + t 2, z t 2 ) dt Z p xy(z u, u) du where u = z t 2 z Z p x(z u)p y (u) du if x and y independent z+dz x which is the convolution of p x (x) and p y (y) Example: 1D Gaussians [proof: exercise] then p(x) =N(µx, σ x) 2 p(y) =N(µy, σ y 2 ) p(x + y) = N(µ x, σ 2 x ) N(µ y, σy 2 ) = N(µ x + µ y, σ 2 x + σ 2 y) Add the means and covariances
29 Maximum Likelihood Estimation informal Estimation is the process by which we infer the value of a quantity of interest, θ, by processing data z that is some way dependent on θ. Simple example: fitting a line to measured data y y = ax + b Estimate line parameters θ=(a,b), given measurements z i = (x i, y i ) and model of the sensor noise x Least squares solution min a,b nx i (y i (ax i + b)) 2 Consider a generative model for the data no noise on x i on y: y i =ỹ i + n i n i N (, σ 2 ) measured value true value p(y i ỹ i ) e (y i ỹ i )2 2σ 2 probability of measuring that true value is ỹ i y i given Model to be estimated ỹ i = ax i + b p(y i ax i + b) e (y i (ax i +b))2 2σ 2
30 For n points, assuming independence p(y 1,y 2,...y n x 1,x 2,...x n ; a, b) this is the likelihood p(y a, b) ny i ny e (y i (ax i +b))2 2σ 2 i e (y i (ax i +b))2 2σ 2 measured data parameters The Maximum likelihood (ML) estimate is obtained as {â,ˆb} =argmax a,b p(y a, b) p(y a, b) ny i e (y i (ax i +b))2 2σ 2 = e P n (y i (ax i +b)) 2 i 2σ 2 Take negative log log(p(y a, b)) nx i (y i (ax i + b)) 2 2σ 2 The Maximum likelihood (ML) estimate is equivalent to {â,ˆb} =argmin a,b nx i (y i (ax i + b)) 2 2σ 2 i.e. to least squares
Lecture Note 1: Probability Theory and Statistics
Univ. of Michigan - NAME 568/EECS 568/ROB 530 Winter 2018 Lecture Note 1: Probability Theory and Statistics Lecturer: Maani Ghaffari Jadidi Date: April 6, 2018 For this and all future notes, if you would
More informationStatistical and Learning Techniques in Computer Vision Lecture 1: Random Variables Jens Rittscher and Chuck Stewart
Statistical and Learning Techniques in Computer Vision Lecture 1: Random Variables Jens Rittscher and Chuck Stewart 1 Motivation Imaging is a stochastic process: If we take all the different sources of
More information01 Probability Theory and Statistics Review
NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement
More informationEEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as
L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace
More informationContinuous Random Variables
1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables
More informationPerhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.
Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage
More informationLecture Notes on the Gaussian Distribution
Lecture Notes on the Gaussian Distribution Hairong Qi The Gaussian distribution is also referred to as the normal distribution or the bell curve distribution for its bell-shaped density curve. There s
More informationCS 195-5: Machine Learning Problem Set 1
CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of
More information[POLS 8500] Review of Linear Algebra, Probability and Information Theory
[POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming
More informationToday. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion
Today Probability and Statistics Naïve Bayes Classification Linear Algebra Matrix Multiplication Matrix Inversion Calculus Vector Calculus Optimization Lagrange Multipliers 1 Classical Artificial Intelligence
More informationProbability Theory for Machine Learning. Chris Cremer September 2015
Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares
More informationIntroduction to Machine Learning
Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB
More informationProbability Review. Chao Lan
Probability Review Chao Lan Let s start with a single random variable Random Experiment A random experiment has three elements 1. sample space Ω: set of all possible outcomes e.g.,ω={1,2,3,4,5,6} 2. event
More informationStatistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart
Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms
More informationx. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).
.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationRobots Autónomos. Depto. CCIA. 2. Bayesian Estimation and sensor models. Domingo Gallardo
Robots Autónomos 2. Bayesian Estimation and sensor models Domingo Gallardo Depto. CCIA http://www.rvg.ua.es/master/robots References Recursive State Estimation: Thrun, chapter 2 Sensor models and robot
More informationProbability Theory Review Reading Assignments
Probability Theory Review Reading Assignments R. Duda, P. Hart, and D. Stork, Pattern Classification, John-Wiley, 2nd edition, 2001 (appendix A.4, hard-copy). "Everything I need to know about Probability"
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 3: Probability, Bayes Theorem, and Bayes Classification Peter Belhumeur Computer Science Columbia University Probability Should you play this game? Game: A fair
More informationIntroduction to Probability and Statistics (Continued)
Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:
More informationBivariate distributions
Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient
More informationAlgorithms for Uncertainty Quantification
Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Uncertainty & Probabilities & Bandits Daniel Hennes 16.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Uncertainty Probability
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationEE4601 Communication Systems
EE4601 Communication Systems Week 2 Review of Probability, Important Distributions 0 c 2011, Georgia Institute of Technology (lect2 1) Conditional Probability Consider a sample space that consists of two
More informationProblem Set 1. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 20
Problem Set MAS 6J/.6J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 0 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain a
More information18 Bivariate normal distribution I
8 Bivariate normal distribution I 8 Example Imagine firing arrows at a target Hopefully they will fall close to the target centre As we fire more arrows we find a high density near the centre and fewer
More informationRandom Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R
In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample
More informationECE531: Principles of Detection and Estimation Course Introduction
ECE531: Principles of Detection and Estimation Course Introduction D. Richard Brown III WPI 22-January-2009 WPI D. Richard Brown III 22-January-2009 1 / 37 Lecture 1 Major Topics 1. Web page. 2. Syllabus
More informationMachine learning - HT Maximum Likelihood
Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce
More informationIntroduction to Normal Distribution
Introduction to Normal Distribution Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 17-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction
More informationProbability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3
Probability Paul Schrimpf January 23, 2018 Contents 1 Definitions 2 2 Properties 3 3 Random variables 4 3.1 Discrete........................................... 4 3.2 Continuous.........................................
More informationLecture 2: Review of Basic Probability Theory
ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent
More informationME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 PROBABILITY. Prof. Steven Waslander
ME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 Prof. Steven Waslander p(a): Probability that A is true 0 pa ( ) 1 p( True) 1, p( False) 0 p( A B) p( A) p( B) p( A B) A A B B 2 Discrete Random Variable X
More informationJoint Distribution of Two or More Random Variables
Joint Distribution of Two or More Random Variables Sometimes more than one measurement in the form of random variable is taken on each member of the sample space. In cases like this there will be a few
More informationChapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables
Chapter 2 Some Basic Probability Concepts 2.1 Experiments, Outcomes and Random Variables A random variable is a variable whose value is unknown until it is observed. The value of a random variable results
More informationStatistical Techniques in Robotics (16-831, F12) Lecture#17 (Wednesday October 31) Kalman Filters. Lecturer: Drew Bagnell Scribe:Greydon Foil 1
Statistical Techniques in Robotics (16-831, F12) Lecture#17 (Wednesday October 31) Kalman Filters Lecturer: Drew Bagnell Scribe:Greydon Foil 1 1 Gauss Markov Model Consider X 1, X 2,...X t, X t+1 to be
More informationIntroduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak
Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,
More informationGaussian random variables inr n
Gaussian vectors Lecture 5 Gaussian random variables inr n One-dimensional case One-dimensional Gaussian density with mean and standard deviation (called N, ): fx x exp. Proposition If X N,, then ax b
More informationMultivariate statistical methods and data mining in particle physics
Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general
More informationPreliminary Statistics. Lecture 3: Probability Models and Distributions
Preliminary Statistics Lecture 3: Probability Models and Distributions Rory Macqueen (rm43@soas.ac.uk), September 2015 Outline Revision of Lecture 2 Probability Density Functions Cumulative Distribution
More informationProbability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.
Probability UBC Economics 326 January 23, 2018 1 2 3 Wooldridge (2013) appendix B Stock and Watson (2009) chapter 2 Linton (2017) chapters 1-5 Abbring (2001) sections 2.1-2.3 Diez, Barr, and Cetinkaya-Rundel
More informationData Analysis and Monte Carlo Methods
Lecturer: Allen Caldwell, Max Planck Institute for Physics & TUM Recitation Instructor: Oleksander (Alex) Volynets, MPP & TUM General Information: - Lectures will be held in English, Mondays 16-18:00 -
More informationUCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei)
UCSD ECE53 Handout #34 Prof Young-Han Kim Tuesday, May 7, 04 Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei) Linear estimator Consider a channel with the observation Y XZ, where the
More informationChapter 2. Continuous random variables
Chapter 2 Continuous random variables Outline Review of probability: events and probability Random variable Probability and Cumulative distribution function Review of discrete random variable Introduction
More informationIntro to Probability. Andrei Barbu
Intro to Probability Andrei Barbu Some problems Some problems A means to capture uncertainty Some problems A means to capture uncertainty You have data from two sources, are they different? Some problems
More informationA Probability Review
A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in
More informationThe Multivariate Gaussian Distribution [DRAFT]
The Multivariate Gaussian Distribution DRAFT David S. Rosenberg Abstract This is a collection of a few key and standard results about multivariate Gaussian distributions. I have not included many proofs,
More informationMore than one variable
Chapter More than one variable.1 Bivariate discrete distributions Suppose that the r.v. s X and Y are discrete and take on the values x j and y j, j 1, respectively. Then the joint p.d.f. of X and Y, to
More informationBrandon C. Kelly (Harvard Smithsonian Center for Astrophysics)
Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming
More informationIntroduction to Machine Learning
What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes
More informationPCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities
PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationIntroduction to Stochastic Processes
Stat251/551 (Spring 2017) Stochastic Processes Lecture: 1 Introduction to Stochastic Processes Lecturer: Sahand Negahban Scribe: Sahand Negahban 1 Organization Issues We will use canvas as the course webpage.
More informationE X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl.
E X A M Course code: Course name: Number of pages incl. front page: 6 MA430-G Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours Resources allowed: Notes: Pocket calculator,
More information3. Review of Probability and Statistics
3. Review of Probability and Statistics ECE 830, Spring 2014 Probabilistic models will be used throughout the course to represent noise, errors, and uncertainty in signal processing problems. This lecture
More informationLecture 3: Pattern Classification
EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationCOM336: Neural Computing
COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk
More informationCSC 411: Lecture 09: Naive Bayes
CSC 411: Lecture 09: Naive Bayes Class based on Raquel Urtasun & Rich Zemel s lectures Sanja Fidler University of Toronto Feb 8, 2015 Urtasun, Zemel, Fidler (UofT) CSC 411: 09-Naive Bayes Feb 8, 2015 1
More informationIntroduction to Computational Finance and Financial Econometrics Probability Review - Part 2
You can t see this text! Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2 Eric Zivot Spring 2015 Eric Zivot (Copyright 2015) Probability Review - Part 2 1 /
More informationMachine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014
Machine Learning Probability Basics Basic definitions: Random variables, joint, conditional, marginal distribution, Bayes theorem & examples; Probability distributions: Binomial, Beta, Multinomial, Dirichlet,
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Week #1
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Week #1 Today Introduction to machine learning The course (syllabus) Math review (probability + linear algebra) The future
More informationconditional cdf, conditional pdf, total probability theorem?
6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random
More informationChapter 5 continued. Chapter 5 sections
Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationMath 180B Problem Set 3
Math 180B Problem Set 3 Problem 1. (Exercise 3.1.2) Solution. By the definition of conditional probabilities we have Pr{X 2 = 1, X 3 = 1 X 1 = 0} = Pr{X 3 = 1 X 2 = 1, X 1 = 0} Pr{X 2 = 1 X 1 = 0} = P
More informationJoint Gaussian Graphical Model Review Series I
Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun
More informationDiscrete Mathematics and Probability Theory Fall 2015 Lecture 21
CS 70 Discrete Mathematics and Probability Theory Fall 205 Lecture 2 Inference In this note we revisit the problem of inference: Given some data or observations from the world, what can we infer about
More informationProbability Models for Bayesian Recognition
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIAG / osig Second Semester 06/07 Lesson 9 0 arch 07 Probability odels for Bayesian Recognition Notation... Supervised Learning for Bayesian
More informationSDS 321: Introduction to Probability and Statistics
SDS 321: Introduction to Probability and Statistics Lecture 13: Expectation and Variance and joint distributions Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin
More informationChapter 4. Chapter 4 sections
Chapter 4 sections 4.1 Expectation 4.2 Properties of Expectations 4.3 Variance 4.4 Moments 4.5 The Mean and the Median 4.6 Covariance and Correlation 4.7 Conditional Expectation SKIP: 4.8 Utility Expectation
More informationECE 450 Homework #3. 1. Given the joint density function f XY (x,y) = 0.5 1<x<2, 2<y< <x<4, 2<y<3 0 else
ECE 450 Homework #3 0. Consider the random variables X and Y, whose values are a function of the number showing when a single die is tossed, as show below: Exp. Outcome 1 3 4 5 6 X 3 3 4 4 Y 0 1 3 4 5
More informationSingle Maths B: Introduction to Probability
Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction
More informationIntroduction to Probability and Stocastic Processes - Part I
Introduction to Probability and Stocastic Processes - Part I Lecture 2 Henrik Vie Christensen vie@control.auc.dk Department of Control Engineering Institute of Electronic Systems Aalborg University Denmark
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationMultiple Random Variables
Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An
More informationIntroduction to Bayesian Statistics
School of Computing & Communication, UTS January, 207 Random variables Pre-university: A number is just a fixed value. When we talk about probabilities: When X is a continuous random variable, it has a
More informationChapter 2. Probability
2-1 Chapter 2 Probability 2-2 Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance with certainty. Examples: rolling a die tossing
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More information3. Probability and Statistics
FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important
More informationBayesian statistics, simulation and software
Module 1: Course intro and probability brush-up Department of Mathematical Sciences Aalborg University 1/22 Bayesian Statistics, Simulations and Software Course outline Course consists of 12 half-days
More informationData Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber
Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2017 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields
More informationJoint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:
Joint Distributions Joint Distributions A bivariate normal distribution generalizes the concept of normal distribution to bivariate random variables It requires a matrix formulation of quadratic forms,
More informationUniversity of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries
University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent
More information5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors
EE401 (Semester 1) 5. Random Vectors Jitkomut Songsiri probabilities characteristic function cross correlation, cross covariance Gaussian random vectors functions of random vectors 5-1 Random vectors we
More informationSDS 321: Introduction to Probability and Statistics
SDS 321: Introduction to Probability and Statistics Lecture 14: Continuous random variables Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin www.cs.cmu.edu/
More informationLet X and Y denote two random variables. The joint distribution of these random
EE385 Class Notes 9/7/0 John Stensby Chapter 3: Multiple Random Variables Let X and Y denote two random variables. The joint distribution of these random variables is defined as F XY(x,y) = [X x,y y] P.
More informationEXPECTED VALUE of a RV. corresponds to the average value one would get for the RV when repeating the experiment, =0.
EXPECTED VALUE of a RV corresponds to the average value one would get for the RV when repeating the experiment, independently, infinitely many times. Sample (RIS) of n values of X (e.g. More accurately,
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More informationOutline. Random Variables. Examples. Random Variable
Outline Random Variables M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno Random variables. CDF and pdf. Joint random variables. Correlated, independent, orthogonal. Correlation,
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation
More information1 Probability theory. 2 Random variables and probability theory.
Probability theory Here we summarize some of the probability theory we need. If this is totally unfamiliar to you, you should look at one of the sources given in the readings. In essence, for the major
More informationBayesian Learning (II)
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP
More information