A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University

Size: px
Start display at page:

Download "A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University"

Transcription

1 A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University Lecture 19 Modeling Topics plan: Modeling (linear/non- linear least squares) Bayesian inference Bayesian approaches to spectral esbmabon; also prewhitening methods OpBmizaBon methods (needed for posterior PDFs, Bayes factors) Reading: Ch 10, 11 and 3 in Gregory For next week: Assignment 3a = group Bayesian project References: Webpage: 1 ASTRONOMY 6523 Spring 2017 Problem Set 3a This is a class project in its initial setup stages but each of you should do the numerical part individually. Here is the assignment in general terms: 1. Assemble a data set 2. Pose two (or more!) hypotheses (models) for the data 3. Set up the Bayesian inference problem for each model, i.e. set up the posterior PDF using reasonable prior PDFs for each model 4. Compare models using the Bayesian odds ratio The method you should use is described in Chapter 3 of Gregory and in particular Section 3.5 about model comparison. The particulars: Construct a data set comprising birthdays of your peers. Express these as a day number of the year. If you like, you could express this as a phase in the interval [0, 1]. (Don t get hung up on leap days). I recommend that each of you get the birthdays of 10 of your family and friends, etc. Than pool them together into an aggregate data set. Given the data set compare two hypothesis about the occurrence of birthdays. As just an example, you might have these two hypotheses: i) H1: that birthdays occur uniformly through the year. ii) H2: that birthdays occur according to a PDF that is a constant + sinusoidal part; this is a three parameter model (constant, amplitude and phase of the sinusoid), which can be challenging. You may want to come up with some two-parameter distribution. You should begin your analysis by calculating, plotting and interpreting (in words) a histogram of the birthdays. You will have to choose a suitable binning interval. You should also calculate, plot and analyze the CDF of the data. For H2, calculate the marginalized PDFs and the 95% confidence for each parameter. Compare the two hypotheses by calculating the odds ratio O12 = P (H1 DI) P (H2 DI). Which hypothesis is favored by the data? 2 1

2 Linear Least Squares Least Squares Fitting Consider a general model for an observable quantity: Data n data points = model for observable quantity theory, k parameters + additive errors PDF possibly known For least squares, we need to know just the 1st and 2nd moments of the error PDF but for maximum likelihood analysis we need to know the PDF. Symbolically, y i = k j=1 j X ij + i,i=1,...,n 7 In vector notation: where y = X Excluding the noise part (n equations, k unknowns) y = n 1vector X = n k matrix = k 1vector = n 1vector Nomenclature: X is the design matrix is the parameter vector 8 2

3 Example: A parabolic model for a times series y i with errors i : y i = t i + 3 t 2 i + i. parabola The model is linear in the parameters even though it is nonlinear in the independent variable, t i. The design matrix and parameter vector are: X = 1 t 1 t2 1 1 t 2 t 2 2. = 1 t n t 2 n Solution 1: No Errors Suppose there are no errors. Then we must solve y = X,a general class of matrix problems. The circumstances in which there are solutions to the problem depend on the rank of the matrix X and on the rank of the augmented matrix [X y]. If X is square with non-zero determinant, then the solution is simply = X 1 y. Usually, however, we (better) have more data points than parameters (n > k or n k). Then X is rectangular. For rectangular matrices we cannot find the simple inverse of X. But if det(x X) = 0, then X X has an inverse. 10 3

4 Therefore the solution to y = X is found by premultiplying by X X y = X X and then multiplying by the inverse of X X (X X) 1 X y =(X X) 1 (X X) = which yields the unique solution =(X X) 1 X y 11 Solution 2: The case with measurement errors The actual problem is y = X + ŷ + The errors break down the uniqueness of the solution. There is possibly an infinite number of solutions for. In some sense we want the best estimate, according to criteria we have already discussed. Typical situation: n k: There are many more data points than parameters solvable problem. We now obtain an estimate for ˆ for based on least squares. By estimating we are also, in effect, estimating the errors, = y X. We want to minimize the errors in a statistical sense. 12 4

5 Therefore, we minimize the inner product (a scalar). n n Q() = 2 j (y ŷ) 2 j=1 j=1 We can write this as a quadratic form using the identity matrix I Q() = I We have put the identity matrix into the equation. We can put any Toeplitz matrix here to get a general quadratic form and we will see later (for weighted least squares) that the covariance matrix will appear in this form. 13 Solution: Q( )=(y X ) (y X ) =[y (X ) ](y X ) = y y (X ) y X y y X +(X ) X X y X X Note (y ) = y and transpose of scalar = scalar Thus Q( )=y y 2 X y + (X X) Now minimize w.r.t. to get estimator ˆ: dq d =(vectorofdq d i,i=1,k)= gradient 14 5

6 Thus and, if the inverse exists, dq d =ˆ =0 2 X y +2(X X)ˆ =0 (X X)ˆ = X y ˆ =(X X) 1 X y 15 This solution is the same answer as for the error-less case. But here, we have an estimate that is not unique; it is one among many models that may be consistent with the data; it just happens to be the one with the minimum least-squares error over an ensemble. For a given data set (a specific realization), the best parameter set may differ from the one that gives the least-squares error. Notes: when the errors have Gaussian statistics, the least-squares solution is identical to the maximum-likelihood solution. We can also write the equation to be solved as the normal equations, where normal here means orthogonal: X (y Xˆ) =X (y ŷ) residuals R = X R R X = 0. inner product Thus the residuals R, which are the errors in estimating the data, are orthogonal to the columns of X. 16 6

7 Matrices: matrix rank = dimension of largest square submatrix with determinent = 0. (an n n matrix with det = 0 has rank <nand is said to be singular) Augmented matrix: [X y] Let r = rank X and r aug = rank [X y] Then if i. r aug >rno solution (we will assume we never have this case) ii. r aug = r = k = no. of unknowns ( ) one solution iii. r aug = r<k can solve for r unknowns after assigning arbitrary values to k r of the unknowns More specifically, we can have 1. n<krank of X = r n<k infinite number of solutions (not enough equations for number of unknowns) n = k (square matrix X) Now a possibility is r = k if det X = 0 A 1 exists unique solution = A 1 y 3. n>k r = rank of X k (more data points than parameters): r = k is again possible but now X does not have an inverse because it is not square. However, the matrix X X = (k n) (n k) = k k matrix (square). (where simply means transpose) has an inverse if det(x X) = 0. Derivatives: We use the results and d db (b c)=c d db (b Ab) =2Ab if A = A 18 7

8 Linear Least Squares: Parameter Errors First consider the special case where errors in the data,, have a diagonal covariance matrix with all diagonal entries equal: = 2 I, where I is the identity matrix. [Note that is an n 1 matrix so the covariance matrix,, is an n n matrix.] We want to find the estimation errors in the parameters ˆ. Let P (ˆ )(ˆ ), which is a k k matrix of correlation values between the different parameters; i.e. this is the covariance matrix of the parameters. We have ˆ =(X X) 1 X y, as before. Substituting y = X + we find that ˆ = +(X X) 1 X. 19 Defining we find that B X X, ˆ = B 1 X. Therefore the covariance matrix of the parameters is where we have used P = (B 1 X )(B 1 X ) = (B 1 X )( XB 1 ) = B 1 X XB 1 = B 1 X 2 IXB 1 = 2 B 1 = 2 (X X) Should be (XX)^-1 (X ) X 20 8

9 which implies Thus, B B and (B 1 ) B 1. P = 2 (X X) 1. The error in each parameter with respect to the true parameter value is j P jj = X 1 1/2 X jj and the correlation coefficient between the two parameters is j k P jk P jk =. j k Pjj P kk In the ideal case, the parameters would be uncorrelated, so j k =0 forj = k 21 Modeling Examples 9

10 Least Squares Examples I. A polynomial model for a times series y i with errors i : k k y i = X ij j + i = t j 1 i j + i. j=1 j=1 The model is linear in the parameters even though it is nonlinear in the independent variable, t i. If the polynomial order is p, then k = p +1and The design matrix and parameter vector are: 1 t 1 t 2 1 t p 1 1 t X = 2 t 2 2 t p 2. 1 t n t 2 n t p n = 1 2. p+1 Define T k = n t k j and tk y = i n t k i y i i=1 Need 1/n in front of last sum 1 Then the product of the design matrix with itself is the k k = (p +1) (p +1)matrix and T 0 T 1 T 2 T p T 1 T 2 T 3 T p+1 X X = T 2 T 3 T 4 T p T p T p+1 T p+2 T 2p X y = X y 1 y 2. y n = n y i i=1 n y t i y i tỵ n.. t n p y t p i y i The least-squares solution ˆ =(X X) 1 X y requires the inverse of X X that will exist if the determinant is nonzero. i=1 i=1 2 10

11 First-order polynomial: something we can easily solve. y i = t i X X = T0 T 1 T 1 T 2 X 1 1 X = det(x (matrix of cofactors) X) 1 T2 T = 1 (T 0 T 2 T1 2) T 1 T 0 3 Then ˆ =(X X) 1 X y = n (T 0 T 2 T 2 1 ) T2 T 1 T 1 T 0 y ty = n (T 0 T 2 T1 2) yt 2 tyt 1 yt 1 + tyt 0 yt n 2 tyt 1 (nt 2 T1 2) yt 1 + tyn So the individual parameters are ˆ1 = n yt 2 tyt 1 (nt 2 T1 2) and ˆ2 = n nty yt 1 (nt 2 T1 2). 4 11

12 Assuming the errors i are stationary and statistically independent with variance 2 i = 2, the covariance matrix of the parameters is where P = 2 X X = T 2 = nt 2 T1 2 n = nt 2 T1 2 1/2 1/2 = 1 2 T = nt2 (negatively correlated) 5 For n 1 and uniform sampling t i = i, i =1,...,n, so 1 2 n 1 2 T 1 n 2 /2 and T 2 n 3 /3 n 2 /2 n n 3 /3 = 2 12 n 3/ (highly anticorrelated) The anticorrelation means that any error in one parameter is compensated by the error in the other. 6 12

13 Better parameterization for the first-order polynomial: orthogonal polynomials. E.g. y i = (t i t) where t = 1 n t n i i=1 Now the design matrix and the various products are 1 t 1 t 1 t X = 2 t, X T0 T X = 1. T 1 T 2 1 t n t and the solution is now T0 0 =, X X 1 1 T2 0 = 0 T 2 T 0 T 2 0 T 0 n(t t)y 1 = y, 2 = T 2 The errors on 1,2 are the same but the parameters are now uncorrelated, 1 2 =0. 7 II. Sinusoids Consider the linear model y = X + where X comprises complex exponentials and the parameter vector comprises Fourier amplitudes: X nm = e 2inm/N, n =0,...,N 1, m =0,...,k 1 WN nm where W N e 2i/N is the N th root of 1 on the unit circle. for k = N we have W N WN k X = 1 WN 2 WN 2k WN N 1 W N (N 1)k W N W N 1 N k = N 1 W 2 2(N 1) N W N WN N 1 (N 1)2 W N 8 13

14 The product matrix is s 0 s 1 s N 1 X s 1 s 0 s N 2 X = s 2 s 1 s 0 s N s N 1 s N 2 s 0 where s p N 1 W pj N. j=0 The off-diagonal terms all sum to zero because the sums are over integer multiples of the periods of the complex sinusoids. Therefore X X = NI and (X X) 1 = N 1 I. 9 We also have N 1 y j j=0 N 1 W j N yj j=0 X y = N 1 W 2j N yj j=0. N 1 (N 1)j W N y j j=0 The least-squares coefficients are then ˆ =(X X) 1 X y = N 1 X y which is just the DFT of y expressed in vector form. The parameter error vector is P = 2 (X X) 1 = N 1 2 I

15 Example of a bad model: Consider y i = x i + a(sin 3 x i ) linearize x i + ax i (cos 3 x i ) 3 where the parameters of the linearized function are 1, 2, 3 = a 3. The design matrix and product matrix are 1 x 1 x 1 cos 3 x 1 1 x X = 2 x 1 cos 3 x 2. 1 x N x 1 cos 3 x N N X i x i i x i cos 3 x i X = i x i i x2 i i x2 i cos 3x i i x i cos 3 x i i x2 i cos 3x i i x2 i (cos 3x i ) 2 For the case where 3 x i 1 for all x i, the elements involving cos 3 x i 1 ( 3 x i ) 2 /2 so for very small 3 x i the cosine factors will be very close to unity. The elements in the matrix are then degenerate with neighboring elements because the sine term in the model is degenerate with the linear term. In this case the design matrix is ill-conditioned and the determinant of X X 0. For cases like this the fitting function should be redefined or singular value decomposition may be used

Frequentist-Bayesian Model Comparisons: A Simple Example

Frequentist-Bayesian Model Comparisons: A Simple Example Frequentist-Bayesian Model Comparisons: A Simple Example Consider data that consist of a signal y with additive noise: Data vector (N elements): D = y + n The additive noise n has zero mean and diagonal

More information

1 Determinants. 1.1 Determinant

1 Determinants. 1.1 Determinant 1 Determinants [SB], Chapter 9, p.188-196. [SB], Chapter 26, p.719-739. Bellow w ll study the central question: which additional conditions must satisfy a quadratic matrix A to be invertible, that is to

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2 MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS SYSTEMS OF EQUATIONS AND MATRICES Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated

More information

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where: VAR Model (k-variate VAR(p model (in the Reduced Form: where: Y t = A + B 1 Y t-1 + B 2 Y t-2 + + B p Y t-p + ε t Y t = (y 1t, y 2t,, y kt : a (k x 1 vector of time series variables A: a (k x 1 vector

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Definition of stochastic process (random

More information

Lecture 6. Numerical methods. Approximation of functions

Lecture 6. Numerical methods. Approximation of functions Lecture 6 Numerical methods Approximation of functions Lecture 6 OUTLINE 1. Approximation and interpolation 2. Least-square method basis functions design matrix residual weighted least squares normal equation

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

Lecture 15 Review of Matrix Theory III. Dr. Radhakant Padhi Asst. Professor Dept. of Aerospace Engineering Indian Institute of Science - Bangalore

Lecture 15 Review of Matrix Theory III. Dr. Radhakant Padhi Asst. Professor Dept. of Aerospace Engineering Indian Institute of Science - Bangalore Lecture 15 Review of Matrix Theory III Dr. Radhakant Padhi Asst. Professor Dept. of Aerospace Engineering Indian Institute of Science - Bangalore Matrix An m n matrix is a rectangular or square array of

More information

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Bastian Steder

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Bastian Steder Introduction to Mobile Robotics Compact Course on Linear Algebra Wolfram Burgard, Bastian Steder Reference Book Thrun, Burgard, and Fox: Probabilistic Robotics Vectors Arrays of numbers Vectors represent

More information

Chapter Two Elements of Linear Algebra

Chapter Two Elements of Linear Algebra Chapter Two Elements of Linear Algebra Previously, in chapter one, we have considered single first order differential equations involving a single unknown function. In the next chapter we will begin to

More information

Ch. 12 Linear Bayesian Estimators

Ch. 12 Linear Bayesian Estimators Ch. 1 Linear Bayesian Estimators 1 In chapter 11 we saw: the MMSE estimator takes a simple form when and are jointly Gaussian it is linear and used only the 1 st and nd order moments (means and covariances).

More information

Gaussian Elimination and Back Substitution

Gaussian Elimination and Back Substitution Jim Lambers MAT 610 Summer Session 2009-10 Lecture 4 Notes These notes correspond to Sections 31 and 32 in the text Gaussian Elimination and Back Substitution The basic idea behind methods for solving

More information

Math Camp II. Basic Linear Algebra. Yiqing Xu. Aug 26, 2014 MIT

Math Camp II. Basic Linear Algebra. Yiqing Xu. Aug 26, 2014 MIT Math Camp II Basic Linear Algebra Yiqing Xu MIT Aug 26, 2014 1 Solving Systems of Linear Equations 2 Vectors and Vector Spaces 3 Matrices 4 Least Squares Systems of Linear Equations Definition A linear

More information

Linear Algebra Review (Course Notes for Math 308H - Spring 2016)

Linear Algebra Review (Course Notes for Math 308H - Spring 2016) Linear Algebra Review (Course Notes for Math 308H - Spring 2016) Dr. Michael S. Pilant February 12, 2016 1 Background: We begin with one of the most fundamental notions in R 2, distance. Letting (x 1,

More information

Review of Linear Algebra

Review of Linear Algebra Review of Linear Algebra Definitions An m n (read "m by n") matrix, is a rectangular array of entries, where m is the number of rows and n the number of columns. 2 Definitions (Con t) A is square if m=

More information

ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3

ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3 ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3 ISSUED 24 FEBRUARY 2018 1 Gaussian elimination Let A be an (m n)-matrix Consider the following row operations on A (1) Swap the positions any

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information

Probability Space. J. McNames Portland State University ECE 538/638 Stochastic Signals Ver

Probability Space. J. McNames Portland State University ECE 538/638 Stochastic Signals Ver Stochastic Signals Overview Definitions Second order statistics Stationarity and ergodicity Random signal variability Power spectral density Linear systems with stationary inputs Random signal memory Correlation

More information

Conceptual Questions for Review

Conceptual Questions for Review Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.

More information

Chapter 7. Linear Algebra: Matrices, Vectors,

Chapter 7. Linear Algebra: Matrices, Vectors, Chapter 7. Linear Algebra: Matrices, Vectors, Determinants. Linear Systems Linear algebra includes the theory and application of linear systems of equations, linear transformations, and eigenvalue problems.

More information

Linear Systems and Matrices

Linear Systems and Matrices Department of Mathematics The Chinese University of Hong Kong 1 System of m linear equations in n unknowns (linear system) a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.......

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2015 http://www.astro.cornell.edu/~cordes/a6523 Lecture 23:! Nonlinear least squares!! Notes Modeling2015.pdf on course

More information

Topics. Vectors (column matrices): Vector addition and scalar multiplication The matrix of a linear function y Ax The elements of a matrix A : A ij

Topics. Vectors (column matrices): Vector addition and scalar multiplication The matrix of a linear function y Ax The elements of a matrix A : A ij Topics Vectors (column matrices): Vector addition and scalar multiplication The matrix of a linear function y Ax The elements of a matrix A : A ij or a ij lives in row i and column j Definition of a matrix

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2015 http://www.astro.cornell.edu/~cordes/a6523 Lecture 4 See web page later tomorrow Searching for Monochromatic Signals

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

Chapter 6. Random Processes

Chapter 6. Random Processes Chapter 6 Random Processes Random Process A random process is a time-varying function that assigns the outcome of a random experiment to each time instant: X(t). For a fixed (sample path): a random process

More information

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i ) Direct Methods for Linear Systems Chapter Direct Methods for Solving Linear Systems Per-Olof Persson persson@berkeleyedu Department of Mathematics University of California, Berkeley Math 18A Numerical

More information

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Introduction to Mobile Robotics Compact Course on Linear Algebra Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Vectors Arrays of numbers Vectors represent a point in a n dimensional space

More information

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno Stochastic Processes M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno 1 Outline Stochastic (random) processes. Autocorrelation. Crosscorrelation. Spectral density function.

More information

1 Linear Regression and Correlation

1 Linear Regression and Correlation Math 10B with Professor Stankova Worksheet, Discussion #27; Tuesday, 5/1/2018 GSI name: Roy Zhao 1 Linear Regression and Correlation 1.1 Concepts 1. Often when given data points, we want to find the line

More information

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Cyrill Stachniss, Maren Bennewitz, Diego Tipaldi, Luciano Spinello

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Cyrill Stachniss, Maren Bennewitz, Diego Tipaldi, Luciano Spinello Introduction to Mobile Robotics Compact Course on Linear Algebra Wolfram Burgard, Cyrill Stachniss, Maren Bennewitz, Diego Tipaldi, Luciano Spinello Vectors Arrays of numbers Vectors represent a point

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring Lecture 9 A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2015 http://www.astro.cornell.edu/~cordes/a6523 Applications: Comparison of Frequentist and Bayesian inference

More information

Cheat Sheet for MATH461

Cheat Sheet for MATH461 Cheat Sheet for MATH46 Here is the stuff you really need to remember for the exams Linear systems Ax = b Problem: We consider a linear system of m equations for n unknowns x,,x n : For a given matrix A

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

1 Last time: determinants

1 Last time: determinants 1 Last time: determinants Let n be a positive integer If A is an n n matrix, then its determinant is the number det A = Π(X, A)( 1) inv(x) X S n where S n is the set of n n permutation matrices Π(X, A)

More information

OR MSc Maths Revision Course

OR MSc Maths Revision Course OR MSc Maths Revision Course Tom Byrne School of Mathematics University of Edinburgh t.m.byrne@sms.ed.ac.uk 15 September 2017 General Information Today JCMB Lecture Theatre A, 09:30-12:30 Mathematics revision

More information

Probability, CLT, CLT counterexamples, Bayes. The PDF file of this lecture contains a full reference document on probability and random variables.

Probability, CLT, CLT counterexamples, Bayes. The PDF file of this lecture contains a full reference document on probability and random variables. Lecture 5 A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2015 http://www.astro.cornell.edu/~cordes/a6523 Probability, CLT, CLT counterexamples, Bayes The PDF file of

More information

CHAPTER 3. Matrix Eigenvalue Problems

CHAPTER 3. Matrix Eigenvalue Problems A SERIES OF CLASS NOTES FOR 2005-2006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 3 A COLLECTION OF HANDOUTS ON SYSTEMS OF ORDINARY DIFFERENTIAL

More information

Linear Algebra Primer

Linear Algebra Primer Linear Algebra Primer David Doria daviddoria@gmail.com Wednesday 3 rd December, 2008 Contents Why is it called Linear Algebra? 4 2 What is a Matrix? 4 2. Input and Output.....................................

More information

Algebra & Trig. I. For example, the system. x y 2 z. may be represented by the augmented matrix

Algebra & Trig. I. For example, the system. x y 2 z. may be represented by the augmented matrix Algebra & Trig. I 8.1 Matrix Solutions to Linear Systems A matrix is a rectangular array of elements. o An array is a systematic arrangement of numbers or symbols in rows and columns. Matrices (the plural

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88 Math Camp 2010 Lecture 4: Linear Algebra Xiao Yu Wang MIT Aug 2010 Xiao Yu Wang (MIT) Math Camp 2010 08/10 1 / 88 Linear Algebra Game Plan Vector Spaces Linear Transformations and Matrices Determinant

More information

A Introduction to Matrix Algebra and the Multivariate Normal Distribution

A Introduction to Matrix Algebra and the Multivariate Normal Distribution A Introduction to Matrix Algebra and the Multivariate Normal Distribution PRE 905: Multivariate Analysis Spring 2014 Lecture 6 PRE 905: Lecture 7 Matrix Algebra and the MVN Distribution Today s Class An

More information

Background Mathematics (2/2) 1. David Barber

Background Mathematics (2/2) 1. David Barber Background Mathematics (2/2) 1 David Barber University College London Modified by Samson Cheung (sccheung@ieee.org) 1 These slides accompany the book Bayesian Reasoning and Machine Learning. The book and

More information

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane.

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane. Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane c Sateesh R. Mane 2018 8 Lecture 8 8.1 Matrices July 22, 2018 We shall study

More information

Exercises * on Principal Component Analysis

Exercises * on Principal Component Analysis Exercises * on Principal Component Analysis Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 207 Contents Intuition 3. Problem statement..........................................

More information

Lecture Notes in Linear Algebra

Lecture Notes in Linear Algebra Lecture Notes in Linear Algebra Dr. Abdullah Al-Azemi Mathematics Department Kuwait University February 4, 2017 Contents 1 Linear Equations and Matrices 1 1.2 Matrices............................................

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Advanced Digital Signal Processing -Introduction

Advanced Digital Signal Processing -Introduction Advanced Digital Signal Processing -Introduction LECTURE-2 1 AP9211- ADVANCED DIGITAL SIGNAL PROCESSING UNIT I DISCRETE RANDOM SIGNAL PROCESSING Discrete Random Processes- Ensemble Averages, Stationary

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011 A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011 Reading Chapter 5 (continued) Lecture 8 Key points in probability CLT CLT examples Prior vs Likelihood Box & Tiao

More information

MATH 425-Spring 2010 HOMEWORK ASSIGNMENTS

MATH 425-Spring 2010 HOMEWORK ASSIGNMENTS MATH 425-Spring 2010 HOMEWORK ASSIGNMENTS Instructor: Shmuel Friedland Department of Mathematics, Statistics and Computer Science email: friedlan@uic.edu Last update April 18, 2010 1 HOMEWORK ASSIGNMENT

More information

Chapter 1. Matrix Algebra

Chapter 1. Matrix Algebra ST4233, Linear Models, Semester 1 2008-2009 Chapter 1. Matrix Algebra 1 Matrix and vector notation Definition 1.1 A matrix is a rectangular or square array of numbers of variables. We use uppercase boldface

More information

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti Mobile Robotics 1 A Compact Course on Linear Algebra Giorgio Grisetti SA-1 Vectors Arrays of numbers They represent a point in a n dimensional space 2 Vectors: Scalar Product Scalar-Vector Product Changes

More information

Chapter 5 Matrix Approach to Simple Linear Regression

Chapter 5 Matrix Approach to Simple Linear Regression STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2015 http://www.astro.cornell.edu/~cordes/a6523 Lecture 12 Applications: Model comparison Some Least-squares lessons

More information

a 11 a 12 a 11 a 12 a 13 a 21 a 22 a 23 . a 31 a 32 a 33 a 12 a 21 a 23 a 31 a = = = = 12

a 11 a 12 a 11 a 12 a 13 a 21 a 22 a 23 . a 31 a 32 a 33 a 12 a 21 a 23 a 31 a = = = = 12 24 8 Matrices Determinant of 2 2 matrix Given a 2 2 matrix [ ] a a A = 2 a 2 a 22 the real number a a 22 a 2 a 2 is determinant and denoted by det(a) = a a 2 a 2 a 22 Example 8 Find determinant of 2 2

More information

Statistical techniques for data analysis in Cosmology

Statistical techniques for data analysis in Cosmology Statistical techniques for data analysis in Cosmology arxiv:0712.3028; arxiv:0911.3105 Numerical recipes (the bible ) Licia Verde ICREA & ICC UB-IEEC http://icc.ub.edu/~liciaverde outline Lecture 1: Introduction

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering. Stochastic Processes and Linear Algebra Recap Slides

Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering. Stochastic Processes and Linear Algebra Recap Slides Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering Stochastic Processes and Linear Algebra Recap Slides Stochastic processes and variables XX tt 0 = XX xx nn (tt) xx 2 (tt) XX tt XX

More information

Unit roots in vector time series. Scalar autoregression True model: y t 1 y t1 2 y t2 p y tp t Estimated model: y t c y t1 1 y t1 2 y t2

Unit roots in vector time series. Scalar autoregression True model: y t 1 y t1 2 y t2 p y tp t Estimated model: y t c y t1 1 y t1 2 y t2 Unit roots in vector time series A. Vector autoregressions with unit roots Scalar autoregression True model: y t y t y t p y tp t Estimated model: y t c y t y t y t p y tp t Results: T j j is asymptotically

More information

MIT Spring 2015

MIT Spring 2015 Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)

More information

Lecture 7: Vectors and Matrices II Introduction to Matrices (See Sections, 3.3, 3.6, 3.7 and 3.9 in Boas)

Lecture 7: Vectors and Matrices II Introduction to Matrices (See Sections, 3.3, 3.6, 3.7 and 3.9 in Boas) Lecture 7: Vectors and Matrices II Introduction to Matrices (See Sections 3.3 3.6 3.7 and 3.9 in Boas) Here we will continue our discussion of vectors and their transformations. In Lecture 6 we gained

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

TOPIC III LINEAR ALGEBRA

TOPIC III LINEAR ALGEBRA [1] Linear Equations TOPIC III LINEAR ALGEBRA (1) Case of Two Endogenous Variables 1) Linear vs. Nonlinear Equations Linear equation: ax + by = c, where a, b and c are constants. 2 Nonlinear equation:

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Statistical Signal Processing Detection, Estimation, and Time Series Analysis

Statistical Signal Processing Detection, Estimation, and Time Series Analysis Statistical Signal Processing Detection, Estimation, and Time Series Analysis Louis L. Scharf University of Colorado at Boulder with Cedric Demeure collaborating on Chapters 10 and 11 A TT ADDISON-WESLEY

More information

Linear Least-Squares Data Fitting

Linear Least-Squares Data Fitting CHAPTER 6 Linear Least-Squares Data Fitting 61 Introduction Recall that in chapter 3 we were discussing linear systems of equations, written in shorthand in the form Ax = b In chapter 3, we just considered

More information

3 (Maths) Linear Algebra

3 (Maths) Linear Algebra 3 (Maths) Linear Algebra References: Simon and Blume, chapters 6 to 11, 16 and 23; Pemberton and Rau, chapters 11 to 13 and 25; Sundaram, sections 1.3 and 1.5. The methods and concepts of linear algebra

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Previously Monte Carlo Integration

Previously Monte Carlo Integration Previously Simulation, sampling Monte Carlo Simulations Inverse cdf method Rejection sampling Today: sampling cont., Bayesian inference via sampling Eigenvalues and Eigenvectors Markov processes, PageRank

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and

More information

Linear Algebra: Lecture notes from Kolman and Hill 9th edition.

Linear Algebra: Lecture notes from Kolman and Hill 9th edition. Linear Algebra: Lecture notes from Kolman and Hill 9th edition Taylan Şengül March 20, 2019 Please let me know of any mistakes in these notes Contents Week 1 1 11 Systems of Linear Equations 1 12 Matrices

More information

Fundamentals of Engineering Analysis (650163)

Fundamentals of Engineering Analysis (650163) Philadelphia University Faculty of Engineering Communications and Electronics Engineering Fundamentals of Engineering Analysis (6563) Part Dr. Omar R Daoud Matrices: Introduction DEFINITION A matrix is

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

A Review of Matrix Analysis

A Review of Matrix Analysis Matrix Notation Part Matrix Operations Matrices are simply rectangular arrays of quantities Each quantity in the array is called an element of the matrix and an element can be either a numerical value

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring Lecture 8 A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2015 http://www.astro.cornell.edu/~cordes/a6523 Applications: Bayesian inference: overview and examples Introduction

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

CS6964: Notes On Linear Systems

CS6964: Notes On Linear Systems CS6964: Notes On Linear Systems 1 Linear Systems Systems of equations that are linear in the unknowns are said to be linear systems For instance ax 1 + bx 2 dx 1 + ex 2 = c = f gives 2 equations and 2

More information

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations.

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations. POLI 7 - Mathematical and Statistical Foundations Prof S Saiegh Fall Lecture Notes - Class 4 October 4, Linear Algebra The analysis of many models in the social sciences reduces to the study of systems

More information

4 Bias-Variance for Ridge Regression (24 points)

4 Bias-Variance for Ridge Regression (24 points) Implement Ridge Regression with λ = 0.00001. Plot the Squared Euclidean test error for the following values of k (the dimensions you reduce to): k = {0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,

More information

PARTIAL DIFFERENTIAL EQUATIONS

PARTIAL DIFFERENTIAL EQUATIONS MATHEMATICAL METHODS PARTIAL DIFFERENTIAL EQUATIONS I YEAR B.Tech By Mr. Y. Prabhaker Reddy Asst. Professor of Mathematics Guru Nanak Engineering College Ibrahimpatnam, Hyderabad. SYLLABUS OF MATHEMATICAL

More information

We use the overhead arrow to denote a column vector, i.e., a number with a direction. For example, in three-space, we write

We use the overhead arrow to denote a column vector, i.e., a number with a direction. For example, in three-space, we write 1 MATH FACTS 11 Vectors 111 Definition We use the overhead arrow to denote a column vector, ie, a number with a direction For example, in three-space, we write The elements of a vector have a graphical

More information

A matrix over a field F is a rectangular array of elements from F. The symbol

A matrix over a field F is a rectangular array of elements from F. The symbol Chapter MATRICES Matrix arithmetic A matrix over a field F is a rectangular array of elements from F The symbol M m n (F ) denotes the collection of all m n matrices over F Matrices will usually be denoted

More information

MATRICES AND ITS APPLICATIONS

MATRICES AND ITS APPLICATIONS MATRICES AND ITS Elementary transformations and elementary matrices Inverse using elementary transformations Rank of a matrix Normal form of a matrix Linear dependence and independence of vectors APPLICATIONS

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Stat 206: Linear algebra

Stat 206: Linear algebra Stat 206: Linear algebra James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Vectors We have already been working with vectors, but let s review a few more concepts. The inner product of two

More information

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Recap: Linear Algebra Today... Topics: Mathematical Background Linear algebra Analysis & differential geometry Numerical techniques Geometric

More information

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science Computational Methods CMSC/AMSC/MAPL 460 Eigenvalues and Eigenvectors Ramani Duraiswami, Dept. of Computer Science Eigen Values of a Matrix Recap: A N N matrix A has an eigenvector x (non-zero) with corresponding

More information

Exercise Sheet 1.

Exercise Sheet 1. Exercise Sheet 1 You can download my lecture and exercise sheets at the address http://sami.hust.edu.vn/giang-vien/?name=huynt 1) Let A, B be sets. What does the statement "A is not a subset of B " mean?

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

Chapter 5. Linear Algebra. A linear (algebraic) equation in. unknowns, x 1, x 2,..., x n, is. an equation of the form

Chapter 5. Linear Algebra. A linear (algebraic) equation in. unknowns, x 1, x 2,..., x n, is. an equation of the form Chapter 5. Linear Algebra A linear (algebraic) equation in n unknowns, x 1, x 2,..., x n, is an equation of the form a 1 x 1 + a 2 x 2 + + a n x n = b where a 1, a 2,..., a n and b are real numbers. 1

More information

ANSWERS. E k E 2 E 1 A = B

ANSWERS. E k E 2 E 1 A = B MATH 7- Final Exam Spring ANSWERS Essay Questions points Define an Elementary Matrix Display the fundamental matrix multiply equation which summarizes a sequence of swap, combination and multiply operations,

More information

Monte Carlo Simulation. CWR 6536 Stochastic Subsurface Hydrology

Monte Carlo Simulation. CWR 6536 Stochastic Subsurface Hydrology Monte Carlo Simulation CWR 6536 Stochastic Subsurface Hydrology Steps in Monte Carlo Simulation Create input sample space with known distribution, e.g. ensemble of all possible combinations of v, D, q,

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information