Stat751 / CSI771 Midterm October 15, 2015 Solutions, Comments. f(x) = 0 otherwise

Size: px
Start display at page:

Download "Stat751 / CSI771 Midterm October 15, 2015 Solutions, Comments. f(x) = 0 otherwise"

Transcription

1 Stat751 / CSI771 Midterm October 15, 2015 Solutions, Comments 1. 13pts Consider the beta distribution with PDF Γα+β ΓαΓβ xα 1 1 x β 1 0 x < 1 fx = 0 otherwise, for fixed constants 0 < α, β. Now, assume that you can generate random deviates U i from a U0, 1 distribution. These are the standard things you can get from a simple random number generator in almost any programming system. In R it is what we get from runif. Describe very carefully how you would generate one random deviate from this triangular distribution using an acceptance/rejection method with one or more U i from U0, 1. You can use a uniform distribution as the majorizing distribution. You can also assume that you can evaluate the gamma function, so just use expressions such as Γα+β, Γα, and Γβ First, we determine a good distribution to use as the majorizing distribution. Any distribution with finite range that we can generate variates from would work. How good a given distribution is for this purpose would depend on α and β, which are not given in this problem. The problem states that you can just use the uniform distribution. In the absence on knowledge of α and β, that s probably as good as anything. In the common notation that I used in class, this means gy = I [0,1] y. Now, we need a number c such that c Γα+β ΓαΓβ xα 1 1 x β 1 for 0 x 1. Finding such a c is not hard; since x α 1 1 x β 1 1, one such c is just Γα+β/ΓαΓβ. Of course, we want the smallest such c. Even this is not hard. In grading your work, I only looked for your c, however you chose it or defined it. Once, we ve got the majorizing function and the c we re ready to go. I subtracted 3 points if you were not explicit in defining your c but not its actual value and in defining your gy. There really is no gy in this problem. Now, the steps are straightforward: 1 Generate u 1 and u 2 from U0, 1. 2 If u 2 fu 1 /c, then accept u 1 as the desired variate; otherwise, go back to 1. 1

2 2. 13pts Show that the least-squares estimator for β in the linear model y Xβ, where y is an observed n- vector and X is a corresponding n m matrix of observations is given by β = X + y, where X + is the Moore-Penrose inverse of X. I like this formula! It looks like the solution to a full-rank, consistent system: c = Az z = A 1 c, and y Xβ β = X + y. Although the problem did not specify that X be of full column rank and it is true even if X is not full rank, the steps are simpler if we assume that, and in the following, I will make that assumption. No one had any problems with the rank of X, one way or the other. Also, in the following, let n and m represent the dimensions; that is, assume that X is n m and the other matrices and vectors are of the implied sizes. The least-squares estimator for β is the value β that minimizes the expression y Xβ T y Xβ, which is the residual sum of squares. The optimal value for β can be obtained in different ways. One way is by using calculus to show that a minimum β must satisfy the condition X T X β = X T y. Another way is by expanding the expression for the residual sum of squares and showing that if X T y X β = 0, then β must be the optimal solution. Following this approach, we take a candidate solution the one we want to prove, and show that the residuals are orthogonal to the columns of X. With all of that as a preface, I will give three different proofs that the optimal value can be expressed as β = X + y. a We obtain an expression for β = Ay using calculus, and then show that A = X +. There are two ways to do this: i. Show that A satisfies the four properties that define X +. ii. Use the QR decomposition of X to show that A = X +. b Take β = X + y and show that it satisfies a necessary and sufficient condition for it to be the minimizer. The condition is the orthogonality of the residuals to the columns of X. **************** Now let s do it each way ************************** a First, by taking the first and second derivatives wrt β, we find that a minimum β must satisfy the condition X T X β = X T y, which yields the solution as that is, A = X T X 1 X T. β = X T X 1 X T y; i. The four properties uniquely determine X + : A. XX + X = X B. X + XX + = X + C. XX + is symmetric. D. X + X is symmetric. So here we go: A. XX T X 1 X T X = X B. X T X 1 X T XX T X 1 X T = X T X 1 X T C. XX T X 1 X T is symmetric take its transpose. D. X T X 1 X T X is symmetric. Therefore, X T X 1 X T = X +. 2

3 ii. Now, we use the QR decomposition of X to show that A = X +. This is the way I did it in class. Form the QR decomposition of X, X = QR, where we can write [ ] R1 R =, 0 where R 1 an m m upper triangular matrix. The squared residual norm can now be written as y Xb T y Xb = y QRb T y QRb = Q T y Rb T Q T y Rb = c 1 R 1 b T c 1 R 1 b + c T 2 c 2, where c 1 is a vector with m elements and c 2 is a vector with n m elements, such that Q T y = c1 Because the squared norm is nonnegative, the minimum of the residual norm occurs when c 1 R 1 b T c 1 R 1 b = 0; that is, when c 1 R 1 b = 0, or c 2 R 1 β = c 1. Because R 1 is triangular, the system is easy to solve: β = R 1 1 c 1. Now, [ X + = R ] Q T. This important expression of the Moore-Penrose inverse was the key to solving this problem in this way. Therefore, β = X + y. We also see that the minimum of the residual norm, or the residual sum of squares,is c T 2 c 2. b Finally, as another way, we take β = X + y as a candidate and show that it satisfies the condition of the orthogonality of the residuals to the columns of X; that is, X T y XX + y = 0. Using the properties of X +, we have X T y XX + y = X T y X T XX + y = X T y X T XX + T y because of symmetry = X T y X T X + T X T y = X T y X T X T + X T y property of Moore-Penrose inverses and transposes = X T y X T y property of Moore-Penrose inverses = 0 3

4 3. 13pts Describe how you would evaluate the integrals below using Monte Carlo. Assume that you have a source of uniform U0, 1 random numbers; that is, you can get a sample x 1, x 2,..., x m. Since your result is an estimate, also give a formula for an estimate of the variance of your estimator Although I have told you not to use Monte Carlo when you can evaluate something analytically and these simple integrals could be evaluated analytically, use Monte Carlo anyway. Give formulas for your estimates and for your estimates of the variance of your estimator. a b 2 0 x 2 e x/2 dx You should, of course, use a good PDF decomposition of x 2 e x/2 so as to be more efficient. The simplest decomposition is just 2x 2 e x/2 2 1, where the second factor is just the uniform PDF over [0, 2]. This is the one I d probably use. Using the uniform, a Monte Carlo estimate of the integral is just where u i are iid U0, 1. 2 m m 2u i 2 e u i, i=1 m i=1 2u i 2 e u i t 2 An estimate of the variance is 1 m m 1, where t is the estimate of the integral. Notice also that the problem stated only that you have a source of U0, 1 random numbers, so this PDF requires no real transformation. Other possibilities would be to use the exponential2 distribution truncated at 2, or to use the gamma3,2 distribution, also truncated at 2. In either case, the first question would be how to get random variables from the distribution of interest. The exponential is easy, just using the inverse CDF; but the gamma is rather difficult. Of course if you assume that you have R, you could use qgammam,3,2. The truncation involved with each of the latter distributions would likely make them less efficient. Use of the exponential is would certainly be more efficient than use of the gamma, however. 0 sinxe x dx Because the integral is improper, you must use a distribution with infinite support. A simple one is the exponential with parameter 1. We can generate an exponential from a uniform u as logu. Hence, a Monte Carlo estimate of the integral is t = 1 m sin logu i, m i=1 where u i are iid U0, 1, and an estimate of the variance is 1 m m i=1 sin logu i t 2 m 1. 4

5 4. 13pts Outline how you would design and conduct a Monte Carlo study to compare the performance of the standard two-sample t test for equality of means of two normal populations with Welch s test when the variance of the underlying distributions are unequal. You do not need to know how to perform these two test; just assume you have programs that will perform the two tests at a given significance level α. That is, given two datasets, your programs will return a value of reject or don t reject. Treat this as a factorial experiment. Identify the factors and the factor levels you would use just make some reasonable choices. Then describe the steps you would follow. The response of interest is the performance of the tests. The subject of the tests is the difference in the means. The difference in the means can be measured in various ways, such as an arithmetic difference or a ratio, in either case, possibly scaled by a standard deviation. The performance of either test is its power over some range of differences. The treatments are the two tests. The factors of interest are a. the differences in the means, possibly scaled b. the differences in the variances c. the sample sizes The general approach would be to choose one population as N0, 1, and the second as Nµ, σ 2. We see that all possible ranges are encompassed by the ranges [0, and 0,. More realistically we may choose [0, 3σ] for µ after choosing [1/9, 9] for σ 2. 5

6 5. 22pts Consider the model y i = αe βx i + ɛ i, where α and β are unknown constants, and ɛ i is a random variable with expected value of 0 and constant variance. Assume that we have pairs of observations y 1, x 1,..., y n, x n, and that the ɛ i s for the observations are independent. a Estimation by least squares. i. What is the objective function; that is, what is the function of α and β that is to be minimized? First of all, notice that if you linearize this by taking logs, you are changing the model. We can write this in the form of sums of individual elements or in a vector notation, where, when x is an n-vector, we would adopt the notation e βx to represent the n-vector whose i th element is e βx i. In vector notation, the objective function is fa, b = y ae bx T y ae bx. In the form of sums of individual elements, the objective function is n fa, b = y i ae bx i 2. i=1 ii. What is the gradient of the objective function? Using the vector form, the gradient is g f = f = = f a f b 2ae bx T y ae bx 2adiagxe bx T y ae bx iii. What is the Hessian of the objective function? H f = g f = = 2 f a 2 2 f b a 2 f a b 2 f b 2 a 2ae bx T y ae bx b 2ae bx T y ae bx a 2adiagxe bx T y ae bx b 2adiagxe bx T y ae bx This was messy, and if you had the formulas right, I gave full credit. iv. Given a starting point, what is the Newton step to move to a new solution? Let a 0, b 0 be given. a 1 b 1 = a 0 b 0 H f a 0, b 0 1 gf a 0, b 0. 6

7 b Estimation by maximum likelihood. i. What else would you need to know or assume? You would need to know the distribution of the random variables, ɛ i. This means the multivariate distribution. Notice that nothing was stated about the relationships of the ɛ i to each other. Make an appropriate assumption to satisfy the need referred to in the previous question the specific assumption is not important. Assume that they are iid N0, σ 2. That is the multivariate distribution is N n 0, σ 2 I n. We can represent the PDF of this distribution as fɛ = 1 2σ 2 π ne ɛt ɛ/2σ 2 Now, based on that assumption, in the following, describe how would you proceed to compute the MLEs of α and β. ii. What is the objective function; that is, what is the function of α and β that is to be minimized? There are actually three variables, α, β, and σ 2. As it turns out, however, the optimal values of α and β are not affected by the value of σ 2. The objective function is the likelihood function: Lα, β; x, y = or, equivalently, the log-likelihood: 1 2σ 2 π ne y αe βx T y αe βx /2σ 2 ; l L α, β; x, y = y αe βx T y αe βx. iii. What is the gradient of the objective function? This is the same as least squares. iv. What is the Hessian of the objective function? This is the same as least squares. v. Given a starting point, what is the Newton step to move to a new solution? This is the same as least squares. It is well-known that least squares is ML if the distribution is normal, the error is additive, and the model is linear. It is also the case here because of the form of the model. 7

8 6. 13pts Given the three linearly independent vectors in 5-space: x 1 = 1, 1, 1, 2,0 x 2 = 1, 0, 0, 1,0 x 3 = 1, 0, 1, 1,1 Form three orthonormal vectors z 1, z 2, and z 3 that span the same space. I intended for my numbers to work out evenly, but they don t, so when you have a square root, just show it as such, and don t worry about the computations. The method to use is the Gram-Schmidt. I was very lenient in grading this one. It gets pretty messy, but here are some expressions z 1 = 1, 1, 1, 2, 0/ 7 z 2 = 1, 0, 0,1,0 31, 1, 1,2,0/7/a, z 3 = 1, 0, 1,1,1 41, 1, 1,2,0/7 bz 2 /c, where a and c are the norms, and b is the inner product of z 2 and the z 3 before adjustment. Some of you made the first vector from the third vector; that is, you smartly chose because x 3 is an integer. z 1 = x 3 / x 3, Rather than doing in the manner indicated above, it is actually better to accumulate the third vector in two steps and the fourth, if there were one, in three steps, and so on. Here s some R code to do it m <- 3 n <- 5 z1 <- c1,1,1,2,0 z2 <- c1,0,0,1,0 z3 <- c1,0,1,1,1 Z <- cbindz1,z2,z3 Z[1:n,1] <- Z[1:n,1]/sqrtsumZ[1:n,1]^2 for k in 2:m{ for j in k:m{ Z[1:n,j] <- Z[1:n,j]-sumZ[1:n,k-1]*Z[1:n,j]*Z[1:n,k-1] } Z[1:n,k] <- Z[1:n,k]/sqrtsumZ[1:n,k]^2 } Check it: roundtz%*%z,10 8

9 7. 13pts Given the vector x = 1, 2, 0,2,0. Describe how you would reflect this vector into the vector x = 3, 0, 0,0,0 The reflection is achieved by the Householder matrix I 2uu T, where So x = I 2uu T x. u = 1 x, 2, 0, 2, 0/ 1 x, 2, 0, 2,0 = 2, 2, 0, 2,0/ 12 x <- c1,2,0,2,0 u <- c-2,2,0,2,0/sqrt12 H <- matrixcrepc1,0,0,0,0,0,4,1,nrow=5-2*u%*%tu roundh%*%x,10 yields [,1] [1,] 3 [2,] 0 [3,] 0 [4,] 0 [5,] 0 9

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text. TEST #3 STA 5326 December 4, 214 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. (You will have access to

More information

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show

More information

Lecture 13: Simple Linear Regression in Matrix Format

Lecture 13: Simple Linear Regression in Matrix Format See updates and corrections at http://www.stat.cmu.edu/~cshalizi/mreg/ Lecture 13: Simple Linear Regression in Matrix Format 36-401, Section B, Fall 2015 13 October 2015 Contents 1 Least Squares in Matrix

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Continuous Random Variables 1

Continuous Random Variables 1 Continuous Random Variables 1 STA 256: Fall 2018 1 This slide show is an open-source document. See last slide for copyright information. 1 / 32 Continuous Random Variables: The idea Probability is area

More information

The 'linear algebra way' of talking about "angle" and "similarity" between two vectors is called "inner product". We'll define this next.

The 'linear algebra way' of talking about angle and similarity between two vectors is called inner product. We'll define this next. Orthogonality and QR The 'linear algebra way' of talking about "angle" and "similarity" between two vectors is called "inner product". We'll define this next. So, what is an inner product? An inner product

More information

Math 291-2: Lecture Notes Northwestern University, Winter 2016

Math 291-2: Lecture Notes Northwestern University, Winter 2016 Math 291-2: Lecture Notes Northwestern University, Winter 2016 Written by Santiago Cañez These are lecture notes for Math 291-2, the second quarter of MENU: Intensive Linear Algebra and Multivariable Calculus,

More information

Eco517 Fall 2004 C. Sims MIDTERM EXAM

Eco517 Fall 2004 C. Sims MIDTERM EXAM Eco517 Fall 2004 C. Sims MIDTERM EXAM Answer all four questions. Each is worth 23 points. Do not devote disproportionate time to any one question unless you have answered all the others. (1) We are considering

More information

MATH 23a, FALL 2002 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Solutions to Final Exam (in-class portion) January 22, 2003

MATH 23a, FALL 2002 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Solutions to Final Exam (in-class portion) January 22, 2003 MATH 23a, FALL 2002 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Solutions to Final Exam (in-class portion) January 22, 2003 1. True or False (28 points, 2 each) T or F If V is a vector space

More information

MAT Linear Algebra Collection of sample exams

MAT Linear Algebra Collection of sample exams MAT 342 - Linear Algebra Collection of sample exams A-x. (0 pts Give the precise definition of the row echelon form. 2. ( 0 pts After performing row reductions on the augmented matrix for a certain system

More information

STA 302f16 Assignment Five 1

STA 302f16 Assignment Five 1 STA 30f16 Assignment Five 1 Except for Problem??, these problems are preparation for the quiz in tutorial on Thursday October 0th, and are not to be handed in As usual, at times you may be asked to prove

More information

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #2 Solutions

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #2 Solutions YORK UNIVERSITY Faculty of Science Department of Mathematics and Statistics MATH 3. M Test # Solutions. (8 pts) For each statement indicate whether it is always TRUE or sometimes FALSE. Note: For this

More information

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 26, 2010 Linear system Linear system Ax = b, A C m,n, b C m, x C n. under-determined

More information

Linear Algebra. and

Linear Algebra. and Instructions Please answer the six problems on your own paper. These are essay questions: you should write in complete sentences. 1. Are the two matrices 1 2 2 1 3 5 2 7 and 1 1 1 4 4 2 5 5 2 row equivalent?

More information

Continuous random variables

Continuous random variables Continuous random variables Can take on an uncountably infinite number of values Any value within an interval over which the variable is definied has some probability of occuring This is different from

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

Lecture 16 Solving GLMs via IRWLS

Lecture 16 Solving GLMs via IRWLS Lecture 16 Solving GLMs via IRWLS 09 November 2015 Taylor B. Arnold Yale Statistics STAT 312/612 Notes problem set 5 posted; due next class problem set 6, November 18th Goals for today fixed PCA example

More information

1 Acceptance-Rejection Method

1 Acceptance-Rejection Method Copyright c 2016 by Karl Sigman 1 Acceptance-Rejection Method As we already know, finding an explicit formula for F 1 (y), y [0, 1], for the cdf of a rv X we wish to generate, F (x) = P (X x), x R, is

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Exercises * on Linear Algebra

Exercises * on Linear Algebra Exercises * on Linear Algebra Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 7 Contents Vector spaces 4. Definition...............................................

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

1 Cricket chirps: an example

1 Cricket chirps: an example Notes for 2016-09-26 1 Cricket chirps: an example Did you know that you can estimate the temperature by listening to the rate of chirps? The data set in Table 1 1. represents measurements of the number

More information

Methods of Mathematical Physics X1 Homework 2 - Solutions

Methods of Mathematical Physics X1 Homework 2 - Solutions Methods of Mathematical Physics - 556 X1 Homework - Solutions 1. Recall that we define the orthogonal complement as in class: If S is a vector space, and T is a subspace, then we define the orthogonal

More information

Stability of the Gram-Schmidt process

Stability of the Gram-Schmidt process Stability of the Gram-Schmidt process Orthogonal projection We learned in multivariable calculus (or physics or elementary linear algebra) that if q is a unit vector and v is any vector then the orthogonal

More information

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition MATH 322, Linear Algebra I J. Robert Buchanan Department of Mathematics Spring 205 Motivation When working with an inner product space, the most

More information

x 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7

x 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7 Linear Algebra and its Applications-Lab 1 1) Use Gaussian elimination to solve the following systems x 1 + x 2 2x 3 + 4x 4 = 5 1.1) 2x 1 + 2x 2 3x 3 + x 4 = 3 3x 1 + 3x 2 4x 3 2x 4 = 1 x + y + 2z = 4 1.4)

More information

STA 2101/442 Assignment 3 1

STA 2101/442 Assignment 3 1 STA 2101/442 Assignment 3 1 These questions are practice for the midterm and final exam, and are not to be handed in. 1. Suppose X 1,..., X n are a random sample from a distribution with mean µ and variance

More information

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2 Final Review Sheet The final will cover Sections Chapters 1,2,3 and 4, as well as sections 5.1-5.4, 6.1-6.2 and 7.1-7.3 from chapters 5,6 and 7. This is essentially all material covered this term. Watch

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

Least squares: the big idea

Least squares: the big idea Notes for 2016-02-22 Least squares: the big idea Least squares problems are a special sort of minimization problem. Suppose A R m n where m > n. In general, we cannot solve the overdetermined system Ax

More information

Notes on Eigenvalues, Singular Values and QR

Notes on Eigenvalues, Singular Values and QR Notes on Eigenvalues, Singular Values and QR Michael Overton, Numerical Computing, Spring 2017 March 30, 2017 1 Eigenvalues Everyone who has studied linear algebra knows the definition: given a square

More information

Notes on Solving Linear Least-Squares Problems

Notes on Solving Linear Least-Squares Problems Notes on Solving Linear Least-Squares Problems Robert A. van de Geijn The University of Texas at Austin Austin, TX 7871 October 1, 14 NOTE: I have not thoroughly proof-read these notes!!! 1 Motivation

More information

CMU CS 462/662 (INTRO TO COMPUTER GRAPHICS) HOMEWORK 0.0 MATH REVIEW/PREVIEW LINEAR ALGEBRA

CMU CS 462/662 (INTRO TO COMPUTER GRAPHICS) HOMEWORK 0.0 MATH REVIEW/PREVIEW LINEAR ALGEBRA CMU CS 462/662 (INTRO TO COMPUTER GRAPHICS) HOMEWORK 0.0 MATH REVIEW/PREVIEW LINEAR ALGEBRA Andrew ID: ljelenak August 25, 2018 This assignment reviews basic mathematical tools you will use throughout

More information

Simulation - Lectures - Part I

Simulation - Lectures - Part I Simulation - Lectures - Part I Julien Berestycki -(adapted from François Caron s slides) Part A Simulation and Statistical Programming Hilary Term 2017 Part A Simulation. HT 2017. J. Berestycki. 1 / 66

More information

Extreme Values and Positive/ Negative Definite Matrix Conditions

Extreme Values and Positive/ Negative Definite Matrix Conditions Extreme Values and Positive/ Negative Definite Matrix Conditions James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University November 8, 016 Outline 1

More information

Linear Least-Squares Data Fitting

Linear Least-Squares Data Fitting CHAPTER 6 Linear Least-Squares Data Fitting 61 Introduction Recall that in chapter 3 we were discussing linear systems of equations, written in shorthand in the form Ax = b In chapter 3, we just considered

More information

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5 Practice Exam. Solve the linear system using an augmented matrix. State whether the solution is unique, there are no solutions or whether there are infinitely many solutions. If the solution is unique,

More information

STAT 450: Statistical Theory. Distribution Theory. Reading in Casella and Berger: Ch 2 Sec 1, Ch 4 Sec 1, Ch 4 Sec 6.

STAT 450: Statistical Theory. Distribution Theory. Reading in Casella and Berger: Ch 2 Sec 1, Ch 4 Sec 1, Ch 4 Sec 6. STAT 45: Statistical Theory Distribution Theory Reading in Casella and Berger: Ch 2 Sec 1, Ch 4 Sec 1, Ch 4 Sec 6. Basic Problem: Start with assumptions about f or CDF of random vector X (X 1,..., X p

More information

CHAPTER 6 SOME CONTINUOUS PROBABILITY DISTRIBUTIONS. 6.2 Normal Distribution. 6.1 Continuous Uniform Distribution

CHAPTER 6 SOME CONTINUOUS PROBABILITY DISTRIBUTIONS. 6.2 Normal Distribution. 6.1 Continuous Uniform Distribution CHAPTER 6 SOME CONTINUOUS PROBABILITY DISTRIBUTIONS Recall that a continuous random variable X is a random variable that takes all values in an interval or a set of intervals. The distribution of a continuous

More information

Stat 451 Lecture Notes Simulating Random Variables

Stat 451 Lecture Notes Simulating Random Variables Stat 451 Lecture Notes 05 12 Simulating Random Variables Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 22 in Lange, and Chapter 2 in Robert & Casella 2 Updated:

More information

FIRST YEAR EXAM Monday May 10, 2010; 9:00 12:00am

FIRST YEAR EXAM Monday May 10, 2010; 9:00 12:00am FIRST YEAR EXAM Monday May 10, 2010; 9:00 12:00am NOTES: PLEASE READ CAREFULLY BEFORE BEGINNING EXAM! 1. Do not write solutions on the exam; please write your solutions on the paper provided. 2. Put the

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

Vector and Matrix Norms. Vector and Matrix Norms

Vector and Matrix Norms. Vector and Matrix Norms Vector and Matrix Norms Vector Space Algebra Matrix Algebra: We let x x and A A, where, if x is an element of an abstract vector space n, and A = A: n m, then x is a complex column vector of length n whose

More information

GQE ALGEBRA PROBLEMS

GQE ALGEBRA PROBLEMS GQE ALGEBRA PROBLEMS JAKOB STREIPEL Contents. Eigenthings 2. Norms, Inner Products, Orthogonality, and Such 6 3. Determinants, Inverses, and Linear (In)dependence 4. (Invariant) Subspaces 3 Throughout

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

Quantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras

Quantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras Quantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras Lecture - 6 Postulates of Quantum Mechanics II (Refer Slide Time: 00:07) In my last lecture,

More information

Assignment #9: Orthogonal Projections, Gram-Schmidt, and Least Squares. Name:

Assignment #9: Orthogonal Projections, Gram-Schmidt, and Least Squares. Name: Assignment 9: Orthogonal Projections, Gram-Schmidt, and Least Squares Due date: Friday, April 0, 08 (:pm) Name: Section Number Assignment 9: Orthogonal Projections, Gram-Schmidt, and Least Squares Due

More information

STAT 111 Recitation 7

STAT 111 Recitation 7 STAT 111 Recitation 7 Xin Lu Tan xtan@wharton.upenn.edu October 25, 2013 1 / 13 Miscellaneous Please turn in homework 6. Please pick up homework 7 and the graded homework 5. Please check your grade and

More information

18.06 Quiz 2 April 7, 2010 Professor Strang

18.06 Quiz 2 April 7, 2010 Professor Strang 18.06 Quiz 2 April 7, 2010 Professor Strang Your PRINTED name is: 1. Your recitation number or instructor is 2. 3. 1. (33 points) (a) Find the matrix P that projects every vector b in R 3 onto the line

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Lecture 22. r i+1 = b Ax i+1 = b A(x i + α i r i ) =(b Ax i ) α i Ar i = r i α i Ar i

Lecture 22. r i+1 = b Ax i+1 = b A(x i + α i r i ) =(b Ax i ) α i Ar i = r i α i Ar i 8.409 An Algorithmist s oolkit December, 009 Lecturer: Jonathan Kelner Lecture Last time Last time, we reduced solving sparse systems of linear equations Ax = b where A is symmetric and positive definite

More information

Seminar on Linear Algebra

Seminar on Linear Algebra Supplement Seminar on Linear Algebra Projection, Singular Value Decomposition, Pseudoinverse Kenichi Kanatani Kyoritsu Shuppan Co., Ltd. Contents 1 Linear Space and Projection 1 1.1 Expression of Linear

More information

University of Colorado Denver Department of Mathematical and Statistical Sciences Applied Linear Algebra Ph.D. Preliminary Exam January 22, 2016

University of Colorado Denver Department of Mathematical and Statistical Sciences Applied Linear Algebra Ph.D. Preliminary Exam January 22, 2016 University of Colorado Denver Department of Mathematical and Statistical Sciences Applied Linear Algebra PhD Preliminary Exam January 22, 216 Name: Exam Rules: This exam lasts 4 hours There are 8 problems

More information

The Gram-Schmidt Process

The Gram-Schmidt Process The Gram-Schmidt Process How and Why it Works This is intended as a complement to 5.4 in our textbook. I assume you have read that section, so I will not repeat the definitions it gives. Our goal is to

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

18.06 Professor Johnson Quiz 1 October 3, 2007

18.06 Professor Johnson Quiz 1 October 3, 2007 18.6 Professor Johnson Quiz 1 October 3, 7 SOLUTIONS 1 3 pts.) A given circuit network directed graph) which has an m n incidence matrix A rows = edges, columns = nodes) and a conductance matrix C [diagonal

More information

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:

More information

Optimization Problems

Optimization Problems Optimization Problems The goal in an optimization problem is to find the point at which the minimum (or maximum) of a real, scalar function f occurs and, usually, to find the value of the function at that

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9 STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9 1. qr and complete orthogonal factorization poor man s svd can solve many problems on the svd list using either of these factorizations but they

More information

Lecture 6. Numerical methods. Approximation of functions

Lecture 6. Numerical methods. Approximation of functions Lecture 6 Numerical methods Approximation of functions Lecture 6 OUTLINE 1. Approximation and interpolation 2. Least-square method basis functions design matrix residual weighted least squares normal equation

More information

MATH 235. Final ANSWERS May 5, 2015

MATH 235. Final ANSWERS May 5, 2015 MATH 235 Final ANSWERS May 5, 25. ( points) Fix positive integers m, n and consider the vector space V of all m n matrices with entries in the real numbers R. (a) Find the dimension of V and prove your

More information

Chapter 5. Basics of Euclidean Geometry

Chapter 5. Basics of Euclidean Geometry Chapter 5 Basics of Euclidean Geometry 5.1 Inner Products, Euclidean Spaces In Affine geometry, it is possible to deal with ratios of vectors and barycenters of points, but there is no way to express the

More information

Math 113 Final Exam: Solutions

Math 113 Final Exam: Solutions Math 113 Final Exam: Solutions Thursday, June 11, 2013, 3.30-6.30pm. 1. (25 points total) Let P 2 (R) denote the real vector space of polynomials of degree 2. Consider the following inner product on P

More information

Physics 202 Laboratory 5. Linear Algebra 1. Laboratory 5. Physics 202 Laboratory

Physics 202 Laboratory 5. Linear Algebra 1. Laboratory 5. Physics 202 Laboratory Physics 202 Laboratory 5 Linear Algebra Laboratory 5 Physics 202 Laboratory We close our whirlwind tour of numerical methods by advertising some elements of (numerical) linear algebra. There are three

More information

Quantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras

Quantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras Quantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras Lecture - 4 Postulates of Quantum Mechanics I In today s lecture I will essentially be talking

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

THE NULLSPACE OF A: SOLVING AX = 0 3.2

THE NULLSPACE OF A: SOLVING AX = 0 3.2 32 The Nullspace of A: Solving Ax = 0 11 THE NULLSPACE OF A: SOLVING AX = 0 32 This section is about the space of solutions to Ax = 0 The matrix A can be square or rectangular One immediate solution is

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

P = A(A T A) 1 A T. A Om (m n)

P = A(A T A) 1 A T. A Om (m n) Chapter 4: Orthogonality 4.. Projections Proposition. Let A be a matrix. Then N(A T A) N(A). Proof. If Ax, then of course A T Ax. Conversely, if A T Ax, then so Ax also. x (A T Ax) x T A T Ax (Ax) T Ax

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )

More information

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares

MATH 167: APPLIED LINEAR ALGEBRA Least-Squares MATH 167: APPLIED LINEAR ALGEBRA Least-Squares October 30, 2014 Least Squares We do a series of experiments, collecting data. We wish to see patterns!! We expect the output b to be a linear function of

More information

Lecture 4 Orthonormal vectors and QR factorization

Lecture 4 Orthonormal vectors and QR factorization Orthonormal vectors and QR factorization 4 1 Lecture 4 Orthonormal vectors and QR factorization EE263 Autumn 2004 orthonormal vectors Gram-Schmidt procedure, QR factorization orthogonal decomposition induced

More information

Applied Numerical Linear Algebra. Lecture 8

Applied Numerical Linear Algebra. Lecture 8 Applied Numerical Linear Algebra. Lecture 8 1/ 45 Perturbation Theory for the Least Squares Problem When A is not square, we define its condition number with respect to the 2-norm to be k 2 (A) σ max (A)/σ

More information

Review problems for MA 54, Fall 2004.

Review problems for MA 54, Fall 2004. Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on

More information

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3. Mathematical Statistics: Homewor problems General guideline. While woring outside the classroom, use any help you want, including people, computer algebra systems, Internet, and solution manuals, but mae

More information

10-701/ Recitation : Linear Algebra Review (based on notes written by Jing Xiang)

10-701/ Recitation : Linear Algebra Review (based on notes written by Jing Xiang) 10-701/15-781 Recitation : Linear Algebra Review (based on notes written by Jing Xiang) Manojit Nandi February 1, 2014 Outline Linear Algebra General Properties Matrix Operations Inner Products and Orthogonal

More information

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments STAT 135 Lab 3 Asymptotic MLE and the Method of Moments Rebecca Barter February 9, 2015 Maximum likelihood estimation (a reminder) Maximum likelihood estimation Suppose that we have a sample, X 1, X 2,...,

More information

2. Signal Space Concepts

2. Signal Space Concepts 2. Signal Space Concepts R.G. Gallager The signal-space viewpoint is one of the foundations of modern digital communications. Credit for popularizing this viewpoint is often given to the classic text of

More information

Solutions Serie 1 - preliminary exercises

Solutions Serie 1 - preliminary exercises D-MAVT D-MATL Prof. A. Iozzi ETH Zürich Analysis III Autumn 08 Solutions Serie - preliminary exercises. Compute the following primitive integrals using partial integration. a) cos(x) cos(x) dx cos(x) cos(x)

More information

Linear Least Squares Problems

Linear Least Squares Problems Linear Least Squares Problems Introduction We have N data points (x 1,y 1 ),...(x N,y N ). We assume that the data values are given by y j = g(x j ) + e j, j = 1,...,N where g(x) = c 1 g 1 (x) + + c n

More information

c 1 v 1 + c 2 v 2 = 0 c 1 λ 1 v 1 + c 2 λ 1 v 2 = 0

c 1 v 1 + c 2 v 2 = 0 c 1 λ 1 v 1 + c 2 λ 1 v 2 = 0 LECTURE LECTURE 2 0. Distinct eigenvalues I haven t gotten around to stating the following important theorem: Theorem: A matrix with n distinct eigenvalues is diagonalizable. Proof (Sketch) Suppose n =

More information

Pseudoinverse & Moore-Penrose Conditions

Pseudoinverse & Moore-Penrose Conditions ECE 275AB Lecture 7 Fall 2008 V1.0 c K. Kreutz-Delgado, UC San Diego p. 1/1 Lecture 7 ECE 275A Pseudoinverse & Moore-Penrose Conditions ECE 275AB Lecture 7 Fall 2008 V1.0 c K. Kreutz-Delgado, UC San Diego

More information

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014 Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

Continuous Optimization

Continuous Optimization Continuous Optimization Sanzheng Qiao Department of Computing and Software McMaster University March, 2009 Outline 1 Introduction 2 Golden Section Search 3 Multivariate Functions Steepest Descent Method

More information

Continuous Random Variables

Continuous Random Variables Continuous Random Variables Recall: For discrete random variables, only a finite or countably infinite number of possible values with positive probability. Often, there is interest in random variables

More information

5601 Notes: The Sandwich Estimator

5601 Notes: The Sandwich Estimator 560 Notes: The Sandwich Estimator Charles J. Geyer December 6, 2003 Contents Maximum Likelihood Estimation 2. Likelihood for One Observation................... 2.2 Likelihood for Many IID Observations...............

More information

Theorems. Least squares regression

Theorems. Least squares regression Theorems In this assignment we are trying to classify AML and ALL samples by use of penalized logistic regression. Before we indulge on the adventure of classification we should first explain the most

More information

Applied Linear Algebra in Geoscience Using MATLAB

Applied Linear Algebra in Geoscience Using MATLAB Applied Linear Algebra in Geoscience Using MATLAB Contents Getting Started Creating Arrays Mathematical Operations with Arrays Using Script Files and Managing Data Two-Dimensional Plots Programming in

More information

Preface to Second Edition... vii. Preface to First Edition...

Preface to Second Edition... vii. Preface to First Edition... Contents Preface to Second Edition..................................... vii Preface to First Edition....................................... ix Part I Linear Algebra 1 Basic Vector/Matrix Structure and

More information

Computational Methods. Least Squares Approximation/Optimization

Computational Methods. Least Squares Approximation/Optimization Computational Methods Least Squares Approximation/Optimization Manfred Huber 2011 1 Least Squares Least squares methods are aimed at finding approximate solutions when no precise solution exists Find the

More information

8.3 Partial Fraction Decomposition

8.3 Partial Fraction Decomposition 8.3 partial fraction decomposition 575 8.3 Partial Fraction Decomposition Rational functions (polynomials divided by polynomials) and their integrals play important roles in mathematics and applications,

More information

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation. PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.. Beta Distribution We ll start by learning about the Beta distribution, since we end up using

More information

Further Mathematical Methods (Linear Algebra) 2002

Further Mathematical Methods (Linear Algebra) 2002 Further Mathematical Methods (Linear Algebra) Solutions For Problem Sheet 9 In this problem sheet, we derived a new result about orthogonal projections and used them to find least squares approximations

More information

Midterm Examination. STA 205: Probability and Measure Theory. Thursday, 2010 Oct 21, 11:40-12:55 pm

Midterm Examination. STA 205: Probability and Measure Theory. Thursday, 2010 Oct 21, 11:40-12:55 pm Midterm Examination STA 205: Probability and Measure Theory Thursday, 2010 Oct 21, 11:40-12:55 pm This is a closed-book examination. You may use a single sheet of prepared notes, if you wish, but you may

More information

Math 61CM - Solutions to homework 6

Math 61CM - Solutions to homework 6 Math 61CM - Solutions to homework 6 Cédric De Groote November 5 th, 2018 Problem 1: (i) Give an example of a metric space X such that not all Cauchy sequences in X are convergent. (ii) Let X be a metric

More information