Stat751 / CSI771 Midterm October 15, 2015 Solutions, Comments. f(x) = 0 otherwise
|
|
- Lydia McLaughlin
- 5 years ago
- Views:
Transcription
1 Stat751 / CSI771 Midterm October 15, 2015 Solutions, Comments 1. 13pts Consider the beta distribution with PDF Γα+β ΓαΓβ xα 1 1 x β 1 0 x < 1 fx = 0 otherwise, for fixed constants 0 < α, β. Now, assume that you can generate random deviates U i from a U0, 1 distribution. These are the standard things you can get from a simple random number generator in almost any programming system. In R it is what we get from runif. Describe very carefully how you would generate one random deviate from this triangular distribution using an acceptance/rejection method with one or more U i from U0, 1. You can use a uniform distribution as the majorizing distribution. You can also assume that you can evaluate the gamma function, so just use expressions such as Γα+β, Γα, and Γβ First, we determine a good distribution to use as the majorizing distribution. Any distribution with finite range that we can generate variates from would work. How good a given distribution is for this purpose would depend on α and β, which are not given in this problem. The problem states that you can just use the uniform distribution. In the absence on knowledge of α and β, that s probably as good as anything. In the common notation that I used in class, this means gy = I [0,1] y. Now, we need a number c such that c Γα+β ΓαΓβ xα 1 1 x β 1 for 0 x 1. Finding such a c is not hard; since x α 1 1 x β 1 1, one such c is just Γα+β/ΓαΓβ. Of course, we want the smallest such c. Even this is not hard. In grading your work, I only looked for your c, however you chose it or defined it. Once, we ve got the majorizing function and the c we re ready to go. I subtracted 3 points if you were not explicit in defining your c but not its actual value and in defining your gy. There really is no gy in this problem. Now, the steps are straightforward: 1 Generate u 1 and u 2 from U0, 1. 2 If u 2 fu 1 /c, then accept u 1 as the desired variate; otherwise, go back to 1. 1
2 2. 13pts Show that the least-squares estimator for β in the linear model y Xβ, where y is an observed n- vector and X is a corresponding n m matrix of observations is given by β = X + y, where X + is the Moore-Penrose inverse of X. I like this formula! It looks like the solution to a full-rank, consistent system: c = Az z = A 1 c, and y Xβ β = X + y. Although the problem did not specify that X be of full column rank and it is true even if X is not full rank, the steps are simpler if we assume that, and in the following, I will make that assumption. No one had any problems with the rank of X, one way or the other. Also, in the following, let n and m represent the dimensions; that is, assume that X is n m and the other matrices and vectors are of the implied sizes. The least-squares estimator for β is the value β that minimizes the expression y Xβ T y Xβ, which is the residual sum of squares. The optimal value for β can be obtained in different ways. One way is by using calculus to show that a minimum β must satisfy the condition X T X β = X T y. Another way is by expanding the expression for the residual sum of squares and showing that if X T y X β = 0, then β must be the optimal solution. Following this approach, we take a candidate solution the one we want to prove, and show that the residuals are orthogonal to the columns of X. With all of that as a preface, I will give three different proofs that the optimal value can be expressed as β = X + y. a We obtain an expression for β = Ay using calculus, and then show that A = X +. There are two ways to do this: i. Show that A satisfies the four properties that define X +. ii. Use the QR decomposition of X to show that A = X +. b Take β = X + y and show that it satisfies a necessary and sufficient condition for it to be the minimizer. The condition is the orthogonality of the residuals to the columns of X. **************** Now let s do it each way ************************** a First, by taking the first and second derivatives wrt β, we find that a minimum β must satisfy the condition X T X β = X T y, which yields the solution as that is, A = X T X 1 X T. β = X T X 1 X T y; i. The four properties uniquely determine X + : A. XX + X = X B. X + XX + = X + C. XX + is symmetric. D. X + X is symmetric. So here we go: A. XX T X 1 X T X = X B. X T X 1 X T XX T X 1 X T = X T X 1 X T C. XX T X 1 X T is symmetric take its transpose. D. X T X 1 X T X is symmetric. Therefore, X T X 1 X T = X +. 2
3 ii. Now, we use the QR decomposition of X to show that A = X +. This is the way I did it in class. Form the QR decomposition of X, X = QR, where we can write [ ] R1 R =, 0 where R 1 an m m upper triangular matrix. The squared residual norm can now be written as y Xb T y Xb = y QRb T y QRb = Q T y Rb T Q T y Rb = c 1 R 1 b T c 1 R 1 b + c T 2 c 2, where c 1 is a vector with m elements and c 2 is a vector with n m elements, such that Q T y = c1 Because the squared norm is nonnegative, the minimum of the residual norm occurs when c 1 R 1 b T c 1 R 1 b = 0; that is, when c 1 R 1 b = 0, or c 2 R 1 β = c 1. Because R 1 is triangular, the system is easy to solve: β = R 1 1 c 1. Now, [ X + = R ] Q T. This important expression of the Moore-Penrose inverse was the key to solving this problem in this way. Therefore, β = X + y. We also see that the minimum of the residual norm, or the residual sum of squares,is c T 2 c 2. b Finally, as another way, we take β = X + y as a candidate and show that it satisfies the condition of the orthogonality of the residuals to the columns of X; that is, X T y XX + y = 0. Using the properties of X +, we have X T y XX + y = X T y X T XX + y = X T y X T XX + T y because of symmetry = X T y X T X + T X T y = X T y X T X T + X T y property of Moore-Penrose inverses and transposes = X T y X T y property of Moore-Penrose inverses = 0 3
4 3. 13pts Describe how you would evaluate the integrals below using Monte Carlo. Assume that you have a source of uniform U0, 1 random numbers; that is, you can get a sample x 1, x 2,..., x m. Since your result is an estimate, also give a formula for an estimate of the variance of your estimator Although I have told you not to use Monte Carlo when you can evaluate something analytically and these simple integrals could be evaluated analytically, use Monte Carlo anyway. Give formulas for your estimates and for your estimates of the variance of your estimator. a b 2 0 x 2 e x/2 dx You should, of course, use a good PDF decomposition of x 2 e x/2 so as to be more efficient. The simplest decomposition is just 2x 2 e x/2 2 1, where the second factor is just the uniform PDF over [0, 2]. This is the one I d probably use. Using the uniform, a Monte Carlo estimate of the integral is just where u i are iid U0, 1. 2 m m 2u i 2 e u i, i=1 m i=1 2u i 2 e u i t 2 An estimate of the variance is 1 m m 1, where t is the estimate of the integral. Notice also that the problem stated only that you have a source of U0, 1 random numbers, so this PDF requires no real transformation. Other possibilities would be to use the exponential2 distribution truncated at 2, or to use the gamma3,2 distribution, also truncated at 2. In either case, the first question would be how to get random variables from the distribution of interest. The exponential is easy, just using the inverse CDF; but the gamma is rather difficult. Of course if you assume that you have R, you could use qgammam,3,2. The truncation involved with each of the latter distributions would likely make them less efficient. Use of the exponential is would certainly be more efficient than use of the gamma, however. 0 sinxe x dx Because the integral is improper, you must use a distribution with infinite support. A simple one is the exponential with parameter 1. We can generate an exponential from a uniform u as logu. Hence, a Monte Carlo estimate of the integral is t = 1 m sin logu i, m i=1 where u i are iid U0, 1, and an estimate of the variance is 1 m m i=1 sin logu i t 2 m 1. 4
5 4. 13pts Outline how you would design and conduct a Monte Carlo study to compare the performance of the standard two-sample t test for equality of means of two normal populations with Welch s test when the variance of the underlying distributions are unequal. You do not need to know how to perform these two test; just assume you have programs that will perform the two tests at a given significance level α. That is, given two datasets, your programs will return a value of reject or don t reject. Treat this as a factorial experiment. Identify the factors and the factor levels you would use just make some reasonable choices. Then describe the steps you would follow. The response of interest is the performance of the tests. The subject of the tests is the difference in the means. The difference in the means can be measured in various ways, such as an arithmetic difference or a ratio, in either case, possibly scaled by a standard deviation. The performance of either test is its power over some range of differences. The treatments are the two tests. The factors of interest are a. the differences in the means, possibly scaled b. the differences in the variances c. the sample sizes The general approach would be to choose one population as N0, 1, and the second as Nµ, σ 2. We see that all possible ranges are encompassed by the ranges [0, and 0,. More realistically we may choose [0, 3σ] for µ after choosing [1/9, 9] for σ 2. 5
6 5. 22pts Consider the model y i = αe βx i + ɛ i, where α and β are unknown constants, and ɛ i is a random variable with expected value of 0 and constant variance. Assume that we have pairs of observations y 1, x 1,..., y n, x n, and that the ɛ i s for the observations are independent. a Estimation by least squares. i. What is the objective function; that is, what is the function of α and β that is to be minimized? First of all, notice that if you linearize this by taking logs, you are changing the model. We can write this in the form of sums of individual elements or in a vector notation, where, when x is an n-vector, we would adopt the notation e βx to represent the n-vector whose i th element is e βx i. In vector notation, the objective function is fa, b = y ae bx T y ae bx. In the form of sums of individual elements, the objective function is n fa, b = y i ae bx i 2. i=1 ii. What is the gradient of the objective function? Using the vector form, the gradient is g f = f = = f a f b 2ae bx T y ae bx 2adiagxe bx T y ae bx iii. What is the Hessian of the objective function? H f = g f = = 2 f a 2 2 f b a 2 f a b 2 f b 2 a 2ae bx T y ae bx b 2ae bx T y ae bx a 2adiagxe bx T y ae bx b 2adiagxe bx T y ae bx This was messy, and if you had the formulas right, I gave full credit. iv. Given a starting point, what is the Newton step to move to a new solution? Let a 0, b 0 be given. a 1 b 1 = a 0 b 0 H f a 0, b 0 1 gf a 0, b 0. 6
7 b Estimation by maximum likelihood. i. What else would you need to know or assume? You would need to know the distribution of the random variables, ɛ i. This means the multivariate distribution. Notice that nothing was stated about the relationships of the ɛ i to each other. Make an appropriate assumption to satisfy the need referred to in the previous question the specific assumption is not important. Assume that they are iid N0, σ 2. That is the multivariate distribution is N n 0, σ 2 I n. We can represent the PDF of this distribution as fɛ = 1 2σ 2 π ne ɛt ɛ/2σ 2 Now, based on that assumption, in the following, describe how would you proceed to compute the MLEs of α and β. ii. What is the objective function; that is, what is the function of α and β that is to be minimized? There are actually three variables, α, β, and σ 2. As it turns out, however, the optimal values of α and β are not affected by the value of σ 2. The objective function is the likelihood function: Lα, β; x, y = or, equivalently, the log-likelihood: 1 2σ 2 π ne y αe βx T y αe βx /2σ 2 ; l L α, β; x, y = y αe βx T y αe βx. iii. What is the gradient of the objective function? This is the same as least squares. iv. What is the Hessian of the objective function? This is the same as least squares. v. Given a starting point, what is the Newton step to move to a new solution? This is the same as least squares. It is well-known that least squares is ML if the distribution is normal, the error is additive, and the model is linear. It is also the case here because of the form of the model. 7
8 6. 13pts Given the three linearly independent vectors in 5-space: x 1 = 1, 1, 1, 2,0 x 2 = 1, 0, 0, 1,0 x 3 = 1, 0, 1, 1,1 Form three orthonormal vectors z 1, z 2, and z 3 that span the same space. I intended for my numbers to work out evenly, but they don t, so when you have a square root, just show it as such, and don t worry about the computations. The method to use is the Gram-Schmidt. I was very lenient in grading this one. It gets pretty messy, but here are some expressions z 1 = 1, 1, 1, 2, 0/ 7 z 2 = 1, 0, 0,1,0 31, 1, 1,2,0/7/a, z 3 = 1, 0, 1,1,1 41, 1, 1,2,0/7 bz 2 /c, where a and c are the norms, and b is the inner product of z 2 and the z 3 before adjustment. Some of you made the first vector from the third vector; that is, you smartly chose because x 3 is an integer. z 1 = x 3 / x 3, Rather than doing in the manner indicated above, it is actually better to accumulate the third vector in two steps and the fourth, if there were one, in three steps, and so on. Here s some R code to do it m <- 3 n <- 5 z1 <- c1,1,1,2,0 z2 <- c1,0,0,1,0 z3 <- c1,0,1,1,1 Z <- cbindz1,z2,z3 Z[1:n,1] <- Z[1:n,1]/sqrtsumZ[1:n,1]^2 for k in 2:m{ for j in k:m{ Z[1:n,j] <- Z[1:n,j]-sumZ[1:n,k-1]*Z[1:n,j]*Z[1:n,k-1] } Z[1:n,k] <- Z[1:n,k]/sqrtsumZ[1:n,k]^2 } Check it: roundtz%*%z,10 8
9 7. 13pts Given the vector x = 1, 2, 0,2,0. Describe how you would reflect this vector into the vector x = 3, 0, 0,0,0 The reflection is achieved by the Householder matrix I 2uu T, where So x = I 2uu T x. u = 1 x, 2, 0, 2, 0/ 1 x, 2, 0, 2,0 = 2, 2, 0, 2,0/ 12 x <- c1,2,0,2,0 u <- c-2,2,0,2,0/sqrt12 H <- matrixcrepc1,0,0,0,0,0,4,1,nrow=5-2*u%*%tu roundh%*%x,10 yields [,1] [1,] 3 [2,] 0 [3,] 0 [4,] 0 [5,] 0 9
Linear Models Review
Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign
More informationThis exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.
TEST #3 STA 5326 December 4, 214 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. (You will have access to
More informationLecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices
Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show
More informationLecture 13: Simple Linear Regression in Matrix Format
See updates and corrections at http://www.stat.cmu.edu/~cshalizi/mreg/ Lecture 13: Simple Linear Regression in Matrix Format 36-401, Section B, Fall 2015 13 October 2015 Contents 1 Least Squares in Matrix
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationContinuous Random Variables 1
Continuous Random Variables 1 STA 256: Fall 2018 1 This slide show is an open-source document. See last slide for copyright information. 1 / 32 Continuous Random Variables: The idea Probability is area
More informationThe 'linear algebra way' of talking about "angle" and "similarity" between two vectors is called "inner product". We'll define this next.
Orthogonality and QR The 'linear algebra way' of talking about "angle" and "similarity" between two vectors is called "inner product". We'll define this next. So, what is an inner product? An inner product
More informationMath 291-2: Lecture Notes Northwestern University, Winter 2016
Math 291-2: Lecture Notes Northwestern University, Winter 2016 Written by Santiago Cañez These are lecture notes for Math 291-2, the second quarter of MENU: Intensive Linear Algebra and Multivariable Calculus,
More informationEco517 Fall 2004 C. Sims MIDTERM EXAM
Eco517 Fall 2004 C. Sims MIDTERM EXAM Answer all four questions. Each is worth 23 points. Do not devote disproportionate time to any one question unless you have answered all the others. (1) We are considering
More informationMATH 23a, FALL 2002 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Solutions to Final Exam (in-class portion) January 22, 2003
MATH 23a, FALL 2002 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Solutions to Final Exam (in-class portion) January 22, 2003 1. True or False (28 points, 2 each) T or F If V is a vector space
More informationMAT Linear Algebra Collection of sample exams
MAT 342 - Linear Algebra Collection of sample exams A-x. (0 pts Give the precise definition of the row echelon form. 2. ( 0 pts After performing row reductions on the augmented matrix for a certain system
More informationSTA 302f16 Assignment Five 1
STA 30f16 Assignment Five 1 Except for Problem??, these problems are preparation for the quiz in tutorial on Thursday October 0th, and are not to be handed in As usual, at times you may be asked to prove
More informationYORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #2 Solutions
YORK UNIVERSITY Faculty of Science Department of Mathematics and Statistics MATH 3. M Test # Solutions. (8 pts) For each statement indicate whether it is always TRUE or sometimes FALSE. Note: For this
More informationLeast Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo
Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 26, 2010 Linear system Linear system Ax = b, A C m,n, b C m, x C n. under-determined
More informationLinear Algebra. and
Instructions Please answer the six problems on your own paper. These are essay questions: you should write in complete sentences. 1. Are the two matrices 1 2 2 1 3 5 2 7 and 1 1 1 4 4 2 5 5 2 row equivalent?
More informationContinuous random variables
Continuous random variables Can take on an uncountably infinite number of values Any value within an interval over which the variable is definied has some probability of occuring This is different from
More informationThis model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that
Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear
More informationLecture 16 Solving GLMs via IRWLS
Lecture 16 Solving GLMs via IRWLS 09 November 2015 Taylor B. Arnold Yale Statistics STAT 312/612 Notes problem set 5 posted; due next class problem set 6, November 18th Goals for today fixed PCA example
More information1 Acceptance-Rejection Method
Copyright c 2016 by Karl Sigman 1 Acceptance-Rejection Method As we already know, finding an explicit formula for F 1 (y), y [0, 1], for the cdf of a rv X we wish to generate, F (x) = P (X x), x R, is
More informationChapter 5 continued. Chapter 5 sections
Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationExercises * on Linear Algebra
Exercises * on Linear Algebra Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 7 Contents Vector spaces 4. Definition...............................................
More informationSTA 294: Stochastic Processes & Bayesian Nonparametrics
MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a
More information1 Cricket chirps: an example
Notes for 2016-09-26 1 Cricket chirps: an example Did you know that you can estimate the temperature by listening to the rate of chirps? The data set in Table 1 1. represents measurements of the number
More informationMethods of Mathematical Physics X1 Homework 2 - Solutions
Methods of Mathematical Physics - 556 X1 Homework - Solutions 1. Recall that we define the orthogonal complement as in class: If S is a vector space, and T is a subspace, then we define the orthogonal
More informationStability of the Gram-Schmidt process
Stability of the Gram-Schmidt process Orthogonal projection We learned in multivariable calculus (or physics or elementary linear algebra) that if q is a unit vector and v is any vector then the orthogonal
More informationOrthonormal Bases; Gram-Schmidt Process; QR-Decomposition
Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition MATH 322, Linear Algebra I J. Robert Buchanan Department of Mathematics Spring 205 Motivation When working with an inner product space, the most
More informationx 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7
Linear Algebra and its Applications-Lab 1 1) Use Gaussian elimination to solve the following systems x 1 + x 2 2x 3 + 4x 4 = 5 1.1) 2x 1 + 2x 2 3x 3 + x 4 = 3 3x 1 + 3x 2 4x 3 2x 4 = 1 x + y + 2z = 4 1.4)
More informationSTA 2101/442 Assignment 3 1
STA 2101/442 Assignment 3 1 These questions are practice for the midterm and final exam, and are not to be handed in. 1. Suppose X 1,..., X n are a random sample from a distribution with mean µ and variance
More informationFinal Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2
Final Review Sheet The final will cover Sections Chapters 1,2,3 and 4, as well as sections 5.1-5.4, 6.1-6.2 and 7.1-7.3 from chapters 5,6 and 7. This is essentially all material covered this term. Watch
More informationLinear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space
Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................
More informationLeast squares: the big idea
Notes for 2016-02-22 Least squares: the big idea Least squares problems are a special sort of minimization problem. Suppose A R m n where m > n. In general, we cannot solve the overdetermined system Ax
More informationNotes on Eigenvalues, Singular Values and QR
Notes on Eigenvalues, Singular Values and QR Michael Overton, Numerical Computing, Spring 2017 March 30, 2017 1 Eigenvalues Everyone who has studied linear algebra knows the definition: given a square
More informationNotes on Solving Linear Least-Squares Problems
Notes on Solving Linear Least-Squares Problems Robert A. van de Geijn The University of Texas at Austin Austin, TX 7871 October 1, 14 NOTE: I have not thoroughly proof-read these notes!!! 1 Motivation
More informationCMU CS 462/662 (INTRO TO COMPUTER GRAPHICS) HOMEWORK 0.0 MATH REVIEW/PREVIEW LINEAR ALGEBRA
CMU CS 462/662 (INTRO TO COMPUTER GRAPHICS) HOMEWORK 0.0 MATH REVIEW/PREVIEW LINEAR ALGEBRA Andrew ID: ljelenak August 25, 2018 This assignment reviews basic mathematical tools you will use throughout
More informationSimulation - Lectures - Part I
Simulation - Lectures - Part I Julien Berestycki -(adapted from François Caron s slides) Part A Simulation and Statistical Programming Hilary Term 2017 Part A Simulation. HT 2017. J. Berestycki. 1 / 66
More informationExtreme Values and Positive/ Negative Definite Matrix Conditions
Extreme Values and Positive/ Negative Definite Matrix Conditions James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University November 8, 016 Outline 1
More informationLinear Least-Squares Data Fitting
CHAPTER 6 Linear Least-Squares Data Fitting 61 Introduction Recall that in chapter 3 we were discussing linear systems of equations, written in shorthand in the form Ax = b In chapter 3, we just considered
More informationPractice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5
Practice Exam. Solve the linear system using an augmented matrix. State whether the solution is unique, there are no solutions or whether there are infinitely many solutions. If the solution is unique,
More informationSTAT 450: Statistical Theory. Distribution Theory. Reading in Casella and Berger: Ch 2 Sec 1, Ch 4 Sec 1, Ch 4 Sec 6.
STAT 45: Statistical Theory Distribution Theory Reading in Casella and Berger: Ch 2 Sec 1, Ch 4 Sec 1, Ch 4 Sec 6. Basic Problem: Start with assumptions about f or CDF of random vector X (X 1,..., X p
More informationCHAPTER 6 SOME CONTINUOUS PROBABILITY DISTRIBUTIONS. 6.2 Normal Distribution. 6.1 Continuous Uniform Distribution
CHAPTER 6 SOME CONTINUOUS PROBABILITY DISTRIBUTIONS Recall that a continuous random variable X is a random variable that takes all values in an interval or a set of intervals. The distribution of a continuous
More informationStat 451 Lecture Notes Simulating Random Variables
Stat 451 Lecture Notes 05 12 Simulating Random Variables Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 22 in Lange, and Chapter 2 in Robert & Casella 2 Updated:
More informationFIRST YEAR EXAM Monday May 10, 2010; 9:00 12:00am
FIRST YEAR EXAM Monday May 10, 2010; 9:00 12:00am NOTES: PLEASE READ CAREFULLY BEFORE BEGINNING EXAM! 1. Do not write solutions on the exam; please write your solutions on the paper provided. 2. Put the
More informationPreliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012
Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.
More informationVector and Matrix Norms. Vector and Matrix Norms
Vector and Matrix Norms Vector Space Algebra Matrix Algebra: We let x x and A A, where, if x is an element of an abstract vector space n, and A = A: n m, then x is a complex column vector of length n whose
More informationGQE ALGEBRA PROBLEMS
GQE ALGEBRA PROBLEMS JAKOB STREIPEL Contents. Eigenthings 2. Norms, Inner Products, Orthogonality, and Such 6 3. Determinants, Inverses, and Linear (In)dependence 4. (Invariant) Subspaces 3 Throughout
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationQuantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras
Quantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras Lecture - 6 Postulates of Quantum Mechanics II (Refer Slide Time: 00:07) In my last lecture,
More informationAssignment #9: Orthogonal Projections, Gram-Schmidt, and Least Squares. Name:
Assignment 9: Orthogonal Projections, Gram-Schmidt, and Least Squares Due date: Friday, April 0, 08 (:pm) Name: Section Number Assignment 9: Orthogonal Projections, Gram-Schmidt, and Least Squares Due
More informationSTAT 111 Recitation 7
STAT 111 Recitation 7 Xin Lu Tan xtan@wharton.upenn.edu October 25, 2013 1 / 13 Miscellaneous Please turn in homework 6. Please pick up homework 7 and the graded homework 5. Please check your grade and
More information18.06 Quiz 2 April 7, 2010 Professor Strang
18.06 Quiz 2 April 7, 2010 Professor Strang Your PRINTED name is: 1. Your recitation number or instructor is 2. 3. 1. (33 points) (a) Find the matrix P that projects every vector b in R 3 onto the line
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationLecture 22. r i+1 = b Ax i+1 = b A(x i + α i r i ) =(b Ax i ) α i Ar i = r i α i Ar i
8.409 An Algorithmist s oolkit December, 009 Lecturer: Jonathan Kelner Lecture Last time Last time, we reduced solving sparse systems of linear equations Ax = b where A is symmetric and positive definite
More informationSeminar on Linear Algebra
Supplement Seminar on Linear Algebra Projection, Singular Value Decomposition, Pseudoinverse Kenichi Kanatani Kyoritsu Shuppan Co., Ltd. Contents 1 Linear Space and Projection 1 1.1 Expression of Linear
More informationUniversity of Colorado Denver Department of Mathematical and Statistical Sciences Applied Linear Algebra Ph.D. Preliminary Exam January 22, 2016
University of Colorado Denver Department of Mathematical and Statistical Sciences Applied Linear Algebra PhD Preliminary Exam January 22, 216 Name: Exam Rules: This exam lasts 4 hours There are 8 problems
More informationThe Gram-Schmidt Process
The Gram-Schmidt Process How and Why it Works This is intended as a complement to 5.4 in our textbook. I assume you have read that section, so I will not repeat the definitions it gives. Our goal is to
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More information18.06 Professor Johnson Quiz 1 October 3, 2007
18.6 Professor Johnson Quiz 1 October 3, 7 SOLUTIONS 1 3 pts.) A given circuit network directed graph) which has an m n incidence matrix A rows = edges, columns = nodes) and a conductance matrix C [diagonal
More informationCOS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION
COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:
More informationOptimization Problems
Optimization Problems The goal in an optimization problem is to find the point at which the minimum (or maximum) of a real, scalar function f occurs and, usually, to find the value of the function at that
More informationSTAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9
STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9 1. qr and complete orthogonal factorization poor man s svd can solve many problems on the svd list using either of these factorizations but they
More informationLecture 6. Numerical methods. Approximation of functions
Lecture 6 Numerical methods Approximation of functions Lecture 6 OUTLINE 1. Approximation and interpolation 2. Least-square method basis functions design matrix residual weighted least squares normal equation
More informationMATH 235. Final ANSWERS May 5, 2015
MATH 235 Final ANSWERS May 5, 25. ( points) Fix positive integers m, n and consider the vector space V of all m n matrices with entries in the real numbers R. (a) Find the dimension of V and prove your
More informationChapter 5. Basics of Euclidean Geometry
Chapter 5 Basics of Euclidean Geometry 5.1 Inner Products, Euclidean Spaces In Affine geometry, it is possible to deal with ratios of vectors and barycenters of points, but there is no way to express the
More informationMath 113 Final Exam: Solutions
Math 113 Final Exam: Solutions Thursday, June 11, 2013, 3.30-6.30pm. 1. (25 points total) Let P 2 (R) denote the real vector space of polynomials of degree 2. Consider the following inner product on P
More informationPhysics 202 Laboratory 5. Linear Algebra 1. Laboratory 5. Physics 202 Laboratory
Physics 202 Laboratory 5 Linear Algebra Laboratory 5 Physics 202 Laboratory We close our whirlwind tour of numerical methods by advertising some elements of (numerical) linear algebra. There are three
More informationQuantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras
Quantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras Lecture - 4 Postulates of Quantum Mechanics I In today s lecture I will essentially be talking
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationTHE NULLSPACE OF A: SOLVING AX = 0 3.2
32 The Nullspace of A: Solving Ax = 0 11 THE NULLSPACE OF A: SOLVING AX = 0 32 This section is about the space of solutions to Ax = 0 The matrix A can be square or rectangular One immediate solution is
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationP = A(A T A) 1 A T. A Om (m n)
Chapter 4: Orthogonality 4.. Projections Proposition. Let A be a matrix. Then N(A T A) N(A). Proof. If Ax, then of course A T Ax. Conversely, if A T Ax, then so Ax also. x (A T Ax) x T A T Ax (Ax) T Ax
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationFinite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product
Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )
More informationMATH 167: APPLIED LINEAR ALGEBRA Least-Squares
MATH 167: APPLIED LINEAR ALGEBRA Least-Squares October 30, 2014 Least Squares We do a series of experiments, collecting data. We wish to see patterns!! We expect the output b to be a linear function of
More informationLecture 4 Orthonormal vectors and QR factorization
Orthonormal vectors and QR factorization 4 1 Lecture 4 Orthonormal vectors and QR factorization EE263 Autumn 2004 orthonormal vectors Gram-Schmidt procedure, QR factorization orthogonal decomposition induced
More informationApplied Numerical Linear Algebra. Lecture 8
Applied Numerical Linear Algebra. Lecture 8 1/ 45 Perturbation Theory for the Least Squares Problem When A is not square, we define its condition number with respect to the 2-norm to be k 2 (A) σ max (A)/σ
More informationReview problems for MA 54, Fall 2004.
Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on
More informationt x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.
Mathematical Statistics: Homewor problems General guideline. While woring outside the classroom, use any help you want, including people, computer algebra systems, Internet, and solution manuals, but mae
More information10-701/ Recitation : Linear Algebra Review (based on notes written by Jing Xiang)
10-701/15-781 Recitation : Linear Algebra Review (based on notes written by Jing Xiang) Manojit Nandi February 1, 2014 Outline Linear Algebra General Properties Matrix Operations Inner Products and Orthogonal
More informationSTAT 135 Lab 3 Asymptotic MLE and the Method of Moments
STAT 135 Lab 3 Asymptotic MLE and the Method of Moments Rebecca Barter February 9, 2015 Maximum likelihood estimation (a reminder) Maximum likelihood estimation Suppose that we have a sample, X 1, X 2,...,
More information2. Signal Space Concepts
2. Signal Space Concepts R.G. Gallager The signal-space viewpoint is one of the foundations of modern digital communications. Credit for popularizing this viewpoint is often given to the classic text of
More informationSolutions Serie 1 - preliminary exercises
D-MAVT D-MATL Prof. A. Iozzi ETH Zürich Analysis III Autumn 08 Solutions Serie - preliminary exercises. Compute the following primitive integrals using partial integration. a) cos(x) cos(x) dx cos(x) cos(x)
More informationLinear Least Squares Problems
Linear Least Squares Problems Introduction We have N data points (x 1,y 1 ),...(x N,y N ). We assume that the data values are given by y j = g(x j ) + e j, j = 1,...,N where g(x) = c 1 g 1 (x) + + c n
More informationc 1 v 1 + c 2 v 2 = 0 c 1 λ 1 v 1 + c 2 λ 1 v 2 = 0
LECTURE LECTURE 2 0. Distinct eigenvalues I haven t gotten around to stating the following important theorem: Theorem: A matrix with n distinct eigenvalues is diagonalizable. Proof (Sketch) Suppose n =
More informationPseudoinverse & Moore-Penrose Conditions
ECE 275AB Lecture 7 Fall 2008 V1.0 c K. Kreutz-Delgado, UC San Diego p. 1/1 Lecture 7 ECE 275A Pseudoinverse & Moore-Penrose Conditions ECE 275AB Lecture 7 Fall 2008 V1.0 c K. Kreutz-Delgado, UC San Diego
More informationProbability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014
Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationContinuous Optimization
Continuous Optimization Sanzheng Qiao Department of Computing and Software McMaster University March, 2009 Outline 1 Introduction 2 Golden Section Search 3 Multivariate Functions Steepest Descent Method
More informationContinuous Random Variables
Continuous Random Variables Recall: For discrete random variables, only a finite or countably infinite number of possible values with positive probability. Often, there is interest in random variables
More information5601 Notes: The Sandwich Estimator
560 Notes: The Sandwich Estimator Charles J. Geyer December 6, 2003 Contents Maximum Likelihood Estimation 2. Likelihood for One Observation................... 2.2 Likelihood for Many IID Observations...............
More informationTheorems. Least squares regression
Theorems In this assignment we are trying to classify AML and ALL samples by use of penalized logistic regression. Before we indulge on the adventure of classification we should first explain the most
More informationApplied Linear Algebra in Geoscience Using MATLAB
Applied Linear Algebra in Geoscience Using MATLAB Contents Getting Started Creating Arrays Mathematical Operations with Arrays Using Script Files and Managing Data Two-Dimensional Plots Programming in
More informationPreface to Second Edition... vii. Preface to First Edition...
Contents Preface to Second Edition..................................... vii Preface to First Edition....................................... ix Part I Linear Algebra 1 Basic Vector/Matrix Structure and
More informationComputational Methods. Least Squares Approximation/Optimization
Computational Methods Least Squares Approximation/Optimization Manfred Huber 2011 1 Least Squares Least squares methods are aimed at finding approximate solutions when no precise solution exists Find the
More information8.3 Partial Fraction Decomposition
8.3 partial fraction decomposition 575 8.3 Partial Fraction Decomposition Rational functions (polynomials divided by polynomials) and their integrals play important roles in mathematics and applications,
More informationPARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.
PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.. Beta Distribution We ll start by learning about the Beta distribution, since we end up using
More informationFurther Mathematical Methods (Linear Algebra) 2002
Further Mathematical Methods (Linear Algebra) Solutions For Problem Sheet 9 In this problem sheet, we derived a new result about orthogonal projections and used them to find least squares approximations
More informationMidterm Examination. STA 205: Probability and Measure Theory. Thursday, 2010 Oct 21, 11:40-12:55 pm
Midterm Examination STA 205: Probability and Measure Theory Thursday, 2010 Oct 21, 11:40-12:55 pm This is a closed-book examination. You may use a single sheet of prepared notes, if you wish, but you may
More informationMath 61CM - Solutions to homework 6
Math 61CM - Solutions to homework 6 Cédric De Groote November 5 th, 2018 Problem 1: (i) Give an example of a metric space X such that not all Cauchy sequences in X are convergent. (ii) Let X be a metric
More information