STAT 350. Assignment 6 Solutions
|
|
- Leslie Jennings
- 5 years ago
- Views:
Transcription
1 STAT 350 Assignment 6 Solutions 1. For the Nitrogen Output in Wallabies data set from Assignment 3 do forward, backward, stepwise all subsets regression. Here is code for all the methods with all subsets done both using C p using adjusted R 2. data nit; infile nit.dat ; input nitexc weight dryin wetin nitin ; proc reg data=nit; model nitexc = weight dryin wetin nitin /selection=forward; run ; proc reg data=nit; model nitexc = weight dryin wetin nitin /selection=backward; run ; proc reg data=nit; model nitexc = weight dryin wetin nitin /selection=stepwise; run ; proc reg data=nit; model nitexc = weight dryin wetin nitin /selection=cp; run ; proc reg data=nit; model nitexc = weight dryin wetin nitin /selection=adjrsq; run ; The conclusion of the output Forward Selection Procedure for Dependent Variable NITEXC Step 1 Variable NITIN Entered R-square = C(p) = Regression Error
2 INTERCEP NITIN Bounds on condition number: 1, 1 Step 2 Variable WETIN Entered R-square = C(p) = Regression Error INTERCEP WETIN NITIN Bounds on condition number: , No other variable met the significance level for entry into the model. Summary of Forward Selection Procedure for Dependent Variable NITEXC Variable Number Partial Model Step Entered In R**2 R**2 C(p) F Prob>F 1 NITIN WETIN Backward Elimination Procedure for Dependent Variable NITEXC Step 0 All Variables Entered R-square = C(p) =
3 Regression Error INTERCEP WEIGHT DRYIN WETIN NITIN Bounds on condition number: , Step 1 Variable WEIGHT Removed R-square = C(p) = Regression Error INTERCEP DRYIN WETIN NITIN Bounds on condition number: , Step 2 Variable DRYIN Removed R-square = C(p) = Regression Error
4 INTERCEP WETIN NITIN Bounds on condition number: , Step 3 Variable WETIN Removed R-square = C(p) = Regression Error INTERCEP NITIN Bounds on condition number: 1, 1 All variables left in the model are significant at the level. Summary of Backward Elimination Procedure for Dependent Variable NITEXC Variable Number Partial Model Step Removed In R**2 R**2 C(p) F Prob>F 1 WEIGHT DRYIN WETIN Stepwise Procedure for Dependent Variable NITEXC Step 1 Variable NITIN Entered R-square = C(p) =
5 Regression Error INTERCEP NITIN Bounds on condition number: 1, 1 All variables left in the model are significant at the level. No other variable met the significance level for entry into the model. Summary of Stepwise Procedure for Dependent Variable NITEXC Variable Number Partial Model Step Entered Removed In R**2 R**2 C(p) F Prob>F 1 NITIN N = 25 Regression Models for Dependent Variable: NITEXC C(p) R-square Variables in Model In NITIN WETIN NITIN DRYIN NITIN WEIGHT NITIN DRYIN WETIN NITIN WEIGHT WETIN NITIN WEIGHT DRYIN NITIN WEIGHT DRYIN WETIN NITIN WEIGHT DRYIN WETIN DRYIN WETIN DRYIN WEIGHT DRYIN WEIGHT WETIN WETIN WEIGHT 5
6 N = 25 Regression Models for Dependent Variable: NITEXC Adjusted R-square Variables in Model R-square In WETIN NITIN NITIN DRYIN NITIN WEIGHT NITIN DRYIN WETIN NITIN WEIGHT WETIN NITIN WEIGHT DRYIN NITIN WEIGHT DRYIN WETIN NITIN WEIGHT DRYIN WETIN DRYIN WETIN DRYIN WEIGHT DRYIN WETIN WEIGHT WETIN WEIGHT is that BACKWARD STEPWISE settle for the model containing only Nitrogen Intake as a predictor. The forward selection method also includes Wet Intake because of the very high level of α (0.5) to enter. The all subsets method using C p would settle on the model using only nitrogen intake but the adjusted R 2 method also includes Wet Intake. However, overall there seems little reason to include Wet Intake since it improves the fit very little is not very significant at all. 2. Suppose X 1, X 2, X 3 are independent N(µ, σ 2 ) rom variables, so that X i = µ + σz i with Z 1, Z 2, Z 3 independent stard normals. (a) If X T = (X 1, X 2, X 3 ) Z T = (Z 1, Z 2, Z 3 ) express X in the form AZ + b for a suitable matrix A vector b. We have A = σi b T = [µ, µ, µ]. (b) Show that X is MV N 3 (µ X, Σ X ) identify µ X Σ X. The definition of MVN is that X be of the form AZ + b then µ X = b Σ X = AA T. So µ T x = [µ, µ, µ] Σ X = σ 2 I. (c) Let Y i = X i X for i = 1, 2, 3 Y 4 = X. Show that Y MV N 4 (µ Y, Σ Y ) find µ Y Σ Y. 6
7 The definition of MVN is that X be of the form AZ +b. Then if Y = BX we have Y = B(AZ + b) = (BA)Z + Bb so that Y is MVN with mean µ Y = Bb = E(Y ) Σ Y = (BA)(BA) T = BAA T B T. In this case we find So B = µ Y = E(Y ) = Σ Y = σ 2 BB T = µ 2/3 1/3 1/3 1/3 2/3 1/3 1/3 1/3 2/3 1/3 1/3 1/3 2/3 1/3 1/3 0 1/3 2/3 1/3 0 1/3 1/3 2/ /3 (d) In class I may have stated that the if the covariance between two components of a multivariate normal vector is 0 then the components are independent, but I indicated a proof only when the multivariate normal distribution in question has a density. In this case the variance matrix is singular so there is no density. However, in terms of the original Z it is possible to find two independent functions of Z such that Y 1, Y 2, Y 3 are a function of the first function while Y 4 is a function of the second. i. Let U 1 = (Z 1 Z 2 )/ 2, U 2 = (Z 1 + Z 2 2Z 3 )/ 6 U 3 = (Z 1 + Z 2 + Z 3 )/3. Show that U = (U 1, U 2, U 3 ) T has a multivariate normal distribution identify the mean variance of U. We have U = AZ where 1/ 2 1/ 2 0 A = 1/ 6 1/ 6 2/ 6 1/3 1/3 1/3 Thus U is multivariate normal with mean 0 variance covariance matrix AA T. Multiply this out to check this is Σ = AA T = /3 ii. Use the result in class, for multivariate normals which have a density to show that (U 1, U 2 ) is independent of U 3. Since the variance covariance matrix is not singular we need only check that Cov(U 1, U 3 ) = Cov(U 2, U 3 ) = 0. These two entries in Σ are, indeed, 0. 7.
8 iii. Express Y 3 as a function of U. We have Y 3 = X 3 X = σ(z 3 Z). Now 3U 3 6U 2 = 3Z 3 Z = U 3. Thus Y 3 = σ(u 3 6U 2 /3 U 3 ) = σ 6U 2 /3. iv. Use the fact that if X 1 X 2 are independent then so are G(X 1 ) H(X 2 ) for any functions G H to show that Y 1, Y 2, Y 3 is independent of Y 4. We see that Y 4 is a function of U 3. I claim that Y 1, Y 2 Y 3 are each functions of U 1, U 2. You did Y 3 in the last part. Notice that Write then These show Y 2 = X 2 X = σ(z 2 Z) Y 1 = X 1 X = σ(z 1 Z). 2U 3 + 6U 2 /3 = Z 1 + Z 2 2U 3 + 6U 2 /3 + 2U 1 = 2Z 1 2U 3 + 6U 2 /3 2U 1 = 2Z 2 Z 1 Z = 6U 2 /6 + 2U 1 /2 Z 2 Z = 6U 2 /6 2U 1 /2 so both Y 1 Y 2 are functions of U 1 U 2. Since Y 1, Y 2, Y 3 is a function of U 1, U 2 Y 4 is a function of U 3 we see the desired independence. v. Express the sample variance of the X i, i = 1, 2, 3 in terms of U use this to show that (n 1)s 2 X/σ 2 has a χ 2 distribution on 2 degrees of freedom (with n = 3). Note: in fact the sample variance of X 1, X 2 is a function of U 1. Generalizations of this idea can be used to develop an identity of the form (n 1)s 2 n = (n 2)s 2 n 1 + U 2 n for a suitable U n where s 2 n is the sample variance for X 1,..., X n. 8
9 The sample variance is s 2 = Y Y2 2 + Y3 2 2 Replace each Y i by the formulas in the previous part to get that 2s 2 σ 2 = (Z 1 Z) 2 + (Z 2 Z) 2 + (Z 3 Z) 2 Write this in terms of U 1 U 2 to get 2s 2 σ 2 = { 6U2 /6 + 2U 1 /2 } 2 + { 6U2 /6 2U 1 /2 } 2 + { 6U2 /3 } 2 = U 2 1 +U 2 2 This is a sum of squares of two independent normals each with mean 0 variance 1 so we are done. 3. In class I discussed the general formula for a multivariate normal density. Suppose that Z 1 Z 2 are independent stard normal variables. Assume that X 1 = az 1 +bz 2 +c X 2 = dz 1 +ez 2 +f. Find the joint density of X 1 X 2 by evaluating the formulas I gave in class. Express P (X 1 t) as a double integral. I want to see the integr the limits of integration but you need not try to do the integral. The density in class was [ 1 2π det(aa T ) exp 1 ] 2 (x µ)t Σ 1 (x µ) where Σ = AA T. The entries in µ are c f we have [ a b A = d e ] Then Σ = AA T = (AA T ) 1 = [ a 2 + b 2 ad + be ad + be d 2 + e 2 d 2 +e 2 (ae bd) 2 Putting together all the algebra gives [ 1 f(x 1, x 2 ) = 2π ae bd exp where the exponent is the quadratic function ad+be (ae bd) 2 ad+be (ae bd) 2 a 2 +b 2 (ae bd) ] ] q(x 1, x 2, a, b, c, d, e, f). (ae bd) 2 q(x 1, x 2, a, b, c, d, e, f) = [(x 1 c) 2 (d 2 + e 2 ) 9 2(ad + be)(x 1 c)(x 2 f) +(x 2 f) 2 (a 2 + b 2 )].
10 To compute P (X 1 t) we take the joint density of X 1 X 2 integrate it over the set of (x 1, x 2 ) such that x 1 t to get P (X 1 t) = t f(x 1, x 2 )dx 1 dx Power sample size calculations must be done before the data are gathered. However: pilot studies are often used to determine the size of the unknown parameters which are needed for these calculations. Use the S Fibre Hardness data discussed in class as follows. (a) Consider the model Y i = β 0 + β 1 S i + β 2 F i + β 3 F 2 i Fit this model to get estimates of all the βs of σ. Use these fitted values as if they were the true parameter values in the following. i. Compute the power of a two sided t test (at the 5% level) of the hypothesis that β 3 = 0 for We need the non-centrality parameter + ɛ i β 3 { 0.006, 0.003, 0, 0.003, 0.006}. δ = β 3 σ a T (X T X) 1 a where a T = [ ]. We get β 3 values from the question the denominator from the stard error of ˆβ 3. I find that stard error to be This gives δ values equal to -3,-1.5,0,1.5,3 (close enough). Turning to Table B.5 with 14 degrees of freedom for error I get powers 0.8, somewhere between , 0.05, somewhere between The somewheres are, I guess in the range of ii. Find a number m of copies of the basic design (each combination of S Fibre tried twice) to guarantee that the power of the t-test of the hypothesis β 3 = 0 is 0.9 when the true parameter values are as in the fit above. The fitted value of β 3 is the corresponding value of δ is just the t statistic in the computer output for testing β 3 = 0. This is though we use in the tables because our tests are to be two sided. We need m to give us a power of 0.9. For large numbers of degrees of freedom for error this power is about half way between the figure under δ = 4 the figure under δ = 3 so we solve m = 3.5 get m = 3.5. We would have to use m = 4 giving δ = 3.74 a power closer to However we could treat the basic design as having 9 points try 63 points in total (7 at each combination of S F) get a power quite close to
11 (b) Now consider the model Y i = β 0 + β 1 S i + β 2 F i + β 3 F 2 i + β 4 S 2 i + β 5 S i F i + ɛ i Fit this model to get estimates of all the βs of σ. Use these fitted values as if they were the true parameter values in the following. Find a number m of copies of the basic design (each combination of S Fibre tried twice) to guarantee that the power of the F -test of the hypothesis β 3 = β 4 = β 5 = 0 is 0.9 when the true parameter values are as in this fit. We can use table B.11 on page 1338 since we will have 3 numerator degrees of freedom for this F test. We will have 18m 6 degrees of freedom for error so will need a φ, in the notation of the table, quite close to 2. This makes δ 2 = (p+1)2 2 = 16. The estimates are ˆβ 3 = , ˆβ 4 = ˆβ 5 = To get our value of δ 2 for the basic design I regress ˆβ 3 F 2 + ˆβ 4 S 2 + ˆβ 5 SF on S F find the error sum of squares is Divide this by ˆσ 2 = 6.77 to find δ 2 = I need to get up to 16 so I need m = 4. 11
MLES & Multivariate Normal Theory
Merlise Clyde September 6, 2016 Outline Expectations of Quadratic Forms Distribution Linear Transformations Distribution of estimates under normality Properties of MLE s Recap Ŷ = ˆµ is an unbiased estimate
More informationSTAT 350. Assignment 4
STAT 350 Assignment 4 1. For the Mileage data in assignment 3 conduct a residual analysis and report your findings. I used the full model for this since my answers to assignment 3 suggested we needed the
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationNext tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution. e z2 /2. f Z (z) = 1 2π. e z2 i /2
Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution Defn: Z R 1 N(0,1) iff f Z (z) = 1 2π e z2 /2 Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) (a column
More informationThe Multivariate Normal Distribution. In this case according to our theorem
The Multivariate Normal Distribution Defn: Z R 1 N(0, 1) iff f Z (z) = 1 2π e z2 /2. Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) T with the Z i independent and each Z i N(0, 1). In this
More informationStatistics 262: Intermediate Biostatistics Model selection
Statistics 262: Intermediate Biostatistics Model selection Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Today s class Model selection. Strategies for model selection.
More informationTropical Polynomials
1 Tropical Arithmetic Tropical Polynomials Los Angeles Math Circle, May 15, 2016 Bryant Mathews, Azusa Pacific University In tropical arithmetic, we define new addition and multiplication operations on
More informationLecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices
Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is
More informationThe Multivariate Gaussian Distribution
The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationSTAT 450. Moment Generating Functions
STAT 450 Moment Generating Functions There are many uses of generating functions in mathematics. We often study the properties of a sequence a n of numbers by creating the function a n s n n0 In statistics
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationSampling Distributions
Merlise Clyde Duke University September 8, 2016 Outline Topics Normal Theory Chi-squared Distributions Student t Distributions Readings: Christensen Apendix C, Chapter 1-2 Prostate Example > library(lasso2);
More informationThe Multivariate Gaussian Distribution [DRAFT]
The Multivariate Gaussian Distribution DRAFT David S. Rosenberg Abstract This is a collection of a few key and standard results about multivariate Gaussian distributions. I have not included many proofs,
More informationSTAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.
STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review
More informationModel Selection Procedures
Model Selection Procedures Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Model Selection Procedures Consider a regression setting with K potential predictor variables and you wish to explore
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More informationA Significance Test for the Lasso
A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical
More informationPreliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38
Preliminaries Copyright c 2018 Dan Nettleton (Iowa State University) Statistics 510 1 / 38 Notation for Scalars, Vectors, and Matrices Lowercase letters = scalars: x, c, σ. Boldface, lowercase letters
More informationThe Multivariate Normal Distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 36
The Multivariate Normal Distribution Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 36 The Moment Generating Function (MGF) of a random vector X is given by M X (t) = E(e t X
More informationProbability Background
Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second
More informationProperties of the least squares estimates
Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares
More informationSTA442/2101: Assignment 5
STA442/2101: Assignment 5 Craig Burkett Quiz on: Oct 23 rd, 2015 The questions are practice for the quiz next week, and are not to be handed in. I would like you to bring in all of the code you used to
More informationMathematical Focus 1 Complex numbers adhere to certain arithmetic properties for which they and their complex conjugates are defined.
Situation: Complex Roots in Conjugate Pairs Prepared at University of Georgia Center for Proficiency in Teaching Mathematics June 30, 2013 Sarah Major Prompt: A teacher in a high school Algebra class has
More informationMatrix Approach to Simple Linear Regression: An Overview
Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix
More informationIn this course we: do distribution theory when ǫ i N(0, σ 2 ) discuss what if the errors, ǫ i are not normal? omit proofs.
Distribution Theory Question: What is distribution theory? Answer: How to compute the distribution of an estimator, test or other statistic, T : Find P(T t), the Cumulative Distribution Function (CDF)
More informationThe Multivariate Normal Distribution. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 36
The Multivariate Normal Distribution Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 36 The Moment Generating Function (MGF) of a random vector X is given by M X (t) = E(e t X
More informationRegression #5: Confidence Intervals and Hypothesis Testing (Part 1)
Regression #5: Confidence Intervals and Hypothesis Testing (Part 1) Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #5 1 / 24 Introduction What is a confidence interval? To fix ideas, suppose
More informationResponse Surface Methodology
Response Surface Methodology Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 27 1 Response Surface Methodology Interested in response y in relation to numeric factors x Relationship
More informationSTAT 830 The Multivariate Normal Distribution
STAT 830 The Multivariate Normal Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2013 Richard Lockhart (Simon Fraser University)STAT 830 The Multivariate Normal Distribution STAT 830
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationInterpreting Regression Results
Interpreting Regression Results Carlo Favero Favero () Interpreting Regression Results 1 / 42 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split
More informationEXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"
EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically
More informationNotes on Random Vectors and Multivariate Normal
MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution
More informationStat 366 A1 (Fall 2006) Midterm Solutions (October 23) page 1
Stat 366 A1 Fall 6) Midterm Solutions October 3) page 1 1. The opening prices per share Y 1 and Y measured in dollars) of two similar stocks are independent random variables, each with a density function
More informationPartial Fraction Decomposition
Partial Fraction Decomposition As algebra students we have learned how to add and subtract fractions such as the one show below, but we probably have not been taught how to break the answer back apart
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More informationSTT 843 Key to Homework 1 Spring 2018
STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ
More informationConjugate Analysis for the Linear Model
Conjugate Analysis for the Linear Model If we have good prior knowledge that can help us specify priors for β and σ 2, we can use conjugate priors. Following the procedure in Christensen, Johnson, Branscum,
More informationECE Lecture #9 Part 2 Overview
ECE 450 - Lecture #9 Part Overview Bivariate Moments Mean or Expected Value of Z = g(x, Y) Correlation and Covariance of RV s Functions of RV s: Z = g(x, Y); finding f Z (z) Method : First find F(z), by
More informationTopic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.
Topic 2: Probability & Distributions ECO220Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit utoronto.ca November 21, 2017 Dr. Nick
More informationMoment Generating Function. STAT/MTHE 353: 5 Moment Generating Functions and Multivariate Normal Distribution
Moment Generating Function STAT/MTHE 353: 5 Moment Generating Functions and Multivariate Normal Distribution T. Linder Queen s University Winter 07 Definition Let X (X,...,X n ) T be a random vector and
More informationMATH 4211/6211 Optimization Basics of Optimization Problems
MATH 4211/6211 Optimization Basics of Optimization Problems Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 A standard minimization
More informationCore Mathematics 3 Algebra
http://kumarmathsweeblycom/ Core Mathematics 3 Algebra Edited by K V Kumaran Core Maths 3 Algebra Page Algebra fractions C3 The specifications suggest that you should be able to do the following: Simplify
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More information( 3) ( ) ( ) ( ) ( ) ( )
81 Instruction: Determining the Possible Rational Roots using the Rational Root Theorem Consider the theorem stated below. Rational Root Theorem: If the rational number b / c, in lowest terms, is a root
More informationAppendix A: Review of the General Linear Model
Appendix A: Review of the General Linear Model The generallinear modelis an important toolin many fmri data analyses. As the name general suggests, this model can be used for many different types of analyses,
More information7x 5 x 2 x + 2. = 7x 5. (x + 1)(x 2). 4 x
Advanced Integration Techniques: Partial Fractions The method of partial fractions can occasionally make it possible to find the integral of a quotient of rational functions. Partial fractions gives us
More informationRegression. Oscar García
Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is
More informationMULTIVARIATE DISTRIBUTIONS
Chapter 9 MULTIVARIATE DISTRIBUTIONS John Wishart (1898-1956) British statistician. Wishart was an assistant to Pearson at University College and to Fisher at Rothamsted. In 1928 he derived the distribution
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationThe Statistical Property of Ordinary Least Squares
The Statistical Property of Ordinary Least Squares The linear equation, on which we apply the OLS is y t = X t β + u t Then, as we have derived, the OLS estimator is ˆβ = [ X T X] 1 X T y Then, substituting
More informationThe Multivariate Normal Distribution 1
The Multivariate Normal Distribution 1 STA 302 Fall 2017 1 See last slide for copyright information. 1 / 40 Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ 2 and t distributions 2
More informationFigure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim
0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#
More informationAnalysis of Experimental Designs
Analysis of Experimental Designs p. 1/? Analysis of Experimental Designs Gilles Lamothe Mathematics and Statistics University of Ottawa Analysis of Experimental Designs p. 2/? Review of Probability A short
More informationST 740: Linear Models and Multivariate Normal Inference
ST 740: Linear Models and Multivariate Normal Inference Alyson Wilson Department of Statistics North Carolina State University November 4, 2013 A. Wilson (NCSU STAT) Linear Models November 4, 2013 1 /
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationvariability of the model, represented by σ 2 and not accounted for by Xβ
Posterior Predictive Distribution Suppose we have observed a new set of explanatory variables X and we want to predict the outcomes ỹ using the regression model. Components of uncertainty in p(ỹ y) variability
More informationLIST OF FORMULAS FOR STK1100 AND STK1110
LIST OF FORMULAS FOR STK1100 AND STK1110 (Version of 11. November 2015) 1. Probability Let A, B, A 1, A 2,..., B 1, B 2,... be events, that is, subsets of a sample space Ω. a) Axioms: A probability function
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationRegression: Lecture 2
Regression: Lecture 2 Niels Richard Hansen April 26, 2012 Contents 1 Linear regression and least squares estimation 1 1.1 Distributional results................................ 3 2 Non-linear effects and
More informationRegression #3: Properties of OLS Estimator
Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20 Introduction In this lecture, we establish some desirable properties associated with
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationLecture 5: Clustering, Linear Regression
Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 .0.0 5 5 1.0 7 5 X2 X2 7 1.5 1.0 0.5 3 1 2 Hierarchical clustering
More informationSTA 2101/442 Assignment 3 1
STA 2101/442 Assignment 3 1 These questions are practice for the midterm and final exam, and are not to be handed in. 1. Suppose X 1,..., X n are a random sample from a distribution with mean µ and variance
More informationInverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1
Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is
More informationEX: Simplify the expression. EX: Simplify the expression. EX: Simplify the expression
SIMPLIFYING RADICALS EX: Simplify the expression 84x 4 y 3 1.) Start by creating a factor tree for the constant. In this case 84. Keep factoring until all of your nodes are prime. Two factor trees are
More information1 Warm-Up: 2 Adjusted R 2. Introductory Applied Econometrics EEP/IAS 118 Spring Sylvan Herskowitz Section #
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Sylvan Herskowitz Section #10 4-1-15 1 Warm-Up: Remember that exam you took before break? We had a question that said this A researcher wants to
More informationMLR Model Selection. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project
MLR Model Selection Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationBIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84
Chapter 2 84 Chapter 3 Random Vectors and Multivariate Normal Distributions 3.1 Random vectors Definition 3.1.1. Random vector. Random vectors are vectors of random variables. For instance, X = X 1 X 2.
More informationHomoskedasticity. Var (u X) = σ 2. (23)
Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This
More informationAMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression
AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number
More informationLecture 13: Simple Linear Regression in Matrix Format
See updates and corrections at http://www.stat.cmu.edu/~cshalizi/mreg/ Lecture 13: Simple Linear Regression in Matrix Format 36-401, Section B, Fall 2015 13 October 2015 Contents 1 Least Squares in Matrix
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationComputational functional genomics
Computational functional genomics (Spring 2005: Lecture 8) David K. Gifford (Adapted from a lecture by Tommi S. Jaakkola) MIT CSAIL Basic clustering methods hierarchical k means mixture models Multi variate
More informationEconomics 620, Lecture 5: exp
1 Economics 620, Lecture 5: The K-Variable Linear Model II Third assumption (Normality): y; q(x; 2 I N ) 1 ) p(y) = (2 2 ) exp (N=2) 1 2 2(y X)0 (y X) where N is the sample size. The log likelihood function
More informationMS 2001: Test 1 B Solutions
MS 2001: Test 1 B Solutions Name: Student Number: Answer all questions. Marks may be lost if necessary work is not clearly shown. Remarks by me in italics and would not be required in a test - J.P. Question
More informationLikelihood Methods. 1 Likelihood Functions. The multivariate normal distribution likelihood function is
Likelihood Methods 1 Likelihood Functions The multivariate normal distribution likelihood function is The log of the likelihood, say L 1 is Ly = π.5n V.5 exp.5y Xb V 1 y Xb. L 1 = 0.5[N lnπ + ln V +y Xb
More informationThis does not cover everything on the final. Look at the posted practice problems for other topics.
Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry
More informationProblem Set #6: OLS. Economics 835: Econometrics. Fall 2012
Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.
More informationLecture 5: Clustering, Linear Regression
Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 Hierarchical clustering Most algorithms for hierarchical clustering
More informationLecture 6: Linear Regression
Lecture 6: Linear Regression Reading: Sections 3.1-3 STATS 202: Data mining and analysis Jonathan Taylor, 10/5 Slide credits: Sergio Bacallado 1 / 30 Simple linear regression Model: y i = β 0 + β 1 x i
More informationCovariance and Correlation
Covariance and Correlation ST 370 The probability distribution of a random variable gives complete information about its behavior, but its mean and variance are useful summaries. Similarly, the joint probability
More informationTopics in Probability and Statistics
Topics in Probability and tatistics A Fundamental Construction uppose {, P } is a sample space (with probability P), and suppose X : R is a random variable. The distribution of X is the probability P X
More informationThe program for the following sections follows.
Homework 6 nswer sheet Page 31 The program for the following sections follows. dm'log;clear;output;clear'; *************************************************************; *** EXST734 Homework Example 1
More informationAlgebra I Vocabulary Cards
Algebra I Vocabulary Cards Table of Contents Expressions and Operations Natural Numbers Whole Numbers Integers Rational Numbers Irrational Numbers Real Numbers Order of Operations Expression Variable Coefficient
More informationCOS 424: Interacting with Data
COS 424: Interacting with Data Lecturer: Rob Schapire Lecture #14 Scribe: Zia Khan April 3, 2007 Recall from previous lecture that in regression we are trying to predict a real value given our data. Specically,
More informationAsymptotic standard errors of MLE
Asymptotic standard errors of MLE Suppose, in the previous example of Carbon and Nitrogen in soil data, that we get the parameter estimates For maximum likelihood estimation, we can use Hessian matrix
More informationNext is material on matrix rank. Please see the handout
B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationStat 5101 Notes: Algorithms (thru 2nd midterm)
Stat 5101 Notes: Algorithms (thru 2nd midterm) Charles J. Geyer October 18, 2012 Contents 1 Calculating an Expectation or a Probability 2 1.1 From a PMF........................... 2 1.2 From a PDF...........................
More informationPreliminary Statistics. Lecture 3: Probability Models and Distributions
Preliminary Statistics Lecture 3: Probability Models and Distributions Rory Macqueen (rm43@soas.ac.uk), September 2015 Outline Revision of Lecture 2 Probability Density Functions Cumulative Distribution
More informationTable 1: Fish Biomass data set on 26 streams
Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain
More informationST505/S697R: Fall Homework 2 Solution.
ST505/S69R: Fall 2012. Homework 2 Solution. 1. 1a; problem 1.22 Below is the summary information (edited) from the regression (using R output); code at end of solution as is code and output for SAS. a)
More information2018 2019 1 9 sei@mistiu-tokyoacjp http://wwwstattu-tokyoacjp/~sei/lec-jhtml 11 552 3 0 1 2 3 4 5 6 7 13 14 33 4 1 4 4 2 1 1 2 2 1 1 12 13 R?boxplot boxplotstats which does the computation?boxplotstats
More information