LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
|
|
- Christian Hudson
- 6 years ago
- Views:
Transcription
1 LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 9 Multicolliearity Dr Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur
2 Multicolliearity diagostics A importat questio that arises is how to diagose the presece of multicolliearity i the data o the basis of give sample iformatio Several diagostic measures are available ad each of them is based o a particular approach It is difficult to say that which of the diagostic is the best or ultimate Some of the popular ad importat diagostics are described further The detectio of multicolliearity ivolves 3 aspects: (i) Determiig its presece (ii) Determiig its severity (iii) Determiig its form or locatio Determiat of X ' X ( X ' X ) This measure is based o the fact that the matrix X ' X becomes ill coditioed i the presece of multicolliearity The value of determiat of X ' X, ie, X ' X declies as degree of multicolliearity icreases If ra ( X ' X) < the X ' X will be sigular ad so X ' X 0 So as X ' X 0, the degree of multicolliearity icreases ad it becomes exact or perfect at X ' X 0 Thus X ' X serves as a measure of multicolliearity ad X ' X 0 idicates that perfect multicolliearity exists
3 3 Limitatios: This measure has followig limitatios i It is ot bouded as ii 0 < X ' X < It is affected by dispersio of explaatory variables For example, if, the X ' X x i xx i i i i xix i xi i i ( ) x i xi r i i where r is the correlatio coefficiet betwee X ad X So X ' X variability of explaatory variable If explaatory variables have very low variability, the zero which will idicate the presece of multicolliearity ad which is ot the case so depeds o correlatio coefficiet ad X ' X may ted to iii It gives o idea about the relative effects o idividual coefficiets If multicolliearity is preset, the it will ot idicate that which variable i X ' X is causig multicolliearity ad is hard to determie
4 4 Ispectio of correlatio matrix The ispectio of off-diagoal elemets r i i gives a idea about the presece of multicolliearity If X i ad X are early liearly depedet the r i will be close to Note that the observatios i X are stadardized i the sese that each observatio is subtracted from mea of that variable ad divided by the square root of corrected sum of squares of that variable X ' X Whe more tha two explaatory variables are cosidered ad if they are ivolved i ear-liear depedecy, the it is ot ecessary that ay of the r i will be large Geerally, pairwise ispectio of correlatio coefficiets is ot sufficiet for detectig multicolliearity i the data 3 Determiat of correlatio matrix Let D be the determiat of correlatio matrix the 0 D If D 0 the it idicates the existece of exact liear depedece amog explaatory variables If D the the colums of X matrix are orthoormal Thus a value close to 0 is a idicatio of high degree of multicolliearity Ay value of D betwee 0 ad gives a idea of the degree of multicolliearity Limitatio: It gives o iformatio about the umber of liear depedecies amog explaatory variables
5 5 Advatages over X ' X (i) It is a bouded measure 0 D (ii) It is ot affected by the dispersio of explaatory variables For example, whe, xi xx i i i i r xx i i xi i i X ' X ( ) 4 Measure based o partial regressio A measure of multicolliearity ca be obtaied o the basis of coefficiets of determiatio based o partial regressio Let R be the coefficiet of determiatio i the full model, ie, based o all explaatory variables ad determiatio i the model whe i th explaatory variable is dropped, i,,,, ad R L R Max( R, R,, R ) be the coefficiet of
6 6 Procedure: i Drop oe of the explaatory variable amog variables, say X ii Ru regressio of y over rest of the ( - ) variables X, X 3,, X iii Calculate R iv Similarly calculate 3 v Fid R Max( R, R,, R ) vi L Determie R, R,, R R R L ( ) R R L R L The quatity provides a measure of multicolliearity If multicolliearity is preset, will be high Higher the R L ( R R L ) degree of multicolliearity, higher the value of So i the presece of multicolliearity, be low Thus if R R L ( ) is close to 0, it idicates the high degree of multicolliearity Limitatios: i It gives o iformatio about the uderlyig relatios about explaatory variables, ie, how may relatioships are preset or how may explaatory variables are resposible for the multicolliearity ii Small value of ( R R L ) may occur because of poor specificatio of the model also ad it may be iferred i such situatio that multicolliearity is preset
7 5 Variace iflatio factors (VIF) The matrix C ( X ' X) X ' X becomes ill-coditioed i the presece of multicolliearity i the data So the diagoal elemets of helps i the detectio of multicolliearity If deotes the coefficiet of determiatio obtaied whe X is regressed o the remaiig ( - ) variables excludig X, the the th diagoal elemet of C is R 7 C R If X is early orthogoal to remaiig explaatory variables, the is small ad cosequetly C is close to R If X is early liearly depedet o a subset of remaiig explaatory variables, the C is large Sice the variace of th OLSE of β is Var b σ ( ) C is close to ad cosequetly So C is the factor by which the variace of b icreases whe the explaatory variables are ear liear depedet Based o this cocept, the variace iflatio factor for the th explaatory variable is defied as VIF R This is the factor which is resposible for iflatig the samplig variace The combied effect of depedecies amog the explaatory variables o the variace of a term is measured by the VIF of that term i the model Oe or more large VIFs idicate the presece of multicolliearity i the data R I practice, usually a VIF > 5 or 0 idicates that the associated regressio coefficiets are poorly estimated because of multicolliearity If regressio coefficiets are estimated by OLSE ad its variace is part of this variace is give by VIF σ ( X ' X) So VIF idicates that a
8 8 Limitatios: (i) (ii) It sheds o light o the umber of depedecies amog the explaatory variables The rule of VIF > 5 or 0 is a rule of thumb which may differ from oe situatio to aother situatio Aother iterpretatio of VIF The VIFs ca also be viewed as follows The cofidece iterval of th OLSE of b± ˆ σ C t α, The legth of the cofidece iterval is β is give by L ˆ σ C t α, Now cosider a situatio where X is a orthogoal matrix, ie, X ' X I so that C, sample size is same as earlier ad same root mea squares ( x ), the the legth of cofidece iterval becomes i x L* ˆ σt α, i L Cosider the ratio C L * Thus VIF idicates the icrease i the legth of cofidece iterval of th regressio coefficiet due to the presece of multicolliearity
9 6 Coditio umber ad coditio idex λ, λ,, λ X ' X Let be the eigevalues (or characteristic roots) of Let 9 λ Max( λ, λ,, λ ) max λ Mi( λ, λ,, λ ) mi The coditio umber (CN) is defied as CN λ λ max < < mi,0 CN The small values of characteristic roots idicates the presece of ear-liear depedecies i the data The CN provides a measure of spread i the spectrum of characteristic roots of X X The coditio umber provides a measure of multicolliearity If CN < 00, the it is cosidered as o-harmful multicolliearity If 00 < CN < 000, the it idicates that the multicolliearity is moderate to severe (or strog) This rage is referred to as dager level If CN > 000, the it idicates a severe (or strog) multicolliearity The coditio umber is based oly or two eigevalues: use iformatio o other eigevalues as well The coditio idices of X X are defied as I fact, largest C CN λ mi ad λ Aother measures are coditio idices which The umber of coditio idices that are large, say more tha 000, idicate the umber of ear-liear depedecies i X X A limitatio of CN ad C is that they are ubouded measures as 0 < CN <, 0 < C < max max C λ,,,, λ
10 0 7 Measure based o characteristic roots ad proportio of variaces λ, λ,, λ X ' X, Λ diag( λ, λ,, λ ) Let be the eigevalues of is matrix ad V is a matrix costructed by the eigevectors of X X Obviously, V is a orthogoal matrix The X X ca be decomposed as V, V,, V λ X ' X VΛV ' Let be the colum of V If there is ear-liear depedecy i the data, the is close to zero ad the ature of liear depedecy is described by the elemets of associated eigevector V The covariace matrix of OLSE is Vb ( ) σ ( X' X) σ ( VΛV ') σ VΛ V ' vi vi v i Var( bi ) σ λ λ λ where v, v,, v are the elemets i V i i i The coditio idices are max C λ,,,, λ
11 Procedure: i Fid coditio idex C, C,, C ii (a) Idetify those λ ' s for which C is greater tha the dager level 000 (b) This gives the umber of liear depedecies (c) Do t cosider those C which are below the dager level ' s iii For such λ 's with coditio idex above the dager level, choose oe such eigevalue, say iv Fid the value of proportio of variace correspodig to i Var( b ), Var( b ),, Var( b ) as v i Note that ca be foud from the expressio λ vi vi v i Var( bi ) σ λ λ λ ie, correspodig to factor The proportio of variace i th p i ( vi / λ ) vi / λ VIF vi p i ( / λ ) λ provides a measure of multicolliearity λ p > 05, If it idicates that is adversely affected by the multicolliearity, ie, estimate of is iflueced by the i presece of multicolliearity b i β i It is a good diagostic tool i the sese that it tells about the presece of harmful multicolliearity as well as also idicates the umber of liear depedecies resposible for multicolliearity This diagostic is better tha other diagostics
12 The coditio idices are also defied by the sigular value decompositio of X matrix as follows: where U is matrix, V is matrix, is matrix UU ' I, VV ' I, D is matrix, D diag ( µ, µ,, µ ) ad µ, µ,, µ are the sigular values of X, V is a matrix whose colums are eigevectors correspodig to eigevalues of X X ad U is a matrix whose colums are the eigevectors associated with the ozero eigevalues of X X X UDV ' The coditio idices of X matrix are defied as µ max η,,,, µ where µ max Max( µ, µ,, µ ) If λ, λ,, λ are the eigevalues of X X the p p X X UDV UDV VD V V V ' ( ') ' ' ' Λ ', so,,,, µ λ Note that with µ λ Var( b ) σ VIF p i i, v i i i ( v i / µ i ) VIF µ v µ i i
13 3 The ill-coditioig i X is reflected i the size of sigular values There will be oe small sigular value for each oliear depedecy The extet of ill coditioig is described by how small is µ µ max relative to It is suggested that the explaatory variables should be scaled to uit legth but should ot be cetered whe computig p i This will help i diagosig the role of itercept term i ear-liear depedece No uique guidace is available i literature o the issue of ceterig the explaatory variables The ceterig maes the itercept orthogoal to explaatory variables So this may remove the ill coditioig due to itercept term i the model
ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors
ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More informationEfficient GMM LECTURE 12 GMM II
DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet
More informationAlgebra of Least Squares
October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal
More informationResponse Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable
Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated
More informationRegression, Inference, and Model Building
Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship
More informationGeometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT
OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca
More informationStatistical Fundamentals and Control Charts
Statistical Fudametals ad Cotrol Charts 1. Statistical Process Cotrol Basics Chace causes of variatio uavoidable causes of variatios Assigable causes of variatio large variatios related to machies, materials,
More informationECON 3150/4150, Spring term Lecture 3
Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More informationFirst, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,
0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical
More information4 Multidimensional quantitative data
Chapter 4 Multidimesioal quatitative data 4 Multidimesioal statistics Basic statistics are ow part of the curriculum of most ecologists However, statistical techiques based o such simple distributios as
More information6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition
6. Kalma filter implemetatio for liear algebraic equatios. Karhue-Loeve decompositio 6.1. Solvable liear algebraic systems. Probabilistic iterpretatio. Let A be a quadratic matrix (ot obligatory osigular.
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationLinear Regression Models
Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationProbability, Expectation Value and Uncertainty
Chapter 1 Probability, Expectatio Value ad Ucertaity We have see that the physically observable properties of a quatum system are represeted by Hermitea operators (also referred to as observables ) such
More informationInvestigating the Significance of a Correlation Coefficient using Jackknife Estimates
Iteratioal Joural of Scieces: Basic ad Applied Research (IJSBAR) ISSN 2307-4531 (Prit & Olie) http://gssrr.org/idex.php?joural=jouralofbasicadapplied ---------------------------------------------------------------------------------------------------------------------------
More informationA statistical method to determine sample size to estimate characteristic value of soil parameters
A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig
More informationII. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation
II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio
More informationMachine Learning for Data Science (CS 4786)
Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm
More informationChapter 2 Descriptive Statistics
Chapter 2 Descriptive Statistics Statistics Most commoly, statistics refers to umerical data. Statistics may also refer to the process of collectig, orgaizig, presetig, aalyzig ad iterpretig umerical data
More informationIsmor Fischer, 1/11/
Ismor Fischer, //04 7.4-7.4 Problems. I Problem 4.4/9, it was show that importat relatios exist betwee populatio meas, variaces, ad covariace. Specifically, we have the formulas that appear below left.
More informationStat 139 Homework 7 Solutions, Fall 2015
Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,
More information3.2 Properties of Division 3.3 Zeros of Polynomials 3.4 Complex and Rational Zeros of Polynomials
Math 60 www.timetodare.com 3. Properties of Divisio 3.3 Zeros of Polyomials 3.4 Complex ad Ratioal Zeros of Polyomials I these sectios we will study polyomials algebraically. Most of our work will be cocered
More informationApply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.
Eigevalue-Eigevector Istructor: Nam Su Wag eigemcd Ay vector i real Euclidea space of dimesio ca be uiquely epressed as a liear combiatio of liearly idepedet vectors (ie, basis) g j, j,,, α g α g α g α
More informationSingular value decomposition. Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaine
Lecture 11 Sigular value decompositio Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaie V1.2 07/12/2018 1 Sigular value decompositio (SVD) at a glace Motivatio: the image of the uit sphere S
More informationOverview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions
Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples
More informationSlide Set 13 Linear Model with Endogenous Regressors and the GMM estimator
Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday
More informationCorrelation and Regression
Correlatio ad Regressio Lecturer, Departmet of Agroomy Sher-e-Bagla Agricultural Uiversity Correlatio Whe there is a relatioship betwee quatitative measures betwee two sets of pheomea, the appropriate
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More information1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable
More informationmultiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.
Lesso 3- Lesso 3- Scale Chages of Data Vocabulary scale chage of a data set scale factor scale image BIG IDEA Multiplyig every umber i a data set by k multiplies all measures of ceter ad the stadard deviatio
More informationSIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS
SIMPLE LINEAR REGRESSION AND CORRELATION ANALSIS INTRODUCTION There are lot of statistical ivestigatio to kow whether there is a relatioship amog variables Two aalyses: (1) regressio aalysis; () correlatio
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationContinuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised
Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for
More information7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals
7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses
More information62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +
62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of
More informationSimple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700
Simple Regressio CS 7 Ackowledgemet These slides are based o presetatios created ad copyrighted by Prof. Daiel Measce (GMU) Basics Purpose of regressio aalysis: predict the value of a depedet or respose
More informationµ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion
Poit Estimatio Poit estimatio is the rather simplistic (ad obvious) process of usig the kow value of a sample statistic as a approximatio to the ukow value of a populatio parameter. So we could for example
More informationLECTURE 8: ORTHOGONALITY (CHAPTER 5 IN THE BOOK)
LECTURE 8: ORTHOGONALITY (CHAPTER 5 IN THE BOOK) Everythig marked by is ot required by the course syllabus I this lecture, all vector spaces is over the real umber R. All vectors i R is viewed as a colum
More informationZeros of Polynomials
Math 160 www.timetodare.com 4.5 4.6 Zeros of Polyomials I these sectios we will study polyomials algebraically. Most of our work will be cocered with fidig the solutios of polyomial equatios of ay degree
More informationEconomics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls
Ecoomics 250 Assigmet 1 Suggested Aswers 1. We have the followig data set o the legths (i miutes) of a sample of log-distace phoe calls 1 20 10 20 13 23 3 7 18 7 4 5 15 7 29 10 18 10 10 23 4 12 8 6 (1)
More information10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random
Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More informationCHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.
CHAPTER 2 umerical Measures Graphical method may ot always be sufficiet for describig data. You ca use the data to calculate a set of umbers that will covey a good metal picture of the frequecy distributio.
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationGeneral IxJ Contingency Tables
page1 Geeral x Cotigecy Tables We ow geeralize our previous results from the prospective, retrospective ad cross-sectioal studies ad the Poisso samplig case to x cotigecy tables. For such tables, the test
More informationStatistical Properties of OLS estimators
1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of
More informationS Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y
1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these
More informationCorrelation Regression
Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother
More information5.1 Review of Singular Value Decomposition (SVD)
MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of
More informationChapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).
Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each
More informationSimple Linear Regression
Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i
More informationChapter 12 Correlation
Chapter Correlatio Correlatio is very similar to regressio with oe very importat differece. Regressio is used to explore the relatioship betwee a idepedet variable ad a depedet variable, whereas correlatio
More informationLecture 24: Variable selection in linear models
Lecture 24: Variable selectio i liear models Cosider liear model X = Z β + ε, β R p ad Varε = σ 2 I. Like the LSE, the ridge regressio estimator does ot give 0 estimate to a compoet of β eve if that compoet
More informationChapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers
Chapter 4 4-1 orth Seattle Commuity College BUS10 Busiess Statistics Chapter 4 Descriptive Statistics Summary Defiitios Cetral tedecy: The extet to which the data values group aroud a cetral value. Variatio:
More informationEigenvalues and Eigenvectors
5 Eigevalues ad Eigevectors 5.3 DIAGONALIZATION DIAGONALIZATION Example 1: Let. Fid a formula for A k, give that P 1 1 = 1 2 ad, where Solutio: The stadard formula for the iverse of a 2 2 matrix yields
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationSection 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis
Sectio 9.2 Tests About a Populatio Proportio P H A N T O M S Parameters Hypothesis Assess Coditios Name the Test Test Statistic (Calculate) Obtai P value Make a decisio State coclusio Sectio 9.2 Tests
More informationLast time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object
6.3 Stochastic Estimatio ad Cotrol, Fall 004 Lecture 7 Last time: Momets of the Poisso distributio from its geeratig fuctio. Gs () e dg µ e ds dg µ ( s) µ ( s) µ ( s) µ e ds dg X µ ds X s dg dg + ds ds
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +
More informationSTP 226 ELEMENTARY STATISTICS
TP 6 TP 6 ELEMENTARY TATITIC CHAPTER 4 DECRIPTIVE MEAURE IN REGREION AND CORRELATION Liear Regressio ad correlatio allows us to examie the relatioship betwee two or more quatitative variables. 4.1 Liear
More informationThe variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.
SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample
More informationChapter 3: Other Issues in Multiple regression (Part 1)
Chapter 3: Other Issues i Multiple regressio (Part 1) 1 Model (variable) selectio The difficulty with model selectio: for p predictors, there are 2 p differet cadidate models. Whe we have may predictors
More informationOutline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression
REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques
More informationREGRESSION (Physics 1210 Notes, Partial Modified Appendix A)
REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data
More informationStatistical Intervals for a Single Sample
3/5/06 Applied Statistics ad Probability for Egieers Sixth Editio Douglas C. Motgomery George C. Ruger Chapter 8 Statistical Itervals for a Sigle Sample 8 CHAPTER OUTLINE 8- Cofidece Iterval o the Mea
More informationFor a 3 3 diagonal matrix we find. Thus e 1 is a eigenvector corresponding to eigenvalue λ = a 11. Thus matrix A has eigenvalues 2 and 3.
Closed Leotief Model Chapter 6 Eigevalues I a closed Leotief iput-output-model cosumptio ad productio coicide, i.e. V x = x = x Is this possible for the give techology matrix V? This is a special case
More informationChapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo
More informationECON 3150/4150, Spring term Lecture 1
ECON 3150/4150, Sprig term 2013. Lecture 1 Ragar Nymoe Uiversity of Oslo 15 Jauary 2013 1 / 42 Refereces to Lecture 1 ad 2 Hill, Griffiths ad Lim, 4 ed (HGL) Ch 1-1.5; Ch 2.8-2.9,4.3-4.3.1.3 Bårdse ad
More information17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15
17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig
More informationAssessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions
Assessmet ad Modelig of Forests FR 48 Sprig Assigmet Solutios. The first part of the questio asked that you calculate the average, stadard deviatio, coefficiet of variatio, ad 9% cofidece iterval of the
More informationBayesian Methods: Introduction to Multi-parameter Models
Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationMath 155 (Lecture 3)
Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,
More informationChapter Vectors
Chapter 4. Vectors fter readig this chapter you should be able to:. defie a vector. add ad subtract vectors. fid liear combiatios of vectors ad their relatioship to a set of equatios 4. explai what it
More informationCLRM estimation Pietro Coretto Econometrics
Slide Set 4 CLRM estimatio Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Thursday 24 th Jauary, 2019 (h08:41) P. Coretto
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber
More informationENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!
ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Solutios Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced
More informationNumber of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day
LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationConfidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation
Cofidece Iterval for tadard Deviatio of Normal Distributio with Kow Coefficiets of Variatio uparat Niwitpog Departmet of Applied tatistics, Faculty of Applied ciece Kig Mogkut s Uiversity of Techology
More informationOpen book and notes. 120 minutes. Cover page and six pages of exam. No calculators.
IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits
More informationMachine Learning for Data Science (CS 4786)
Machie Learig for Data Sciece CS 4786) Lecture 9: Pricipal Compoet Aalysis The text i black outlies mai ideas to retai from the lecture. The text i blue give a deeper uderstadig of how we derive or get
More informationSection 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations
Differece Equatios to Differetial Equatios Sectio. Calculus: Areas Ad Tagets The study of calculus begis with questios about chage. What happes to the velocity of a swigig pedulum as its positio chages?
More informationIndices of Distances: Characteristics and Detection of Abnormal Points
Iteratioal Joural of Mathematics ad Computer Sciece, 8(2013), o. 2, 55 68 M CS Idices of Distaces: Characteristics ad Detectio of Abormal Poits Hicham Y. Abdallah Departmet of Applied Mathematics Faculty
More information1 Last time: similar and diagonalizable matrices
Last time: similar ad diagoalizable matrices Let be a positive iteger Suppose A is a matrix, v R, ad λ R Recall that v a eigevector for A with eigevalue λ if v ad Av λv, or equivaletly if v is a ozero
More informationCov(aX, cy ) Var(X) Var(Y ) It is completely invariant to affine transformations: for any a, b, c, d R, ρ(ax + b, cy + d) = a.s. X i. as n.
CS 189 Itroductio to Machie Learig Sprig 218 Note 11 1 Caoical Correlatio Aalysis The Pearso Correlatio Coefficiet ρ(x, Y ) is a way to measure how liearly related (i other words, how well a liear model
More informationhttp://www.xelca.l/articles/ufo_ladigsbaa_houte.aspx imulatio Output aalysis 3/4/06 This lecture Output: A simulatio determies the value of some performace measures, e.g. productio per hour, average queue
More informationCorrelation and Covariance
Correlatio ad Covariace Tom Ilveto FREC 9 What is Next? Correlatio ad Regressio Regressio We specify a depedet variable as a liear fuctio of oe or more idepedet variables, based o co-variace Regressio
More informationSoo King Lim Figure 1: Figure 2: Figure 3: Figure 4: Figure 5: Figure 6: Figure 7:
0 Multivariate Cotrol Chart 3 Multivariate Normal Distributio 5 Estimatio of the Mea ad Covariace Matrix 6 Hotellig s Cotrol Chart 6 Hotellig s Square 8 Average Value of k Subgroups 0 Example 3 3 Value
More informationA proposed discrete distribution for the statistical modeling of
It. Statistical Ist.: Proc. 58th World Statistical Cogress, 0, Dubli (Sessio CPS047) p.5059 A proposed discrete distributio for the statistical modelig of Likert data Kidd, Marti Cetre for Statistical
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationEconomics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator
Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationChapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more
More informationChimica Inorganica 3
himica Iorgaica Irreducible Represetatios ad haracter Tables Rather tha usig geometrical operatios, it is ofte much more coveiet to employ a ew set of group elemets which are matrices ad to make the rule
More information