1 Inferential Methods for Correlation and Regression Analysis


 Gladys Chase
 9 months ago
 Views:
Transcription
1 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet ad the sample Regressio Lie were obtaied for describig ad measurig the quality ad stregth of the liear relatioship betwee two cotiuous variables. I this sectio a model for the uderlyig populatio will be developed. The this model will be used to obtai iferetial methods for this kid of data. 1.1 The Simple Liear Regressio Model Defiitio: The simple liear regressio model assumes that there is a lie with itercept α ad slope β, called the true populatio regressio lie, that describes the relatioship betwee the variable x ad y. Whe a value of the idepedet variable x is fixed ad a observatio o the depedet variable y ca be writte as, y = α + βx + e e is the error, ad we assume e N (0, σ). Basic Assumptios of the Simple Liear Regressio Model 1. The distributio of the radom deviatio e has populatio mea µ e =0. 2. The stadard deviatio of e is the same for ay particular value of x. It is deoted by σ. 3. The distributio of e is ormal. The model implies for the mea of the respose variable µ y = α + βx Radomess i e implies that the outcome of y for a give x is also radom, so that y is a radom variable. To summarize: That for every x, y is a ormal distributed radom variable with mea α + βx ad stadard deviatio σ. Iterpretatio: The slope β of the populatio regressio lie is the average chage i y associated with a 1uit icrease i x. The vertical itercept α is the mea of y whe x = 0. The value of σ describes the extet to which (x, y) observatios deviate vertically from the regressio lie. Diagram (compare Textbook pg.598) Diagram show small ad large σ. 1
2 1.2 Estimatig the Parameter of the Model The parameter of the model are α, β, ad σ. They ca be estimated (poit estimator ad cofidece iterval) by usig sample data (x 1, y 1 ), (x 2, y 2 ),..., (x, y ). Assume that the relatioship of x ad y ca be described by a Simple Liear Regressio Model ad that the observatios were draw idepedetly. Uder these assumptios poit estimates of α ad β are the itercept ad the slope of the least squares lie, respectively. That is poit estimate of β = b = S xy S xx poit estimate of α = a = ȳ b x where as before S xy = xy ( x)( y) ad S xx = x 2 ( x) 2. The estimated regressio lie is idetical to the least squares lie: ŷ = a + bx I additio to estimatig α ad β there is also a iterest i estimatig σ the spread of the radom deviatio of the regressio lie. This estimate will be based o the Sum of Squares for Error: SSE = (y ŷ) 2 where ŷ 1 = a + bx 1, ŷ 2 = a + bx 2,..., ŷ = a + bx are the fitted or predicted y values ad the residuals are y 1 ŷ 1, y 2 ŷ 2,..., y ŷ The residuals give the vertical distace of the observatios to the regressio lie. This makes SSE a measure of the extet to which the sample data spreads out about the estimated regressio lie. Estimatio of σ The statistic for estimatig the variace σ 2 is where The estimate of the stadard deviatio σ is s 2 = SSE 2 SSE = (y ŷ) 2 = y 2 a y b xy s = s 2 The Coefficiet of Determiatio ca ow be rewritte as r 2 = 1 SSE SS yy 2
3 where SS yy is SS yy = (y ȳ) 2 = y 2 ( y) 2 = S yy This equatio ow shows the iterpretatio of r 2 as the proportio of y variatio that ca be explaied by the model relatioship. σ measures the usual vertical deviatio of a poit (x, y) i the populatio from the populatio regressio lie. Similarly, s e measures the typical deviatio of a sample poit (x, y) of the sample regressio lie. 1.3 Iferece Cocerig the Slope of the Populatio Regressio Lie The slope B is the average chage i y, whe x is icreased by 1 uit. This iterpretatio makes B a iterestig parameter. I additio we fid, that if β = 0 the Populatio Regressio Lie would be parallel to the xaxis, thus a chage i x would t have a impact o y, which would mea that x ad y are idepedet. These two properties lead us to ask for a cofidece iterval for B, i order to be able to estimate B properly ad for a statistical test cocerig B. Before we ca obtai these methods we have to study the samplig distributio of the statistic to be used for estimatig B, which is b. Properties of the Samplig Distributio of b Whe the basic assumptios of the simple liear regressio model are met, the followig claims hold: 1. The populatio mea of b is β: µ b = β. Thus, b is a ubiased estimate of β. 2. The populatio stadard deviatio of b is σ b = σ Sxx 3. The statistic b has a ormal distributio (which is a cosequece of e beig ormally distributed). If σ b is large the estimate b does ot have to be close to the true β. I order for σ b to be small σ should be small, that meas little variatio about the populatio regressio lie, ad (or) S xx = (x x) 2 should be large, that meas far spread out x values are i favor for a smaller variatio i the estimate b for B. Stadardizig b Sice σ b depeds o the ukow σ it ca ot be used for a stadardizig b, i order to develop a statistic that ca be used for developig a cofidece iterval or test. 3
4 Istead use SE b = as estimate of σ b the stadard deviatio of b. This leads to: t = b β SE b which is t distributed with df = 2 (for a Simple Liear Regressio Model). This tstatistic does ot deped o ay ukow parameters beside β, so it ca be used for developig a cofidece iterval ad a test cocerig B. Cofidece Iterval for β A (1 α)% Cofidece Iterval for the slope β from a simple liear regressio model is give by: b ± (t critical value) 2 1 α SE b Where the t critical value is based o df = 2 ad C = 1 α (Table C). Example: Is cardiovascular fitess related to a athlete s performace i a 20km ski race? s S xx x= time to exhaustio ruig o a treadmill (i miutes) y= 20km ski time(i miutes) x y A scatterplot of this data shows a liear decrease i ski time by icreasig treadmill time. Straightforward calculatio gives: = 11 x = y = x 2 = xy = y 2 = 48,
5 From these calculate ad so that: S xy = xy ( x)( y) S xx = x 2 ( x) 2 = (106.3)(728.70) 11 = (106.3)2 11 = = b = S xy = = ad a = ȳ b x = = S xx The poit estimate for the decrease i ski time for every miute icrease i treadmill time is 2.33 miutes. Further more we get SSE = y 2 a y b xy = SS yy = y 2 ( y) 2 = r 2 = 1 SSE = SS yy s 2 = SSE 2 = s = s 2 = SE b = s Sxx = r 2 = is ot very high, but shows that about 63% of the variatio i skitime ca be explaied by the Liear Regressio Model. The estimate s = 2.19 shows that there is quite some variatio i the skitime for ay give treadmill time. The calculatio of a 95% cofidece iterval for β requires the t critical value with 2=9 degrees of freedom ad 1 α = 0.95: The resultig cofidece iterval is the b ± t critical value SE b = ± (2.26) (0.591) = ± = ( 3.671; 0.999) Based o the sample, we are 95% cofidet that the true average decrease i ski time associated with a oe miute icrease i treadmill time is betwee 1 ad 3.7 miutes. Hypothesis Test Cocerig β Hypotheses: Test type Upper tail H 0 : β 0 H a : β > 0 Lower tail H 0 : β 0 H a : β < 0 Two tail H 0 : β = 0 H a : β 0 5
6 Assumptios: Radom sample ad the Simple Liear Regressio Model applies to x ad y. Test statistic: Pvalue: t 0 = b SE b with df = 2 Test type Pvalue Upper tail P (t > t 0 ) Lower tail P (t < t 0 ) Two tail 2 P (t > abs(t 0 )) The two tail test of H 0 : β = 0 versus H a : β 0 is a test to assure that there is i fact a liear coectio betwee x ad y, it is also called Model Utility Test. Cotiue Example: Let s do the Model Utility Test for the example above at a sigificace level of α = Hypotheses: ad α = Do Model Utility Test. H 0 : β = 0 versus H a : β 0 2. Assumptios: Radom sample. Assume the relatioship of x ad y ca be described by a Simple Liear Regressio Model: 3. Test statistic: t 0 = b = = with df = 9 SE b Pvalue: This is a two tail test, thus pvalue= 2 P (t > abs(t 0 )) = = Decisio: Sice the pvalue α the ull hypothesis is rejected. 6. Coclusio: At sigificace level of 5% we coclude that β 0 ad there is a liear relatioship betwee ski time ad treadmill time. The Test Procedure: 1. Hypotheses: State the hypotheses ad the sigificace level, ad choose the appropriate test. 2. Assumptios: Check if the assumptios of that test are met. 3. Test statistic: Calculate the test statistic icludig the degrees of freedom if ecessary. 4. Pvalue: Obtai the pvalue, accordig to test type. 5. Decisio: Decide if the ull hypothesis ca be rejected, or ot. 6. Coclusio: Aswer the questio cocerig the problem stated for the situatio. 6
7 1.4 Checkig Model Adequacy Before relyig o the results based o a simple Liear Regressio Model, the model assumptios should be checked. These assumptios are mostly related to the distributio of the error e i the model equatio For the radom error e, we have to assume 1. For all x, e is ormally distributed, ad y = α + βx + e 2. The mea ad the stadard deviatio of e do ot deped o the value of x. If these assumptio are seriously violated the results based o the Simple Liear Regressio Model are ot reliable, ad may ot be true. This fact makes it ecessary to check for these assumptios. Residual Aalysis If the residuals e 1 to e from the populatio regressio lie α + βx would be available, we could use them to check for those assumptios. I ay applied situatio these values are ot kow, because the populatio regressio lie is ot kow. We have to rely o their estimates, the sample residuals (residuals from the least squares regressio lie, the estimated lie) y 1 ŷ 1 = y 1 (a + bx 1 ) y 2 ŷ 2 = y 2 (a + bx 2 )... y ŷ = y (a + bx ) Ay observatio that leads to very large residual should be examied for mistakes. How to determie if a residual is large? By ow we kow this is best be doe with by stadardizig the values ad tha judge them. How do we stadardize the residuals? We stadardize by subtractig the mea ad dividig by the stadard deviatio. But we kow that the populatio mea of the residuals is zero (µ e = 0). Ad it ca be determied that the stadard deviatio of the i th residual y i ŷ i is equal to ad the stadardized residual is equal to s 1 1 (x i x) 2 S xx. y i ŷ i s 1 1 (x i x) 2 S xx 7
8 Example 1 (Treadmill time ad ski time) For this data set the stadardized residuals are Treadmill Ski st. Res Is ay of these values ureasoably large? First do a histogram for the residuals to check if it is reasoable to assume they come from a ormal populatio. This histogram shows a uimodal ad symmetric distributio, without ay outliers. assumptio of ormality seems to be reasoable. The Plottig stadardized residuals versus tread mill time gives a visual presetatio of this data ad idicates if there is cocer that the stadard deviatio might vary for differet x. 8
9 This graph show that the stadardized residuals are evely spread above ad below the 0. The spread does ot chage with x. This plot does ot give ay idicatio that the Simple Liear Regressio Model assumptios are violated Applyig the Model Estimatio ad Predictio Oce we established that the model we foud describes the relatioship properly, we ca use the model for estimatio: estimate mea values of y, E(y) for give values of x. For example we might be iterested i the mea time a perso eed to fiish the ski race whe doig 10 miutes o the treadmill. predictio: ew idividual value based o the kowledge of x, For example, we wat to give the rage of ski times for people who ca do 10 miutes o the treadmill. The key to both applicatios of the model is the least squares lie, ŷ = a+bx, we foud before, which gives us poit estimators for both applicatios The ext step is to fid cofidece itervals for E(y) for a give x, ad for a idividual value y, for a give x. I order to do this we will have to discuss the distributio of ŷ for the two applicatios. Samplig distributio of ŷ = a + bx 1. The stadard deviatio of the distributio of the estimator ŷ of the mea value E(y) at a specific value of x, say x p 1 σŷ = σ + (x p x) 2 SS xx This is the stadard error of ŷ. 2. The stadard deviatio of the distributio of the predictio error, y ŷ, for the predictor ŷ of a idividual ew y at a specific value of x, say x p σ (y ŷ) = σ This is the stadard error of predictio (x p x) 2 SS xx
10 A (1 α)100% CI for E(y) at x = x p ŷ ± t s 1 + (x p x) 2 SS xx A (1 α)100% Predictio Iterval for y at x = x p ŷ ± t s (x p x) 2 SS xx Commet: Do a cofidece iterval, if iterested i the mea of y for a certai value of x, but if you wat to get a rage where a future value of y might fall for a certai value of x, the do a predictio iterval. Cotiue example: Accordig to the liear regressio model predict the middle 95% of ski times for people who do x p = 10 miutes o the treadmill? Fid a predictio iterval. For x p = 10 fid ŷ = (10) = With df = 9 t = ± 2.262(2.188) ( /11)2 +, gives ± which is [60.27, 70.65]. We expect that 95% of people who ca do 10 miutes o the treadmill to fiish the ski race i a time betwee ad miutes. Estimate the mea ski time for people who do x = 10 miutes o the treadmill at cofidece level of 95% ± 2.262(2.188) 1 11 ( /11)2 +, gives ± which is [63.90, 67.02]. We are 95% cofidet that the mea time to fiish the ski race for people who ca do 10 miutes o the treadmill falls betwee 63.9 ad 67 miutes. 10
Linear Regression Models
Liear Regressio Models Dr. Joh MellorCrummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Crosssectioal data. 2. Time series data.
More informationSimple Linear Regression
Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform largesample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationFinal Examination Solutions 17/6/2010
The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 00900 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI2 (1075) STATISTICAL DECISION MAKING Advaced
More informationStatistics 20: Final Exam Solutions Summer Session 2007
1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets
More informationSIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS
SIMPLE LINEAR REGRESSION AND CORRELATION ANALSIS INTRODUCTION There are lot of statistical ivestigatio to kow whether there is a relatioship amog variables Two aalyses: (1) regressio aalysis; () correlatio
More information71. Chapter 4. Part I. Sampling Distributions and Confidence Intervals
71 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7 Sectio 1. Samplig Distributio 73 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses
More information3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.
3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear
More informationChapter 1 (Definitions)
FINAL EXAM REVIEW Chapter 1 (Defiitios) Qualitative: Nomial: Ordial: Quatitative: Ordial: Iterval: Ratio: Observatioal Study: Desiged Experimet: Samplig: Cluster: Stratified: Systematic: Coveiece: Simple
More informationII. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation
II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso productmomet correlatio
More informationAssessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions
Assessmet ad Modelig of Forests FR 48 Sprig Assigmet Solutios. The first part of the questio asked that you calculate the average, stadard deviatio, coefficiet of variatio, ad 9% cofidece iterval of the
More informationStat 200 Testing Summary Page 1
Stat 00 Testig Summary Page 1 Mathematicias are like Frechme; whatever you say to them, they traslate it ito their ow laguage ad forthwith it is somethig etirely differet Goethe 1 Large Sample Cofidece
More information(all terms are scalars).the minimization is clearer in sum notation:
7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1
More informationMath 140 Introductory Statistics
8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These
More informationRegression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.
Regressio Correlatio vs. regressio Predicts Y from X Liear regressio assumes that the relatioship betwee X ad Y ca be described by a lie Regressio assumes... Radom sample Y is ormally distributed with
More informationSampling Distributions, ZTests, Power
Samplig Distributios, ZTests, Power We draw ifereces about populatio parameters from sample statistics Sample proportio approximates populatio proportio Sample mea approximates populatio mea Sample variace
More informationSection 14. Simple linear regression.
Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chisquare Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chisquare Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationTable 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab
Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet
More informationSampling, Sampling Distribution and Normality
4/17/11 Tools of Busiess Statistics Samplig, Samplig Distributio ad ormality Preseted by: Mahedra Adhi ugroho, M.Sc Descriptive statistics Collectig, presetig, ad describig data Iferetial statistics Drawig
More informationStatistical inference: example 1. Inferential Statistics
Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either
More information1036: Probability & Statistics
036: Probability & Statistics Lecture 0 Oe ad TwoSample Tests of Hypotheses 0 Statistical Hypotheses Decisio based o experimetal evidece whether Coffee drikig icreases the risk of cacer i humas. A perso
More informationChapter 13, Part A Analysis of Variance and Experimental Design
Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of
More informationInstructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?
CONFIDENCE INTERVALS How do we make ifereces about the populatio parameters? The samplig distributio allows us to quatify the variability i sample statistics icludig how they differ from the parameter
More informationStatistics. Chapter 10 TwoSample Tests. Copyright 2013 Pearson Education, Inc. publishing as Prentice Hall. Chap 101
Statistics Chapter 0 TwoSample Tests Copyright 03 Pearso Educatio, Ic. publishig as Pretice Hall Chap 0 Learig Objectives I this chapter, you lear How to use hypothesis testig for comparig the differece
More informationChapter 4  Summarizing Numerical Data
Chapter 4  Summarizig Numerical Data 15.075 Cythia Rudi Here are some ways we ca summarize data umerically. Sample Mea: i=1 x i x :=. Note: i this class we will work with both the populatio mea µ ad the
More informationCorrelation and Covariance
Correlatio ad Covariace Tom Ilveto FREC 9 What is Next? Correlatio ad Regressio Regressio We specify a depedet variable as a liear fuctio of oe or more idepedet variables, based o covariace Regressio
More informationY i n. i=1. = 1 [number of successes] number of successes = n
Eco 371 Problem Set # Aswer Sheet 3. I this questio, you are asked to cosider a Beroulli radom variable Y, with a success probability P ry 1 p. You are told that you have draws from this distributio ad
More informationChapter 22: What is a Test of Significance?
Chapter 22: What is a Test of Sigificace? Thought Questio Assume that the statemet If it s Saturday, the it s the weeked is true. followig statemets will also be true? Which of the If it s the weeked,
More informationTopic 6 Sampling, hypothesis testing, and the central limit theorem
CSE 103: Probability ad statistics Fall 2010 Topic 6 Samplig, hypothesis testig, ad the cetral limit theorem 61 The biomial distributio Let X be the umberofheadswhe acoiofbiaspistossedtimes The distributio
More informationWHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT
WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still
More informationREVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION
REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION I liear regreio, we coider the frequecy ditributio of oe variable (Y) at each of everal level of a ecod variable (X). Y i kow a the depedet variable.
More informationSTATISTICAL INFERENCE
STATISTICAL INFERENCE POPULATION AND SAMPLE Populatio = all elemets of iterest Characterized by a distributio F with some parameter θ Sample = the data X 1,..., X, selected subset of the populatio = sample
More informationCTL.SC0x Supply Chain Analytics
CTL.SC0x Supply Chai Aalytics Key Cocepts Documet V1.1 This documet cotais the Key Cocepts documets for week 6, lessos 1 ad 2 withi the SC0x course. These are meat to complemet, ot replace, the lesso videos
More informationConfidence Intervals QMET103
Cofidece Itervals QMET103 Library, Teachig ad Learig CONFIDENCE INTERVALS provide a iterval estimate of the ukow populatio parameter. What is a cofidece iterval? Statisticias have a habit of hedgig their
More information1 Constructing and Interpreting a Confidence Interval
Itroductory Applied Ecoometrics EEP/IAS 118 Sprig 2014 WARM UP: Match the terms i the table with the correct formula: Adrew CraeDroesch Sectio #6 5 March 2014 ˆ Let X be a radom variable with mea µ ad
More informationIntroduction to Econometrics (3 rd Updated Edition) Solutions to Odd Numbered End of Chapter Exercises: Chapter 4
Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd Numbered Ed of Chapter Exercises: Chapter 4 (This versio August 7, 204) 205 Pearso Educatio, Ic. Stock/Watso
More informationThe standard deviation of the mean
Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider
More informationSolutions to Odd Numbered End of Chapter Exercises: Chapter 4
Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd Numbered Ed of Chapter Exercises: Chapter 4 (This versio July 2, 24) Stock/Watso  Itroductio to Ecoometrics
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture  9 Multicolliearity Dr Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Multicolliearity diagostics A importat questio that
More informationTables and Formulas for Sullivan, Fundamentals of Statistics, 2e Pearson Education, Inc.
Table ad Formula for Sulliva, Fudametal of Statitic, e. 008 Pearo Educatio, Ic. CHAPTER Orgaizig ad Summarizig Data Relative frequecy frequecy um of all frequecie Cla midpoit: The um of coecutive lower
More informationRecall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and nonusers, x  y.
Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad ousers, x  y. Such studies are sometimes viewed
More informationThe variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.
SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample
More informationBasis for simulation techniques
Basis for simulatio techiques M. Veeraraghava, March 7, 004 Estimatio is based o a collectio of experimetal outcomes, x, x,, x, where each experimetal outcome is a value of a radom variable. x i. Defiitios
More informationUnit 9 Regression and Correlation
BIOSTATS 540  Fall 05 Regressio ad Correlatio Page of 44 Uit 9 Regressio ad Correlatio Assume that a statistical model such as a liear model is a good first start oly  Gerald va Belle Is higher blood
More informationIE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.
Closed book ad otes. No calculators. 120 miutes. Cover page, five pages of exam, ad tables for discrete ad cotiuous distributios. Score X i =1 X i / S X 2 i =1 (X i X ) 2 / ( 1) = [i =1 X i 2 X 2 ] / (
More informationProbability, Expectation Value and Uncertainty
Chapter 1 Probability, Expectatio Value ad Ucertaity We have see that the physically observable properties of a quatum system are represeted by Hermitea operators (also referred to as observables ) such
More informationStatisticians use the word population to refer the total number of (potential) observations under consideration
6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space
More informationConfidence Intervals for the Population Proportion p
Cofidece Itervals for the Populatio Proportio p The cocept of cofidece itervals for the populatio proportio p is the same as the oe for, the samplig distributio of the mea, x. The structure is idetical:
More informationKLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions
We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give
More informationQuestion 1: Exercise 8.2
Questio 1: Exercise 8. (a) Accordig to the regressio results i colum (1), the house price is expected to icrease by 1% ( 100% 0.0004 500 ) with a additioal 500 square feet ad other factors held costat.
More informationChapter 4 Tests of Hypothesis
Dr. Moa Elwakeel [ 5 TAT] Chapter 4 Tests of Hypothesis 4. statistical hypothesis more. A statistical hypothesis is a statemet cocerig oe populatio or 4.. The Null ad The Alterative Hypothesis: The structure
More informationNumber of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day
LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the
More informationCORRELATION AND REGRESSION
the Further Mathematics etwork www.fmetwork.org.uk V 7 1 1 REVISION SHEET STATISTICS 1 (Ed) CORRELATION AND REGRESSION The mai ideas are: Scatter Diagrams ad Lies of Best Fit Pearso s Product Momet Correlatio
More informationSTAT 203 Chapter 18 Sampling Distribution Models
STAT 203 Chapter 18 Samplig Distributio Models Populatio vs. sample, parameter vs. statistic Recall that a populatio cotais the etire collectio of idividuals that oe wats to study, ad a sample is a subset
More informationHomework for 4/9 Due 4/16
Name: ID: Homework for 4/9 Due 4/16 1. [ 136] It is covetioal wisdom i military squadros that pilots ted to father more girls tha boys. Syder 1961 gathered data for military fighter pilots. The sex of
More informationActivity 3: Length Measurements with the FourSided Meter Stick
Activity 3: Legth Measuremets with the FourSided Meter Stick OBJECTIVE: The purpose of this experimet is to study errors ad the propagatio of errors whe experimetal data derived usig a foursided meter
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationESTIMATION AND PREDICTION BASED ON KRECORD VALUES FROM NORMAL DISTRIBUTION
STATISTICA, ao LXXIII,. 4, 013 ESTIMATION AND PREDICTION BASED ON KRECORD VALUES FROM NORMAL DISTRIBUTION Maoj Chacko Departmet of Statistics, Uiversity of Kerala, Trivadrum 695581, Kerala, Idia M. Shy
More informationPH 425 Quantum Measurement and Spin Winter SPINS Lab 1
PH 425 Quatum Measuremet ad Spi Witer 23 SPIS Lab Measure the spi projectio S z alog the zaxis This is the experimet that is ready to go whe you start the program, as show below Each atom is measured
More informationBHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13
BHW # /5 ENGR Probabilistic Aalysis Beautiful Homework # Three differet roads feed ito a particular freeway etrace. Suppose that durig a fixed time period, the umber of cars comig from each road oto the
More informationNCSS Statistical Software. Tolerance Intervals
Chapter 585 Itroductio This procedure calculates oe, ad two, sided tolerace itervals based o either a distributiofree (oparametric) method or a method based o a ormality assumptio (parametric). A twosided
More informationMeasures of Spread: Variance and Standard Deviation
Lesso 16 Measures of Spread: Variace ad Stadard Deviatio BIG IDEA Variace ad stadard deviatio deped o the mea of a set of umbers. Calculatig these measures of spread depeds o whether the set is a sample
More informationIntroducing Sample Proportions
Itroducig Sample Proportios Probability ad statistics Aswers & Notes TINspire Ivestigatio Studet 60 mi 7 8 9 0 Itroductio A 00 survey of attitudes to climate chage, coducted i Australia by the CSIRO,
More informationMA 575, Linear Models : Homework 3
MA 575, Liear Models : Homework 3 Questio 1 RSS( ˆβ 0, ˆβ 1 ) (ŷ i y i ) Problem.7 Questio.7.1 ( ˆβ 0 + ˆβ 1 x i y i ) (ȳ SXY SXY x + SXX SXX x i y i ) ((ȳ y i ) + SXY SXX (x i x)) (ȳ y i ) SXY SXX SY
More informationDS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10
DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set
More informationTopics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion
.87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses
More informationUCLA STAT 110B Applied Statistics for Engineering and the Sciences
UCLA STAT 110B Applied Statistics for Egieerig ad the Scieces Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistats: Bria Ng, UCLA Statistics Uiversity of Califoria, Los Ageles,
More informationSuccessful HE applicants. Information sheet A Number of applicants. Gender Applicants Accepts Applicants Accepts. Age. Domicile
Successful HE applicats Sigificace tests use data from samples to test hypotheses. You will use data o successful applicatios for courses i higher educatio to aswer questios about proportios, for example,
More informationFirst Year Quantitative Comp Exam Spring, Part I  203A. f X (x) = 0 otherwise
First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I1 Part I  203A A radom variable X is distributed with the margial desity: >
More informationImportant Concepts not on the AP Statistics Formula Sheet
Part I: IQR = Q 3 Q 1 Test for a outlier: 1.5(IQR) above Q 3 or below Q 1 The calculator will ru the test for you as log as you choose the boplot with the oulier o it i STATPLOT Importat Cocepts ot o the
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationStatistical Inference Procedures
Statitical Iferece Procedure Cofidece Iterval Hypothei Tet Statitical iferece produce awer to pecific quetio about the populatio of iteret baed o the iformatio i a ample. Iferece procedure mut iclude a
More informationG. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan
Deviatio of the Variaces of Classical Estimators ad Negative Iteger Momet Estimator from Miimum Variace Boud with Referece to Maxwell Distributio G. R. Pasha Departmet of Statistics Bahauddi Zakariya Uiversity
More informationThe Sampling Distribution of the Maximum. Likelihood Estimators for the Parameters of. BetaBinomial Distribution
Iteratioal Mathematical Forum, Vol. 8, 2013, o. 26, 12631277 HIKARI Ltd, www.mhikari.com http://d.doi.org/10.12988/imf.2013.3475 The Samplig Distributio of the Maimum Likelihood Estimators for the Parameters
More informationPaired Data and Linear Correlation
Paired Data ad Liear Correlatio Example. A group of calculus studets has take two quizzes. These are their scores: Studet st Quiz Score ( data) d Quiz Score ( data) 7 5 5 0 3 0 3 4 0 5 5 5 5 6 0 8 7 0
More informationSome Properties of the Exact and Score Methods for Binomial Proportion and Sample Size Calculation
Some Properties of the Exact ad Score Methods for Biomial Proportio ad Sample Size Calculatio K. KRISHNAMOORTHY AND JIE PENG Departmet of Mathematics, Uiversity of Louisiaa at Lafayette Lafayette, LA 705041010,
More informationOutput Analysis and RunLength Control
IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad RuLegth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%
More informationCH19 Confidence Intervals for Proportions. Confidence intervals Construct confidence intervals for population proportions
CH19 Cofidece Itervals for Proportios Cofidece itervals Costruct cofidece itervals for populatio proportios Motivatio Motivatio We are iterested i the populatio proportio who support Mr. Obama. This sample
More informationReview Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn
Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom
More informationy ij = µ + α i + ɛ ij,
STAT 4 ANOVA Cotrasts ad Multiple Comparisos /3/04 Plaed comparisos vs uplaed comparisos Cotrasts Cofidece Itervals Multiple Comparisos: HSD Remark Alterate form of Model I y ij = µ + α i + ɛ ij, a i
More informationBinomial Distribution
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible
More informationChapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol DiscreteEvent System Simulation
Chapter Output Aalysis for a Sigle Model Baks, Carso, Nelso & Nicol DiscreteEvet System Simulatio Error Estimatio If {,, } are ot statistically idepedet, the S / is a biased estimator of the true variace.
More informationDiscrete probability distributions
Discrete probability distributios I the chapter o probability we used the classical method to calculate the probability of various values of a radom variable. I some cases, however, we may be able to develop
More informationREGRESSION (Physics 1210 Notes, Partial Modified Appendix A)
REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data
More informationA LARGER SAMPLE SIZE IS NOT ALWAYS BETTER!!!
A LARGER SAMLE SIZE IS NOT ALWAYS BETTER!!! Nagaraj K. Neerchal Departmet of Mathematics ad Statistics Uiversity of Marylad Baltimore Couty, Baltimore, MD 2250 Herbert Lacayo ad Barry D. Nussbaum Uited
More informationSection 11.8: Power Series
Sectio 11.8: Power Series 1. Power Series I this sectio, we cosider geeralizig the cocept of a series. Recall that a series is a ifiite sum of umbers a. We ca talk about whether or ot it coverges ad i
More informationElement sampling: Part 2
Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More informationLecture 4. Random variable and distribution of probability
Itroductio to theory of probability ad statistics Lecture. Radom variable ad distributio of probability dr hab.iż. Katarzya Zarzewsa, prof.agh Katedra Eletroii, AGH email: za@agh.edu.pl http://home.agh.edu.pl/~za
More informationConfidence Level We want to estimate the true mean of a random variable X economically and with confidence.
Cofidece Iterval 700 Samples Sample Mea 03 Cofidece Level 095 Margi of Error 0037 We wat to estimate the true mea of a radom variable X ecoomically ad with cofidece True Mea μ from the Etire Populatio
More informationMedian and IQR The median is the value which divides the ordered data values in half.
STA 666 Fall 2007 Webbased Course Notes 4: Describig Distributios Numerically Numerical summaries for quatitative variables media ad iterquartile rage (IQR) 5umber summary mea ad stadard deviatio Media
More information4.1 Sigma Notation and Riemann Sums
0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas
More informationAsymptotic Results for the Linear Regression Model
Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is
More informationChapter VII Measures of Correlation
Chapter VII Measures of Correlatio A researcher may be iterested i fidig out whether two variables are sigificatly related or ot. For istace, he may be iterested i kowig whether metal ability is sigificatly
More informationThe Sample Variance Formula: A Detailed Study of an Old Controversy
The Sample Variace Formula: A Detailed Study of a Old Cotroversy Ky M. Vu PhD. AuLac Techologies Ic. c 00 Email: kymvu@aulactechologies.com Abstract The two biased ad ubiased formulae for the sample variace
More informationMAT1026 Calculus II Basic Convergence Tests for Series
MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real
More informationTesting Statistical Hypotheses for Compare. Means with Vague Data
Iteratioal Mathematical Forum 5 o. 3 656 Testig Statistical Hypotheses for Compare Meas with Vague Data E. Baloui Jamkhaeh ad A. adi Ghara Departmet of Statistics Islamic Azad iversity Ghaemshahr Brach
More information