1 Inferential Methods for Correlation and Regression Analysis
|
|
- Gladys Chase
- 6 years ago
- Views:
Transcription
1 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet ad the sample Regressio Lie were obtaied for describig ad measurig the quality ad stregth of the liear relatioship betwee two cotiuous variables. I this sectio a model for the uderlyig populatio will be developed. The this model will be used to obtai iferetial methods for this kid of data. 1.1 The Simple Liear Regressio Model Defiitio: The simple liear regressio model assumes that there is a lie with itercept α ad slope β, called the true populatio regressio lie, that describes the relatioship betwee the variable x ad y. Whe a value of the idepedet variable x is fixed ad a observatio o the depedet variable y ca be writte as, y = α + βx + e e is the error, ad we assume e N (0, σ). Basic Assumptios of the Simple Liear Regressio Model 1. The distributio of the radom deviatio e has populatio mea µ e =0. 2. The stadard deviatio of e is the same for ay particular value of x. It is deoted by σ. 3. The distributio of e is ormal. The model implies for the mea of the respose variable µ y = α + βx Radomess i e implies that the outcome of y for a give x is also radom, so that y is a radom variable. To summarize: That for every x, y is a ormal distributed radom variable with mea α + βx ad stadard deviatio σ. Iterpretatio: The slope β of the populatio regressio lie is the average chage i y associated with a 1-uit icrease i x. The vertical itercept α is the mea of y whe x = 0. The value of σ describes the extet to which (x, y) observatios deviate vertically from the regressio lie. Diagram (compare Textbook pg.598) Diagram show small ad large σ. 1
2 1.2 Estimatig the Parameter of the Model The parameter of the model are α, β, ad σ. They ca be estimated (poit estimator ad cofidece iterval) by usig sample data (x 1, y 1 ), (x 2, y 2 ),..., (x, y ). Assume that the relatioship of x ad y ca be described by a Simple Liear Regressio Model ad that the observatios were draw idepedetly. Uder these assumptios poit estimates of α ad β are the itercept ad the slope of the least squares lie, respectively. That is poit estimate of β = b = S xy S xx poit estimate of α = a = ȳ b x where as before S xy = xy ( x)( y) ad S xx = x 2 ( x) 2. The estimated regressio lie is idetical to the least squares lie: ŷ = a + bx I additio to estimatig α ad β there is also a iterest i estimatig σ the spread of the radom deviatio of the regressio lie. This estimate will be based o the Sum of Squares for Error: SSE = (y ŷ) 2 where ŷ 1 = a + bx 1, ŷ 2 = a + bx 2,..., ŷ = a + bx are the fitted or predicted y values ad the residuals are y 1 ŷ 1, y 2 ŷ 2,..., y ŷ The residuals give the vertical distace of the observatios to the regressio lie. This makes SSE a measure of the extet to which the sample data spreads out about the estimated regressio lie. Estimatio of σ The statistic for estimatig the variace σ 2 is where The estimate of the stadard deviatio σ is s 2 = SSE 2 SSE = (y ŷ) 2 = y 2 a y b xy s = s 2 The Coefficiet of Determiatio ca ow be rewritte as r 2 = 1 SSE SS yy 2
3 where SS yy is SS yy = (y ȳ) 2 = y 2 ( y) 2 = S yy This equatio ow shows the iterpretatio of r 2 as the proportio of y variatio that ca be explaied by the model relatioship. σ measures the usual vertical deviatio of a poit (x, y) i the populatio from the populatio regressio lie. Similarly, s e measures the typical deviatio of a sample poit (x, y) of the sample regressio lie. 1.3 Iferece Cocerig the Slope of the Populatio Regressio Lie The slope B is the average chage i y, whe x is icreased by 1 uit. This iterpretatio makes B a iterestig parameter. I additio we fid, that if β = 0 the Populatio Regressio Lie would be parallel to the x-axis, thus a chage i x would t have a impact o y, which would mea that x ad y are idepedet. These two properties lead us to ask for a cofidece iterval for B, i order to be able to estimate B properly ad for a statistical test cocerig B. Before we ca obtai these methods we have to study the samplig distributio of the statistic to be used for estimatig B, which is b. Properties of the Samplig Distributio of b Whe the basic assumptios of the simple liear regressio model are met, the followig claims hold: 1. The populatio mea of b is β: µ b = β. Thus, b is a ubiased estimate of β. 2. The populatio stadard deviatio of b is σ b = σ Sxx 3. The statistic b has a ormal distributio (which is a cosequece of e beig ormally distributed). If σ b is large the estimate b does ot have to be close to the true β. I order for σ b to be small σ should be small, that meas little variatio about the populatio regressio lie, ad (or) S xx = (x x) 2 should be large, that meas far spread out x values are i favor for a smaller variatio i the estimate b for B. Stadardizig b Sice σ b depeds o the ukow σ it ca ot be used for a stadardizig b, i order to develop a statistic that ca be used for developig a cofidece iterval or test. 3
4 Istead use SE b = as estimate of σ b the stadard deviatio of b. This leads to: t = b β SE b which is t distributed with df = 2 (for a Simple Liear Regressio Model). This t-statistic does ot deped o ay ukow parameters beside β, so it ca be used for developig a cofidece iterval ad a test cocerig B. Cofidece Iterval for β A (1 α)% Cofidece Iterval for the slope β from a simple liear regressio model is give by: b ± (t critical value) 2 1 α SE b Where the t critical value is based o df = 2 ad C = 1 α (Table C). Example: Is cardiovascular fitess related to a athlete s performace i a 20km ski race? s S xx x= time to exhaustio ruig o a treadmill (i miutes) y= 20km ski time(i miutes) x y A scatterplot of this data shows a liear decrease i ski time by icreasig treadmill time. Straightforward calculatio gives: = 11 x = y = x 2 = xy = y 2 = 48,
5 From these calculate ad so that: S xy = xy ( x)( y) S xx = x 2 ( x) 2 = (106.3)(728.70) 11 = (106.3)2 11 = = b = S xy = = ad a = ȳ b x = = S xx The poit estimate for the decrease i ski time for every miute icrease i treadmill time is 2.33 miutes. Further more we get SSE = y 2 a y b xy = SS yy = y 2 ( y) 2 = r 2 = 1 SSE = SS yy s 2 = SSE 2 = s = s 2 = SE b = s Sxx = r 2 = is ot very high, but shows that about 63% of the variatio i ski-time ca be explaied by the Liear Regressio Model. The estimate s = 2.19 shows that there is quite some variatio i the ski-time for ay give treadmill time. The calculatio of a 95% cofidece iterval for β requires the t critical value with -2=9 degrees of freedom ad 1 α = 0.95: The resultig cofidece iterval is the b ± t critical value SE b = ± (2.26) (0.591) = ± = ( 3.671; 0.999) Based o the sample, we are 95% cofidet that the true average decrease i ski time associated with a oe miute icrease i treadmill time is betwee 1 ad 3.7 miutes. Hypothesis Test Cocerig β Hypotheses: Test type Upper tail H 0 : β 0 H a : β > 0 Lower tail H 0 : β 0 H a : β < 0 Two tail H 0 : β = 0 H a : β 0 5
6 Assumptios: Radom sample ad the Simple Liear Regressio Model applies to x ad y. Test statistic: P-value: t 0 = b SE b with df = 2 Test type P-value Upper tail P (t > t 0 ) Lower tail P (t < t 0 ) Two tail 2 P (t > abs(t 0 )) The two tail test of H 0 : β = 0 versus H a : β 0 is a test to assure that there is i fact a liear coectio betwee x ad y, it is also called Model Utility Test. Cotiue Example: Let s do the Model Utility Test for the example above at a sigificace level of α = Hypotheses: ad α = Do Model Utility Test. H 0 : β = 0 versus H a : β 0 2. Assumptios: Radom sample. Assume the relatioship of x ad y ca be described by a Simple Liear Regressio Model: 3. Test statistic: t 0 = b = = with df = 9 SE b P-value: This is a two tail test, thus p-value= 2 P (t > abs(t 0 )) = = Decisio: Sice the p-value α the ull hypothesis is rejected. 6. Coclusio: At sigificace level of 5% we coclude that β 0 ad there is a liear relatioship betwee ski time ad treadmill time. The Test Procedure: 1. Hypotheses: State the hypotheses ad the sigificace level, ad choose the appropriate test. 2. Assumptios: Check if the assumptios of that test are met. 3. Test statistic: Calculate the test statistic icludig the degrees of freedom if ecessary. 4. P-value: Obtai the p-value, accordig to test type. 5. Decisio: Decide if the ull hypothesis ca be rejected, or ot. 6. Coclusio: Aswer the questio cocerig the problem stated for the situatio. 6
7 1.4 Checkig Model Adequacy Before relyig o the results based o a simple Liear Regressio Model, the model assumptios should be checked. These assumptios are mostly related to the distributio of the error e i the model equatio For the radom error e, we have to assume 1. For all x, e is ormally distributed, ad y = α + βx + e 2. The mea ad the stadard deviatio of e do ot deped o the value of x. If these assumptio are seriously violated the results based o the Simple Liear Regressio Model are ot reliable, ad may ot be true. This fact makes it ecessary to check for these assumptios. Residual Aalysis If the residuals e 1 to e from the populatio regressio lie α + βx would be available, we could use them to check for those assumptios. I ay applied situatio these values are ot kow, because the populatio regressio lie is ot kow. We have to rely o their estimates, the sample residuals (residuals from the least squares regressio lie, the estimated lie) y 1 ŷ 1 = y 1 (a + bx 1 ) y 2 ŷ 2 = y 2 (a + bx 2 )... y ŷ = y (a + bx ) Ay observatio that leads to very large residual should be examied for mistakes. How to determie if a residual is large? By ow we kow this is best be doe with by stadardizig the values ad tha judge them. How do we stadardize the residuals? We stadardize by subtractig the mea ad dividig by the stadard deviatio. But we kow that the populatio mea of the residuals is zero (µ e = 0). Ad it ca be determied that the stadard deviatio of the i th residual y i ŷ i is equal to ad the stadardized residual is equal to s 1 1 (x i x) 2 S xx. y i ŷ i s 1 1 (x i x) 2 S xx 7
8 Example 1 (Treadmill time ad ski time) For this data set the stadardized residuals are Treadmill Ski st. Res Is ay of these values ureasoably large? First do a histogram for the residuals to check if it is reasoable to assume they come from a ormal populatio. This histogram shows a uimodal ad symmetric distributio, without ay outliers. assumptio of ormality seems to be reasoable. The Plottig stadardized residuals versus tread mill time gives a visual presetatio of this data ad idicates if there is cocer that the stadard deviatio might vary for differet x. 8
9 This graph show that the stadardized residuals are evely spread above ad below the 0. The spread does ot chage with x. This plot does ot give ay idicatio that the Simple Liear Regressio Model assumptios are violated Applyig the Model Estimatio ad Predictio Oce we established that the model we foud describes the relatioship properly, we ca use the model for estimatio: estimate mea values of y, E(y) for give values of x. For example we might be iterested i the mea time a perso eed to fiish the ski race whe doig 10 miutes o the treadmill. predictio: ew idividual value based o the kowledge of x, For example, we wat to give the rage of ski times for people who ca do 10 miutes o the treadmill. The key to both applicatios of the model is the least squares lie, ŷ = a+bx, we foud before, which gives us poit estimators for both applicatios The ext step is to fid cofidece itervals for E(y) for a give x, ad for a idividual value y, for a give x. I order to do this we will have to discuss the distributio of ŷ for the two applicatios. Samplig distributio of ŷ = a + bx 1. The stadard deviatio of the distributio of the estimator ŷ of the mea value E(y) at a specific value of x, say x p 1 σŷ = σ + (x p x) 2 SS xx This is the stadard error of ŷ. 2. The stadard deviatio of the distributio of the predictio error, y ŷ, for the predictor ŷ of a idividual ew y at a specific value of x, say x p σ (y ŷ) = σ This is the stadard error of predictio (x p x) 2 SS xx
10 A (1 α)100% CI for E(y) at x = x p ŷ ± t s 1 + (x p x) 2 SS xx A (1 α)100% Predictio Iterval for y at x = x p ŷ ± t s (x p x) 2 SS xx Commet: Do a cofidece iterval, if iterested i the mea of y for a certai value of x, but if you wat to get a rage where a future value of y might fall for a certai value of x, the do a predictio iterval. Cotiue example: Accordig to the liear regressio model predict the middle 95% of ski times for people who do x p = 10 miutes o the treadmill? Fid a predictio iterval. For x p = 10 fid ŷ = (10) = With df = 9 t = ± 2.262(2.188) ( /11)2 +, gives ± which is [60.27, 70.65]. We expect that 95% of people who ca do 10 miutes o the treadmill to fiish the ski race i a time betwee ad miutes. Estimate the mea ski time for people who do x = 10 miutes o the treadmill at cofidece level of 95% ± 2.262(2.188) 1 11 ( /11)2 +, gives ± which is [63.90, 67.02]. We are 95% cofidet that the mea time to fiish the ski race for people who ca do 10 miutes o the treadmill falls betwee 63.9 ad 67 miutes. 10
Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).
Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each
More informationLinear Regression Models
Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationRegression, Inference, and Model Building
Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship
More informationResponse Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable
Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated
More informationCorrelation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph
Correlatio Y Two variables: Which test? X Explaatory variable Respose variable Categorical Numerical Categorical Cotigecy table Cotigecy Logistic Grouped bar graph aalysis regressio Mosaic plot Numerical
More informationStat 139 Homework 7 Solutions, Fall 2015
Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,
More informationS Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y
1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these
More informationCorrelation Regression
Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother
More informationSimple Linear Regression
Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio
More informationChapter 23: Inferences About Means
Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For
More informationST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.
ST 305: Exam 3 By hadig i this completed exam, I state that I have either give or received assistace from aother perso durig the exam period. I have used o resources other tha the exam itself ad the basic
More informationSection 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis
Sectio 9.2 Tests About a Populatio Proportio P H A N T O M S Parameters Hypothesis Assess Coditios Name the Test Test Statistic (Calculate) Obtai P value Make a decisio State coclusio Sectio 9.2 Tests
More informationOverview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions
Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples
More informationBIOS 4110: Introduction to Biostatistics. Breheny. Lab #9
BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous
More informationSimple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700
Simple Regressio CS 7 Ackowledgemet These slides are based o presetatios created ad copyrighted by Prof. Daiel Measce (GMU) Basics Purpose of regressio aalysis: predict the value of a depedet or respose
More informationFinal Examination Solutions 17/6/2010
The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationInferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.
Iferetial Statistics ad Probability a Holistic Approach Iferece Process Chapter 8 Poit Estimatio ad Cofidece Itervals This Course Material by Maurice Geraghty is licesed uder a Creative Commos Attributio-ShareAlike
More informationDescribing the Relation between Two Variables
Copyright 010 Pearso Educatio, Ic. Tables ad Formulas for Sulliva, Statistics: Iformed Decisios Usig Data 010 Pearso Educatio, Ic Chapter Orgaizig ad Summarizig Data Relative frequecy = frequecy sum of
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced
More informationChapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo
More informationChapter 8: Estimating with Confidence
Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig
More informationStatistics 20: Final Exam Solutions Summer Session 2007
1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets
More informationChapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more
More information7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals
7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses
More informationFinal Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech
Fial Review Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech 1 Radom samplig model radom samples populatio radom samples: x 1,..., x
More informationSIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS
SIMPLE LINEAR REGRESSION AND CORRELATION ANALSIS INTRODUCTION There are lot of statistical ivestigatio to kow whether there is a relatioship amog variables Two aalyses: (1) regressio aalysis; () correlatio
More informationSTP 226 ELEMENTARY STATISTICS
TP 6 TP 6 ELEMENTARY TATITIC CHAPTER 4 DECRIPTIVE MEAURE IN REGREION AND CORRELATION Liear Regressio ad correlatio allows us to examie the relatioship betwee two or more quatitative variables. 4.1 Liear
More informationLecture 11 Simple Linear Regression
Lecture 11 Simple Liear Regressio Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech Midterm 2 mea: 91.2 media: 93.75 std: 6.5 2 Meddicorp
More informationAssessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions
Assessmet ad Modelig of Forests FR 48 Sprig Assigmet Solutios. The first part of the questio asked that you calculate the average, stadard deviatio, coefficiet of variatio, ad 9% cofidece iterval of the
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS
PART of UNIVERSITY OF TORONTO Faculty of Arts ad Sciece APRIL/MAY 009 EAMINATIONS ECO0YY PART OF () The sample media is greater tha the sample mea whe there is. (B) () A radom variable is ormally distributed
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +
More informationCommon Large/Small Sample Tests 1/55
Commo Large/Small Sample Tests 1/55 Test of Hypothesis for the Mea (σ Kow) Covert sample result ( x) to a z value Hypothesis Tests for µ Cosider the test H :μ = μ H 1 :μ > μ σ Kow (Assume the populatio
More informationContinuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised
Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for
More information3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.
3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear
More informationThis is an introductory course in Analysis of Variance and Design of Experiments.
1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class
More informationII. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation
II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio
More informationChapter 1 (Definitions)
FINAL EXAM REVIEW Chapter 1 (Defiitios) Qualitative: Nomial: Ordial: Quatitative: Ordial: Iterval: Ratio: Observatioal Study: Desiged Experimet: Samplig: Cluster: Stratified: Systematic: Coveiece: Simple
More informationA quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population
A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate
More informationComparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading
Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationInterval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),
Cofidece Iterval Estimatio Problems Suppose we have a populatio with some ukow parameter(s). Example: Normal(,) ad are parameters. We eed to draw coclusios (make ifereces) about the ukow parameters. We
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationOpen book and notes. 120 minutes. Cover page and six pages of exam. No calculators.
IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits
More informationStatistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes
Admiistrative Notes s - Lecture 7 Fial review Fial Exam is Tuesday, May 0th (3-5pm Covers Chapters -8 ad 0 i textbook Brig ID cards to fial! Allowed: Calculators, double-sided 8.5 x cheat sheet Exam Rooms:
More informationStat 200 -Testing Summary Page 1
Stat 00 -Testig Summary Page 1 Mathematicias are like Frechme; whatever you say to them, they traslate it ito their ow laguage ad forthwith it is somethig etirely differet Goethe 1 Large Sample Cofidece
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationWorksheet 23 ( ) Introduction to Simple Linear Regression (continued)
Worksheet 3 ( 11.5-11.8) Itroductio to Simple Liear Regressio (cotiued) This worksheet is a cotiuatio of Discussio Sheet 3; please complete that discussio sheet first if you have ot already doe so. This
More informationFrequentist Inference
Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More information(all terms are scalars).the minimization is clearer in sum notation:
7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1
More informationChapter 13: Tests of Hypothesis Section 13.1 Introduction
Chapter 13: Tests of Hypothesis Sectio 13.1 Itroductio RECAP: Chapter 1 discussed the Likelihood Ratio Method as a geeral approach to fid good test procedures. Testig for the Normal Mea Example, discussed
More informationChapter 6 Sampling Distributions
Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to
More information- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion
1 Chapter 7 ad 8 Review for Exam Chapter 7 Estimates ad Sample Sizes 2 Defiitio Cofidece Iterval (or Iterval Estimate) a rage (or a iterval) of values used to estimate the true value of the populatio parameter
More informationSample Size Determination (Two or More Samples)
Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie
More informationClass 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 23 Daiel B. Rowe, Ph.D. Departmet of Mathematics, Statistics, ad Computer Sciece Copyright 2017 by D.B. Rowe 1 Ageda: Recap Chapter 9.1 Lecture Chapter 9.2 Review Exam 6 Problem Solvig Sessio. 2
More informationComputing Confidence Intervals for Sample Data
Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios
More informationStat 421-SP2012 Interval Estimation Section
Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationAP Statistics Review Ch. 8
AP Statistics Review Ch. 8 Name 1. Each figure below displays the samplig distributio of a statistic used to estimate a parameter. The true value of the populatio parameter is marked o each samplig distributio.
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationImportant Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.
Importat Formulas Chapter 3 Data Descriptio Mea for idividual data: X = _ ΣX Mea for grouped data: X= _ Σf X m Stadard deviatio for a sample: _ s = Σ(X _ X ) or s = 1 (Σ X ) (Σ X ) ( 1) Stadard deviatio
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationHypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance
Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?
More informationSTAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)
STAT 350 Hadout 9 Samplig Distributio, Cetral Limit Theorem (6.6) A radom sample is a sequece of radom variables X, X 2,, X that are idepedet ad idetically distributed. o This property is ofte abbreviated
More information1 Models for Matched Pairs
1 Models for Matched Pairs Matched pairs occur whe we aalyse samples such that for each measuremet i oe of the samples there is a measuremet i the other sample that directly relates to the measuremet i
More informationCHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics
CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS 8.1 Radom Samplig The basic idea of the statistical iferece is that we are allowed to draw ifereces or coclusios about a populatio based
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationSampling Distributions, Z-Tests, Power
Samplig Distributios, Z-Tests, Power We draw ifereces about populatio parameters from sample statistics Sample proportio approximates populatio proportio Sample mea approximates populatio mea Sample variace
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationFirst, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,
0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical
More informationOctober 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1
October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 1 Populatio parameters ad Sample Statistics October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 2 Ifereces
More informationDr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)
Dr Maddah NMG 617 M Statistics 11/6/1 Multiple egressio () (Chapter 15, Hies) Test for sigificace of regressio This is a test to determie whether there is a liear relatioship betwee the depedet variable
More informationMath 140 Introductory Statistics
8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These
More informationLesson 11: Simple Linear Regression
Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested
More informationAgreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times
Sigificace level vs. cofidece level Agreemet of CI ad HT Lecture 13 - Tests of Proportios Sta102 / BME102 Coli Rudel October 15, 2014 Cofidece itervals ad hypothesis tests (almost) always agree, as log
More informationMidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday
Aoucemets MidtermII Review Sta 101 - Fall 2016 Duke Uiversity, Departmet of Statistical Sciece Office Hours Wedesday 12:30-2:30pm Watch liear regressio videos before lab o Thursday Dr. Abrahamse Slides
More informationSimple Linear Regression
Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i
More informationRegression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.
Regressio Correlatio vs. regressio Predicts Y from X Liear regressio assumes that the relatioship betwee X ad Y ca be described by a lie Regressio assumes... Radom sample Y is ormally distributed with
More informationThis chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.
Chapter 9 & : Comparig Two Treatmets: This chapter focuses o two eperimetal desigs that are crucial to comparative studies: () idepedet samples ad () matched pair samples Idepedet Radom amples from Two
More informationPSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9
Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I
More informationChapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p
Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret
More informationCorrelation and Regression
Correlatio ad Regressio Lecturer, Departmet of Agroomy Sher-e-Bagla Agricultural Uiversity Correlatio Whe there is a relatioship betwee quatitative measures betwee two sets of pheomea, the appropriate
More informationSection 14. Simple linear regression.
Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo
More information2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2
Chapter 8 Comparig Two Treatmets Iferece about Two Populatio Meas We wat to compare the meas of two populatios to see whether they differ. There are two situatios to cosider, as show i the followig examples:
More information(X i X)(Y i Y ) = 1 n
L I N E A R R E G R E S S I O N 10 I Chapter 6 we discussed the cocepts of covariace ad correlatio two ways of measurig the extet to which two radom variables, X ad Y were related to each other. I may
More informationTests of Hypotheses Based on a Single Sample (Devore Chapter Eight)
Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........
More informationStatistical Intervals for a Single Sample
3/5/06 Applied Statistics ad Probability for Egieers Sixth Editio Douglas C. Motgomery George C. Ruger Chapter 8 Statistical Itervals for a Sigle Sample 8 CHAPTER OUTLINE 8- Cofidece Iterval o the Mea
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationt distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference
EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationApril 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE
April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece
More informationChapter 12 Correlation
Chapter Correlatio Correlatio is very similar to regressio with oe very importat differece. Regressio is used to explore the relatioship betwee a idepedet variable ad a depedet variable, whereas correlatio
More informationUniversity of California, Los Angeles Department of Statistics. Simple regression analysis
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100C Istructor: Nicolas Christou Simple regressio aalysis Itroductio: Regressio aalysis is a statistical method aimig at discoverig
More information