Correlation Regression
|
|
- Theodore Holland
- 5 years ago
- Views:
Transcription
1 Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother variable? How accurately ca the value of oe variable be predicted from kowledge of the other? Regressio aalysis refers to the process of studyig the causal relatioship betwee a depedet variable ad a set of idepedet explaatory variables
2 Two Sorts of Bivariate Relatioships Geerally, we ca classify the ature of the relatioship betwee a pair of variables ito two types: A bivariate relatioship ca be determiistic, where kowledge of oe of the variables etails a perfect kowledge of the other OR A bivariate relatioship ca be probabilistic, where kowledge of oe of the variables ca allow you to estimate the value of the other variable, but ot with absolute accuracy ad/or certaity
3 A Determiistic Relatioship Suppose we are travelig from oe place to aother o the Iterstate, ad we travel at a costat speed There is a determiistic relatioship betwee the time spet drivig ad the distace traveled that we ca express graphically, or usig a equatio: distace (s) itercept (s 0 ) time (t) slope (v) s = s 0 + vt s: distace traveled s 0 : iitial distace v: speed t: time traveled Ufortuately, few relatioships are truly determiistic
4 A Probabilistic Relatioship More ofte, we fid relatioships betwee two variables that have a probabilistic ature For example, suppose we compare the ages ad heights of a sample of youg people betwee 2 ad 20 years old: height (meters) age (years) Here, we caot predict height from age as we could distace from time i the previous example There is a relatioship here, but there is a elemet of upredictability or error cotaied i this model
5 Samplig ad Regressio Whe we are comparig a pair of variables usig a sampled data set, we expect to fid a relatioship that is less tha perfect (i.e. probabilistic ad ot determiistic) because We expect that i the process of collectig the data there will be some measuremet errors which is aother source of variatio We might fid that there are other factors exertig some cotrol over the relatioship (which of course are ot accouted for i our simple bivariate model)
6 Simple vs. Multiple Regressio Today, we are goig to examie simple liear regressio, where we estimate the values of a depedet variable (y) usig the values of a idepedet variable (x) This cocept ca be exteded to multiple liear regressio, where more explaatory idepedet variables (x 1, x 2, x 3 x ) are used to develop estimates of the depedet variable s values For purposes of clarity, we will first look at the simple case, so we ca more easily grasp the mathematics ivolved
7 Simple Liear Regressio Simple liear regressio models the relatioship betwee a idepedet variable (x) ad a depedet variable (y) usig a equatio that expresses y as a liear fuctio of x, plus a error term: y = a + bx + e y (depedet) a error: ε b x (idepedet) x is the idepedet variable y is the depedet variable b is the slope of the fitted lie a is the itercept of the fitted lie e is the error term
8 Fittig a Lie to a Set of Poits Whe we have a data set cosistig of a idepedet ad a depedet variable, ad we plot these usig a scatterplot, to costruct our model betwee the relatioship betwee the variables, we eed to select a lie that represets the relatioship: y (depedet) x (idepedet) We ca choose a lie that fits best usig a least squares method The least squares lie is the lie that miimizes the vertical distaces betwee the poits ad the lie, i.e. it miimizes the error term ε whe it is cosidered for all poits i the data set
9 Samplig ad Regressio II We usually operate usig sampled data, ad while we are buildig a model of the form: y = a + bx + e from our sample, i doig so we are attemptig to estimate a true regressio lie, describig the relatioship betwee idepedet variable (x) ad depedet variable (y) for the etire populatio: y = α + βx + ε Multiple samples would yield several similar regressio lies, which should approximate the populatio regressio lie
10 Least Squares Method The least squares method operates mathematically, miimizig the error term e for all poits We ca describe the lie of best fit we will fid usig the equatio ŷ = a + bx, ad you ll recall that from a previous slide that the formula for our liear model was expressed usig y = a + bx + e y ŷ We use the value ŷ o the lie to estimate the true value, y (y - ŷ) The differece betwee the two is (y - ŷ) = e ŷ = a + bx This differece is positive for poits above the lie, ad egative for poits below it
11 Estimates ad Residuals Our simple liear regressio models take the form: y = a + bx + e which ca alteratively be expressed as: ŷ = a + bx where ŷ is the estimate of y produced by the regressio We ca rearrage these equatios to show: e = y ŷ The errors i the estimatio of y usig the regressio equatio are kow are residuals, ad express for ay give value i the data set to what extet the regressio lie is either uderestimatig or overestimatig the true value of y
12 Miimizig the Error Term I a liear model, the error i estimatig the true value of the depedet variable y is expressed by the differece betwee the true value ad the estimated value ŷ, e = (y - ŷ) (i.e. the residuals) Sometimes this differece will be positive (whe the lie uderestimates the value of y) ad sometimes it will be egative (whe the lie overestimates the value of y), because there will be poits above ad below the lie If we were to simply sum these error terms, the positive ad egative values would cacel out Istead, we ca square the differeces ad the sum them up to create a useful estimate of the overall error
13 Error Sum of Squares By squarig the differeces betwee y ad ŷ, ad summig these values for all poits i the data set, we calculate the error sum of squares (usually deoted by SSE): SSE = Σ (y - ŷ) 2 The least squares method of selectig a lie of best fit fuctios by fidig the parameters of a lie (itercept a ad slope b) that miimizes the error sum of squares, i.e. it is kow as the least squares method because it fids the lie that makes the SSE as small as it ca possibly be, miimizig the vertical distaces betwee the lie ad the poits
14 Miimizig the SSE We eed to the values of a ad b that would be miimize the error sums of squares: mi a,b Σ (y - ŷ) 2 = mi a,b Σ (y i -a -bx i ) 2 Solvig this problem would require calculus: Take the derivative of the expressio w.r.t. to a ad b, settig them each to 0 ad solvig for the 2 ukows It is graphically equivalet to fidig the miimum of a 3- dimesioal parabolic coe:
15 Fidig Regressio Coefficiets The equatios used to fid the values for the slope (b) ad itercept (a) of the lie of best fit usig the least squares method are: b = Σ (x i - x) (y i -y) a = y - bx Σ (x i -x) 2 Where: x i is the i th idepedet variable value y i is the i th depedet variable value x is the mea value of all the x i values y is the mea value of all the y i values
16 Iterpretig Slope (b) The slope of the lie (b), gives the chage i y (depedet variable) due to a uit chage i x (idepedet variable): b > 0 b < 0 Positive relatioship As the values of x icrease, the values of y icrease too Negative (a.k.a. iverse) relatioship As values of x icrease, the values of y decrease
17 Regressio Slope ad Correlatio The iterpretatio of the sig of the slope parameter ad the correlatio coefficiet is idetical, ad this is o coicidece the umerator of the slope expressio is idetical to that of the correlatio coefficiet r = i= Σ (x i - x)(y i -y) i=1 ( - 1) s X s Y The regressio slope ca expressed i terms of the correlatio coefficiet: b = b = r s y s x Σ (x i - x) (y i -y) Σ (x i -x) 2
18 Coefficiet of Determiatio (r 2 ) For example, suppose we have two datasets, ad we fit a regressio lie to each usig the least squares method: (a) (b) y y x While the same approach (the least squares method) has bee used to select the lie of best fit for both data sets, the relatioship betwee x ad y is clearly stroger i (a) tha i (b), because the poits are closer to the lie We have a umerical measure to express the stregth of the relatioship; the coefficiet of determiatio (r 2 ) x
19 Coefficiet of Determiatio (r 2 ) y ŷ y If we use y to estimate y, the error is (y - y) If we use ŷ to estimate y, the error is (y - ŷ) Thus, (ŷ - y) is the improvemet i our model To accout for the total improvemet for the model, we ca calculate this distace ad sum it for all poits i the data set, first takig the square of the differece (ŷ -y)
20 Coefficiet of Determiatio (r 2 ) The regressio sum of squares (SSR) expresses the improvemet made i estimatig y by usig the regressio lie: y ŷ y SSR = Σ (ŷ i -y) 2 The total sum of squares (SST) expresses the overall variatio betwee the values of y ad their mea y: SST = Σ (y i -y) 2 The coefficiet of determiatio (R 2 ) expresses the amout of variatio i y explaied by the regressio lie (the stregth of the relatioship): r 2 = SSR SST
21 Partitioig the Total Sum of Squares We ca also thik of regressio as a way to partitio the variatio i the values of the depedet variable y We ca take the total variatio, ad divide it ito two compoets: The compoet explaied by the regressio lie The compoet that remais uexplaied We ca characterize the total variability i y usig the sum of the squared deviatios of the y i values from their mea The total variability is expressed by the total sum of squares: SST = Σ (y i -y) 2
22 Partitioig the Total Sum of Squares We ca decompose the total sum of squares ito those two compoets: SST = Σ (y i -y) 2 I other words: SST = SSR + SSE ad the coefficiet of determiatio expresses the portio of the total variatio i y explaied by the regressio lie = Σ (ŷ i -y) 2 SST + Σ (y i - ŷ) 2 SSE y SSR ŷ y
23 Regressio ANOVA Table We ca create a aalysis of variace table that allows us to display the sums of squares, their degrees of freedom, mea square values (for the regressio ad error sums of squares), ad a F-statistic: Compoet Regressio (SSR) Error (SSE) Total (SST) Sum of Squares Σ (ŷ i -y) 2 Σ (y i - ŷ) 2 Σ (y i -y) 2 df Mea Square SSR / 1 SSE / ( - 2) F MSSR MSSE
24 Regressio Example We ca use the data set we used to illustrate covariace ad correlatio: It was a set of 10 values of TVDI for remotely sesed pixels cotaiig the Glydo catchmet i Baltimore Couty, ad accompayig soil moisture measuremets take i the catchmet o matchig dates: Volumetric Soil Moisture Glydo Field Sampled Soil Moisture versus TVDI from a 3x3 kerel TVDI (3x3 kerel) TVDI Soil Moisture
25 Regressio Example To fid the optimal values for slope (b) ad the itercept (a), we must first calculate the mea values of the idepedet variable (TVDI) ad the depedet variable (soil moisture): Mea TVDI = 0.501, mea soil moisture = We ca ow use these values to calculate the optimal slope accordig to the formula: b = Σ (x i - x) (y i -y) Σ (x i -x) 2
26 Regressio Example TVDI (x) Soil Moisture (y) (x - xbar) (y - ybar) (x - xbar) * (y - ybar) (x - xbar)^ Mea Sum Slope We ca ow substitute the slope value ito the itercept equatio to calculate the itercept: a = y - bx a = ( * 0.501) = 0.603
27 Regressio Example We ca ow use our regressio equatio ŷ = x to calculate estimates for each of the values of x i the dataset, ad the proceed to calculate the SSR, SSE & SST TVDI (x) Soil Moisture (y) TVDI Estimate (yhat) (yhat - ybar) (yhat - ybar)^2 (y - yhat) (y - yhat) ^2 (y - ybar) (y - ybar) ^ Mea SSR Slope SSE Itercept SST SSR+SSE
28 Regressio Example Now that we have all the ecessary values, we ca fill i the ANOVA table: Sum of Degrees of Mea Compoet Squares Freedom Square F-Test Regressio (SSR) Error (SSE) Total (SST) We ca also calculate the coefficiet of determiatio r 2 = SSR / SST = / = 0.76
29 A Sigificace Test for r 2 We ca test to see if the regressio lie has bee successful i explaiig a sigificat portio of the variatio i y, by performig a F-test This operates i a similar fashio to how we used the F-test i ANOVA, this time testig the ull hypothesis that the true coefficiet of determiatio of the populatio ρ 2 = 0 usig a F-test formulated as: F test = r2 ( - 2) 1 - r 2 = MSSR MSSE which has a F-distributio with degrees of freedom: df = (1, - 2)
30 Hypothesis Testig - Sigificace of r 2 F-test Example Research questio: Is the regressio lie explaiig a sigificat proportio of the variatio i y (Soil Moisture) 1. H 0 : ρ 2 = 0 (Explaatio of variatio ot sigificat) 2. H A : ρ 2 0 (Sigificat variatio explaied) 3. Select α = 0.05, oe-tailed because we are usig a F-test 4. I order to compute the F-test statistic, we eed to first calculate either the coefficiet of determiatio or the mea sums of squares for both the regressio ad error terms (i this case we have already doe both): F test = 0.76 (8) = =
31 Hypothesis Testig - Sigificace of r 2 F-test Example 5. We ow eed to fid the critical F-value, first calculatig the degrees of freedom: df = (1, - 2) = (1, 10-2) = (1, 8) We ca ow look up the F crit value for our α (0.05 i oe tail) ad df = (1, 8), F crit = F test > F crit, therefore we reject H 0, ad accept H A, fidig that the regressio explais a sigificat portio of the variatio i y (i.e. the populatio coefficiet of determiatio ρ 2, which we have estimated usig the sample coefficiet of determiatio r 2 is ot equal to 0)
Regression, Inference, and Model Building
Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationS Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y
1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these
More informationSimple Linear Regression
Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i
More informationResponse Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable
Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated
More informationSIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS
SIMPLE LINEAR REGRESSION AND CORRELATION ANALSIS INTRODUCTION There are lot of statistical ivestigatio to kow whether there is a relatioship amog variables Two aalyses: (1) regressio aalysis; () correlatio
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor
More information3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.
3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More informationECON 3150/4150, Spring term Lecture 3
Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio
More informationSimple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700
Simple Regressio CS 7 Ackowledgemet These slides are based o presetatios created ad copyrighted by Prof. Daiel Measce (GMU) Basics Purpose of regressio aalysis: predict the value of a depedet or respose
More informationFirst, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,
0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical
More informationII. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation
II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio
More informationLinear Regression Models
Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect
More informationRegression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.
Regressio, Part I I. Differece from correlatio. II. Basic idea: A) Correlatio describes the relatioship betwee two variables, where either is idepedet or a predictor. - I correlatio, it would be irrelevat
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More informationChapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).
Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +
More informationST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.
ST 305: Exam 3 By hadig i this completed exam, I state that I have either give or received assistace from aother perso durig the exam period. I have used o resources other tha the exam itself ad the basic
More informationContinuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised
Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for
More informationAlgebra of Least Squares
October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal
More informationSimple Linear Regression
Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS
PART of UNIVERSITY OF TORONTO Faculty of Arts ad Sciece APRIL/MAY 009 EAMINATIONS ECO0YY PART OF () The sample media is greater tha the sample mea whe there is. (B) () A radom variable is ormally distributed
More informationFinal Examination Solutions 17/6/2010
The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:
More informationStat 200 -Testing Summary Page 1
Stat 00 -Testig Summary Page 1 Mathematicias are like Frechme; whatever you say to them, they traslate it ito their ow laguage ad forthwith it is somethig etirely differet Goethe 1 Large Sample Cofidece
More information(all terms are scalars).the minimization is clearer in sum notation:
7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1
More informationUNIT 11 MULTIPLE LINEAR REGRESSION
UNIT MULTIPLE LINEAR REGRESSION Structure. Itroductio release relies Obectives. Multiple Liear Regressio Model.3 Estimatio of Model Parameters Use of Matrix Notatio Properties of Least Squares Estimates.4
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationChapter 13, Part A Analysis of Variance and Experimental Design
Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of
More informationStat 139 Homework 7 Solutions, Fall 2015
Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More informationREGRESSION (Physics 1210 Notes, Partial Modified Appendix A)
REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data
More informationUniversity of California, Los Angeles Department of Statistics. Simple regression analysis
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100C Istructor: Nicolas Christou Simple regressio aalysis Itroductio: Regressio aalysis is a statistical method aimig at discoverig
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationAssessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions
Assessmet ad Modelig of Forests FR 48 Sprig Assigmet Solutios. The first part of the questio asked that you calculate the average, stadard deviatio, coefficiet of variatio, ad 9% cofidece iterval of the
More informationOutline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression
REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationChapter 6 Sampling Distributions
Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to
More informationLecture 11 Simple Linear Regression
Lecture 11 Simple Liear Regressio Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech Midterm 2 mea: 91.2 media: 93.75 std: 6.5 2 Meddicorp
More information9. Simple linear regression G2.1) Show that the vector of residuals e = Y Ŷ has the covariance matrix (I X(X T X) 1 X T )σ 2.
LINKÖPINGS UNIVERSITET Matematiska Istitutioe Matematisk Statistik HT1-2015 TAMS24 9. Simple liear regressio G2.1) Show that the vector of residuals e = Y Ŷ has the covariace matrix (I X(X T X) 1 X T )σ
More informationUniversity of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 00C Istructor: Nicolas Christou EXERCISE Aswer the followig questios: Practice problems - simple regressio - solutios a Suppose y,
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationDr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)
Dr Maddah NMG 617 M Statistics 11/6/1 Multiple egressio () (Chapter 15, Hies) Test for sigificace of regressio This is a test to determie whether there is a liear relatioship betwee the depedet variable
More informationLinear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other
Liear Regressio Aalysis Aalysis of paired data ad usig a give value of oe variable to predict the value of the other 5 5 15 15 1 1 5 5 1 3 4 5 6 7 8 1 3 4 5 6 7 8 Liear Regressio Aalysis E: The chirp rate
More informationDescribing the Relation between Two Variables
Copyright 010 Pearso Educatio, Ic. Tables ad Formulas for Sulliva, Statistics: Iformed Decisios Usig Data 010 Pearso Educatio, Ic Chapter Orgaizig ad Summarizig Data Relative frequecy = frequecy sum of
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 017 MODULE 4 : Liear models Time allowed: Oe ad a half hours Cadidates should aswer THREE questios. Each questio carries
More informationCorrelation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph
Correlatio Y Two variables: Which test? X Explaatory variable Respose variable Categorical Numerical Categorical Cotigecy table Cotigecy Logistic Grouped bar graph aalysis regressio Mosaic plot Numerical
More informationChapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo
More informationIsmor Fischer, 1/11/
Ismor Fischer, //04 7.4-7.4 Problems. I Problem 4.4/9, it was show that importat relatios exist betwee populatio meas, variaces, ad covariace. Specifically, we have the formulas that appear below left.
More informationWorksheet 23 ( ) Introduction to Simple Linear Regression (continued)
Worksheet 3 ( 11.5-11.8) Itroductio to Simple Liear Regressio (cotiued) This worksheet is a cotiuatio of Discussio Sheet 3; please complete that discussio sheet first if you have ot already doe so. This
More informationStatistical Properties of OLS estimators
1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationChapter 12 Correlation
Chapter Correlatio Correlatio is very similar to regressio with oe very importat differece. Regressio is used to explore the relatioship betwee a idepedet variable ad a depedet variable, whereas correlatio
More informationGeometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT
OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca
More informationLecture 3. Properties of Summary Statistics: Sampling Distribution
Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary
More informationChapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers
Chapter 4 4-1 orth Seattle Commuity College BUS10 Busiess Statistics Chapter 4 Descriptive Statistics Summary Defiitios Cetral tedecy: The extet to which the data values group aroud a cetral value. Variatio:
More informationOverview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions
Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationRegression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.
Regressio Correlatio vs. regressio Predicts Y from X Liear regressio assumes that the relatioship betwee X ad Y ca be described by a lie Regressio assumes... Radom sample Y is ormally distributed with
More informationCTL.SC0x Supply Chain Analytics
CTL.SC0x Supply Chai Aalytics Key Cocepts Documet V1.1 This documet cotais the Key Cocepts documets for week 6, lessos 1 ad 2 withi the SC0x course. These are meat to complemet, ot replace, the lesso videos
More informationChapter 23: Inferences About Means
Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For
More informationData Analysis and Statistical Methods Statistics 651
Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio
More informationLeast-Squares Regression
MATH 482 Least-Squares Regressio Dr. Neal, WKU As well as fidig the correlatio of paired sample data {{ x 1, y 1 }, { x 2, y 2 },..., { x, y }}, we also ca plot the data with a scatterplot ad fid the least
More informationCorrelation and Regression
Correlatio ad Regressio Lecturer, Departmet of Agroomy Sher-e-Bagla Agricultural Uiversity Correlatio Whe there is a relatioship betwee quatitative measures betwee two sets of pheomea, the appropriate
More informationInvestigating the Significance of a Correlation Coefficient using Jackknife Estimates
Iteratioal Joural of Scieces: Basic ad Applied Research (IJSBAR) ISSN 2307-4531 (Prit & Olie) http://gssrr.org/idex.php?joural=jouralofbasicadapplied ---------------------------------------------------------------------------------------------------------------------------
More informationA statistical method to determine sample size to estimate characteristic value of soil parameters
A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig
More informationRegression and Correlation
43 Cotets Regressio ad Correlatio 43.1 Regressio 43. Correlatio 17 Learig outcomes You will lear how to explore relatioships betwee variables ad how to measure the stregth of such relatioships. You should
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationt distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference
EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The
More informationStatistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes
Admiistrative Notes s - Lecture 7 Fial review Fial Exam is Tuesday, May 0th (3-5pm Covers Chapters -8 ad 0 i textbook Brig ID cards to fial! Allowed: Calculators, double-sided 8.5 x cheat sheet Exam Rooms:
More informationChapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more
More informationBivariate Sample Statistics Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 7
Bivariate Sample Statistics Geog 210C Itroductio to Spatial Data Aalysis Chris Fuk Lecture 7 Overview Real statistical applicatio: Remote moitorig of east Africa log rais Lead up to Lab 5-6 Review of bivariate/multivariate
More informationECON 3150/4150, Spring term Lecture 1
ECON 3150/4150, Sprig term 2013. Lecture 1 Ragar Nymoe Uiversity of Oslo 15 Jauary 2013 1 / 42 Refereces to Lecture 1 ad 2 Hill, Griffiths ad Lim, 4 ed (HGL) Ch 1-1.5; Ch 2.8-2.9,4.3-4.3.1.3 Bårdse ad
More informationCorrelation and Covariance
Correlatio ad Covariace Tom Ilveto FREC 9 What is Next? Correlatio ad Regressio Regressio We specify a depedet variable as a liear fuctio of oe or more idepedet variables, based o co-variace Regressio
More informationOpen book and notes. 120 minutes. Cover page and six pages of exam. No calculators.
IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits
More informationChapter 13: Tests of Hypothesis Section 13.1 Introduction
Chapter 13: Tests of Hypothesis Sectio 13.1 Itroductio RECAP: Chapter 1 discussed the Likelihood Ratio Method as a geeral approach to fid good test procedures. Testig for the Normal Mea Example, discussed
More informationRecall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.
Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed
More informationSTP 226 EXAMPLE EXAM #1
STP 226 EXAMPLE EXAM #1 Istructor: Hoor Statemet: I have either give or received iformatio regardig this exam, ad I will ot do so util all exams have bee graded ad retured. PRINTED NAME: Siged Date: DIRECTIONS:
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationImportant Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.
Importat Formulas Chapter 3 Data Descriptio Mea for idividual data: X = _ ΣX Mea for grouped data: X= _ Σf X m Stadard deviatio for a sample: _ s = Σ(X _ X ) or s = 1 (Σ X ) (Σ X ) ( 1) Stadard deviatio
More informationSTP 226 ELEMENTARY STATISTICS
TP 6 TP 6 ELEMENTARY TATITIC CHAPTER 4 DECRIPTIVE MEAURE IN REGREION AND CORRELATION Liear Regressio ad correlatio allows us to examie the relatioship betwee two or more quatitative variables. 4.1 Liear
More information(X i X)(Y i Y ) = 1 n
L I N E A R R E G R E S S I O N 10 I Chapter 6 we discussed the cocepts of covariace ad correlatio two ways of measurig the extet to which two radom variables, X ad Y were related to each other. I may
More informationBIOS 4110: Introduction to Biostatistics. Breheny. Lab #9
BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous
More information4 Multidimensional quantitative data
Chapter 4 Multidimesioal quatitative data 4 Multidimesioal statistics Basic statistics are ow part of the curriculum of most ecologists However, statistical techiques based o such simple distributios as
More informationDealing with Data and Fitting Empirically
Fittig Fuctios to Data: Regressio Dealig with Data ad Fittig Empirically Notes by Holly Hirst Whe workig with a situatio for which we do t have a physical law (ad hece a equatio, fuctio or iequality),
More informationMA Advanced Econometrics: Properties of Least Squares Estimators
MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationRegression and correlation
Cotets 43 Regressio ad correlatio 1. Regressio. Correlatio Learig outcomes You will lear how to explore relatioships betwee variables ad how to measure the stregth of such relatioships. You should ote
More informationTests of Hypotheses Based on a Single Sample (Devore Chapter Eight)
Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........
More informationSummary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.
Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios
More informationLecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS
Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review from last week What is a cofidece iterval? 2 Review from last week What is a cofidece
More informationSoo King Lim Figure 1: Figure 2: Figure 3: Figure 4: Figure 5: Figure 6: Figure 7:
0 Multivariate Cotrol Chart 3 Multivariate Normal Distributio 5 Estimatio of the Mea ad Covariace Matrix 6 Hotellig s Cotrol Chart 6 Hotellig s Square 8 Average Value of k Subgroups 0 Example 3 3 Value
More informationComparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading
Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual
More informationSample Size Determination (Two or More Samples)
Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie
More informationLesson 11: Simple Linear Regression
Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested
More informationDAWSON COLLEGE DEPARTMENT OF MATHEMATICS 201-BZS-05 PROBABILITY AND STATISTICS FALL 2015 FINAL EXAM
DAWSON COLLEGE DEPARTMENT OF MATHEMATICS 201-BZS-05 PROBABILITY AND STATISTICS FALL 2015 FINAL EXAM Name: Date: December 24th, 2015 Studet Number: Time: 9:30 12:30 Grade: / 116 Examier: Matthew MARCHANT
More information