Predicting the eruption time after observed an eruption of 4 minutes in duration.

Similar documents
r y Simple Linear Regression How To Study Relation Between Two Quantitative Variables? Scatter Plot Pearson s Sample Correlation Correlation

Simple Linear Regression. How To Study Relation Between Two Quantitative Variables? Scatter Plot. Pearson s Sample Correlation.

Summarizing Bivariate Data. Correlation. Scatter Plot. Pearson s Sample Correlation. Summarizing Bivariate Data SBD - 1

Linear Regression. Can height information be used to predict weight of an individual? How long should you wait till next eruption?

Correlation: Examine Quantitative Bivariate Data

Correlation. Pearson s Sample Correlation. Correlation and Linear Regression. Scatter Plot

Reaction Time VS. Drug Percentage Subject Amount of Drug Times % Reaction Time in Seconds 1 Mary John Carl Sara William 5 4

REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION

Simple Linear Regression Analysis

Quiz 1- Linear Regression Analysis (Based on Lectures 1-14)

Chapter Two. An Introduction to Regression ( )

Regression. Chapter 11 Part 4. More than you ever wanted to know about how to interpret the computer printout

8 The independence problem

ENGI 3423 Simple Linear Regression Page 12-01

Objectives of Multiple Regression

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

STA302/1001-Fall 2008 Midterm Test October 21, 2008

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Correlation and Regression Analysis

Correlation and Simple Linear Regression

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

σ σ r = x i x N Statistics Formulas Sample Mean Population Mean Interquartile Range Population Variance Population Standard Deviation

Summary of the lecture in Biostatistics

Simple Linear Regression - Scalar Form

KR20 & Coefficient Alpha Their equivalence for binary scored items

1. a. Houston Chronicle, Des Moines Register, Chicago Tribune, Washington Post

Simple Linear Regression

Chapter 13 Student Lecture Notes 13-1

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

STS 414 ANALYSIS OF VARIANCE (ANOVA) REVIEW OF SIMPLEREGRESSION

Simple Linear Regression

Econometric Methods. Review of Estimation

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

Simple Linear Regression and Correlation.

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

Statistics MINITAB - Lab 5

Statistics: Unlocking the Power of Data Lock 5

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Multiple Choice Test. Chapter Adequacy of Models for Regression

European Journal of Mathematics and Computer Science Vol. 5 No. 2, 2018 ISSN

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

: At least two means differ SST

Formulas and Tables from Beginning Statistics

Probability and. Lecture 13: and Correlation

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

European Journal of Mathematics and Computer Science Vol. 5 No. 2, 2018 ISSN

Lecture 8: Linear Regression

Third handout: On the Gini Index

Regression. Linear Regression. A Simple Data Display. A Batch of Data. The Mean is 220. A Value of 474. STAT Handout Module 15 1 st of June 2009

Lecture Notes Types of economic variables

Linear Regression with One Regressor


Chapter 11 Systematic Sampling

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Previous lecture. Lecture 8. Learning outcomes of this lecture. Today. Statistical test and Scales of measurement. Correlation

Chapter 8: Statistical Analysis of Simulated Data

Multiple Linear Regression Analysis

ESS Line Fitting

CS473-Algorithms I. Lecture 12b. Dynamic Tables. CS 473 Lecture X 1

ε. Therefore, the estimate

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

FRST 531 Applied multivariate statistics

Handout #4. Statistical Inference. Probability Theory. Data Generating Process (i.e., Probability distribution) Observed Data (i.e.

ENGI 4421 Propagation of Error Page 8-01

Simulation Output Analysis

Lecture 3. Sampling, sampling distributions, and parameter estimation

UNIT 7 RANK CORRELATION

Lecture 3 Probability review (cont d)

2.3 Least-Square regressions

ANOVA with Summary Statistics: A STATA Macro

Lecture Notes Forecasting the process of estimating or predicting unknown situations

Continuous Distributions

A Result of Convergence about Weighted Sum for Exchangeable Random Variable Sequence in the Errors-in-Variables Model

Johns Hopkins University Department of Biostatistics Math Review for Introductory Courses

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

Johns Hopkins University Department of Biostatistics Math Review for Introductory Courses

Lecture 2: The Simple Regression Model

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

x z Increasing the size of the sample increases the power (reduces the probability of a Type II error) when the significance level remains fixed.

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Lecture 1: Introduction to Regression

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Collapsing to Sample and Remainder Means. Ed Stanek. In order to collapse the expanded random variables to weighted sample and remainder

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Linear Regression. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

MEASURES OF DISPERSION

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

Quantitative analysis requires : sound knowledge of chemistry : possibility of interferences WHY do we need to use STATISTICS in Anal. Chem.?

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

Parameter, Statistic and Random Samples

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Chapter 8. Inferences about More Than Two Population Central Values

Linear Regression. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

Lecture 1: Introduction to Regression

Statistical Inference Procedures

residual. (Note that usually in descriptions of regression analysis, upper-case

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Transcription:

Lear Regreo ad Correlato 00 Predctg the erupto tme after oberved a erupto of 4 mute durato. 90 80 70 Iter-erupto Tme.5.0.5 3.0 3.5 4.0 4.5 5.0 5.5 Durato A ample of tererupto tme wa take durg Augut -8, 978 ad Augut 6-3, 979. The data et aved the fle g.dat whch a ASCII fle, ad the SPSS fle gr.av. The frt colum of the data et for date, the ecod colum erupto durato tme ( mute) ad the thrd colum tererupto tme (mute). How ca we help the tourt? Tr ome eplorator techque uch a htogram, boplot, catter plot,..., to ee f there a patter. A charactertc of the geer the durato of the prevou erupto. We could thk of our data a par of the form (durato of erupto, tme utl et erupto). Do ou ee two ubgroup th data et? Do ou ee that a loger durato ted to be followed b a loger tme terval utl the et erupto? J. S. Rehart, a 969 paper the Joural of Geophcal Reearch, provde a mecham for th patter baed o the temperature level of the water at the bottom of a geer tube at the tme the water at the top reache the bolg temperature. That a horter erupto would be followed b a horter tererupto tme (ad a loger erupto would be followed b a loger tererupto tme) alo cotet wth Rehart' model, ce a hort erupto characterzed b havg more water at the bottom of the geer beg heated hort of bolg temperature, ad left the tube. Th water ha bee heated omewhat, however, o t take le tme for the et erupto to occur. A log erupto reult the tube beg empted, o the water mut be heated from a colder temperature, whch take loger. Awer the followg queto:. I there a correlato betwee the durato of erupto ad the tme betwee erupto? Ca we model the relato ug a mple regreo modelg techque ad fd the equato of the le that bet decrbe the data.. Ue lear regreo techque to predct the average watg tme utl et erupto after a erupto that lat 4.3 mute. (Ue a 95% cofdece terval for predctg the average watg tme utl et erupto.)

Lear Regreo ad Correlato Smple Lear Regreo Model the Lear Relato Betwee Two Quattatve varable, (repoe varable) ad (eplaator varable). µ + ε (Smple Lear Regreo Model) that to determe µ α + β. What are the value of α ad β? Leat-Square Etmate of Regreo Parameter Fd oluto of α ad β of a traght le that mmze the followg varablt meaure (total accumulated quared error). Σ [ (α + β )] Maatee Klled 30? +? Redual (or Error) 0 0 0 0 0 700 800 Leat-Square Etmate of Regreo Parameter ˆ SS β, ˆ α ˆ β Aother formula: ˆ β r, SS where ad are tadard devato of ad Eample: (Maatee) 37 7945 4 ˆ β. 486 ˆ α. 486 4. 430439 09809. 5 4 4 Equato of the regreo le: ˆ α ˆ + ˆ β ; ˆ 4. 430439 +. 486 If at a certa ear the umber of power boat regtered 700,000, etmate how ma maatee o average would be klled. ŷ 4.430439 +.486, at 700 4.430439 +.486 700 45.973 Number of Power Boat

Lear Regreo ad Correlato Problem of etrapolato Etrapolated reult for a value out of the cope of A poble tred Scope of data Regreo ad caualt Regreo telf provde o formato about caual patter ad mut upplemeted b addtoal aal (wth deged ad cotrolled epermet) to obta ght about caual relatohp. Frt Order Smple Lear Regreo Model Model aumpto: α + β + ε wth error depedet ad detcall dtrbuted a ε Ν (0, σ ) ad mea of at µ α + β. Mea of at 4 3 4 3

Lear Regreo ad Correlato Redual: e ˆ ˆ ˆ α + β ˆ ŷ Eample: Fd the redual at 4. Oberved, ad predcted 4.430439 +.486 4 6.0. The redual 6.0 4.99. Source of varablt: Repoe varable, Eplaator varable, Error Redual Sum of Square (SSRed), or Sum of Square of Error (SSE) Total Sum of Square (SSTo) SSTo SSR + SSE ( ) Sum of Square due to Regreo (SSR) SSTo SSE ANOVA Table Source of Var. S.S. d.f. M.S. F Regreo SSR SSR/MSR MSR/MSE Error SSE SSE/( )MSE Total (corrected) SSTo ( ˆ ) e ˆ ) ( Etmato of σ : MSE SSE / ( ) 8.87 (Degree of freedom ) Etmated Stadard Error of the regreo model: 4.8 4

Lear Regreo ad Correlato Model R Model Summar R Square Adjuted R Square Std. Error of the Etmate.94 a.886.877 4.8 a. Predctor: (Cotat), NPOWERBT ANOVA b Model Sum of Square df Mea Square F Sg. Regreo 7.979 7.979 93.65.000 a Redual 9.4 8.87 Total 93.49 3 a. Predctor: (Cotat), NPOWERBT b. Depedet Varable: MANKILL Model (Cotat) NPOWERBT a. Depedet Varable: MANKILL Utadardzed Coeffcet B Coeffcet a Std. Error Stadardzed Coeffcet Beta -4.430 7.4-5.589.000.5.03.94 9.675.000 t Sg. Equato of the regreo le: ˆ α ˆ + ˆ β ; ˆ 4. 430439 +. 486 Iferece for Regreo Coeffcet Hpothe: Ho: β β 0, v.. Ha: β β 0 (It ofte tetg for Ho: β 0 v.. Ha: β 0.) Tet Stattc: ˆ β β 0 t ~ t-dtrbuto d.f., where e ˆ ˆ ( β ).03 e ˆ ( ˆ β ) ( ) Deco rule: Reject Ho, f C.V. approach: t < t α/ or t > t α/ p-value approach: p-value < α Hpothe: Ho: α α 0, v.. Ha: α α 0 (It ofte tetg for Ho: α 0 v.. Ha: α 0.) Tet Stattc: ˆ α α0 t ~ t-dtrbuto d.f., where e ˆ ( ˆ α) Deco rule: Reject Ho, f C.V. approach: t < t α/ or t > t α/ p-value approach: p-value < α α 7.4 ( ) e ˆ ( ˆ ) + 5

Lear Regreo ad Correlato Iferece for Predcted Value The (-α) 00% cofdece terval for predctg the mea repoe at : ˆ tα / e ˆ ( ˆ) where ± ( ) e ˆ ( ˆ) d.f. ( ) + Predcted Number of Maatee Klled o Average at 4 > 6.0 ± 3.9 > (.09, 9.9) The (-α) 00% cofdece terval for predctg a dvdual outcome at : ˆ t ˆ ( ~ α / e ) where ± ( ) e ˆ ( ~ ) d.f. ( ) + + Predcted Number of Maatee Klled at 4 > 6.0 ± 0.0 > (5.9, 6.) Cofdece Iterval Bad Number of maatee klled 30 0 0 0 0 0 700 800 Number of Powerboat 6

Lear Regreo ad Correlato Evaluato of the Model Coeffcet of Determato It the proporto of varato oberved that ca be eplaed b the varable wth the lear regreo model. It the quare of the Pearo correlato coeffcet, r. r SSE/SSTo SSR/SSTo Redual Plot A catter plot of the redual agat the predcted value of the repoe varable to verf the aumpto behd the regreo model. (Homogeet of varace, radom ormal error, appropratee of the lear model.).5 Scatterplot Depedet Varable: Number of maatee klle Regreo Stadardzed Redual.0.5 0.0 -.5 -.0 -.5 -.0 -.5 -.5 -.0 -.5 0.0.5.0.5.0 Regreo Stadardzed Predcted Value 0 0 Model ot a good lear ft Volato of the equal varace aumpto 7

Lear Regreo ad Correlato Traformato The crcle of power rule. Whe the relato betwee ad ot lear, we beg lookg at traformato of the form p ad p. p, -3, -, -, -/, l, /,,, 3, Whe the catter plot how patter of data a Quadrat I, oe ca traform the varable wth a p fucto b creag power p to a umber larger tha oe, or traformato varable wth p fucto b creag power p to a umber large tha oe. Th what the up ad up mea. If the patter mlar to the catter plot Quadrat II, the ether creae the power of p traformato to a umber larger tha oe ( up) or decreae the power of p traformato to a umber le tha oe ( dow). up Quadrat II Quadrat I dow up Quadrat III Quadrat IV dow Eample: female lfe epectac, GDP (Gro dometc product) Plot of v.. Plot of v.. l() 90 90 80 80 Female lfe epectac 99 70-0000 0 0000 0000 30000 Female lfe epectac 99 70 4 5 6 7 8 9 0 GDP per capta Natural log of GDP ˆ αˆ + ˆ β l( ) 8

Lear Regreo ad Correlato Correlato: Eame Quattatve Bvarate Data The correlato, ρ, betwee two radom varable, X ad Y, defed a, ( X µ X ) ( Y µ Y ) ρ average σ X σ Y product of the tadard devate of X ad Y, quatfe the tregth of lear relatohp. Regreo Aal I there a relato betwee umber of power boat the area ad umber of maatee klled? Year NPB( ) Nkll( ) 77 447 3 78 4 79 48 4 80 498 6 8 53 4 8 5 0 83 56 5 84 559 34 85 585 33 86 64 33 87 645 39 88 675 43 89 7 90 79 47 Maatee Klled 30 0 0 0 0 Scatter Plot 0 700 800 Number of Power Boat Eample: Calculato of Correlato Coeffcet Σ 7945 (Σ) 63305 Σ 4 (Σ) 69744 Σ 468597 Σ 56 Σ 475 S 09809.5, S 93.4857 S 37 r 0.9447789, r 0.886379485 9

Lear Regreo ad Correlato 0 Formula for correlato coeffcet: Properte of correlato coeffcet: < r < It meaure the tregth ad drecto of the lear relato betwee two quattatve varable. r f all pot le eactl o a traght le. ρ he otato for populato correlato coeffcet. Correlato Doe Not Impl Cauato Eample: The umber of powerboat regtered ma ot be the drect caue for the death of Maatee..,,, ) ( ) ( ) )( ( where r r,.) correlato ample Pearo' ( r cloe to r cloe to r cloe to zero r cloe to zero

Lear Regreo ad Correlato t-tet for correlato Hpothe: Ho: ρ ρ 0, v.. Ha: ρ ρ 0 Tet Stattc: t r ρ0 ( r ) /( ) ~ t-dtrbuto d.f. If tud rak correlato, ue r : Tet Stattc: t r r Deco rule: Reject Ho, f C.V. approach: t < t α/ or t > t α/ p-value approach: p-value < α Eample: (Maatee) I there a gfcat correlato betwee powerboat ad death of maatee? Ho: ρ 0, v.. Ha: ρ 0. 94 0 t 9. 65 (. 886) /( 4 ) d.f. 4 -, p-value <.0005, reject Ho, there gfcat lear relato. Spearma Rak Correlato Coeffcet Spearma correlato coeffcet calculated baed o the rak of the obervato,.e., r rak of amog all, ad r rak of amog all, where d r r. If te ue average rak. Tet for correlato ug rak correlato coeffcet If ample ze le tha 0 ue pecal table. If ample ze large ue t-dtrbuto table. The t-tet the ame a the tet procedure for o-rak data. It le etve to outler wth the dadvatage of lo of formato. Eample: (Maatee) ( )( ) r r r r 6 r ( ( ) ( ) r r r r d ) Year NPB( ) Rak Nkll( ) Rak 77 447 3 78 4 5 79 48 3 4 6.5 80 498 4 6 3 8 53 6 4 6.5 8 5 5 0 4 83 56 7 5 84 559 8 34 0 85 585 9 33 8.5 86 64 0 33 8.5 87 645 39 88 675 43 89 7 3 4 90 79 4 47 3 Σ d (-) + (-5) + (3-6.5) + 57 r t 6 57. 875 4( 4 ) 4. 875 6. 6. 875 p-value <.0005 <.05. There gfcat correlato.

Lear Regreo ad Correlato Eample: I a vetgato of the relato betwee the average aual temperature ad the mortalt de for a tpe of breat cacer wome certa rego of Europe, data were collected from 5 Europea coutre. 0 00 90 SPSS Output 80 Model Summar 70 Mortalt Ide 30 Average Temperature Model R R Square Adjuted R Square Std. Error of the Etmate.94 a.853.84 5.936 a. Predctor: (Cotat), Average Temperature Sample correlato coeffcet: Eample: I a vetgato, coutre were cluded to tud the relato betwee female lfe epectac ad the brthrate. 90 80 70 Female lfe epectac 99 0 0 0 30 Brth per 000 populato, 99 Model R Model Summar R Square Adjuted R Square Std. Error of the Etmate.870 a.756.754 5.6 a. Predctor: (Cotat), Brth per 000 populato, 99 Sample correlato coeffcet: