Chapter 12 Simple Linear Regression

Similar documents
SIMPLE LINEAR REGRESSION

PROBABILITY AND STATISTICS. Least Squares Regression

MINITAB Stat Lab 3

Lecture 4 Topic 3: General linear models (GLMs), the fundamentals of the analysis of variance (ANOVA), and completely randomized designs (CRDs)

Regression. What is regression? Linear Regression. Cal State Northridge Ψ320 Andrew Ainsworth PhD

Source slideplayer.com/fundamentals of Analytical Chemistry, F.J. Holler, S.R.Crouch. Chapter 6: Random Errors in Chemical Analysis

Suggested Answers To Exercises. estimates variability in a sampling distribution of random means. About 68% of means fall

Comparing Means: t-tests for Two Independent Samples

NEGATIVE z Scores. TABLE A-2 Standard Normal (z) Distribution: Cumulative Area from the LEFT. (continued)

HSC PHYSICS ONLINE KINEMATICS EXPERIMENT

Standard normal distribution. t-distribution, (df=5) t-distribution, (df=2) PDF created with pdffactory Pro trial version

Lecture 7: Testing Distributions

1. The F-test for Equality of Two Variances

L Exercise , page Exercise , page 523.

Multipurpose Small Area Estimation

CHAPTER 6. Estimation

μ + = σ = D 4 σ = D 3 σ = σ = All units in parts (a) and (b) are in V. (1) x chart: Center = μ = 0.75 UCL =

A NEW OPTIMAL QUADRATIC PREDICTOR OF A RESIDUAL LINEAR MODEL IN A FINITE POPULATION

REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION

Chapter 2 Sampling and Quantization. In order to investigate sampling and quantization, the difference between analog

ON THE APPROXIMATION ERROR IN HIGH DIMENSIONAL MODEL REPRESENTATION. Xiaoqun Wang

Gain and Phase Margins Based Delay Dependent Stability Analysis of Two- Area LFC System with Communication Delays

STRAIN LIMITS FOR PLASTIC HINGE REGIONS OF CONCRETE REINFORCED COLUMNS

III.9. THE HYSTERESIS CYCLE OF FERROELECTRIC SUBSTANCES

If Y is normally Distributed, then and 2 Y Y 10. σ σ

A Bluffer s Guide to... Sphericity

Z a>2 s 1n = X L - m. X L = m + Z a>2 s 1n X L = The decision rule for this one-tail test is

Alternate Dispersion Measures in Replicated Factorial Experiments

Why ANOVA? Analysis of Variance (ANOVA) One-Way ANOVA F-Test. One-Way ANOVA F-Test. One-Way ANOVA F-Test. Completely Randomized Design

STUDENT S t-distribution AND CONFIDENCE INTERVALS OF THE MEAN ( )

arxiv: v3 [hep-ph] 15 Sep 2009

Bio 112 Lecture Notes; Scientific Method

Reliability Analysis of Embedded System with Different Modes of Failure Emphasizing Reboot Delay

Research Article Reliability of Foundation Pile Based on Settlement and a Parameter Sensitivity Analysis

Social Studies 201 Notes for March 18, 2005

Stratified Analysis of Probabilities of Causation

Modeling and forecasting of rainfall data of mekele for Tigray region (Ethiopia)

After the invention of the steam engine in the late 1700s by the Scottish engineer

Solution to Test #1.

White Rose Research Online URL for this paper: Version: Accepted Version

Social Studies 201 Notes for November 14, 2003

( ) y = Properties of Gaussian curves: Can also be written as: where

Comparison of independent process analytical measurements a variographic study

Suggestions - Problem Set (a) Show the discriminant condition (1) takes the form. ln ln, # # R R

Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis Chapter 7

A Note on the Sum of Correlated Gamma Random Variables

Confidence Intervals and Hypothesis Testing of a Population Mean (Variance Known)

Math 273 Solutions to Review Problems for Exam 1

x z Increasing the size of the sample increases the power (reduces the probability of a Type II error) when the significance level remains fixed.

[Saxena, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

Non-linearity parameter B=A of binary liquid mixtures at elevated pressures

0 of the same magnitude. If we don t use an OA and ignore any damping, the CTF is

A Luenberger Soil Quality Indicator

Week 3 Statistics for bioinformatics and escience

Estimation of Current Population Variance in Two Successive Occasions

Beta Burr XII OR Five Parameter Beta Lomax Distribution: Remarks and Characterizations

Quantitative Information Leakage. Lecture 9

No-load And Blocked Rotor Test On An Induction Machine

Statistical Analysis Using Combined Data Sources Ray Chambers Centre for Statistical and Survey Methodology University of Wollongong

Molecular Dynamics Simulations of Nonequilibrium Effects Associated with Thermally Activated Exothermic Reactions

Sociology 376 Exam 1 Spring 2011 Prof Montgomery

Standard Guide for Conducting Ruggedness Tests 1

By Xiaoquan Wen and Matthew Stephens University of Michigan and University of Chicago

PhysicsAndMathsTutor.com

Introduction to Laplace Transform Techniques in Circuit Analysis

AP Statistics Ch 3 Examining Relationships

Prediction Uncertainty of Density Functional Approximations for Properties of Crystals with Cubic Symmetry

Estimation of Peaked Densities Over the Interval [0,1] Using Two-Sided Power Distribution: Application to Lottery Experiments

PARAMETERS OF DISPERSION FOR ON-TIME PERFORMANCE OF POSTAL ITEMS WITHIN TRANSIT TIMES MEASUREMENT SYSTEM FOR POSTAL SERVICES

ONLINE APPENDIX FOR HOUSING BOOMS, MANUFACTURING DECLINE,

Optimal Coordination of Samples in Business Surveys

Network based Sensor Localization in Multi-Media Application of Precision Agriculture Part 2: Time of Arrival

Measuring the fit of the model - SSR

Emittance limitations due to collective effects for the TOTEM beams

The Combined Effect of Wind and Rain on Interrill Erosion Processes

Supplementary information

THE EXPERIMENTAL PERFORMANCE OF A NONLINEAR DYNAMIC VIBRATION ABSORBER

Topic - 12 Linear Regression and Correlation

Statistics and Data Analysis

Inference for the Regression Coefficient

The Influence of the Load Condition upon the Radial Distribution of Electromagnetic Vibration and Noise in a Three-Phase Squirrel-Cage Induction Motor

What lies between Δx E, which represents the steam valve, and ΔP M, which is the mechanical power into the synchronous machine?

FUNDAMENTALS OF POWER SYSTEMS

Bogoliubov Transformation in Classical Mechanics

Unified Correlation between SPT-N and Shear Wave Velocity for all Soil Types

K K π +- Preliminary Results on γ γ

Inference for Two Stage Cluster Sampling: Equal SSU per PSU. Projections of SSU Random Variables on Each SSU selection.

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Statistical Downscaling Prediction of Sea Surface Winds over the Global Ocean

Design spacecraft external surfaces to ensure 95 percent probability of no mission-critical failures from particle impact.

Preemptive scheduling on a small number of hierarchical machines

Math Skills. Scientific Notation. Uncertainty in Measurements. Appendix A5 SKILLS HANDBOOK

On the Robustness of the Characteristics Related to (M\M\1) ( \FCFS) Queue System Model

Determination of Flow Resistance Coefficients Due to Shrubs and Woody Vegetation

Acceptance sampling uses sampling procedure to determine whether to

Cumulative Review of Calculus

Unified Design Method for Flexure and Debonding in FRP Retrofitted RC Beams

Chapter 14 Simple Linear Regression (A)

Lecture 21. The Lovasz splitting-off lemma Topics in Combinatorial Optimization April 29th, 2004

NON-GAUSSIAN ERROR DISTRIBUTIONS OF LMC DISTANCE MODULI MEASUREMENTS

Transcription:

Chapter 1 Simple Linear Regreion Introduction Exam Score v. Hour Studied Scenario Regreion Analyi ued to quantify the relation between (or more) variable o you can predict the value of one variable baed on the value of another develop an equation to predict the value of a dependent variable baed on the value of one or more independent variable Correlation Analyi meaure the trength of linear relation between a pair of variable if you plan to predict Y from X, they ought to be related! 1

Simple v. Multiple Regreion Simple Regreion Analyi ue a ingle independent variable to predict the dependent variable etimated Score 40.0816 + 1.4966(Hour) r.743 Multiple Regreion Analyi ue multiple independent variable to predict the dependent variable the et of independent variable hould be independent of one another and each hould be highly related to the dependent variable etimated Score 33.914 +3.47(GPA) -1.698(Abence) +1.395(Hour) r.7654 3 Characterizing Relationhip Direct Relation line of bet fit ha poitive lope Invere Relation line of bet fit ha negative lope Determinitic (Functional) Relation 100% pure relation between the pair of variable there i no catter with repect to line of bet fit, o the value of Y can be determined exactly (without error) baed on value of X Stochatic (Statitical, Random) Relation a le than perfect relation between the pair of variable ince variable other than X impact Y, there i catter with repect to line of bet fit and there will be error when ue x to predict y How characterize the apparent relation between Exam Score and Hour Studied? 4

Simple Linear Regreion Model Population Linear Regreion Equation y ß0 + ß1 x + e ε repreent the combined effect of other variable and i aumed to have mean of 0 and variance of σ Sample Linear Regreion Equation ŷ b 0 + b1 x 5 Leat Square Method: Line Of Bet Fit The ample regreion line won't perfectly fit the ample point there will be error in fit. Why? error in fit reidual (y - ŷ) Provide the bet fitting line in the ene that it ha the minimum amount of quared deviation between each oberved value and the correponding point on the regreion line Minimize the um of quared reidual in order to: prevent (+) and (-) error from cancelling draw added attention to any large error prefer to make everal mall error in order to avoid large error 6 3

Leat Square Method: Line Of Bet Fit Propertie of the Leat Square regreion equation 1) b 0 and b 1 are unbiaed etimator of ß 0 and ß 1 ) line pae through the point ( x, y) 3) the um of the reidual i zero (y - ŷ) 0 4) the um of the quared reidual i minimized (x-x)(y- b1 y) lope (x -x) y 0 1 intercept b Exam Score v. Hr Studied y - b the ample regreion equation i: x (y - ŷ) minimum compute the predicted value compute the reidual and quared reidual 7 Conditional Ditribution Of y Figure 1.8 on page 511 Why i y variable at any given value x? Ditribution of y i aumed Normal with mean ŷ The regreion equation i the line which connect the mean value of y at each value of x 8 4

Correlation Analyi Concept Meaure the trength of linear relation between two variable If you intend to ue X to predict Y, how trongly related are they? The lope of the ample regreion equation wa +1.4965 o thee variable eem to move together The mean exam core wa 76 and variation among tudent core wa 11.504 ome of the variation in core can be explained by taking into account hour tudied 9 Strength of Relationhip r.98, r.96 r.78, r.61 r.34, r.1 r.1, r.01 r-.01, r.00 r-.99, r.98 r-.64, r.41 r-.33, r.1 r-.11, r.01 10 5

95 Correlation Analyi TOTAL VARIATION EXPLAINABLE BY + UNEXPLAINABLE IN SCORES HOURS STUDIED BY HOURS STUDIED SST SSR + SSE (y y) (ŷ y) + (y (9-76) (88-76) + (9-88) ŷ) 85 75 y 76 65 55 45 35 0 5 10 15 0 5 30 35 11 Correlation Analyi TOTAL VARIATION EXPLAINABLE BY + UNEXPLAINABLE IN SCORES HOURS STUDIED BY HOURS STUDIED SST SSR + SSE (y y) (ŷ y) + (y ŷ) Exam Score v. Hour Studied SST SSR SSE 1 6

Coefficient Of Determination Meaure the proportion of variation in variable y that i explained by variable x Indicate how well the ample regreion line fit the ample data ρ etimated by r 0 < r < 1 r explained variation SSR total variation SST Exam Score v. Hr Studied (ŷ y) (y y) 13 Coefficient Of Correlation ρ etimated by r -1 < r < +1 r (ign of Interpretation: There i a (trength) (direct or invere) correlation between (variable X) and (variable Y) b 1 ) r Value of r Strength of correlation.9 to 1 very high.7 to.9 high.5 to.7 moderate.3 to.5.0 to.3 weak little if any Exam Score v. Hr Studied 14 7

Coefficient Of Correlation When working with multiple variable, common to obtain the correlation between each pair of variable a triangular correlation matrix Can invetigate whether or not the potential independent variable are truly independent of one another Score Hour GPA Hour 0.86 GPA 0.489 0.566 Abence -0.343-0.34 0.08 15 Limitation Of Regreion Analyi Regreion/Correlation cannot prove caue-and-effect relationhip Brightman article Don't ue the regreion model to predict beyond range of oberved X-value 16 8

Mean Square Error & Standard Error of Etimate Meaure amount of catter around the regreion line Serve a an etimate of σ SSE (y ŷ) M.S.E. n n - Standard Error of Etimate Square root of MSE Serve a an etimate of σ Ued for inference regarding the regreion line hypothei tet interval etimate Exam Score v. Hr Studied et SSE n (y ŷ) n - 17 t-tet for Significance of the Slope b 1 etimate ß 1 H 0 : ß 1 0 no relation between the two variable H A : ß 1?0 i a relation between the two variable tet tatitic b 1 whoe ampling ditr follow t n- Standard Error of the Slope meaure ROSE when ue b 1 to etimate ß 1 b 1 M.S.E. (x x) Exam Score v. Hr Studied et (x x) 18 9

Interval Etimation In Regreion Analyi What core would you predict for tudent who tudy 30 hour? We ve etimated that the mean core of all tudent tudying 30 hour i 85. Thi i a point etimate baed on a ample of n8. The etimate could be in error due to ource: 1) ampling error ince b o and b 1 are ample reult, they may be biaed we're not certain where the true population regreion equation i ) tochatic relation wherever the true population regreion equation actually i, there i catter around it due to the combined effect of other variable 19 Confidence Interval Etimate of the Mean Value of y Etimate the mean value for y at a given value of x Standard Error of the Conditional Mean (nib) account for ampling error in etimating b 0 and b 1 which would affect our predicted value ŷ et 1 n (x x) (x x) + CIfor y ŷ ± t n ŷ Pg. 59: notice that the width of the confidence band increae a you predict further away from x-bar Exam Score v. Hour Studied 95% CI for the mean core of tudent who tudy 30 hour 0 10

Prediction Interval Etimate of an Individual Value of y Etimate an individual value for y at a given x Standard Error of the Forecat (nib) account for ampling error and the fact that there i diperion around the regreion line ind et 1 1 (x x) + n (x x) + PI for y ind ŷ ± t n ind Pg. 531: notice that PI band are wider than CI band and that each i wider a you predict further away from x-bar Exam Score v. Hour Studied 95% PI for individual core of a particular tudent who tudie 30 hr 1 11