Handout #8: Matrix Framework for Simple Linear Regression
|
|
- Gervase Palmer
- 5 years ago
- Views:
Transcription
1 Handout #8: Matrix Framework for Simple Linear Regression Example 8.1: Consider again the Wendy s subset of the Nutrition dataset that was initially presented in Handout #7. Assume the following structure for the mean and variance functions. o ( ) o ( ) Simple Linear Regression Output Scatterplot showing the conditional distribution of SaturatedFat Calories Basic Regression Output Standard Parameter Estimate Output (with 95% confidence intervals) Output for the 95% confidence interval and prediction interval. 1
2 Matrix Representation of the Data The data structure can easily be represented with vectors and matricies. For example, the response column of the data will be represented by a vector, say y, and the predictor variable will be represented by a second vector, say x 1. A theoretical representation and a representation for the observed data are presented here for comparison purposes. Theoretical Representation Representation for Observed Data [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] 2
3 Theoretical Framework is easier with Matrix Representation Representation Theoretical Representation using Standard Notation Representation Theoretical Representation using Matrix Notation Distributional Properties Distributional Properties ( ) ( ) ( ) ( ) ( ) ( ) ( ) where is an n x n identify matrix, with n equal to the number of observations. Some people emphasize the fact that all the variability in the response is represented in the error term and state the following result. ( ), for all i Some people emphasize the fact that all the variability in the response is represented in the error term and state the following result. ( ) The quantity ( ) has the following form when it is written out in its entirety. [ ] ( [ ] [ ] ) 3
4 Example 8.2: There are certainly situations in which this simple form maybe inadequate. Consider the following data structure in which glucose levels were measured on subjects at baseline and at three repeated time points, i.e. baseline, 30 minutes, 60 minutes, and 90 minutes. In this particular situation, the standard error assumptions are not appropriate. Data structure and snip-it of estimated mean functions of interest. A better modeling approach for the error structure would be to allow the errors within a subject to be correlated with each other. [ ] ( [ ] [ ] ) 4
5 Working with Matrix Representation in R Studio To read a dataset into R Studio, select Import Dataset in the Workspace box (upper right corner). Select From Text File The most common format for text files that I use is comma delimited, which simply implies that observations in the dataset are separated by commas. R Studio has the ability to automatically identify a comma delimited file type. R Studio produces the following window when reading in this type of file. 5
6 Data Structure in R Studio In R (and R Studio), data is stored in a data.frame structure. This is not necessary equivalent to a matrix, but for our purposes a data.frame can be thought of as a matrix. Gettng the dimensions of our Nutrition data.frame, i.e. number of observations and number of variables. > dim(nutrition) [1] Getting the variable names of a the Nutrition data.frame. > names(nutrition) [1] "RowID" "Restaurant" "Item" "Type" "Breakfast" "ServingSize" "Calories" [8] "TotalFat" "SaturatedFat" "Cholesterol" "Sodium" "Fiber" "Sugar" "TotalCarbs" [15] "Protein" Getting the elements in the 1 st row of the Nutrition data.frame. > Nutrition[1,] Getting the elements of the 2 nd column, i.e. the restaurant for each observation. > Nutrition[,2] 6
7 Simple plotting and model fitting R Studio. Note: Before you can use specific variable names from a dataset, you must attach the dataset. This is essentially telling R which dataset you d like to work with. You should detach() a dataset after you are done to prevent confusion. If you fail to attach() a dataset, you will get the following type of error. Attach the Nutrition dataset so that R can identify the dataset that you intend to work with. > attach(nutrition) Creating a simple plot in R > plot(calories,saturatedfat) A simple linear regression model fit can be done using the lm() function. > slr.fit = lm(saturatedfat ~ Calories) To see the output the initial output, simple type slr.fit. A more detailed summary can be obtained by using the following summary() function. For example, summary(slr.fit) will produce additional summaries for this model. > slr.fit Call: lm(formula = SaturatedFat ~ Calories) Coefficients: (Intercept) Calories Adding the estimated model to the plot. > abline(slr.fit) 7
8 In R, there are often several characteristics of a function that are retained, but not necessarily easily identified or know. The names() function can be used to identify the names of the often hidden quantities. For example, slr.fit$residuals will produce a vector of all the residuals from the fit. > names(slr.fit) Using the residuals from the fit to easily obtain a plot of the estimated variance function. > plot(calories,abs(slr.fit$residuals)) > lines(lowess(calories,abs(slr.fit$residuals))) You can very easily get help on most functions in R through the use of the help() function. For example, if you d like information regarding the use of the lowess() function, type > help(lowess) 8
9 > help(lowess) : : 9
10 Example 8.3 Working again the with Wendys subset of the Nutrition dataset. The first step is to obtain only the observations from Wendys. This can be done as follows. > Wendys=Nutrition[Restaurant=="Wendys",] To obtain only the variables needed, we will ask for only certain columns. These columns will also be reordered as well. > Wendys=Nutrition[Restaurant=="Wendys",c(2,3,9,7)] > Wendys Next, we will construct the X matrix, i.e. design matrix. Recall the matrix notation structure for our simple linear regression model. [ ] [ ] [ ] [ ] 10
11 Creating the X matrix Step #1: Creating the column of 1s. > dim(wendys) [1] 28 4 > x0=rep(1,28) > x0 [1] Step #2: Creating the 2 nd column > x1=wendys[,4] > x1 [1] [23] Step #3: Putting the columns together in a matrix > x=cbind(x0,x1) > View(x) 11
12 Creating the Y vector > y=wendys[,3] > View(y) Obtaining the estimated parameters, i.e. the vector We know from JMP output that the estimated y-intercept is about -5.8 and the slope estimate is about Putting these quantities into a vector format yields. [ ] This vector can be obtained using the following matrix formula ( ) 12
13 Getting the first quantity, i.e. ( ) in R First, getting the transpose of the matrix X > xprime=t(x) > View(xprime) Next, multiply X by the transpose of X > xprimex=xprime %*% x > View(xprimex) Now, getting the inverse of > xprimex.inv=solve(xprimex, diag(2) ) > View(xprimex.inv) Now, we can multiply the pieces together to get the estimated parameters, i.e. ( ) > beta.hat = xprimex.inv %*% xprime %*% y > View(beta.hat) [ ] 13
14 Predicted Values and Residuals Predicted Values > y.pred = x %*% beta.hat > View(y.pred) Residuals > resid = y - y.pred > View(resid) 14
15 Some of the other commonly used to quantities Summary of Fit and ANOVA table from JMP. Getting the Sum of Squares for C. Total, i.e. the total unexplained variation in the marginal distribution. > C.Total = 27 * var(wendys$saturatedfat) > C.Total [1] Getting the total unexplained variation in the conditional distribution can be done quite easily using the residual vector. > Sum.Squared.Error = t(resid) %*% resid > Sum.Squared.Error [,1] [1,] Dividing the quantity above by 26 yields our variance estimate, under a constant variance assumption. That is, the is given by > Mean.Squared.Error = Sum.Squared.Error / 26 > Mean.Squared.Error [,1] [1,] 6.57 Taking the square root yields the estimated standard deviation, i.e, = ( ) > sqrt(mean.squared.error) [,1] [1,]
16 R 2 -- to include a visualization of R 2 Getting the R 2 value via the reduction in unexplained variation. > RSquared = (C.Total - Sum.Squared.Error)/C.Total > RSquared [,1] [1,] Visualization of R 2 There is a visual interpretation of R 2, which we have not yet discussed in this class. This visualization is given by plotting the y values against the predicted values. This plot is shown here. If the model provides a good fit, then the points on this plot should follow the y=x line. This has been included on the plot below. > plot(y,y.pred,xlim=c(0,30),ylim=c(0,30)) > abline(0,1) Questions 1. What would this plot look like if R 2 values was very close to 1? 2. Consider the 1 st observations in our dataset -- Daves Hot N Juicy ¼ lb Single. This items has a SaturatedFat value of 14 and the predicted SaturatedFat from the regression line was determined to be a. Find this point on the graph above. b. Identify the residual for this point on the graph. The R 2 quantity calculated above can be computed by squaring the correlation measurement from the plot above. Traditionally,, the greek r, is used to identify a correlation; thus, I d guess that this is where the R 2 notation was derived from. > cor(y,y.pred)^2 [,1] [1,]
17 Obtaining the standard errors for estimated parameters The standard error quantities for the y-intercept and slope were discussed in a previous handout; however, the formulation of such quantities was not given. Standard error values for the y-intercept and slope are provided in standard regression output. The standard error of the slope is used to quantify the degree to which the estimated slope of the regression line will vary over repeated samples. From the above plot, we can see that the variation in the estimated slope certainly affects the variation in the estimated y-intercepts. That is, these two standard error quantities are said to co-vary, i.e. a covariation exists between these two quantities. The variation in the estimated parameter vector is given by the following variance/covariance matrix. ( ) ([ ]) [ ( ) ( ) ( ) ( ) ] The estimated variance/covariance matrix is given by the following quantity. ( ) ( ) 17
18 Getting variance/covariance matrix of the estimated parameter vector ( ) ( ) [ ] [ ] [ ] Thus, the standard error, i.e. standard deviation, of the estimated y-intercept is given by ( ) and the standard error of the estimated slope is given by ( ) Comment: The co-variation that exists between the model parameters is ignored when the 95% confidence intervals are individually considered. A 95% joint confidence region does not ignore such covariation. This confidence region is constructed using a multivariate normal distribution. (Take STAT 415: Multivariate Statistics for all the details!) Individual 95% confidence intervals for model parameters 95% Joint Confidence Region for model parameters 18
19 Predictions and Standard Errors for Predictions Goal: Obtain a prediction (and its associated standard errors for CIs and PIs) for the expected SaturatedFat level of a Wendy s menu item with 900 calories. Output from JMP regarding the prediction and estimated standard errors. Creating a row vector that contains the information for our new observation > xnew=cbind(1,900) Note: Column binding is needed to create a row vector as [1] and [900] are being put together to form a row vector. Thus, cbind(1,900) will create a row vector that is needed to make a prediction for a food item with 900 Calories. To obtain the predicted SaturatedFat, simply multiple this row vector by > y.pred.900 = xnew %*% beta.hat [,1] [1,]
20 Multiplication Properties for Variances Variance of a constant, say c, times ( ) ( ) ( ) Variance of a row vector, say r, times. This is commonly referred to as a linear combination of the estimated parameter vector. ( ) ( ) Getting the variance for the linear combination of interest when making a prediction for a food item with 900 Calories. ( ) ( ) ( ) [ ] [ ] [ ] Taking the square root of this quantity yields the predicted standard error quantity provided by JMP. 20
21 The standard error for an individual prediction (versus the average predicted value) requires the addition of the variability present in the conditional distribution. That is, the variation for an individual predication involves variation in estimating the regression line plus the variation in conditional distribution. ( ) A visualization of the 95% prediction intervals and it s corresponding standard error are given below. 95% Prediction Interval for Calories = 900 Prediction intervals certainly vary over repeated samples. The standard error for an individual prediction measures such variation. 21
CHAPTER 10. Regression and Correlation
CHAPTER 10 Regression and Correlation In this Chapter we assess the strength of the linear relationship between two continuous variables. If a significant linear relationship is found, the next step would
More informationRegression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.
Regression Bivariate i linear regression: Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables and. Generally describe as a
More informationBIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression
BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationdetermine whether or not this relationship is.
Section 9-1 Correlation A correlation is a between two. The data can be represented by ordered pairs (x,y) where x is the (or ) variable and y is the (or ) variable. There are several types of correlations
More information2.1: Inferences about β 1
Chapter 2 1 2.1: Inferences about β 1 Test of interest throughout regression: Need sampling distribution of the estimator b 1. Idea: If b 1 can be written as a linear combination of the responses (which
More informationSTAT 3022 Spring 2007
Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationASSIGNMENT 3 SIMPLE LINEAR REGRESSION. Old Faithful
ASSIGNMENT 3 SIMPLE LINEAR REGRESSION In the simple linear regression model, the mean of a response variable is a linear function of an explanatory variable. The model and associated inferential tools
More informationSMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot.
SMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot. 2. Fit the linear regression line. Regression Analysis: y versus x y
More informationR Demonstration ANCOVA
R Demonstration ANCOVA Objective: The purpose of this week s session is to demonstrate how to perform an analysis of covariance (ANCOVA) in R, and how to plot the regression lines for each level of the
More informationSampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM,
Department of Statistics The Wharton School University of Pennsylvania Statistics 61 Fall 3 Module 3 Inference about the SRM Mini-Review: Inference for a Mean An ideal setup for inference about a mean
More informationChapter 12: Linear regression II
Chapter 12: Linear regression II Timothy Hanson Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 14 12.4 The regression model
More informationAssumptions, Diagnostics, and Inferences for the Simple Linear Regression Model with Normal Residuals
Assumptions, Diagnostics, and Inferences for the Simple Linear Regression Model with Normal Residuals 4 December 2018 1 The Simple Linear Regression Model with Normal Residuals In previous class sessions,
More informationNext is material on matrix rank. Please see the handout
B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0
More informationModeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop
Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector
More informationSteps to take to do the descriptive part of regression analysis:
STA 2023 Simple Linear Regression: Least Squares Model Steps to take to do the descriptive part of regression analysis: A. Plot the data on a scatter plot. Describe patterns: 1. Is there a strong, moderate,
More informationFoundations for Functions
Activity: TEKS: Overview: Materials: Regression Exploration (A.2) Foundations for functions. The student uses the properties and attributes of functions. The student is expected to: (D) collect and organize
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationConfidence Intervals for the Odds Ratio in Logistic Regression with One Binary X
Chapter 864 Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X Introduction Logistic regression expresses the relationship between a binary response variable and one or more
More informationChapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line
Chapter 7 Linear Regression (Pt. 1) 7.1 Introduction Recall that r, the correlation coefficient, measures the linear association between two quantitative variables. Linear regression is the method of fitting
More informationIntroduction to Simple Linear Regression
Introduction to Simple Linear Regression 1. Regression Equation A simple linear regression (also known as a bivariate regression) is a linear equation describing the relationship between an explanatory
More informationStatistical Techniques II
Statistical Techniques II EST705 Regression with atrix Algebra 06a_atrix SLR atrix Algebra We will not be doing our regressions with matrix algebra, except that the computer does employ matrices. In fact,
More informationLAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION
LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the
More informationINFERENCE FOR REGRESSION
CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We
More informationBiostatistics 380 Multiple Regression 1. Multiple Regression
Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)
More informationCorrelation. A statistics method to measure the relationship between two variables. Three characteristics
Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction
More informationMatrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =
Matrices and vectors A matrix is a rectangular array of numbers Here s an example: 23 14 17 A = 225 0 2 This matrix has dimensions 2 3 The number of rows is first, then the number of columns We can write
More informationLinear Probability Model
Linear Probability Model Note on required packages: The following code requires the packages sandwich and lmtest to estimate regression error variance that may change with the explanatory variables. If
More informationSimple Linear Regression for the Climate Data
Prediction Prediction Interval Temperature 0.2 0.0 0.2 0.4 0.6 0.8 320 340 360 380 CO 2 Simple Linear Regression for the Climate Data What do we do with the data? y i = Temperature of i th Year x i =CO
More informationInteractions and Factorial ANOVA
Interactions and Factorial ANOVA STA442/2101 F 2017 See last slide for copyright information 1 Interactions Interaction between explanatory variables means It depends. Relationship between one explanatory
More informationInference with Simple Regression
1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems
More informationInteractions and Factorial ANOVA
Interactions and Factorial ANOVA STA442/2101 F 2018 See last slide for copyright information 1 Interactions Interaction between explanatory variables means It depends. Relationship between one explanatory
More informationObtaining Uncertainty Measures on Slope and Intercept
Obtaining Uncertainty Measures on Slope and Intercept of a Least Squares Fit with Excel s LINEST Faith A. Morrison Professor of Chemical Engineering Michigan Technological University, Houghton, MI 39931
More informationThis module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression.
WISE ANOVA and Regression Lab Introduction to the WISE Correlation/Regression and ANOVA Applet This module focuses on the logic of ANOVA with special attention given to variance components and the relationship
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationRegression Analysis: Exploring relationships between variables. Stat 251
Regression Analysis: Exploring relationships between variables Stat 251 Introduction Objective of regression analysis is to explore the relationship between two (or more) variables so that information
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More informationChapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.
Chapter Goals To understand the methods for displaying and describing relationship among variables. Formulate Theories Interpret Results/Make Decisions Collect Data Summarize Results Chapter 7: Is There
More informationy n 1 ( x i x )( y y i n 1 i y 2
STP3 Brief Class Notes Instructor: Ela Jackiewicz Chapter Regression and Correlation In this chapter we will explore the relationship between two quantitative variables, X an Y. We will consider n ordered
More informationCourse Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model
Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012
More informationDo Now 18 Balance Point. Directions: Use the data table to answer the questions. 2. Explain whether it is reasonable to fit a line to the data.
Do Now 18 Do Now 18 Balance Point Directions: Use the data table to answer the questions. 1. Calculate the balance point.. Explain whether it is reasonable to fit a line to the data.. The data is plotted
More informationAdvanced Quantitative Data Analysis
Chapter 24 Advanced Quantitative Data Analysis Daniel Muijs Doing Regression Analysis in SPSS When we want to do regression analysis in SPSS, we have to go through the following steps: 1 As usual, we choose
More informationWISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet
WISE Regression/Correlation Interactive Lab Introduction to the WISE Correlation/Regression Applet This tutorial focuses on the logic of regression analysis with special attention given to variance components.
More informationSMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3
SMAM 319 Exam1 Name 1. Pick the best choice. (10 points-2 each) _c A. A data set consisting of fifteen observations has the five number summary 4 11 12 13 15.5. For this data set it is definitely true
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More information7.0 Lesson Plan. Regression. Residuals
7.0 Lesson Plan Regression Residuals 1 7.1 More About Regression Recall the regression assumptions: 1. Each point (X i, Y i ) in the scatterplot satisfies: Y i = ax i + b + ɛ i where the ɛ i have a normal
More informationSMAM 314 Exam 42 Name
SMAM 314 Exam 42 Name Mark the following statements True (T) or False (F) (10 points) 1. F A. The line that best fits points whose X and Y values are negatively correlated should have a positive slope.
More informationLab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model
Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.
More informationIntroduction to Linear Regression
Introduction to Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Introduction to Linear Regression 1 / 46
More information2 Prediction and Analysis of Variance
2 Prediction and Analysis of Variance Reading: Chapters and 2 of Kennedy A Guide to Econometrics Achen, Christopher H. Interpreting and Using Regression (London: Sage, 982). Chapter 4 of Andy Field, Discovering
More informationL21: Chapter 12: Linear regression
L21: Chapter 12: Linear regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 37 So far... 12.1 Introduction One sample
More informationData Analysis and Statistical Methods Statistics 651
y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent
More informationChapter 2: simple regression model
Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.
More informationSTAT Regression Methods
STAT 501 - Regression Methods Unit 9 Examples Example 1: Quake Data Let y t = the annual number of worldwide earthquakes with magnitude greater than 7 on the Richter scale for n = 99 years. Figure 1 gives
More informationSTAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression
STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More informationLongitudinal Data Analysis of Health Outcomes
Longitudinal Data Analysis of Health Outcomes Longitudinal Data Analysis Workshop Running Example: Days 2 and 3 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development
More informationFactorial ANOVA. More than one categorical explanatory variable. See last slide for copyright information 1
Factorial ANOVA More than one categorical explanatory variable See last slide for copyright information 1 Factorial ANOVA Categorical explanatory variables are called factors More than one at a time Primarily
More informationDeciphering Math Notation. Billy Skorupski Associate Professor, School of Education
Deciphering Math Notation Billy Skorupski Associate Professor, School of Education Agenda General overview of data, variables Greek and Roman characters in math and statistics Parameters vs. Statistics
More informationProject Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang
Project Report for STAT7 Statistical Methods Instructor: Dr. Ramon V. Leon Wage Data Analysis Yuanlei Zhang 77--7 November, Part : Introduction Data Set The data set contains a random sample of observations
More informationSome general observations.
Modeling and analyzing data from computer experiments. Some general observations. 1. For simplicity, I assume that all factors (inputs) x1, x2,, xd are quantitative. 2. Because the code always produces
More informationTyping Equations in MS Word 2010
CM3215 Fundamentals of Chemical Engineering Laboratory Typing Equations in MS Word 2010 https://www.youtube.com/watch?v=cenp9mehtmy Professor Faith Morrison Department of Chemical Engineering Michigan
More informationSTAT 350. Assignment 4
STAT 350 Assignment 4 1. For the Mileage data in assignment 3 conduct a residual analysis and report your findings. I used the full model for this since my answers to assignment 3 suggested we needed the
More informationChapter 10 Correlation and Regression
Chapter 10 Correlation and Regression 10-1 Review and Preview 10-2 Correlation 10-3 Regression 10-4 Variation and Prediction Intervals 10-5 Multiple Regression 10-6 Modeling Copyright 2010, 2007, 2004
More informationItem Reliability Analysis
Item Reliability Analysis Revised: 10/11/2017 Summary... 1 Data Input... 4 Analysis Options... 5 Tables and Graphs... 5 Analysis Summary... 6 Matrix Plot... 8 Alpha Plot... 10 Correlation Matrix... 11
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More informationSTAT 215 Confidence and Prediction Intervals in Regression
STAT 215 Confidence and Prediction Intervals in Regression Colin Reimer Dawson Oberlin College 24 October 2016 Outline Regression Slope Inference Partitioning Variability Prediction Intervals Reminder:
More informationSTATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS
STATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS Principal Component Analysis (PCA): Reduce the, summarize the sources of variation in the data, transform the data into a new data set where the variables
More informationRegression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.
Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would
More informationPart II { Oneway Anova, Simple Linear Regression and ANCOVA with R
Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R Gilles Lamothe February 21, 2017 Contents 1 Anova with one factor 2 1.1 The data.......................................... 2 1.2 A visual
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationIntermediate Algebra Summary - Part I
Intermediate Algebra Summary - Part I This is an overview of the key ideas we have discussed during the first part of this course. You may find this summary useful as a study aid, but remember that the
More informationApplied Regression Analysis
Applied Regression Analysis Lecture 2 January 27, 2005 Lecture #2-1/27/2005 Slide 1 of 46 Today s Lecture Simple linear regression. Partitioning the sum of squares. Tests of significance.. Regression diagnostics
More informationInference with Heteroskedasticity
Inference with Heteroskedasticity Note on required packages: The following code requires the packages sandwich and lmtest to estimate regression error variance that may change with the explanatory variables.
More informationSTATS DOESN T SUCK! ~ CHAPTER 16
SIMPLE LINEAR REGRESSION: STATS DOESN T SUCK! ~ CHAPTER 6 The HR manager at ACME food services wants to examine the relationship between a workers income and their years of experience on the job. He randomly
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More informationBivariate data analysis
Bivariate data analysis Categorical data - creating data set Upload the following data set to R Commander sex female male male male male female female male female female eye black black blue green green
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationExamination paper for TMA4255 Applied statistics
Department of Mathematical Sciences Examination paper for TMA4255 Applied statistics Academic contact during examination: Anna Marie Holand Phone: 951 38 038 Examination date: 16 May 2015 Examination time
More informationChapter 7 Linear Regression
Chapter 7 Linear Regression 1 7.1 Least Squares: The Line of Best Fit 2 The Linear Model Fat and Protein at Burger King The correlation is 0.76. This indicates a strong linear fit, but what line? The line
More informationSTAT 572 Assignment 5 - Answers Due: March 2, 2007
1. The file glue.txt contains a data set with the results of an experiment on the dry sheer strength (in pounds per square inch) of birch plywood, bonded with 5 different resin glues A, B, C, D, and E.
More information15.063: Communicating with Data
15.063: Communicating with Data Summer 2003 Recitation 6 Linear Regression Today s Content Linear Regression Multiple Regression Some Problems 15.063 - Summer '03 2 Linear Regression Why? What is it? Pros?
More informationInteractions between Binary & Quantitative Predictors
Interactions between Binary & Quantitative Predictors The purpose of the study was to examine the possible joint effects of the difficulty of the practice task and the amount of practice, upon the performance
More information1 The Classic Bivariate Least Squares Model
Review of Bivariate Linear Regression Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup............................... 1 1.2 An Example Predicting Kids IQ................. 1 2 Evaluating
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationRegression used to predict or estimate the value of one variable corresponding to a given value of another variable.
CHAPTER 9 Simple Linear Regression and Correlation Regression used to predict or estimate the value of one variable corresponding to a given value of another variable. X = independent variable. Y = dependent
More information1 Introduction to Minitab
1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you
More information1 The basics of panel data
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Related materials: Steven Buck Notes to accompany fixed effects material 4-16-14 ˆ Wooldridge 5e, Ch. 1.3: The Structure of Economic Data ˆ Wooldridge
More informationReview of the General Linear Model
Review of the General Linear Model EPSY 905: Multivariate Analysis Online Lecture #2 Learning Objectives Types of distributions: Ø Conditional distributions The General Linear Model Ø Regression Ø Analysis
More informationOne-Way Repeated Measures Contrasts
Chapter 44 One-Way Repeated easures Contrasts Introduction This module calculates the power of a test of a contrast among the means in a one-way repeated measures design using either the multivariate test
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 12: Frequentist properties of estimators (v4) Ramesh Johari ramesh.johari@stanford.edu 1 / 39 Frequentist inference 2 / 39 Thinking like a frequentist Suppose that for some
More informationChapter 9. Correlation and Regression
Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in
More informationSection 5.3: Linear Inequalities
336 Section 5.3: Linear Inequalities In the first section, we looked at a company that produces a basic and premium version of its product, and we determined how many of each item they should produce fully
More informationMultiple Regression Introduction to Statistics Using R (Psychology 9041B)
Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Paul Gribble Winter, 2016 1 Correlation, Regression & Multiple Regression 1.1 Bivariate correlation The Pearson product-moment
More information